Jump to content
G-sofa

XML SiteMap - Duplicate URL issue

Recommended Posts

Hi,

I recently took a a Wordpress website live, since then I have been in the process of adding a list of plugins.  I have installed Google XML Sitemaps to assist with an automated sitemap to be forwarded to Google, Bing etc, however, being new to Wordpress I wasn't sure of the best plugin to use with Yoast and All In One being other alternatives getting good reviews.  When developing websites out with Wordpress and without the luxury of plugins I used to use www.xml-sitemaps.com. So, I decided to use this to see what was produced. I got a comprehensive XML sitemap, too comprehensive to be honest.  There were duplicate URL's with an additional /?s=  (screenshot attached) on the end, I am flummoxed!?! I used these URL's and they work, the pages don't exist in the WP admin 'Pages' category but the URL's work. So I have three questions, well more I am sure but two for now.

One:
What is the best plugin to use so the XML sitemap is generated and forwarded automatically at a set frequency to all the major search engines as well as updating Webmaster Tools?

Two:
I have the Google XML Sitemap plugin activated, will this have already uploaded the most recent version of the sitemap to Google, Bing etc?

Three:
I don't want to upload the sitemap (generated using www.xml-sitemaps.com) and containing the duplicates with the additional url information without removing these duplicates.  What is the best solution to permanently remove the duplicates so I can have in place an up to date XML sitemap ongoing?

Many thanks for looking

Screen Shot 2018-02-04 at 20.10.06.png

Share this post


Link to post
Share on other sites

Could be that the sitemap generator is indexing search results. ?s= is the url key that WP uses for search queries.

If you are worried that search engines will make the same "mistake" as the sitemap generator you could take one of these precautions:
 

Note: First of all go through the internal links on your site and make sure you dont have any links with the search query string in it.

METHOD  1

In your robots.txt file add this line to prevent indexing of any search queries

Disallow: /*?s

METHOD 2

In your documents head element (probably located in header.php) you can use a Wordpress function to check if the page is the search result page (which can have endless amounts of URLs) and add a robots meta tag.

<?php
if ( is_search() ) {
    echo '<meta name="robots" content="noindex">';
}
?>

Or use the Youst settings to do it manually on each page. (you can also manually set a canonical url)

https://kb.yoast.com/kb/canonical-urls-in-wordpress-seo/ 

 METHOD 3

Submit a sitemap to Google, Bing and other search engines with only the URLs you want indexed

Edited by Nillervision

Share this post


Link to post
Share on other sites
11 hours ago, Nillervision said:

Could be that the sitemap generator is indexing search results. ?s= is the url key that WP uses for search queries.

If you are worried that search engines will make the same "mistake" as the sitemap generator you could take one of these precautions:
 

Note: First of all go through the internal links on your site and make sure you dont have any links with the search query string in it.

METHOD  1

In your robots.txt file add this line to prevent indexing of any search queries


Disallow: /*?s

METHOD 2

In your documents head element (probably located in header.php) you can use a Wordpress function to check if the page is the search result page (which can have endless amounts of URLs) and add a robots meta tag.


<?php
if ( is_search() ) {
    echo '<meta name="robots" content="noindex">';
}
?>

Or use the Youst settings to do it manually on each page. (you can also manually set a canonical url)

https://kb.yoast.com/kb/canonical-urls-in-wordpress-seo/ 

 METHOD 3

Submit a sitemap to Google, Bing and other search engines with only the URLs you want indexed

Nillervision, 

Thank you so much for the three, detailed options to fix the issue. So just top confirm, the ?s= is a WP query string?  Is there a way in future to avoid this current situation I find myself in, might it have been caused when I migrated the development website to the live domain using the plugin All-in-One WP Migration?

Share this post


Link to post
Share on other sites
7 hours ago, G-sofa said:

Thank you so much for the three, detailed options to fix the issue. So just top confirm, the ?s= is a WP query string?  Is there a way in future to avoid this current situation I find myself in, might it have been caused when I migrated the development website to the live domain using the plugin All-in-One WP Migration?

I'm sure this is only an issue because you have some links containing the query string somewhere on your site. Probably because of a quick copy/paste action, check the urls in your menus and inline links in your posts.

Share this post


Link to post
Share on other sites

You don’t need a sitemap. Disable the generator and any plugins you have.

If you really feel you can’t live without a sitemap you need to set the priority for each page. If you don’t it may affect your ranking. And not in a good way.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


  • Recently Browsing

    No registered users viewing this page.

  • Member Statistics

    • Total Members
      57,712
    • Most Online
      4,970

    Newest Member
    adabo
    Joined
  • Forum Statistics

    • Total Topics
      65,782
    • Total Posts
      454,601
×