If you create a page on your website (like
/some-page.html), then it is just one page, even URL parameters are added (like
/some-page.html?foo=bar), right? The myth is that search engines see both of those URLs as a single page.
In reality, search engines have to treat URLs with parameters of if they could be completely separate pages. Some sites rely on URL parameters to show different content (eg
Search engine bots need to crawl URLs both with and without URL parameters when they find them. There is no way for them to know whether the content is the same without crawling them to look.
Once a search engine bot crawls a URL, it compares it with the contents of other pages. The duplicate content detection algorithms it uses will determine that
/some-page.html?foo=bar have the same content. At that point it will make a determination about which of those two duplicates it wants to index.
The extra crawling caused by URL parameters can be a huge problem when parameters that don't change the contents of the page are used frequently. Using parameters for tracking where visitors came from or for tracking session IDs cause lots of unneeded crawling. It can hurt SEO if search engines have to crawl so many URLs that they can't crawl everything or if they can't determine which URL you would prefer they index.
The best way to avoid problems is to not use tracking parameters. If you use URL parameters, it is best to only use ones that change the content of the page.
If you do use tracking parameters, it is usually a good idea to block them in
robots.txt. If you want to prevent search engines from crawling all parameters, you can use the rule
Disallow: /*? which prevents crawling of any URL with a question mark.
Alternately you could use canonical tags to specify which URLs you prefer to have indexed (probably the ones without parameters,) or use redirects to redirect away from URLs with parameters.
Google also has a URL parameters tool that allows you to tell Googlebot which parameters are only used for tracking and don't change the page content.
This article was written as part of a series about SEO myths.