Page 1 of 1

Optimal Robots.txt

Posted: Fri Apr 17, 2015 10:57 pm
by sk786
I have put together a comprehensive robots.txt file for OpenCart. What do you guys think about it?
Sitemap: http://www.example.com/sitemap.xml

User-agent: *
Disallow: /*&
Disallow: /*?
Disallow: /*&limit
Disallow: /*?sort
Disallow: /*&sort
Disallow: /*?route
Disallow: /*?page
Disallow: /*&create
Disallow: /*?keyword
Disallow: /*?av
Disallow: /admin/
Disallow: /system/
Disallow: /catalog/
Disallow: /vqmod/

Re: Optimal Robots.txt

Posted: Sat Apr 18, 2015 1:38 am
by IP_CAM
Just be aware, bad bots don't care about robots.txt, in contrary, it helps, sometimes, to easy find Subs... ???
just to note it
Ernie
bigmax.ch/shop/

Re: Optimal Robots.txt

Posted: Thu Apr 23, 2015 1:57 am
by Dhaupin
In my opinion you shouldnt block pagination, sorts, limits, or other querystrings like that anymore. Instead, teach Google how to use them in "webmaster tools > Crawl > URL parameters". You can also kinda teach Bing in "webmasters > configure my site > ignore URl parameters"

Also, if you have a huge store and dont cache your sitemap, it can create a TON of extra load and possibly cause server faults (500 errors)...especially with speeders or someone simply hitting refresh macro. On the fence, but we dont report the sitemap in robots.txt anymore for this reason. Bots will try sitemap.xml first anyways, but hiding it in robots will *attempt* to lessen liability.