Post by sk786 » Fri Apr 17, 2015 10:57 pm

I have put together a comprehensive robots.txt file for OpenCart. What do you guys think about it?
Sitemap: http://www.example.com/sitemap.xml

User-agent: *
Disallow: /*&
Disallow: /*?
Disallow: /*&limit
Disallow: /*?sort
Disallow: /*&sort
Disallow: /*?route
Disallow: /*?page
Disallow: /*&create
Disallow: /*?keyword
Disallow: /*?av
Disallow: /admin/
Disallow: /system/
Disallow: /catalog/
Disallow: /vqmod/

Newbie

Posts

Joined
Mon Apr 13, 2015 11:40 am

Post by IP_CAM » Sat Apr 18, 2015 1:38 am

Just be aware, bad bots don't care about robots.txt, in contrary, it helps, sometimes, to easy find Subs... ???
just to note it
Ernie
bigmax.ch/shop/

My Github OC Site: https://github.com/IP-CAM
5'600 + FREE OC Extensions, on the World's largest private Github OC Repository Archive Site.


User avatar
Legendary Member

Posts

Joined
Tue Mar 04, 2014 1:37 am
Location - Switzerland

Post by Dhaupin » Thu Apr 23, 2015 1:57 am

In my opinion you shouldnt block pagination, sorts, limits, or other querystrings like that anymore. Instead, teach Google how to use them in "webmaster tools > Crawl > URL parameters". You can also kinda teach Bing in "webmasters > configure my site > ignore URl parameters"

Also, if you have a huge store and dont cache your sitemap, it can create a TON of extra load and possibly cause server faults (500 errors)...especially with speeders or someone simply hitting refresh macro. On the fence, but we dont report the sitemap in robots.txt anymore for this reason. Bots will try sitemap.xml first anyways, but hiding it in robots will *attempt* to lessen liability.

https://creadev.org | support@creadev.org - Opencart Extensions, Integrations, & Development. Made in the USA.


User avatar
Active Member

Posts

Joined
Tue May 13, 2014 3:45 am
Location - PA
Who is online

Users browsing this forum: No registered users and 22 guests