Page 1 of 1

Robot txt

Posted: Fri Aug 31, 2018 3:48 am
by SherryM
Hi
I have open cart 3.0.2.0 and Im having a very hard time to know what is the proper robot .txt I'm so scared I will allow something that is personal or customers personal information. Can someone show me what I should allow and what not to allow. Thanks

Re: Robot txt

Posted: Fri Aug 31, 2018 5:06 am
by straightlight
I'm so scared I will allow something that is personal or customers personal information.
If the case, ensure to use SSL on your store.

Re: Robot txt

Posted: Fri Aug 31, 2018 8:59 am
by SherryM
Hi
Thank you I do have ssl . But what is the normal robots.txt file so lost on this. I want to make sure I allow what I should for google and for mobile friendly pages

Re: Robot txt

Posted: Fri Aug 31, 2018 10:12 am
by straightlight
Robots.txt is a text file webmasters create to instruct web robots (typically search engine robots) how to crawl pages on their website. ... In practice, robots.txt files indicate whether certain user agents (web-crawling software) can or cannot crawl parts of a website.
Source: https://moz.com/learn/seo/robotstxt .

Re: Robot txt

Posted: Fri Aug 31, 2018 4:59 pm
by paulfeakins
SherryM wrote:
Fri Aug 31, 2018 3:48 am
I'm so scared I will allow something that is personal or customers personal information.
Hahaha don't worry, even the worst robots.txt cannot do this.

Re: Robot txt

Posted: Sat Sep 01, 2018 1:29 am
by SherryM
Hi
Thank you . This is what I have for my robot file but not sure if it is right or should be fixed so confused and yes I have been reading the links and what google says but I get so confused over it,
User-agent: Googlebot
Allow:
User-agent: AdsBot-Google
Disallow:
User-agent: Googlebot-Image
Allow:
User-agent: *
Disallow: /*&limit
Disallow: /*?limit
Disallow: /*?sort
Disallow: /*&sort
Disallow: /*?order
Disallow: /*&order
Allow: /*?price
Allow: /*&price
Disallow: /*?brand_tabletpc
Disallow: /*&brand_tabletpc
Disallow: /*?color_default
Disallow: /*&color_default
Disallow: /*?filter_tag
Disallow: /*&filter_tag
Disallow: /*?mode
Disallow: /*&mode
Disallow: /*?cat
Disallow: /*&cat
Disallow: /*?dir
Disallow: /*&dir
Disallow: /*?color
Disallow: /*&color
Allow: /*?product_id
Allow: /*&product_id
Disallow: /*?minprice
Disallow: /*&minprice
Disallow: /*?maxprice
Disallow: /*&maxprice
Disallow: /*?route=checkout/
Disallow: /*?route=account/
Disallow: /*?route=product/search
Disallow: /*?page=1
Disallow: /*&create=1
Disallow: /?route=information/contact
Disallow: /*?route=affiliate/
Disallow: /*?keyword
Disallow: /*?av
Disallow: /admin/
Disallow: /system/
#Disallow: /catalog/

Sitemap: https://www.iowagoatmilksoap.com/index. ... le_sitemap

Re: Robots txt

Posted: Sat Sep 01, 2018 3:25 am
by MrPhil
Keep in mind that robots.txt is just a suggestion to search engines that they stay out of certain areas. There's nothing that can enforce this. A robots.txt file is simply something to sculpt how SE bots see and catalog the things that you most want cataloged, and hopefully avoid cataloging things that you don't want visitors snooping around. All bots are free to look anywhere they want (unless permissions or password protection stops them) and index anything they want, regardless of what robots.txt says. In fact, some malicious bots could be looking for robots.txt entries "prohibiting" looking at sensitive files that otherwise they would never see (e.g., not linked to).

Re: Robot txt

Posted: Tue Sep 18, 2018 3:12 pm
by sirenaplus
User-agent: *
Sitemap: https://yoursite/sitemap.xml
This is sample of robots.txt and google bot is reading.

Re: Robot txt

Posted: Tue Sep 18, 2018 8:14 pm
by khnaz35
SherryM wrote:
Sat Sep 01, 2018 1:29 am
Hi
Thank you . This is what I have for my robot file but not sure if it is right or should be fixed so confused and yes I have been reading the links and what google says but I get so confused over it,
User-agent: Googlebot
Allow:
User-agent: AdsBot-Google
Disallow:
User-agent: Googlebot-Image
Allow:
User-agent: *
Disallow: /*&limit
Disallow: /*?limit
Disallow: /*?sort
Disallow: /*&sort
Disallow: /*?order
Disallow: /*&order
Allow: /*?price
Allow: /*&price
Disallow: /*?brand_tabletpc
Disallow: /*&brand_tabletpc
Disallow: /*?color_default
Disallow: /*&color_default
Disallow: /*?filter_tag
Disallow: /*&filter_tag
Disallow: /*?mode
Disallow: /*&mode
Disallow: /*?cat
Disallow: /*&cat
Disallow: /*?dir
Disallow: /*&dir
Disallow: /*?color
Disallow: /*&color
Allow: /*?product_id
Allow: /*&product_id
Disallow: /*?minprice
Disallow: /*&minprice
Disallow: /*?maxprice
Disallow: /*&maxprice
Disallow: /*?route=checkout/
Disallow: /*?route=account/
Disallow: /*?route=product/search
Disallow: /*?page=1
Disallow: /*&create=1
Disallow: /?route=information/contact
Disallow: /*?route=affiliate/
Disallow: /*?keyword
Disallow: /*?av
Disallow: /admin/
Disallow: /system/
#Disallow: /catalog/

Sitemap: https://www.iowagoatmilksoap.com/index. ... le_sitemap
The link you mention is for sitemap not the robot.txt