Post by larproman » Fri Mar 01, 2019 5:30 am

Version 3.0.2.0
I have had to delete and re-setup my website several times, and also the products and categories. All the links now work and my site works OK. But Google and every other search engine return very old or non-existant URL's. I have spent months on the Google Console deleting old URL's but they keep re-appearing. Currently Google Console tells me I have over 8,000 pages it cannot find.
Many categories are mis-matched, returning searches like looking in my Cable Tie section for switches.
Very frustrating and disapointing.
Thanks in advance.
Graeme

New member

Posts

Joined
Sun Apr 29, 2018 8:37 am

Post by IP_CAM » Fri Mar 01, 2019 12:21 pm

Very frustrating and disapointing.
I guess so, but I still can't find out, why you came here in the first
Place. Complaining about Search Engine behaviour around here
does not change anything, it's their Decision, what to index and
how. And it depends on what you offer them on information. So,
better make sure, to have your Shop set + SEO linked accordingly,
and after a few months, you won't be linked wrongly anymore.

And if you don't know how, just get one of the Pro's, to do it for you.
If you have 8'000 wrongly linked pages, you must be running a wealthy
Place, creating enough funds, to be able to, I assume ... :D
Good Luck!
Ernie

My Github OC Site: https://github.com/IP-CAM
5'200 + FREE OC Extensions, on the World's largest private Github OC Repository Archive Site.


User avatar
Legendary Member

Posts

Joined
Tue Mar 04, 2014 1:37 am
Location - Switzerland

Post by larproman » Fri Mar 01, 2019 1:08 pm

Hi,
That's my point. Search engines don't make up URL's they only return them as an answer somebody might ask.
My site is: https://autoandmarine.com.au I sell auto electrical parts. Google is returning URL's that point to pages's that don't exist, so I get constant 404's.
OC On Line Report shows a Google's search with this: "Black-UV/Electric-Terminals/Battery-Terminal" (looking inside the cable tie category for terminals)
And "Auto-Switch/Jcase" (looking inside the switch category for fusible links)
Also, I don't have 8,000 pages on my site but something is telling the search engines that I do. There must be a bad file or error in a database, or something...
Graeme

New member

Posts

Joined
Sun Apr 29, 2018 8:37 am

Post by IP_CAM » Fri Mar 01, 2019 2:02 pm

Black-UV/Electric-Terminals/Battery-Terminal
Well, Googie and I are friends, since it exists, but I was never aware of,
that something like the above could be used as searchable Content.

But, as I wrote above, you'll need an SEO Professional for such, there
is no way around it, if you can't do it by yourselfs. Successfull SEO is
and means Business, and only Fools would spread their hard-earned
Wisdom for free. But you know, how it works, after coming here for a
good while already, I guess ... :D

Good Luck
Ernie

My Github OC Site: https://github.com/IP-CAM
5'200 + FREE OC Extensions, on the World's largest private Github OC Repository Archive Site.


User avatar
Legendary Member

Posts

Joined
Tue Mar 04, 2014 1:37 am
Location - Switzerland

Post by larproman » Sat Mar 02, 2019 5:57 am

"searchable Content."
You are correct. But how did OC generate "Black-UV/Electric-Terminals/Battery-Terminal" when I have never entered it? OC has mis-matched my categories but nobody seems to know how it could have happened or where this data could be stored.
I should explain the 8,000 pages that Google can't find. In Google Console I have 3,787 pages submitted on my sitemap. When I check errors under "smartphone" I have 8,628 pages with both 500 errors or 404 errors. When I check errors under "Desktop" I have 155 x 404 errors. All of these URL's have been deleted but keep showing up again. They are ALL old deleted addresses or mismatched categories / items.
How did OC mismatch my categories and products? Are there double entries in my database?
Any help would be appreciated.
Graeme

New member

Posts

Joined
Sun Apr 29, 2018 8:37 am

Post by letxobnav » Sat Mar 02, 2019 10:15 am

bots will only see what you send them on their requests, they do not access your database.

So I guess that sometime in the past you had some issues with your seo url settings and send out wrong url's in your responses to google either via pages or via submitted xml sitemaps. So google indexed those and is now coming for them.

404's are fine as such, if you are now no longer providing those wrong url's, eventually google will stop requesting them.
500 errors are a different matter, those are server errors and could be caused by anything from htaccess to simple code errors (even a simple missing closing quote in php can cause a 500 error), these are very difficult to pin down.

Crystal Light Centrum Taiwan
Extensions: MailQueue | SUKHR | VBoces

“Data security is paramount at [...], and we are committed to protecting the privacy of anyone who is associated with our [...]. We’ve made a lot of improvements and will continue to make them.”
When you know your life savings are gone.


User avatar
Expert Member

Posts

Joined
Fri Aug 18, 2017 4:35 pm
Location - Taiwan

Post by larproman » Mon Mar 04, 2019 6:22 am

Hi,
Yes you are correct. I did a re-install last year and also changed several URL's. And even though I continually request Goggle to remove the links they keep re-appearing.
I guess I will have to wait for Goggle to catch up.
Thank you for your response.
Graeme

New member

Posts

Joined
Sun Apr 29, 2018 8:37 am

Post by victorj » Mon Mar 04, 2019 6:35 am

I use a 301 redirect manager.
All old urls i have (when deleting a product or editing a product) are redirected to a alternative url on my site.

There are quit a few redirect extensions.
Do not wait unil google deletes those entries, but use them to point to new product and use it to your advantage.

Koeltechnische deurrubbers eenvoudig online op maat bestellen.
Alle niet stekplichtige onderdelen zoals scharnieren, sloten, randverwarming en verlichting voor alle typen koelingen en vriezers.
https://koelcel-onderdelen.com


User avatar
Expert Member

Posts

Joined
Sat Jun 25, 2011 4:09 am
Location - Alkmaar Holland

Post by letxobnav » Mon Mar 04, 2019 11:26 am

I do the same, I changed the admin side so that when I change and save an seo url keyword of a product, it saves the old one (up to 2 versions) first.
When a bot then requests a url with an keyword that does not exist, I check if it is one of the old ones and redirect to the current one.
Mass redirects you should ofcourse handle via server conf or htaccess.

Crystal Light Centrum Taiwan
Extensions: MailQueue | SUKHR | VBoces

“Data security is paramount at [...], and we are committed to protecting the privacy of anyone who is associated with our [...]. We’ve made a lot of improvements and will continue to make them.”
When you know your life savings are gone.


User avatar
Expert Member

Posts

Joined
Fri Aug 18, 2017 4:35 pm
Location - Taiwan

Post by larproman » Thu Mar 07, 2019 2:20 pm

That makes sense to redirect. I thought telling Goggle to remove them would, well, remove them. But obviously not...
I like yours that saves the old URL.
Any suggestions for a program to do this? Free ones are helpful as it looks like there is a new version of OC out.

New member

Posts

Joined
Sun Apr 29, 2018 8:37 am

Post by letxobnav » Sat Mar 09, 2019 8:38 am

well, I do not know of an extension for that, all I can tell you how I do it myself.

1) I added 2 columns in the seo_url table, keyword_old1 and keyword_old2.

2) I changed the seo_url part of the product saving function in admin so that it checks if the new keyword is different from the existing one. If it is, I set keyword_old2 = keyword_old1, Keyword_old1 = keyword and keyword = the new keyword.

3) Then in the seo_url class where the query is fetched based on the keyword, if it is not found I do another query to check if perhaps keyword_old1 or keyword_old2 equals the sought keyword. if one of them is, I redirect with the obsolete keyword replaced by the current keyword, if none of them are, I give a 404 as usual.

4) Normally you would only need to redirect if the url is requested by a bot so for customers you could simply return the query if either of the 3 keywords match (customers do not care about a correct urls and this saves you a redirect) but this depends on how confident you are in identifying bots. There are legitimate hidden bots out there (especially chinese) so I redirect.

Crystal Light Centrum Taiwan
Extensions: MailQueue | SUKHR | VBoces

“Data security is paramount at [...], and we are committed to protecting the privacy of anyone who is associated with our [...]. We’ve made a lot of improvements and will continue to make them.”
When you know your life savings are gone.


User avatar
Expert Member

Posts

Joined
Fri Aug 18, 2017 4:35 pm
Location - Taiwan

Post by larproman » Tue Mar 19, 2019 8:32 am

Hi all,
Further to my mis-matched URL's and why Goggle can't find over 8,000 pages, etc
All my page URL's are https://
I checked my sitemap and it lists ALL <loc> as http: (not https:)
My sitemap lists SOME <image:loc> as https: and SOME <image:loc> as http:
ALL the <image:loc> in the bottom half of my sitemap are http:
Goggle says this is a mismatch. This could explain why I have so many errors?
I was using http: for a month before I went to https:
I am looking to confirm the problem, find why this could have happened (me) and how to fix it.
Thanks in advance,

New member

Posts

Joined
Sun Apr 29, 2018 8:37 am

Post by letxobnav » Wed Mar 20, 2019 8:27 am

in config.php have you set both HTTP_SERVER and HTTPS_SERVER to https?

/ HTTP
define('HTTP_SERVER', 'https://domain');

// HTTPS
define('HTTPS_SERVER', 'https://domain');

Crystal Light Centrum Taiwan
Extensions: MailQueue | SUKHR | VBoces

“Data security is paramount at [...], and we are committed to protecting the privacy of anyone who is associated with our [...]. We’ve made a lot of improvements and will continue to make them.”
When you know your life savings are gone.


User avatar
Expert Member

Posts

Joined
Fri Aug 18, 2017 4:35 pm
Location - Taiwan

Post by larproman » Wed Mar 20, 2019 1:48 pm

Hi,
I changed as you suggested, OK so far.
Thank you for your advice.
Do I have to change my .htaccess file?

New member

Posts

Joined
Sun Apr 29, 2018 8:37 am

Post by letxobnav » Wed Mar 20, 2019 3:09 pm

On most old-school sites (the early days of ssl) they would separate http and https links, http for links that did not require encryption (less processing) and https for those which did like checkout pages where you enter userid's and passwords and such.

That can however give problems if you use a combination of absolute (http://domain/.....) links and relative (products/....) links.
Besides, google wants all links in https anyway except for perhaps images and such.
So this should put all the links you set on the html you send out to https.

The only thing I have in .htaccess for this is :

RewriteCond %{HTTPS} !=on
RewriteCond %{REQUEST_URI} !.*\.(ico|mp3|mpeg|cur|webp|svg|ttf|eot|woff|woff2|gif|jpg|JPG|jpeg|JPEG|png|js|webp)
RewriteRule ^(.*)$ https://%{SERVER_NAME}%{REQUEST_URI} [R=301,L]

which basically forces https except for images and fonts if they request http but I am not even sure I need that.

Crystal Light Centrum Taiwan
Extensions: MailQueue | SUKHR | VBoces

“Data security is paramount at [...], and we are committed to protecting the privacy of anyone who is associated with our [...]. We’ve made a lot of improvements and will continue to make them.”
When you know your life savings are gone.


User avatar
Expert Member

Posts

Joined
Fri Aug 18, 2017 4:35 pm
Location - Taiwan
Who is online

Users browsing this forum: Amazon [Bot] and 405 guests