Post by newbgweb@abv.bg » Wed Jun 24, 2020 12:35 pm

Hello,
I have very stragne problem on my site. I use complate SEO package and I can see the 404 pages. Since 1 month I see strange URL 404. I Use Complate SEO package and Journal 3 theme. Opencart version 3.0.3.2. I contacted the Complate SEO package developer and he told me to contact Journal Theme support, but Journal Theme told me to write here in the forum. I hope someone can help me with that. See screenshots.
Regards!

Attachments

sni13.PNG

sni13.PNG (89.12 KiB) Viewed 2924 times

sni12.PNG

sni12.PNG (85.67 KiB) Viewed 2924 times

sni11.PNG

sni11.PNG (66.48 KiB) Viewed 2924 times



Posts

Joined
Thu Sep 12, 2019 4:26 pm


Post by letxobnav » Wed Jun 24, 2020 2:11 pm

That is normal, get used to it.

Crystal Light Centrum Taiwan
Extensions: MailQueue | SUKHR | VBoces

“Data security is paramount at [...], and we are committed to protecting the privacy of anyone who is associated with our [...]. We’ve made a lot of improvements and will continue to make them.”
When you know your life savings are gone.


User avatar
Expert Member

Posts

Joined
Fri Aug 18, 2017 4:35 pm
Location - Taiwan

Post by Cue4cheap » Wed Jun 24, 2020 11:12 pm

Scriptkiddies and the like. People probing your website to see if they can get something from it.
Keeps you in your toes to make sure you use extensions etc that are reputable since OpenCart iks pretty good a plugging any holes and most vulnerabilities are introduced from the admin doing something dumb or extensions.
Mike

cue4cheap not cheap quality


Expert Member

Posts

Joined
Fri Sep 20, 2013 4:45 am

Post by newbgweb@abv.bg » Thu Jun 25, 2020 12:30 pm

Thanks for the replay. What can I do to protect my store? Regards!


Posts

Joined
Thu Sep 12, 2019 4:26 pm


Post by JNeuhoff » Thu Jun 25, 2020 7:22 pm

newbgweb@abv.bg wrote:
Thu Jun 25, 2020 12:30 pm
Thanks for the replay. What can I do to protect my store? Regards!
This one has a useful link to a list of bad bots and crawlers which you can put into your '.htaccess' file:

http://tab-studio.com/en/blocking-robots-on-your-page/

Also, you may want to block some IP-addresses causing your bad web traffic, in your '.htaccess', as in the example below:

Code: Select all

# reject known hackers and spammers
Order Deny,Allow
Deny from 37.187.90.226

Export/Import Tool * SpamBot Buster * Unused Images Manager * Instant Option Price Calculator * Number Option * Google Tag Manager * Survey Plus * OpenTwig


User avatar
Guru Member

Posts

Joined
Wed Dec 05, 2007 3:38 am


Post by letxobnav » Thu Jun 25, 2020 8:55 pm

I can assure you, the truly bad bots do not identify themselves via their user-agent, only the amateurs do that.
Most of those scanners will try to find clues as to what you are running and they will receive 99% 404's. because you do not have what they seek.
Putting them in htaccess will give them 403s which makes no difference other than that htaccess files are loaded every request up the tree without caching.
In short, a waste of time and resources.

if you want to put a little security in your htaccess, use this:

Code: Select all

# BLOCK UNNECESSARY REQUEST METHODS
RewriteCond %{REQUEST_METHOD} ^(CONNECT|DEBUG|DELETE|MOVE|PUT|TRACE|TRACK) [NC]
RewriteRule .* - [F,L]

ServerSignature Off
RewriteCond %{QUERY_STRING} (javascript:).*(\;) [NC,OR]
RewriteCond %{QUERY_STRING} (<|%3C).*script.*(>|%3) [NC,OR]
RewriteCond %{QUERY_STRING} (\;|\'|\"|%22).*(union|UNION|select|SELECT|insert|INSERT|drop|DROP|update|UPDATE|md5|MD5|benchmark) [NC,OR]
RewriteCond %{QUERY_STRING} (base64_encode|localhost|mosconfig) [NC,OR]
RewriteCond %{QUERY_STRING} (boot\.ini|echo.*kae|etc/passwd|passwd|eval|$_POST) [NC,OR]
RewriteCond %{QUERY_STRING} (GLOBALS|REQUEST)(=|\[|%) [NC]
RewriteRule ^(.*)$ blacklist.php [L]
and put a blacklist.php in your root, in there you can log the blocked request

Code: Select all

error_log('BLACKLISTING:' . $_SERVER['REMOTE_ADDR'] . ' ' . $_SERVER['REQUEST_URI']);
header($_SERVER['SERVER_PROTOCOL'] . " 404 Not Found", true);
exit();
or put this function in startup.php

Code: Select all

function valid_request () {
	$invalid		= "(\(\))"; // lets not look for quotes. [good]bots use them constantly. looking for () since technically parenthesis arent valid
	$period 		= "(\\002e|%2e|%252e|%c0%2e|\.)";
	$slash 			= "(\\2215|%2f|%252f|%5c|%255c|%c0%2f|%c0%af|\/|\\\)"; // http://security.stackexchange.com/questions/48879/why-does-directory-traversal-attack-c0af-work
	$routes 		= "(etc|dev|irj)" . $slash . "(passwds?|group|null|portal)|allow_url_include|auto_prepend_file|route_*=http";
	$filetypes 		= $period . "+(sql|db|sqlite|log|ini|cgi|bak|rc|apk|pkg|deb|rpm|exe|msi|bak|old|cache|lock|autoload|gitignore|ht(access|passwds?)|cpanel_config|history|zip|bz2|tar|(t)?gz)";
	$cgis 			= "cgi(-|_){0,1}(bin(-sdb)?|mod|sys)?";
	$phps 			= "(changelog|version|license|command|xmlrpc|admin-ajax|wsdl|tmp|shell|stats|echo|(my)?sql|sample|modx|load-config|cron|wp-(up|tmp|sitemaps|sitemap(s)?|signup|settings|" . $period . "?config(uration|-sample|bak)?))" . $period . "php";
	$doors 			= "(" . $cgis . $slash . "(common" . $period . "(cgi|php))|manager" . $slash . "html|stssys" . $period . "htm|((mysql|phpmy|db|my)admin|pma|sqlitemanager|sqlite|websql)" . $slash . "|(jmx|web)-console|bitrix|invoker|muieblackcat|w00tw00t|websql|xampp|cfide|wordpress|wp-admin|hnap1|tmunblock|soapcaller|zabbix|elfinder)";
	$sqls 			= "((un)?hex\(|name_const\(|char\(|a=0)";
	$nulls 			= "(%00|%2500)";
	$truth 			= "(.{1,4})=\1"; // catch OR always-true (1=1) clauses via sql inject - not used atm, its too broad and may capture search=chowder (ch=ch) for example
	$regex 			= "/$invalid|$period{1,2}$slash|$routes|$filetypes|$phps|$doors|$sqls|$nulls/i";
	$regex_ignore 	= "/captcha=0/i";
	$results 		= '';
	$matches 		= array();
	$str 			= $_SERVER['SERVER_NAME'] . $_SERVER['REQUEST_URI'];
	$has_agent 		= isset($_SERVER['HTTP_USER_AGENT']);
	$user_agent 	= ($has_agent ? $_SERVER['HTTP_USER_AGENT'] : 'no user agent');

	if (preg_match_all($regex, preg_replace($regex_ignore, '', $str), $matches) || !$has_agent) {
		if ($matches[0]) {
			$matches[0] = array_unique($matches[0]);
			foreach ($matches[0] as $match) {
				$results .= $match . ' ';
			}
		}
		if (!$has_agent) $results .= ' | No User Agent';
		return $results;
	}
	return true;
}
after index()

Code: Select all

/* VALIDATE URL */
$valid_request = $this->valid_request();
if ($valid_request !== true) {
	error_log('Invalid Request from IP: '.$_SERVER['REMOTE_ADDR']);
	error_log('Reason: '.$valid_request);
	error_log('URI: '.$_SERVER['REQUEST_URI']);
	if (isset($_SERVER['HTTP_USER_AGENT'])) {
		error_log('USER-AGENT: '.$_SERVER['HTTP_USER_AGENT']);
	} else {
		error_log('USER-AGENT: None');
	}
	header($_SERVER['SERVER_PROTOCOL'] . " 404 Not found", true);
	exit;
}

Crystal Light Centrum Taiwan
Extensions: MailQueue | SUKHR | VBoces

“Data security is paramount at [...], and we are committed to protecting the privacy of anyone who is associated with our [...]. We’ve made a lot of improvements and will continue to make them.”
When you know your life savings are gone.


User avatar
Expert Member

Posts

Joined
Fri Aug 18, 2017 4:35 pm
Location - Taiwan

Post by newbgweb@abv.bg » Tue Jun 30, 2020 9:48 pm

Thank you letxobnav !!
That solved the problem!


Posts

Joined
Thu Sep 12, 2019 4:26 pm


Post by letxobnav » Wed Jul 01, 2020 9:17 pm

Well, these security measures (filtering out malicious requests) do not prevent simple probing requests by humans or bots.
Anyone can request any resource from your site whether they exist or not.

They do have an impact on your site if you use seo urls though.

In the original htaccess code to enable seo urls you will find:

Code: Select all

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} !.*\.(ico|gif|jpg|jpeg|png|js|css)
RewriteRule ^([^?]*) index.php?_route_=$1 [L,QSA]
that piece of code translates into:
if the requested resource does not exist as a file or directory and it does not contain .ico or .gif etc.
then pass the request as a parameter in the query string to php script index.php and let OC handle it
otherwise just give a normal webserver 404 not found.

This is done for seo urls as those do not exist in your filesystem so those urls become a parameter and are handles by the seo url class.
That class performs the translation from seo url keywords to the ultimate get variables your script will work with which involves one or more db queries to the seo_url table.
If a proper query cannot be found OC will generate a 404 not found page and that is returned.

The problem with this is that any request for any resource you do not have is treated this way.
So if a bot requests:
/your_database.sql
/data.bak
/config_backup.php
/mysql.tar
etc.

All are probably resources not present on your site, yet OC will perform those very efforts, db queries, 404 page generation as it currently believes they might be seo urls which need translation.
Useless efforts which can be easily prevented by adding the usual probing extensions you do not have as well:

Code: Select all

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} !.*\.(env|php|xml|ashx|cfg|dat|ico|cur|txt|mp3|webp|svg|ttf|eot|woff|woff2|gif|jpg|JPG|jpeg|JPEG|png|js|cfg|css|pdf|zip|env|tar|sql|gz|tar|exe|rar|arj|cab|iso|rpm|tbz|tgz|old|bak|backup|dump|db|7z|asp|aspx|exp|html|htm)$
RewriteRule ^([^?]*) index.php?_route_=$1 [L,QSA]
which means that no request for resources with those extensions you do not have will be passed to php and simply be rejected by your webserver with a 404 not found. Those requests will still come and fill your server logs but they will have virtually no impact any longer.
That impact can be substantial as I sometimes get probes for resources I do not have at 100 requests per second.

note: just make sure you do not have .htm or .html in your seo urls or remove those from that list.

I goes as far as doing this:

Code: Select all

	RewriteCond %{REQUEST_URI} !.*\..*$
	RewriteCond %{REQUEST_FILENAME} !-f
	RewriteCond %{REQUEST_FILENAME} !-d
	RewriteRule ^([^?]*) index.php [L,QSA]

meaning: any request for a resource which does not exist and does not have a dot in it, is passed to php, all others are webserver rejected (404-ed), i.e. any request for a resource with any extension I do not have.
Of course that means that I have to make sure not to put any dots in my seo urls.

Crystal Light Centrum Taiwan
Extensions: MailQueue | SUKHR | VBoces

“Data security is paramount at [...], and we are committed to protecting the privacy of anyone who is associated with our [...]. We’ve made a lot of improvements and will continue to make them.”
When you know your life savings are gone.


User avatar
Expert Member

Posts

Joined
Fri Aug 18, 2017 4:35 pm
Location - Taiwan
Who is online

Users browsing this forum: No registered users and 10 guests