Post by Dhaupin » Sat Sep 13, 2014 8:27 am

Ok this is how to start catching badbots on your site(s), then reporting, banning, whatever you want to do with them. The theory is simple, bots read code from top to bottom, so the sooner then run into an email, login, register, or comment link...the better for trapping.

Skill Level: Advanced (Since setting up wrong could trap good creatures)


The theory: Robots.txt, 403 Forbidden, and noindex/nofollow protect pages from being crawled. Bad bots ignore these warnings and dive in. Lets use that to our advantage. First, we need a honeypot or tarpit, then some tasty links.


Step 1) Goto projecthoneypot.org and make an account. This is the most extensive yet safe blacklist. Sign up for a honeypot, and/or optional MX forwarder. These are direct in's to snag your bots and strengthen their blacklist. Get the php honeypot running somewhere on your domain(s) and take note of its url.


Step 2) Goto stopforumspam.com and make an account. This is the most comprehensive spammer blacklist. If you like, you can install this basic registration post protection we made that checks against SFS. Its a blueprint, so bend as you need (itll always be free): http://forum.opencart.com/viewtopic.php ... 3&p=509025 Most likely bots will never even get to it :)


Step 3) Lets make a tarpit too, which is like a never ending black sinkhole of junk data, links, emails, and fake forms/comments/registers. Once bots are stuck, they harvest to their hearts desire, until they get bored or their botmaster stops them. Since a tarpit generates in under a second, and doesnt use database, it can save emmse load. Here is a basic tarpit blueprint. Edit this to work with your OC using the settings in it. If you see errors, lets hash it out to make stable. Set your log location, then you can get a wordnik API key from their site if you need random words:

Code: Select all

<?php
// Dribbler Tarbaby Community v.0.0.9 - Copyright 2013-2014 under GNU/GPL
// Original script by Mike (zaphod@spambotsecurity.com)
// http://www.stopforumspam.com/forum/viewtopic.php?pid=41173
//
// Contributors:
//	John Darkhorse
//	Derek Haupin (dhaupin@gmail.com)
//
// @@ WARNING @@
// The tarbaby will make and hold an entry process for the entire render time. This may cause server faults if you get many bots or set your delay too high. If you are getting faults, reduce the tar generate time by changing $enableTar_delay. You can also turn off Tar completely by entering false for $enableTar. Once you catch up with bans, or enable logging->fail2ban jail, these faults will be reduced.
//
// @@ TIPS @@
// File can be named anything, so long as it ends in .php - if you DONT use wordpress, try wp-admin.php wp-login.php or x.php. If you DO use wordpress, try node.php or admin.php. Place this on your website, and make sure to add it to your robots.txt so good crawlers don't get caught in it. Place hidden links to this file from your frontside pages. "Good" users won't see the hidden links, but bots will. Adding the "nofollow noindex" attribute to the link would also warn "good" crawlers to stay away.
//
// Example Link: <a href="URL_to_filename.php" rel="nofollow noindex" style="display:none;">Login Register Comment</a>
//
// @@ SETUP @@
// Set configs below to turn on or off functions or customize your settings. If you require random words, please sign up on wordnik.com for an API key, then enter it for $enableWords_key. If you are behind a reverse proxy, or want to attempt to detect proxies, turn on enableProxyBust.

// turn on or off the anti-human lockout overlay - note: bots that run JS will see the overlay but still harvest code to submit
  $enableNohumans = true;
  $enableNohumans_msg = '403 Forbidden: You should not be on this page, please leave immediately. Thank you!';

// turn on or off the meta noindex or nofollow in html head of page
  $enableNoindex = true;

// turn on or off the meta wordpress or drupal identifiers in html head of page
  $enableWordpress = true;
  $enableDrupal = true;

// turn on or off the no-caching header to prevent cache
  $enableNocache = true;

// turn on or off the 403 forbidden header to drive away more bots
  $enable403 = false;
  
// turn on or off fake comment or login forms
  $enableForm_comment = true;
  $enableForm_login = true;

// turn on or off timed corrupt email links | limit = number of emails shown | length = number of rand chars in emails | delay = time in seconds between generating an email | call custom tar output with tarBaby(limit, length, delay) as in tarBaby(10, 3, 1) would pull 10 emails with 3 extra random letters, with 1 second delay between email gen.
  $enableTar = true;
  $enableTar_limit = '20';
  $enableTar_length = '8';
  $enableTar_delay = '0';

// turn on or off the unique urls as querystrings | separator = what to put between urls | length = number of rand chars in urls
  $enableQuery = true;
  $enableQuery_separator = '/';
  $enableQuery_length = '3';

// turn on or off the random words via API | limit = total amount of rand words | key = your www.wordnik.com API key | call custom random words with randWords(length) as in randWords(3) would make 3 spaced words.
  $enableWords = false;
  $enableWords_limit = '200';
  $enableWords_key = '';

// turn on or off logs | sys = server level logging to var/log/messages | html = account level logging | file = the file location for html logging | timezone = your local timezone for date format
  $enableLog_sys = false;
  $enableLog_html = true;
  $enableLog_file = 'system/logs/error.txt';
  $enableLog_timezone = 'America/New_York';

// turn on or off proxybuster - attempts to un-hide originating IP if [reverse]proxy provides methods to do so
  $enableProxyBust = true;

// turn on or off honeypot | url = link to another honeypot | email = a mailto an email trap
  $enableHoneyPot = true;
  $enableHoneyPot_url = 'http://srce.dsms.net/adlibituminvolved.php?blogid=5';
  $enableHoneyPot_email = 't.r.o.e.g.d+wp-admin@gmail.com';





// @GLOBALS - define utility variables or routes
date_default_timezone_set($enableLog_timezone);
$dateTime = date("m-d-Y h:i:s A");
$userAgent = (isset($_SERVER['HTTP_USER_AGENT'])?$_SERVER['HTTP_USER_AGENT']:'');
$requestURL = (isset($_SERVER['SERVER_NAME'])?$_SERVER['SERVER_NAME']:'');
$requestURI = (isset($_SERVER['REQUEST_URI'])?$_SERVER['REQUEST_URI']:'');

if (($enableProxyBust == true) && (isset($_SERVER['REMOTE_ADDR'])) && (isset($_SERVER['HTTP_X_FORWARDED_FOR'])) && (!empty($_SERVER['HTTP_X_FORWARDED_FOR']))) {
	$ip = end(array_values(array_filter(explode(',',$_SERVER['HTTP_X_FORWARDED_FOR']))));
	$ipProxy = $_SERVER['REMOTE_ADDR'];
	$ipProxy_label = ' behind proxy ';
} elseif (($enableProxyBust == true) && (isset($_SERVER['REMOTE_ADDR']))) {
	$ip = $_SERVER['REMOTE_ADDR'];
	$ipProxy = '';
	$ipProxy_label = ' no proxy detected ';
} elseif (($enableProxyBust == false) && (isset($_SERVER['REMOTE_ADDR']))) {
	$ip = $_SERVER['REMOTE_ADDR'];
	$ipProxy = '';
	$ipProxy_label = '';
} else {
	$ip = '';
	$ipProxy = '';
	$ipProxy_label = '';
}


// @TIMER START - begin script timer if logs config is on
if (($enableLog_sys == true) || ($enableLog_html == true)) {
	$timerStart = microtime(true);
}


// @HEADER - set headers according to config
if ($enableNocache == true) {
	header("Cache-Control: no-cache, must-revalidate");
}
if ($enable403 == true) {
	header('HTTP/1.1 403 Forbidden');
}


// @GENWORD - generate some random words from wordnik.com API (you need an API key)
function randWords ($length) {
	global $enableWords, $enableWords_key;
	if (($enableWords == true) && (isset($enableWords_key)) && (!empty($enableWords_key))) {
		$randWords_get = file_get_contents('http://api.wordnik.com:80/v4/words.json/randomWords?hasDictionaryDef=false&minCorpusCount=0&maxCorpusCount=-1&minDictionaryCount=1&maxDictionaryCount=-1&minLength=5&maxLength=-1&limit=' . $length . '&api_key=' . $enableWords_key);
		$outputWords = json_decode($randWords_get, true);
		
		foreach ($outputWords as $outputWord) {
			echo $outputWord['word'] . ' ';
		}
	}
}


// @GENRANDOM - make random letter number strings for url or email
function randChars($length) {
    $randMake = '';
    $randMakes = array_merge(range(0, 9), range('a', 'z'));
    for ($i = 0; $i < $length; $i++) {
        $randMake .= $randMakes[array_rand($randMakes)];
    }
    return $randMake;
}
$randomEmail = randChars(12) . '@' . randChars(6) .'.com ';
$randomEmailAlt = randChars(10) . '@' . randChars(8) .'.com ';


// @TARBABY - start the tarpit - reduce delay or disable it in configs if its too heavy entries
function tarBaby ($limit, $length, $delay) {
	echo '<h2>There are ' . $limit . ' Users Online:</h2>';
	
	if ((ob_get_level() == 0)) ob_start();

	echo(str_pad('',1024));

	for ($i = 0; $i < $limit; $i++){
		for ($j = 0; $j < 1; $j++){
			$corruptMail = chr(mt_rand(0,255)) . randChars($length) . chr(mt_rand(0,255)) . '@' . chr(mt_rand(0,255)) . 'gmail' . chr(mt_rand(0,255)) . '.com';
			echo '<a href="mailto:' . $corruptMail . '" target="_blank">' . $corruptMail . '</a>, ';
		}
			ob_flush();
			flush();
			sleep($delay);
	}
	ob_end_flush();
}


// @URLS - check page URL to make pagination and include querystring config setting
function curPageURL() {
	global $enableQuery, $enableQuery_separator, $enableQuery_length, $requestURL, $requestURI;
	$pageURL = 'http';
	if ($_SERVER['HTTPS'] == "on") {$pageURL .= 's';}
	
	$pageURL .= "://";
	if ($_SERVER['SERVER_PORT'] != '80') {
		$pageURL .= $requestURL . ':' .$_SERVER['SERVER_PORT'] . $requestURI;
	} else {
		$pageURL .= $requestURL . $requestURI;
	}
	if ($enableQuery == true) {
		return $pageURL . $enableQuery_separator . randChars($enableQuery_length);
	} else {
		return $pageURL;
	}
}


// @HONEYPOT - generate or use existing honeypots and email traps
function makeHoneyPot() {
	global $enableHoneyPot, $enableHoneyPot_url, $enableHoneyPot_email;
	$outputHoney = '';
	if ($enableHoneyPot == true) {
		$outputHoney .= '<a href="' . $enableHoneyPot_url . '" rel="nofollow noindex" target="_blank">Forum Blog</a> <a href="mailto:' . $enableHoneyPot_email . '" rel="nofollow noindex" target="_blank">' . $enableHoneyPot_email . '</a>';
	} else {
		$outputHoney = '<a href="' . curPageURL() . '">Forum Blog</a>';
	}
	return $outputHoney;
}


// @UTILS - end timer, begin log entry for fail2ban in syslog /var/log/messages and wherever you defined your log file
function banHammer() {
	global $enableLog_sys, $enableLog_html, $enableLog_file, $timerStart, $dateTime, $userAgent, $requestURL, $requestURI, $ip, $ipProxy, $ipProxy_label;
	if (($enableLog_sys == true) || ($enableLog_html == true)) {
		$timerEnd = microtime(true) - $timerStart;
		$timerRounded = round($timerEnd, 2);
		$memoryUsed = memory_get_peak_usage(true)/1024/1024;
	}
	
	if ($enableLog_sys == true) {
		$logSYS = '@Spam | Tarpit caught ' . $ip . $ipProxy_label . $ipProxy . ' | ' . $requestURL . ' | ' . $userAgent . ' Memory: ' . $memoryUsed . 'MB - Time: ' . $timerRounded . ' Sec';
			syslog(LOG_NOTICE, $logSYS);
	}
			
	if ($enableLog_html == true) {
		$logHTML = '<b>@Spam</b> - ' . $dateTime . ' | <b>Tarpit</b> caught an IP <b>' . $ip . '</b>' . $ipProxy_label . '<b>' . $ipProxy . '</b> | ' . $requestURL . ' | ' . $userAgent . ' Memory: ' . $memoryUsed . 'MB - Time: ' . $timerRounded . ' Sec' . "\r\n";
			$logThread = fopen($enableLog_file, 'a+');
			fwrite($logThread, $logHTML);
			fclose($logThread);
	}
}
?>

<!-- @VIEW - render some output from above -->
<html><head>
<?php if ($enableNoindex == true) { ?>
	<meta name="robots" content="noindex, nofollow">
<?php } ?>
<?php if ($enableWordpress == true) { ?>
	<meta name="generator" content="WordPress 3.9.1">
<?php } ?>
<?php if ($enableDrupal == true) { ?>
	<meta name="generator" content="Drupal 7 (http://drupal.org)">
<?php } ?>
</head><body>

<?php if ($enableNohumans == true) { ?>
	<div id="overlay" style="position: fixed; top: 0; left: 0; width: 100%; height: 120%; background: #000000; opacity: 0.95;"><div id="overlay_inner" style="max-width: 600px; margin: 0 auto; position: relative; top: 25%; color: #FFFFFF; font-size: 2em; text-align: center;"><?php echo $enableNohumans_msg; ?></div></div>
<?php } ?>

<h1>Welcome to The <?php echo $requestURL; ?> Blog! <a href="<?php echo curPageURL(); ?>">Login</a> <a href="<?php echo curPageURL(); ?>">Register</a> <a href="<?php echo curPageURL(); ?>">My Account</a></h1>

<?php if ($enableForm_login == true) { ?>
	<form>
		Email or Username:<br /><input type="email" name="email" size="50" /><br />
		Password:<br /><input type="password" name="password" size="50" /><br />
		<input type="submit" name="submit" value="Login Register" />
	</form>
<?php } ?>

<p><?php echo randWords($enableWords_limit) ?></p>

<?php if ($enableForm_comment == true) { ?>
	<p><a href="<?php echo curPageURL(); ?>">Reply</a> to comment by: <a href="mailto:<?php echo $randomEmail ?>"><?php echo $randomEmail ?></a></p>
	<form>
		<input type="title" name="title" placeholder="Title..." size="50" /><br/>
		<textarea name="comment" placeholder="Type your comment here..." cols="39" rows="10"></textarea><br />
		Your Name:<br /><input type="name" name="name" placeholder="Guest" size="50" /><br />
		Your Email:<br /><input type="email" name="email" placeholder="<?php echo $randomEmailAlt; ?>" size="50" required /><br />
		Your Website:<br /><input type="url" name="url" placeholder="http://www.<?php echo randChars(10); ?>.com" size="50" /><br />
		<input type="submit" name="submit" value="Reply Post Comment Email Contact" />
	</form>
<?php } ?>

<?php if ($enableTar == true) { ?>
	<p><?php echo tarBaby($enableTar_limit, $enableTar_length, $enableTar_delay); ?></p>
<?php } ?>

<p><a href="<?php echo curPageURL(); ?>" title="<?php echo curPageURL(); ?>">Previous</a>&nbsp;<a href="<?php echo curPageURL(); ?>" title="<?php echo curPageURL(); ?>">Home</a>&nbsp;<a href="<?php echo curPageURL(); ?>" title="<?php echo curPageURL(); ?>">Next</a>&nbsp;<a href="<?php echo curPageURL(); ?>" title="<?php echo curPageURL(); ?>">Login</a>&nbsp;<a href="<?php echo curPageURL(); ?>" title="<?php echo curPageURL(); ?>">Register</a>&nbsp;<a href="<?php echo curPageURL(); ?>" title="<?php echo curPageURL(); ?>">Profile</a>&nbsp;<a href="<?php echo curPageURL(); ?>" title="<?php echo curPageURL(); ?>">Contact</a>&nbsp;<a href="<?php echo curPageURL(); ?>.xml" title="<?php echo curPageURL(); ?>.xml">Sitemap</a></p>

<h2><?php echo makeHoneyPot(); ?></h2>

<p>Powered By: Large Energy Companies</p>

<?php echo banHammer(); ?>

</body></html>


Intermission: Now we are set with some goods - 2 honeytraps, a post protection, and a fake email (see next steps if you didnt make a fake email). Now all we have to do is make some in's, protect it from good bots, and start watching the catches of the day.



Step 4) Make sure the projecthoneypot trap, the SFS protection, and tarpit are working and accessible. We must deny them in robots.txt. Add lines to your store/robots.txt file like so:

Code: Select all

/my-honeypot-file.php
/my-tarpit-file.php

Step 5) Now the fun, we need to paste a trail of bait in your site. The placement and naming of this is very important. As far as naming your trap/honeypot goes, its best to make them as mouth watering as possible. Whats a good starting point? Name one wp-login.php (for spam, like honeypot) and the other wp-admin.php (for the tarpit trap). This will catch all types of scanners as well as spammers. Now we must link to those locations with hidden, off UI, nofollow/noindex links.

Edit this to your locations/emails and put directly after the opening <body> tag on every page on your site. Easiest way is with header.tpl. Its important this comes before any other register/login/admin/comment/email style links to catch all (note: replace EXAMPLE with your domain, replace t.r.o.e.g.d+oc@gmail.com with your donated MX if you made one)

Code: Select all

<div style="position: absolute; top: -250px; left: -250px; display: none;">
					<a href="http://www.EXAMPLE.com/wp-admin.php" rel="nofollow noindex" target="_blank" style="display: none;">Login</a>
					<a href="http://www.EXAMPLE.com/wp-login.php" rel="nofollow noindex" target="_blank" style="display: none;">Register</a>
					<a href="http://srce.dsms.net/adlibituminvolved.php?blogid=5" rel="nofollow noindex" target="_blank" style="display: none;">Blog Forum</a>
					<a href="mailto:t.r.o.e.g.d+oc@gmail.com?subject=Just Spamming You" rel="nofollow noindex" target="_blank" style="display: none;">Contact</a>
				</div>

Now keep watching logs, or Advancemode set up fail2ban based on them
You should see logs flowing to your error.txt (or wherever you set logs in those extensions). If you have fail2ban, or a VPS, and set server logging, itll dump there for your banning pleasure. For now these mods dont necessarily report back but thats why theyre blueprint points O0


Ok thats a rough overview, but there is not much else to it! Because your now trapping and can log the heck outta these things, you can start banning (or autobanning), and your server load will be reduced during big bot waves or amature floods. Have fun!

https://creadev.org | support@creadev.org - Opencart Extensions, Integrations, & Development. Made in the USA.


User avatar
Active Member

Posts

Joined
Tue May 13, 2014 3:45 am
Location - PA

Post by Dunald » Tue Jul 10, 2018 9:03 pm

How do I "start banning (or autobanning)"?

Active Member

Posts

Joined
Tue Mar 15, 2011 9:05 pm
Who is online

Users browsing this forum: No registered users and 46 guests