Archive for March 21st, 2008

PHPBB3 Captcha difficulty

Is phpBB3 more secure than phpBB2? Here is a default phpBB3 sample.

PHPBB3 captcha

This is a lot stronger than a phpBB2 captcha. We can’t separate a letter based purely on its colour anymore. Notice how there is a line running underneath the B that is the same colour as the B. The background colour is annoying as anything but only from a person’s point of view. Our PC doesn’t really mind.

One of its issues/weaknesses is that there are no lines that cut across the squares, they all go underneath them. That means there is no breaks in the squares we have to detect. The only other weakness I can see is that the lines go directly across without intersecting at any point. That means that there are no objects that look like the squares of the letters that are just noise.

So here’s my algorithm which I think would solve it. Admittedly I haven’t tested this but I don’t see why it wouldn’t work. All the letters are made up of squares. We need to test, if starting at one pixel we can get back to the start by following the same colour pixels. That obviously would make a square :P , or something close like a distorted rectangle. It’s almost like dot to dot puzzles. If we can get back to the start keep the line and colour it in using previous post’s fill function (or php GD’s one :D ). If we can’t get back to the start or the line keeps travelling too far then we remove it and find another coloured pixel, that doesn’t match the background colour.

The main issues we would have to overcome are lines which are thicker than 1 pixel and small blocks of colour found at the side of some of the letters. The other issue would be making sure we don’t recheck the part we just shaded in (Maybe use a unique colour for it?).

Friday, March 21st, 2008

A custom floodfill routine

Last post I said to separate characters you simply need to flood fill them and calculate the extreme points to find out where to fit your rectangle around. Hahaha. I’ve spent the last few days trying to port optimized floodfill functions to php. Normally I’d just take the sane easy option and use pre-written code like GOCR but apparently I like pain.

The problem is the optimized bit. I can write a simple recursive floodfill function that calls itself until it’s done but I have no idea how to write something that will be reasonably fast. The more complicated captchas will require a fair bit of speed because you will be thinking about cracking a fair few of them in a period of time. This site is where I eventually found a simple routine that worked. My issue was I used another routine, ported it, and then it broke. Miserably. After filling only two lines.

Here is my code. First it loads in an image of a captcha. It then scans two lines along the horizontal axis, one at 1/4 of the way down the picture and one at 3/4 of the way down. When it hits a character it floodfills it. The floodfill function returns the extreme positions of pixels which gives us a rectangle around that letter.

<?php

function floodFillScanlineStack($image, $x, $y)
{
// the colour we are shading in - black letters
$oldColour = 0;
// the colour the want to shade the letters in - red just because we can
$fillColour = imagecolorallocate($image, 255, 0, 0);

// we need the image width & height
$w = imagesx($image);
$h = imagesy($image);

// set the rectangle co-ords
$rectangle = array(”x1″ => $x, “x2″ => $x, “y1″ => $y, “y2″ => $y);

if($oldColour == $fillColour) return;

$stack = array();
$stack[] = array(”x” => $x, “y” => $y);

while(count($stack)>0)
{
$pos = array_pop($stack);
$x = $pos[’x'];
$y = $pos[’y'];

$y1 = $y;
while($y1 >= 0 && imagecolorat($image, $x, $y1) == $oldColour) $y1–;
$y1++;
$spanLeft = 0;
$spanRight = 0;
while($y1 < $h && imagecolorat($image, $x, $y1) == $oldColour )
{

// here we set the pixel colour
// use these to find our rectangle around the letter
imagesetpixel($image, $x, $y1, $fillColour);
if($x<$rectangle[’x1′])
$rectangle[’x1′] = $x;
if($y1<$rectangle[’y1′])
$rectangle[’y1′] = $y1;
if($x>$rectangle[’x2′])
$rectangle[’x2′] = $x;
if($y1>$rectangle[’y2′])
$rectangle[’y2′] = $y1;

if($spanLeft==0 && $x > 0 && imagecolorat($image, $x - 1, $y1) == $oldColour)
{
$stack[] = array(”x” => $x - 1, “y” => $y1);
$spanLeft = 1;
}
else if($spanLeft==1 && $x > 0 && imagecolorat($image, $x - 1, $y1) != $oldColour)
{
$spanLeft = 0;
}
if($spanRight==0 && $x < $w && imagecolorat($image, $x + 1, $y1) == $oldColour)
{
$stack[] = array(”x” => $x + 1, “y” => $y1);
$spanRight = 1;
}
else if($spanRight==1 && $x < $w && imagecolorat($image, $x + 1, $y1) != $oldColour)
{
$spanRight = 0;
}
$y1++;
}
}

return $rectangle;
}

function floodfill_char($image, $x)
{
if((imagecolorat($image, $x, 12)==0))
return floodFillScanlineStack($image, $x, 12);

if((imagecolorat($image, $x, 38)==0))
return floodFillScanlineStack($image, $x, 38);
}

function split_chars_along_vertical($image)
{
$w = imagesx($image);
$h = imagesy($image);

/* $rgb = imagecolorat($img, $x, $y);
$r += $rgb >> 16;
$g += $rgb >> 8 & 255;
$b += $rgb & 255; */

// scan along each verical line looking for black pixels
// we’ll only scan two lines of pixels to save time. Both along the center
// split slightly apart
$letters = array();
for($index=0; $index<$w; $index++)
{
// check two lines of pixels one at 12 down, one at 38 down
// the picture is 50 pixels tall by the way
if((imagecolorat($image, $index, 12)==0) || (imagecolorat($image, $index, 38)==0))
{
// fill the character and return a rectangle around the image
$rectangle = floodfill_char($image, $index);

// pull this letter out into a new image
$singleLetter = imagecreatetruecolor($rectangle[’x2′] - $rectangle[’x1′] + 1,
$rectangle[’y2′] - $rectangle[’y1′] + 1);
imagecopy($singleLetter, $image, 0, 0, $rectangle[’x1′], $rectangle[’y1′],
$rectangle[’x2′] - $rectangle[’x1′] + 1,
$rectangle[’y2′] - $rectangle[’y1′] + 1);
$letters[] = $singleLetter;

// find the next character
$index = $rectangle[’x2′]+1;
}
}

return $letters;
}

$image = @imagecreatefrompng(’82.clean.png’);
if ($image == false) { die (’Unable to open image’); }

$letters = split_chars_along_vertical($image);

// dump the first letter to the screen
header(”content-type:image/png”);
imagepng($letters[0]);

?>

To run through the code and show how it works I made this neat little gif. To be honest it’s probably just a waste of my bandwidth but it looks pretty cool.

Floodfill gif

Now I can finally sort this damn neural network out :P . You just know something is going to break. Of course that’s the fun part, right? Anyone?

Scripts to separate characters

Friday, March 21st, 2008