Get lyrics for any song using XMPP and PHP right into your IM – Add [email protected]

XMPP is soon finding it’s way into real time applications other than just chat. I have combined JAXL (Jabber XMPP client library written in PHP) and the API from to build a real time chat bot which can assist you with lyrics for any song. You can start using it by simply adding [email protected] to your IM account (e.g. Gtalk, Jabber etc). In this blog post, I will explain in brief the working of lyricsfly bot and how you can integrate XMPP into your own application.

Try out [email protected]
Follow the following steps to get the bot working for you:

  • Login to your gtalk account using any of the IM available
  • Press Add Contact
  • Add [email protected] as your chat buddy
  • Send a chat message in following format “Song Title – Song Artist” e.g. “one – metallica”
  • You should see something like this: Demo for "one-metallica"

Working of [email protected] with Jaxl
Here is in brief the working of lyricsfly bot using Jaxl client library:

  • When someone sends a message like “one – metallica” to the bot, eventMessage() method is called inside jaxl.class.php
  • eventMessage then extracts the song title and artist name from the message using PHP explode. Filter the title and artist names for allowed characters.
  • eventMessage also calls lyricsfly API and fetch the lyrics. Finally it sends the lyrics as message to requester.
  • eventMessage also uses memcached to cache the lyrics. It decreases both response time and load on lyricsfly servers
  • Bot also keeps a count of number of queries from a particular user. Since it is still under development, currently there is a limit on number of lyrics you can fetch in a single day.

Making your own custom bot

  • Checkout latest from the trunk
    sabhinav$ svn checkout jaxl-read-only
  • Edit config file with your bot username, password and jabber servers
  • Run from command like
    php index.php
  • To customize the bot modify eventMessage and eventPresence methods of Jaxl class inside jaxl.class.php

For a full fledged running bot example code, edit index.php and include jaxl4dzone.class.php instead of jaxl.class.php and re-run the bot.

Have fun and enjoy singing songs along with the lyrics.

WordPress style “Duplicate comment detected” using Memcached and PHP

If you have a knack of leaving comments on blogs, chances are you might have experienced a wordpress error page saying “Duplicate comment detected; it looks as though you’ve already said that!“, probably because you were not sure that your comment was saved last time and you tried to re-post your comment. In this blog post, I will put up some sample PHP code for Duplicate comment detection using Memcached without touching the databases. Towards the end, I will also discuss how the script can be modified for usage in any environment including forums and social networking websites.

Duplicate comment detection using Memcached
Here is a php function called is_repetitive_comment which return some useful value if the comment is repetitive, otherwise FALSE.


        define('COMPRESSION', 0);
        define('SIGNATURE_TTL', 60);

        $mem = new Memcache;
        $mem->addServer("localhost", 11211);

        function is_repetitive_comment($comment, $username) { // username can be ip address for anonymous env
                                                              // for per blog/forum checks pass forum id too
                                                              // for multi-host using same memcached instance, pass hostname too
                                                              // for restricting post of same comment, don't pass username
                $comment = trim($comment);
                $signature = md5(implode('',func_get_args()));

                global $mem;
                if(($value = $mem->get($signature)) !== FALSE) {
                        error_log($signature." found at ".time());
                        return $value;
                else {
                        $value = array('comment' => $comment,
                                       'by' => $username,
                                       /* Other information if you may want to save */
                        $mem->set($signature, $value, COMPRESSION, SIGNATURE_TTL);
                        error_log($signature." set at ".time());
                        return FALSE;


Is it working?
Lets verify the working of the code and then we will dig into the code:

  • Save the sample code in a file, name it index.php
  • Towards the end of the script add following 3 line of code:
            var_dump(is_repetitive_comment("User Comment", "username"));
            sleep(5); // Simulating the case when a user might try to post the same comment again knowingly or unknowingly
                      // Similar kind of check is done in wordpress comment submission (though without memcached)
            var_dump(is_repetitive_comment("User Comment", "username"));
  • Run from command line:
    sabhinav$ php index.php
    6105b67d969642fe9e27bc052f29e259 set at 1262393877
    6105b67d969642fe9e27bc052f29e259 found at 1262393882
    array(2) {
      string(12) "User Comment"
      string(8) "username"
  • As seen, function is_repetitive_comment returns bool(false) for the first time. However, after 5 seconds when same comment is being submitted it throws back some useful information from previous submission.

Working of is_repetitive_comment
Here is in brief, how memcached is used for duplicate comment detection by the script:

  • SIGNATURE_TTL defines the time limit between two similar comment submissions. Default set to 60 seconds
  • is_repetitive_comment takes two parameter namely the comment itself and the username of the user trying to post the comment.
  • The function create a signature by combining the passed parameters and checks whether a key=$signature exists in memcache
  • If key is found, it means same user has posted the same comment in past SIGNATURE_TTL i.e. 60 seconds. Function simply return back the value set for the key from memcache
  • However, if key is NOT found, user is allowed to post the comment by returning FALSE. However function also sets a key=$signature into memcache

The value of key=$signature depends upon your application and use case. You might want to save some useful parameters so that you can show appropriate error message without hitting the databases for anything.

Extracting more from the sample script
Here is how you can modify the above sample script for various environments:

  • If you are performing repetitive comment check in an anonymous environment i.e. commenter may not be registered users, you can pass commenter’s ip address instead of username
  • If you serve multiple sites out of the same box and all share the same memcached instance, you SHOULD also pass site’s root url to the function. Otherwise you might end up showing error message to wrong users
  • If you want to restrict submission of same comment per blog or forum, also pass the blog id to the function
  • If you want to simply restrict submission of same comment through out your site, pass only the comment to the function

Let me know if you do similar tiny little hacks using memcached 😀

How to use locks in PHP cron jobs to avoid cron overlaps

Cron jobs are hidden building blocks for most of the websites. They are generally used to process/aggregate data in the background. However as a website starts to grow and there is gigabytes of data to be processed by every cron job, chances are that our cron jobs might overlap and possibly corrupt our data. In this blog post, I will demonstrate how can we avoid such overlaps by using simple locking techniques. I will also discuss a few edge cases we need to consider while using locks to avoid overlap.

Cron job helper class
Here is a helper class (cron.helper.php) which will help us avoiding cron job overlaps. (See usage example below)


	define('LOCK_DIR', '/Users/sabhinav/Workspace/cronHelper/');
	define('LOCK_SUFFIX', '.lock');

	class cronHelper {

		private static $pid;

		function __construct() {}

		function __clone() {}

		private static function isrunning() {
			$pids = explode(PHP_EOL, `ps -e | awk '{print $1}'`);
			if(in_array(self::$pid, $pids))
				return TRUE;
			return FALSE;

		public static function lock() {
			global $argv;

			$lock_file = LOCK_DIR.$argv[0].LOCK_SUFFIX;

			if(file_exists($lock_file)) {
				//return FALSE;

				// Is running?
				self::$pid = file_get_contents($lock_file);
				if(self::isrunning()) {
					error_log("==".self::$pid."== Already in progress...");
					return FALSE;
				else {
					error_log("==".self::$pid."== Previous job died abruptly...");

			self::$pid = getmypid();
			file_put_contents($lock_file, self::$pid);
			error_log("==".self::$pid."== Lock acquired, processing the job...");
			return self::$pid;

		public static function unlock() {
			global $argv;

			$lock_file = LOCK_DIR.$argv[0].LOCK_SUFFIX;


			error_log("==".self::$pid."== Releasing lock...");
			return TRUE;



Using cron.helper.php
Here is how the helper class can be integrated in your current cron job code:

  • Save cron.helper.php in a folder called cronHelper
  • Update LOCK_DIR as per your need
  • You might have to set proper permissions on folder cronHelper, so that running cron job have write permissions
  • Wrap your cron job code as show below:
    	require 'cronHelper/cron.helper.php';
    	if(($pid = cronHelper::lock()) !== FALSE) {
    		 * Cron job code goes here
    		sleep(10); // Cron job code for demonstration

Is it working? Verify
Lets verify is the helper class really take care of all the edge cases.

  • sleep(10) is our cron job code for this test
  • Run from command line:
    sabhinav$ php job.php
    ==40818== Lock acquired, processing the job...
    ==40818== Releasing lock...

    where 40818 is the process id of current running cron job

  • Run from command line and terminate the cron job in between by pressing CNTR+C:
    sabhinav$ php job.php
    ==40830== Lock acquired, processing the job...

    By pressing CNTR+C, we simulate the cases when a cron job can die in between due to a fatal error or system shutdown. In such cases, helper class fails to release the lock on this cron job.

  • With the lock in place (ls -l cronHelper | grep lock), run from command line:
    sabhinav$ php job.php
    ==40830== Previous job died abruptly...
    ==40835== Lock acquired, processing the job...
    ==40835== Releasing lock...

    As seen, helper class detects that one of the previous cron job died abruptly and then allow the current job to run successfully.

  • Run the cron job from two command line window and one of them will not proceed as shown below:
    centurydaily-lm:cronHelper sabhinav$ php job.php
    ==40856== Already in progress...

    One of the cron job will die since a cron job with $pid=40856 is already in progress.

Working of cron.helper.php
The helper class create a lock file inside LOCK_DIR. For our test cron job above, lock file name will be job.php.lock. Lock file name suffix can be configured using LOCK_SUFFIX.

cronHelper::lock() places the current running cron job process id inside the lock file. Upon job completion cronHelper::unlock() deletes the lock file.

If cronHelper::lock() finds that lock file already exists, it extracts the previous cron job process id from the lock file and checks whether a previous cron job is still running. If previous job is still in progress, we abort our current current job. If previous job is not in progress i.e. died abruptly, current cron job acquires the lock.

This is the classic method for avoiding cron overlaps. However there can be various other methods of achieving the same thing. If you know any do let me know through your comments.

How to add content verification using hmac in PHP

Many times a requirement arises where we are supposed to expose an API for intended users, who can use these API endpoints to GET/POST data on our servers. But how do we verify that only the intended users are using these API’s and not any hacker or attacker. In this blog post, I will show you the most elegant way of adding content verification using hash_hmac (Hash-based Message Authentication Code) in PHP. This will allow us to restrict possible misuse of our API by simply issuing an API key for intended users.

Here are the steps for adding content verification using hmac in PHP:

  • Issue $private_key and $public_key for users allowed to post data using our API. You can use the method similar to one described here for generating public and private keys.
  • Users having these keys can now use following sample script (hmac-sender.php) to submit data:
            // User Public/Private Keys
            $private_key = 'private_key_user_id_9999';
            $public_key = 'public_key_user_id_9999';
            // Data to be submitted
            $data = 'This is a HMAC verification demonstration';
            // Generate content verification signature
            $sig = base64_encode(hash_hmac('sha1', $data, $private_key, TRUE));
            // Prepare json data to be submitted
            $json_data = json_encode(array('data'=>$data, 'sig'=>$sig, 'pubKey'=>$public_key));
            // Finally submit to api end point
  • At hmac-receiver.php, we validate the incoming data in following fashion:
            function get_private_key_for_public_key($public_key) {
                    // extract private key from database or cache store
                    return 'private_key_user_id_9999';
            // Data submitted
            $data = $_GET['data'];
            $data = json_decode(stripslashes($data), TRUE);
            // User hit the end point API with $data, $signature and $public_key
            $message = $data['data'];
            $received_signature = $data['sig'];
            $private_key = get_private_key_for_public_key($data['pubKey']);
            $computed_signature = base64_encode(hash_hmac('sha1', $message, $private_key, TRUE));
            if($computed_signature == $received_signature) {
                    echo "Content Signature Verified";
            else {
                    echo "Invalid Content Verification Signature";

Where to use such verification?
This is an age old method for content verification which is used widely in a variety of applications. Below are a few places where hmac verification finds a place:

  • If you have exposed an API for your vendors to submit requested data
  • If you are looking to enable third party applications in your website. Similar to developer application model of facebook.

Hope you liked the post. Do leave your comments.

How to use locks for assuring atomic operation in Memcached?

Memcached provide atomic increment and decrement commands to manipulate integer (key,value) pairs. However special care should be taken to ensure application performance and possible race conditions while using memcached. In this blog post, I will first build a facebook style “like” application using atomic increment command of memcached. Also, I will discuss various technical difficulty one would face while ensuring atomicity in this application. Finally, I will demo how to ensure atomicity over a requested process using custom locks in memcached.

Where should I care about it?
Lets consider a sample application as depicted by the flow diagram below:
Facebook style "like" demo architecture using "memcached"

The above application is similar to facebook “like” feature. In brief, we maintain a key per post e.g. $key="post_id_1234_likes_count", storing count of users who liked this post. Another $key="post_id_1234_user_id_9999", stores user_id_9999 relationship with post_id_1234. Example, “liked” which is set to 1 if liked and “timestamp” which is the time when user liked this post.

Since this application is going to reside on a high traffic website, earlier design decisions are made to have memcached in-front of MySQL database and will act as the primary storage medium with periodic syncs to the database. For me a like/dislike functionality is not so important as compared to other social functionality on my website.

Here is a sample code for the above functionality:

	$mem = new Memcache;
	$mem->addServer("", 11211);

	function incrementLikeCount($post_id) {
		global $mem;

		// prepare post key
		$key = "post_id_".$post_id."_likes_count";

		// get old count
		$old_count = $mem->get($key);

		// false means no one liked this post before
		if($old_count === FALSE) $old_count = 0;

		// increment count
		$new_count = $old_count+1;

		// set new count value
		if($mem->set($key, $new_count, 0, 0)) {
			error_log("Incremented key ".$key." to ".$new_count);
			return TRUE;
		else {
			error_log("Error occurred in incrementing key ".$key);
			return FALSE;

	// get incoming parameters
	$post_id = $_GET['post_id'];

	// take action

Why should I care about it?
Save the above code sample in a file called memcached_no_lock.php and hit the url http://localhost/memcached_no_lock.php?post_id=1234 five times. Verify the key value in memcached:

centurydaily-lm:Workspace sabhinav$ telnet localhost 11211
Trying ::1...
Connected to localhost.
Escape character is '^]'.
get post_id_1234_likes_count
VALUE post_id_1234_likes_count 0 2

Alright, application seems to give expected results. Next, lets verify this application for high traffic websites using apache benchmark:

centurydaily-lm:Workspace sabhinav$ ab -n 100 -c 10 http://localhost/memcached_no_lock.php?post_id=1234
Concurrency Level:      10
Time taken for tests:   0.090 seconds
Complete requests:      100
Failed requests:        0
Write errors:           0
Total transferred:      22200 bytes
HTML transferred:       0 bytes
Requests per second:    1112.03 [#/sec] (mean)

Verify the key value in memcached:

centurydaily-lm:Workspace sabhinav$ telnet localhost 11211
Trying ::1...
Connected to localhost.
Escape character is '^]'.
get post_id_1234_likes_count
VALUE post_id_1234_likes_count 0 2

What happened? We expected value for $key="post_id_1234_likes_count" to reach 100, but actually it is 36. What went wrong? This behavior can be explained by simply looking at the apache error log file:

[Sat Dec 05 14:32:08 2009] [error] [client ::1] Incremented key post_id_1234_likes_count to 1
[Sat Dec 05 14:32:08 2009] [error] [client ::1] Incremented key post_id_1234_likes_count to 1
[Sat Dec 05 14:32:08 2009] [error] [client ::1] Incremented key post_id_1234_likes_count to 1
[Sat Dec 05 14:32:08 2009] [error] [client ::1] Incremented key post_id_1234_likes_count to 2
[Sat Dec 05 14:32:08 2009] [error] [client ::1] Incremented key post_id_1234_likes_count to 2
[Sat Dec 05 14:32:08 2009] [error] [client ::1] Incremented key post_id_1234_likes_count to 3
[Sat Dec 05 14:32:08 2009] [error] [client ::1] Incremented key post_id_1234_likes_count to 3
[Sat Dec 05 14:32:08 2009] [error] [client ::1] Incremented key post_id_1234_likes_count to 3

Ohk, from above log we understand concurrency killed our application, since we see $key being incremented to the same value by more than 1 incoming request.

How should I take care of this?
Below is the modified code sample which will allow us atomic increments:

	$mem = new Memcache;
	$mem->addServer("", 11211);

	function incrementLikeCount($post_id) {
		global $mem;

		// prepare post key
		$key = "post_id_".$post_id."_likes_count";

		$new_count = $mem->increment($key, 1);
		if($new_count === FALSE) {
			$new_count = $mem->add($key, 1, 0, 0);
			if($new_count === FALSE) {
				error_log("Someone raced us for first count on key ".$key);
				$new_count = $mem->increment($key, 1);
				if($new_count === FALSE) {
					error_log("Unable to increment key ".$key);
					return FALSE;
				else {
					error_log("Incremented key ".$key." to ".$new_count);
					return TRUE;
			else {
				error_log("Initialized key ".$key." to ".$new_count);
				return TRUE;
		else {
			error_log("Incremented key ".$key." to ".$new_count);
			return TRUE;


	// get incoming parameters
	$post_id = $_GET['post_id'];

	// take action

To ensure atomicity, we start with incrementing the $key="post_id_1234_likes_count". Since memcached increment() is atomic by itself, we need not put any locking mechanism in here. However, memcached increment returns FALSE, if the $key doesn’t already exists.

Hence, if we get a FALSE response from the first increment, we will try to initialize $key using memcached add() command. Good thing about memcached add is that, it will return a false FALSE, if the $key is already present. Hence, if more than one thread is trying to initialize $key, only one of them will succeed. All the rest of the threads will return FALSE for add command. Finally, if the response is FALSE from the first add, we will try to increment the $key again.

Lets try to test this modified code with apache benchmark. Also, this time we will increase concurrency from 10 to 100 threads. Save the above modified code in a file called memcached_lock.php and issue the following ab command:

centurydaily-lm:Workspace sabhinav$ ab -n 10000 -c 100 http://localhost/memcached_lock.php?post_id=1234
Concurrency Level:      100
Time taken for tests:   11.006 seconds
Complete requests:      10000
Failed requests:        0
Write errors:           0
Total transferred:      2224884 bytes
HTML transferred:       0 bytes
Requests per second:    908.61 [#/sec] (mean)

Lets verify the key value inside memcached:

centurydaily-lm:Workspace sabhinav$ telnet localhost 11211
Trying ::1...
Connected to localhost.
Escape character is '^]'.
get post_id_1234_likes_count
VALUE post_id_1234_likes_count 0 5

Bingo! As desired we have a value of 10000 for $key inside memcached.

Using custom locks for atomicity:
There can be many instances where you SHOULD try to process a request atomically using locks. For e.g. while trying to fetch a query from database or while trying to regenerate a requested page template in your custom template caching engine.

In the example below, I will modify the memcached_lock.php script to ensure atomic increments without using increment command. Instead I will use custom locks using memcached:

	$mem = new Memcache;
	$mem->addServer("", 11211);

	function incrementLikeCount($post_id) {
		global $mem;

		// prepare post key
		$key = "post_id_".$post_id."_likes_count";

		// initialize lock
		$lock = FALSE;

		// initialize configurable parameters
		$tries = 0;
		$max_tries = 1000;
		$lock_ttl = 10;

		$new_count = $mem->get($key); // fetch older value
	    while($lock === FALSE && $tries < $max_tries) {
	    	if($new_count === FALSE) $new_count = 0;
	        $new_count = $new_count + 1;

			// add() will return false if someone raced us for this lock
                       // ALWAYS USE add() FOR CUSTOM LOCKS
	        $lock = $mem->add("lock_".$new_count, 1, 0, $lock_ttl);

			usleep(100*($tries%($max_tries/10))); // exponential backoff style of sleep

		if($lock === FALSE && $tries >= $max_tries) {
			error_log("Unable to increment key ".$key);
			return FALSE;
		else {
	    	$mem->set($key, $new_count, 0, 0);
			error_log("Incremented key ".$key." to ".$new_count);
			return TRUE;


	// get incoming parameters
	$post_id = $_GET['post_id'];

	// take action

Try testing it using apache benchmark as above and then verify it with memcached.

centurydaily-lm:Workspace sabhinav$ telnet localhost 11211
Trying ::1...
Connected to localhost.
Escape character is '^]'.
get post_id_1234_likes_count
VALUE post_id_1234_likes_count 0 3

We see a drop in performance from 1112 hits/sec (memcached_no_lock) to 908 hits/sec (memcached_lock using increment). This is majorly because of increased concurrency. At same concurrency level of 10, I received a performance benchmark of 1128 hits/sec with our thread protected code. However, for our custom lock code above, I received a performance benchmark of 275 hits/sec.

Always use memcached increment/decrement while dealing with locks on integer valued keys. For achieving locks on a process, use custom locks as demoed above using memcached add command. Also custom locks are subjected to configurable options like $max_tries and others.

Hope you enjoyed reading.
Do let me know through your comments.

PHP tokens and opcodes : 3 useful extensions for understanding the working of Zend Engine

“PHP tokens and opcodes” – When a PHP script is executed it goes through a number of processes, before the final result is displayed. These processes are namely: Lexing, Parsing, Compiling and Executing. In this blog post, I will walk you through all these processes with a sample example. In the end I will list some useful PHP extensions, which can be used to analyze results of every intermediate process.

Lets take a sample PHP script as an example:

	function increment($a) {
		return $a+1;
	$a = 3;
	$b = increment($a);
	echo $b;

Try running this script through command line:

~ sabhinav$ php -r debug.php

This PHP script goes through the following processes before outputting the result:

  • Lexing: The php code inside debug.php is converted into tokens
  • Parsing: During this stage, tokens are processed to derive at meaningful expressions
  • Compiling: The derived expressions are compiled into opcodes
  • Execution: Opcodes are executed to derive at the final result

Lets see how a PHP script passes through all the above steps.

During this stage human readable php script is converted into token. For the first two lines of our PHP script:

	function increment($a) {

tokens will look like this (try to match the tokens below line by line with the above 2 lines of PHP code and you will get a feel):

~ sabhinav$ php -r 'print_r(token_get_all(file_get_contents("debug.php")));';
    [0] => Array
            [0] => 368             // 368 is the token number and it's symbolic name is T_OPEN_TAG, see below
            [1] => <?php

            [2] => 1

    [1] => Array
            [0] => 371
            [1] =>
            [2] => 2

    [2] => Array
            [0] => 334
            [1] => function
            [2] => 2

    [3] => Array
            [0] => 371
            [1] =>
            [2] => 2

    [4] => Array
            [0] => 307
            [1] => increment
            [2] => 2

    [5] => (
    [6] => Array
            [0] => 309
            [1] => $a
            [2] => 2

    [7] => )
    [8] => Array
            [0] => 371
            [1] =>
            [2] => 2

    [9] => {
    [10] => Array
            [0] => 371
            [1] =>

            [2] => 2

A list of parser tokens can be found here:

Every token number has a symbolic name attached with it. Below is our PHP script with human readable code replaced by symbolic name for each generated token:

~ sabhinav$ php -r '$tokens = (token_get_all(file_get_contents("debug.php"))); foreach($tokens as $token) { if(count($token) == 3) { echo token_name($token[0]); echo $token[1]; echo token_name($token[2]);  }  }';

Parsing and Compiling:
By generating the tokens in the above step, zend engine is able to recognize each and every detail in the script. Where the spaces are, where are the new line characters, where is a user defined function and what not. Over the next two stages, the generated tokens are parsed and then compiled into opcodes. Below is the compiled opcode for the complete sample script of ours:

~ sabhinav$ php -r '$op_codes = parsekit_compile_file("debug.php", $errors, PARSEKIT_SIMPLE); print_r($op_codes); print_r($errors);';
    [3] => ZEND_ASSIGN T(0) T(0) 3
    [6] => ZEND_SEND_VAR UNUSED T(0) 0x1
    [7] => ZEND_DO_FCALL T(1) 'increment' 0x83E710CA
    [9] => ZEND_ASSIGN T(2) T(0) T(1)
    [function_table] => Array
            [increment] => Array
                    [0] => ZEND_EXT_NOP UNUSED UNUSED UNUSED
                    [1] => ZEND_RECV T(0) 1 UNUSED
                    [2] => ZEND_EXT_STMT UNUSED UNUSED UNUSED
                    [3] => ZEND_ADD T(0) T(0) 1
                    [4] => ZEND_RETURN UNUSED T(0) UNUSED
                    [5] => ZEND_EXT_STMT UNUSED UNUSED UNUSED
                    [6] => ZEND_RETURN UNUSED NULL UNUSED


    [class_table] =>

As we can see above, Zend engine is able to recognize the flow of our PHP. For instance, [3] => ZEND_ASSIGN T(0) T(0) 3 is a replacement for $a = 3; in our PHP code. Read on to understand what do these T(0) in the opcode means.

Executing the opcodes:
The generated opcode is executed one by one. Below table shows various details as every opcode is executed:

~ sabhinav$ php -d -d vld.execute=0 -f debug.php
Branch analysis from position: 0
Return found
filename:       /Users/sabhinav/Workspace/interview/facebook/peaktraffic/debug.php
function name:  (null)
number of ops:  13
compiled vars:  !0 = $a, !1 = $b
line     #  op                           fetch          ext  return  operands
   2     0  EXT_STMT
         1  NOP
   5     2  EXT_STMT
         3  ASSIGN                                                   !0, 3
   6     4  EXT_STMT
         5  EXT_FCALL_BEGIN
         6  SEND_VAR                                                 !0
         7  DO_FCALL                                      1          'increment'
         8  EXT_FCALL_END
         9  ASSIGN                                                   !1, $1
   7    10  EXT_STMT
        11  ECHO                                                     !1
   8    12  RETURN                                                   1

Function increment:
Branch analysis from position: 0
Return found
filename:       /Users/sabhinav/Workspace/interview/facebook/peaktraffic/debug.php
function name:  increment
number of ops:  7
compiled vars:  !0 = $a
line     #  op                           fetch          ext  return  operands
   2     0  EXT_NOP
         1  RECV                                                     1
   3     2  EXT_STMT
         3  ADD                                              ~0      !0, 1
         4  RETURN                                                   ~0
   4     5* EXT_STMT
         6* RETURN                                                   null

End of function increment.

First table represents the main loop run, while second table represents the run of user defined function in the php script. compiled vars: !0 = $a tells us that internally while script execution !0 = $a and hence now we can relate [3] => ZEND_ASSIGN T(0) T(0) 3 very well.

Above table also returns back the number of operations number of ops: 13 which can be used to benchmark and performance enhancement of your PHP script.

If APC cache is enabled, it caches the opcodes and thereby avoiding repetitive lexing/parsing/compiling every time same PHP script is called.

3 PHP extensions providing interface to Zend Engine:
Below are 3 very useful PHP extensions for geeky PHP developers. (Specially helpful for all PHP extension developers)

  • Tokenizer: The tokenizer functions provide an interface to the PHP tokenizer embedded in the Zend Engine. Using these functions you may write your own PHP source analyzing or modification tools without having to deal with the language specification at the lexical level.
  • Parsekit: These parsekit functions allow runtime analysis of opcodes compiled from PHP scripts.
  • Vulcan Logic Disassembler (vld): Provides functionality to dump the internal representation of PHP scripts. Homepage of VLD project for download instructions.

Hope this is of some help for PHP geeks out there.

Writing a custom unix style tail in PHP using Libevent API on Mac OS X 10.5.x and other platforms

Libevent is a library which provides a mechanism to execute a callback function when a specific event occurs on a file descriptor or after a timeout has been reached. Many famous applications/frameworks/libraries like memcached are using libevent. In this blog post, I will demonstrate how to write a custom unix style tail script using Libevent API in PHP.

Setting up the environment:
Setting up libevent with PHP is a little tricky. Below are the steps, I followed to make it work on Mac OSX 10.5. However the steps should be same for any other OS you choose to code on. Here we go:

  1. Check the version of libevent installed on your system. If you don’t have libevent or the installed version is < 1.4, you will need to compile libevent-1.4.x
    ~ sabhinav$ port list | grep libevent
    libevent                       @1.4.12         devel/libevent
  2. Uninstall existing libevent
    ~ sabhinav$ port uninstall libevent
  3. Add the following into your .bash_profile file:
    export CFLAGS="-arch x86_64 -g -Os -pipe -no-cpp-precomp"
    export CCFLAGS="-arch x86_64 -g -Os -pipe"
    export CXXFLAGS="-arch x86_64 -g -Os -pipe"
    export LDFLAGS="-arch x86_64 -bind_at_load"
  4. Open a new terminal window. Download and extract libevent-1.4.12-stable
    ~ sabhinav$ wget
    ~ sabhinav$ tar -xvzf libevent-1.4.12-stable.tar.gz
  5. Compile libevent-1.4.12-stable
    ~ sabhinav$ cd libevent-1.4.12-stable
    ~ sabhinav$ ./configure
    ~ sabhinav$ make
    ~ sabhinav$ sudo make install
  6. Assuming you have a successful installation, lets install PECL package libevent-0.0.2.
    ~ sabhinav$ pecl download libevent-0.0.2
    ~ sabhinav$ tar -xvzf libevent-0.0.2.tgz libevent-0.0.2
    ~ sabhinav$ cd libevent-0.0.2
    ~ sabhinav$ phpize
    ~ sabhinav$ ./configure
    ~ sabhinav$ make
    ~ sabhinav$ sudo make install
  7. Enable libevent extension in your php.ini
  8. Reload apache server
    ~ sabhinav$ sudo apachectl restart
  9. Confirm we have libevent extension enabled using phpinfo(); or
    ~ sabhinav$ php -i | grep libevent

Writing a custom unix style tail script in PHP (tail.php)
Below is a sample script which can be used as a base for writing custom unix style tail script. Comments in the code will help you understanding the flow of the code. Also do view official documentation for PHP Libevent extension usage.


	// callback function called whenever the registered event is triggered
	function eventFd($fd, $events, $arg) {
		echo fread($fd, 4096);

	// create event base
	$base_fd = event_base_new();

	// create a new event
	$event_fd = event_new();

	// resource to be monitored
	$fd = fopen($argv[1], 'r');

	// set event on passed file name
	event_set($event_fd, $fd, EV_WRITE | EV_PERSIST, 'eventFd', array($event_fd, $base_fd));

	// associate base with this event
	event_base_set($event_fd, $base_fd);

	// register event

	// start event loop


Trying out tail.php
Save the above code file and issue the following on the terminal:

~ sabhinav$ php tail.php /var/log/apache2/access_log

Try accessing a page on your webserver and you should see the access log being tailed by the php script. 😀


Web Security : Using crumbs to protect your PHP API (Ajax) call from Cross-site request forgery (CSRF/XSRF) and other vulnerabilities

Have your API calls ever being used directly by someone without your permission? If yes, read on to find out how can we protect our API’s from such spammers and hackers. Before we go ahead and see a possible solution for this, lets try to list out a few cases, when our API’s can be accessed without our permissions.

Common cases of vulnerable API/Ajax calls

  • Ajax calls having no user authentication: This is the first place where a spammer will try to find out a loop hole. Take this example, suppose I created a group chat plugin for my blog. Since it’s a group chat plugin, I don’t really want the blog viewers to register before they can write a messages. Blog viewer only need to provide their name, email and url (just like wordpress comments). Thereafter, they can write messages which are submitted on the server side using ajax calls. And here is the “problem”. Anyone can pick up the ajax url, write a curl script, post the required parameters and fill up my database with millions of messages.
  • Ajax calls having user authentication: One day I realize my group chat plugin has received more than 1 million messages last night (all spams). Hence I decide to make my blog viewers to register before they can post a message on the group chat plugin, simply because someone is filling up my database by simulating ajax calls through a curl script. Anyone can write a script, since these ajax call do not authenticate the user making the call. But are my ajax calls safe after forcing users to register? NO, a registered user too can simulate these ajax calls and passing authentication by sending the right cookies.

Possible solutions and their flaws
If you look around on web, you will find a bunch of solution to such problems. But then every solution have it’s own problem which forces you not to use them. Listed below are 2 possible solutions to our problem:

  • Using X-Requested-With to protect ajax calls: All famous javascript frameworks like JQuery, YUI, Mootools etc sends an additional header parameter while making an XHR request. These libraries set “X-Requested-With=XMLHttpRequest” header, which can then be used on the server side to detect if the call was made through an ajax call. But a programmer can easily pass these headers using a curl script, making the server believe that the call was made through an XHR request.
  • Using HTTP Referrer: This solution comes in handy for cases when a spammer/hacker try to POST data into your site’s. We can check for the referrer page, before we go ahead and accept the POST data. If the POST data is coming from a page within your site, you go ahead and accept the data, otherwise reject it. But this solution again have it’s shortcomings. HTTP Referrer can be tampered in certain browsers using javascript and they can also be stripped away by some proxies and firewalls.

Using crumbs
Finally the idea is to have crumbs. A unique electronic key which is shared between server and client, and which have a short life time. But how are these useful? Suppose, in my group chat module, upon page load i generate a crumb whose life time is 30 minutes (tunable). Why 30 minutes? Because, I assume my blog viewers to either engage into the group chat module or leave that specific blog post within 30 minutes.

Now whenever a user writes a message, this crumb is passed back to the server side. If user writes a message before 30 minutes, this crumb will be validated and user shout submitted. (30 minutes should take care of 99.99% of the cases). In response, server api sends back the new crumb which should be sent back with the next ajax call.

Now when a spammer try to simulate the ajax request using curl calls, he will not be able to succeed because of the absence of the crumb. But he can capture the crumb from the site and simulate the effect, right? YES he can, but we can take care of this by reducing the life time of the generated crumb.

Generating crumbs using PHP
Here are the two functions, I use to generate and verify crumbs in PHP:

        // user for whom crumb is to be generated
        $uid = "[email protected]";

        // usually $salt = DB_PASSWORD . DB_USER . DB_NAME . DB_HOST . ABSPATH;
        $salt = "abcdefghijklmnopqrstuvwxyz";

        function challenge($data) {
                global $salt;
                return hash_hmac('md5', $data, $salt);

        function issue_crumb($ttl, $action = -1) {
                global $uid;

                // ttl
                $i = ceil(time() / $ttl);

                // log
                echo "Generating crumb at time:".time().", i:".$i.", action:".$action.", uid:".$uid.PHP_EOL;

                // return crumb
                return substr(challenge($i . $action . $uid), -12, 10);

        function verify_crumb($ttl, $crumb, $action = -1) {
                global $uid;

                // ttl
                $i = ceil(time() / $ttl);

                // log
                echo "Verifying crumb:".$crumb." at time:".time().", i:".$i.", action:".$action.", uid:".$uid.PHP_EOL;

                // verify crumb
                if(substr(challenge($i . $action . $uid), -12, 10) == $crumb || substr(challenge(($i - 1) . $action . $uid), -12, 10) == $crumb)
                        return true;
                return false;

I can generate crumbs with a simple call:

$crumb = issue_crumb(300, "group_chat_module");

where $ttl = 300 (required), $action = “group_chat_module” (optional, defaults to -1)

Later on I can verify the crumb using another call:

var_dump(verify_crumb(300, $crumb, "group_chat_module"));

I hope this helps you protecting your API’s. Let me know of better methods to stop such attacks.

5 exciting (gaming) bots you can create using Jaxl (Jabber XMPP Library) in PHP

Jaxl is an open source XMPP client library written in PHP. The object oriented structure of JAXL allow developers to build various extensions using Jaxl library as their base. If used intelligently, JAXL client library is capable of doing more than just chat message transfers. Here are a few applications where developers have tried using JAXL for delivering more than just chat messages:

“I used your library to develop a prototype that connects dynamically some users to a XMPP server if an external event is detected. The script runs like a daemon. Because of your object-oriented class design it was very easy to set up dynamic number of parallel XMPP sessions. I would like to use your library as part of a software that integrates telephony and XMPP functionality. The software will also be licenced unter GPL. Thanks again for your great work.”

“I’m thinking on creating a symfony plugin for jaxl library. I’ve worked before with the Jabber php library. This one of yours is much nicer! And it is working really nice, good job! “

The possibilities are endless. In this blog post I will discuss various possible use cases of JAXL client library starting from, creating an 24×7 online chat bot, broadcasting messages to gtalk friend list, rss feed aggregator, custom out of office email bot for gmail and google apps user, and finally a simple game using Jaxl client library. (similar to anagram gaming bot [email protected])

Setting up the environment
Jaxl is hosted on Google Code. Checkout the latest version of JAXL client library:

svn checkout jaxl-read-only

Alternately you can download the latest version of Jaxl from here:

From here on I will assume you have all the library files in a folder called jaxl. You should see the following set of php files inside the jaxl folder:

  • config.ini.php : Holds your jabber account and mysql connection information
  • mysql.class.php : Basic MySQL connection class used to insert received messages and presence into MySQL database
  • logger.class.php : A very basic logger class which allows you to log all XML stanza’s send and received from jabber server
  • xmpp.class.php : Base XMPP class library which implements the XMPP protocol
  • jaxl.class.php : JAXL class which extends XMPP class library. Should be the starting point for your application
  • index.php : After building your application in jaxl.class.php, you finally initialize and call your methods here

You will also see a bunch of other php files: jaxl4broadcast.class.php, jaxl4gmail.class.php, jaxl4dzone.class.php, which are extensions written using Jaxl library as their base. We will discuss them all as we proceed on the blog.

Another point I would like to discuss before we go ahead, is the structure of Jaxl client library. XMPP class is written in xmpp.class.php php class file which implements the XMPP protocol. It takes care of user authentication, user presence (available, busy, idle), user status, sending and receiving messages and everything which we will see as we proceed on the blog. JAXL class extends XMPP class in jaxl.class.php php file. We will develop all our bots/applications in jaxl.class.php php file and will never require to touch the base xmpp.class.php file. Finally, index.php is the file which invokes our application written in jaxl.class.php. As a convention, we rename jaxl.class.php to jaxl4app.class.php where app is the name of our application.

XMPP class defined in xmpp.class.php passes program handle to various methods whenever an event occur. Following 4 methods are of our use, while developing an application inside jaxl4app.class.php file:

  • eventMessage($fromJid, $content, $offline = FALSE) : This is the method where XMPP class passes the handle, when it receives a message. The message can be either online or offline, which is indicated by the $offline parameter being passed to this method. Other two parameters received are $fromJid which is the jabber id of the user sending the message. e.g. [email protected], and $content which is the actual message sent by the user identified by $fromJid
  • eventPresence($fromJid, $status, $photo) : This is the method where XMPP class passes the handle, when it receives a presence i.e. notification about status change of a user. A change in status event is triggered by either user changing his status text or by user changing his online presence i.e. available, idle, busy. $fromJid is the parameter passed to this method which is the jabber id of the user who changed his status. $status is the new status set by the user.
  • eventNewEMail($total, $thread, $url, $participation, $messages, $date, $senders, $labels, $subject, $snippet) : XMPP library also implements the Gmail Notification extension for XMPP protocol, and passes the handle to eventNewEMail() method when ever a new email is received on Gmail or Google Apps mail. One of the real life example of this protocol called Gmail Notification can be seen using Gtalk. Gtalk will pop up a window when ever you receive a new email on Gmail. For more detail about various parameters passed to this method refer the Gmail Notification documentation.
  • setStatus() : XMPP class passes the handle to this method before setting the status of the bot/application we intent to run using Jaxl library. Customize this method for setting custom status messages on logon or anytime during the execution of the application.

In 99.99% of the applications which we will intent to build, will not require handle to other events which happen in the background and handled by the base XMPP class. For more information about all the events refer this blog post. Behind the scenes – How and What XML’s are exchanged by JAXL

Jaxl also provide provisions to switch your bot between production and development environment by a simple change in the config.ini.php file. $env parameter (allowed values are “prod” or “devel”) decides what environment do you want to run this bot on. $env is set to “devel”, when you are developing your bot/application and don’t want to connect to a production jabber server repeatedly during development. Setting $env to “prod”, will configure the bot/application to connect to the production jabber server.

1. My first bot: Creating an 24×7 online status aggregation bot
This was how I started working on JAXL. I wanted to collect status messages of all my gtalk friends. I also wanted to plot on a graph, when and which of my gtalk friends come online or go offline. A basic example of this graph can be seen on my timeline at Gtalkbots.

The good thing in Jaxl library is that, by default it comes with a built-in bot capable of doing the above tasks for us. (because this is how i started working on Jaxl, hence is the default behaviour). Here is how you can configure Jaxl for the same:

  • Choosing an environment: Open config.ini.php and choose an environment. Since we want to collect information about all our friends on gtalk, we will set $env="prod". This will allow our bot to connect to the jabber servers hosted at (see config file). Also set $logDB = TRUE;, which will enable logging of user information to the MySQL database.
  • Updating user credentials: Register a username at Gmail (if you don’t have one already) and update the username and password of this user in the config file. Our bot will use these credentials to authenticate with the google talk servers. Add a few friends using gtalk to start with. Also update your MySQL database hostname, username and password. Leave the database name as jaxl.
  • Creating jaxl database: Run the database.sql file against your MySQL database. It will create a database called jaxl with two tables called message and presence. Our application will use these tables for storing information about our gtalk friends.
  • Creating the bot: jaxl.class.php by default is ready to work as we want it to. It will log all your gtalk friend’s information in the MySQL database if $logDB is set to TRUE inside the config.ini.php file. Also by default it replies back a welcome message for every message received (online or offline). You may want to un-comment that section of the code as of now.
  • Running the bot: Open command line (windows) or the terminal window (unix, mac) and migrate to the jaxl folder. (Remember you cannot run your application with something like http://localhost/jaxl in your browser. XMPP is a TCP-IP level protocol and not made for running directly using HTTP protocol i.e. browsers. However, we can use BOSH extension of XMPP protocol to make it run over HTTP. Jaxl currently doesn’t support BOSH extension. It’s currently under testing). Now simply run the following command.
    sudo php index.php

    on the terminal. You should see something like this on your terminal: jaxl-jabber-xmpp-library-demo-1

    To debug more while development, you should enabled logging in the config file. Jaxl library will start logging every xmpp stanza sent or recieved to the google talk servers. This is also a good way of learning more about the internals of xmpp protocol.

  • Running the bot 24×7: One of the most common query i get (specially from college enthusiasts) is how to run their applications 24×7. Just like the anagram gaming bot at gtalkbots ([email protected]). Run the following command on your terminal to run your application as a background process (only possible on unix or osx, not on windows):
    sudo nohup php index.php > log/logger.log &

    . This will start the bot as a background process and hence the bot will not stop its execution even if you close the terminal window. When you want to kill your application, simply search for the process id of your application using:

    ps aux | grep index.php

    . Note the process id corresponding to your application and issue

    kill -9 process_id

    to kill the application.

2. Broadcasting messages to your gtalk friend list
You might want to broadcast a message to your gtalk buddy list for a number of reasons. Extension jaxl4broadcast.class.php will do exactly the same for us. To run this extension you should checkout the latest xmpp.class.php file from the repository. Jaxl v 1.0.4 doesn’t support Gmail Extension. Leave your configuration file as it is form the previous example. Simply include jaxl4broadcast.class.php in your index.php file, instead of jaxl.class.php. Or you might also want to create separate index file for each application you build using Jaxl.

Now simply type in

sudo php index.php

on terminal window. You will see the following action logs on the terminal window:
The script broadcast the default message to everyone on the gtalk friend list. In the screen shot you can also see me receiving default message sent by our bot. If any of the friend(s) are offline, the bot sends an offline message to them.

3. RSS feed aggregator bot
Here we will try to make a bot which keep processing RSS feeds in the background. We will also make provisions in this application to retrieve aggregated RSS feed results just by sending simple text messages to the bot. One such application build using Jaxl library is RSS feed integrator for Dzone. You can find this application in the jaxl folder if you have checked out the code from repository. Otherwise download the application from here.

Leave your configuration file as it is from previous applications and instead of jaxl.class.php, include jaxl4dzone.class.php inside index.php. As before simply run the following command on the terminal window:

nohup php index.php > log/logger.log &


Read How to get dzone feeds as IM using JAXL? Add [email protected] for pre-requisites required before running this application and a complete list of provisions made for retrieving rss feeds from this running bot. Here are is a response from the bot, when I send a message reading “php” to the bot:
The application in the background, checks for the incoming message. Further it checks for a cached RSS feed in the cache folder. If the cache is stale or expired, it refetches the RSS feed from Dzone and throw back the results as seen in the screenshot above. If the bot finds a fresh cache of RSS feed in cache directory, it simply throw back the same RSS feed.

4. Custom out-of-office email bot for gmail
jaxl4gmail.class.php is an extension which shows the power of Gmail Extension integration into Jaxl library. This extension allows you to send custom out-of-office email’s to your contacts. I might want to send out a custom out-of-office mail to my colleagues in office and a custom mail to my friends and family.

Download the extension from here, if not already present in the jaxl directory. Include jaxl4gmail.class.php in index.php. Finally run the bot using:

nohup php index.php > log/logger.log &

For details information on how to customize this extension refer this blog post: Programatically control your google mails using JAXL v 1.0.4

5. Building an online multi-user gaming bot
By now we know how to run our bot 24×7 using Jaxl library. In this section we will develop a basic online multi-user gaming bot. Before we go ahead and code our bot, lets decide a few rules for our game:

Users worldwide can add our bot as buddy in gtalk (or using any other IM client). Below are the rules and actions a user can perform:

  • Send a message “start” to enter the multi-user gaming arena.
  • Send a message “stop” to exit the gaming arena.
  • Send a message “options” to view available options for the gaming arena.
  • Any other message sent by the user, will be considered as his answer to the previously broadcasted question. We will make sure “start”, “stop” and “options” are not an answer to any of the question being broadcasted.
  • Whom-so-ever sends a right answer to the broadcasted question receives 5 points. Bot immediately notify everyone in the arena about right answer being received. Thereafter, bot will broadcast the next question to all the users in the arena.
  • Bot reads a list of questions from a file or database as soon as the bot is started. Thereafter, it will keep reading questions from the list of questions randomly and keep broadcasting them to users in the arena.
  • Bot keeps a track of jabber id for incoming (identified by “start”) and outgoing (identified by “stop”) users.
  • Bot also maintains the index of current question being broadcasted

We will code our application in a file called jaxl4gaming.class.php. Here is how the final code will look like. See comments inside the code for more explanation:

jaxl4gaming.class.php (download)

  /* Include XMPP Class */

  class JAXL extends XMPP {

    // List of question contained in an array
    var $questions = array();

    // List of answers corresponding to above questions
    var $answers = array();

    // list of answers which are not allowed for any question
    var $answers_not_allowed = array('start','stop','options');

    // last sent question key (basically index value of question in questions array)
    var $last_question_key = -1;

    // an associative array storing user scores
    var $user_scores = array();

    // stores jabber id of users currently in the arena
    var $user_jids = array();

    // game status
    var $game_status = FALSE;

    function eventMessage($fromJid, $content, $offline = FALSE) {
      // Take action only if the message received is online
      if(!$offline) {
         // trim incoming content
         $content = trim($content);

         // get bare jid for the user
	 $fromJid = $this->getBareJid($fromJid);
	 switch($content) {
	   case 'start':
	   case 'stop':
	   case 'options':
	     $this->handle_user_message($fromJid, $content);

    // not required for this gaming demo
    function eventPresence($fromJid, $status, $photo) {


    // set the status for our gaming bot
    function setStatus() {
      // Set a custom status or use $this->status
      $this->sendStatus("Type *options* for getting started");
      print "Setting Status...n";
      print "Donen";

      // initialize game
      if(!$this->game_status) {
        $this->logger->logger('Initializing gaming arena....');
        $this->game_status = TRUE;

    function init() {
      // called when the bot starts
      // read the list of questions and their answers from a txt file
      // populate the $question and $answers array
      // HARDCODING arrays for DEMO purpose.
      $this->questions = array('q1','q2','q3','q4','q5');
      $this->answers = array('a1','a2','a3','a4','a5');
      return TRUE;

    function broadcast_message($message, $except=array()) {
      foreach($this->user_jids as $jid => $info) {
	if(in_array($jid, $except)) continue;
   	else if($this->user_jids[$jid]['status'] == 'online') {
          $this->sendMessage($jid, $message);
      return TRUE;

    function add_user_to_arena($jid) {
       // check if user visited the game before
       // you may want to send some custom welcome messages depending upon the user type
       if(!isset($this->user_jids[$jid])) {
         $this->logger->logger('Adding user_jids key for: '.$jid);
	 $this->user_jids[$jid] = array();

       $this->user_jids[$jid]['status'] = 'online';
       $this->user_jids[$jid]['start_time'] = time();
       $this->logger->logger($jid.' joined the arena: '.json_encode($this->user_jids[$jid]));

       return TRUE;

    function send_current_question($jid) {
      // is this the 1st user in the arena
      if($this->last_question_key == -1) $this->last_question_key++;
      $current_question = $this->questions[$this->last_question_key];

      $this->logger->logger('Sending current question at index: '.$this->last_question_key.', question: '.$current_question.' to: '.$jid);
      $this->sendMessage($jid, $current_question);
      return TRUE;

    function broadcast_next_question($except=array()) {
      if($this->last_question_key == count($this->questions)-1) $this->last_question_key = 0;
      else $this->last_question_key++;

      $this->broadcast_message($this->questions[$this->last_question_key], $except);
      return TRUE;

    function broadcast_right_answer($fromJid, $answer, $except) {
      $message = '*'.$fromJid.'* gave the right answer: '.$answer;
      $this->broadcast_message($message, array($fromJid));
      return TRUE;

    function remove_user_from_arena($jid) {
       if(isset($this->user_jids[$jid])) {
	 $this->user_jids[$jid]['status'] = 'offline';
         $this->user_jids[$jid]['end_time'] = $this->user_jids[$jid]['start_time'];
       return TRUE;

    function display_options($jid) {
      $options = '*start* To join the arena, *stop* To quit the arena, *options* To display this help';
      $this->sendMessage($jid, $options);
      return TRUE;

    function handle_user_message($jid, $message) {
      // check if user already exists in the arena
      if(!isset($this->user_jids[$jid]) || $this->user_jids[$jid]['status'] == 'offline') {
        return TRUE;

      // we treat this message as an answer
      $current_answer = $this->answers[$this->last_question_key];
      if($message == $current_answer) {
        $this->broadcast_right_answer($jid, $message, array($jid));
      else {
        $message = $message.' is a wrong answer. Try again!';
        $this->sendMessage($jid, $message);
      return TRUE;

    function increase_user_points($jid) {
      if(!isset($this->user_jids[$jid]['points'])) $this->user_jids[$jid]['points']=0;
      $this->user_jids[$jid]['points'] += 1;


This is the basic game architecture which will generally be followed while you build games using Jaxl library.
In brief here is the explanation to above code:

  • Initializing game: setStatus() is the last method called during the whole initialization process of bot. Hence this is a right choice to call our game initialization method init(). You can do a number of things in this method. For this demo, I simply hard code the questions and their answers in respective arrays. Once the game is initialized, these $questions and $answers will reside in program memory for the life time.
  • Basic game flow: The flow is simple. I have customized the eventMessage() method provided by Jaxl class. I simply check for a number of cases and divert the flow of the game. If the incoming message is one of the available options, I simply do the respective action. (start triggers add_user_to_arena(), stop triggers remove_user_from_arena() and options trigger display_options() method). If incoming message doesn’t match any available options, I consider it as an attempt to answer the current question and redirect to handle_user_message(). However, if user is yet not a part of the gaming arena and he tries to answer a question, handle_user_message() function will simply redirect to display_options() method.
  • User stats: I also maintain basic user stats in a variable called $user_jids. For each user, I maintain the following fields: 'status' field value can be ‘online’ or ‘offline’ depending upon user availability in the arena. I also maintain a 'start_time' field which indicates when did the user last joined the arena. You might want to have this for a number of reasons. Every time user quits the arena, I also save a field called 'end_time' indicating when did the user last left the arena. Finally, I maintain user points in a field called 'points'. This field is incremented by 1 for every correct answer by the user.
  • Infinite questions: Every time a new question is broadcasted, I check the status of current question key. If it has reached the end of questions array, i simply reset it to 0. Hence the bot will keep serving the questions always. This logic is inside broadcast_next_question() method.
  • Broadcasting messages: broadcast_message() is the main method which broadcast all message from the bot to users playing in the arena. It takes two values as parameters: $message i.e. the message you want to broadcast and $except array which contains user jid’s which you want to skip while broadcasting.

Now how do we test our game. Simply follow the following steps:

  • Download jaxl4gaming.class.php and include it inside index.php
  • Update the config.ini.php file with your production username and password. We will run this bot using gtalk user credentials.
  • Run the bot using
    sudo php index.php
  • Add bot into your gtalk and try to send a message options to it. If everything is fine, you should be able to see the bot performing as we described in the game rules above.
  • Finally customize the methods inside jaxl4gaming.class.php and build your own games.

Debugging your Jaxl bot
In case you run into some error while trying to run Jaxl here are a few things you SHOULD do:

  • Checkout the open/closed issues here. I get lot of queries specially from college enthusiasts and in 99% of the cases solution can be found on the issue’s link above.
  • Another thing you should do is, enable error logging in your php.ini and check the error logs.
  • If you are still unable to find a solution, file a new issue with relevant information here.
  • Search the jaxl forum and discuss with other users who must have encountered similar errors before. Jjoin other users on jaxl’s google group here.
  • Finally, if nothing helps. Send a mail or IM me.

All the best with Jaxl. 😀

Building a Custom PHP Framework with a custom template caching engine using Output Control functions

In past 1 year or so, I had opportunities of using a lot of php frameworks including zend, symfony, cakephp, codeigniter. All frameworks have their pros and cons, however that is out of scope of this blog post. You may want to checkout this comparison list of php frameworks here.

In this blog post I will build a custom PHP framework (MVC Architecture). Then go on to discuss in brief about the output control functions and finally show how to build a custom template caching engine using these functions for our framework.

Source Code
You may want to download the complete source code for this blog post from here.

Building a custom PHP Framework
We will choose a MVC architecture for our framework. Here is a basic directory structure for our custom framework:


The view, model, controller, log and cache directories contains the following framework modules respectively:

  1. view directory contains our view level files. i.e. files containing our HTML, js, css code.
  2. model directory contains the model class responsible for interacting with database and other storages
  3. controller directory contains our controller class. Each incoming request is first received by the controller class constructor, which thereafter controls the flow of request in the framework
  4. log directory contains our logger class. This class is auto loaded for every request providing a basic logger::log($log_message) logger method throughout the framework. This class logs all data in a file called log.log.
  5. cache directory contains our cache class. For this blog tutorial, we will only write the template caching engine class. In production systems, we might have individual classes for other types of cache systems e.g. memcached (Read Memcached and “N” things you can do with it – Part 1 to know more about memcached and MySQL Query Cache, WP-Cache, APC, Memcache – What to choose for a complete comparison lists of various other caching techniques.

Lets see in details, what all file each and every directory contain contains.

Root directory files
We have 4 files in our root directory, namely .htaccess, index.php, config.ini.php and 404.php in order of relevance. Lets look through the content of these files:


RewriteEngine on
RewriteBase /

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*) index.php
  1. 1st two lines essentially means that Switch on the apache rewrite module and set RewriteBase as / i.e. the root directory
  2. Last 3 lines mean that, if incoming request is for a file or directory which physically exists under the root directory serve them otherwise route all other requests to index.php in the root directory

Hence now for an incoming request like http://localhost/test1.php, apache will route the request to index.php in the root directory because there is no test1.php under the root directory. Cool, lets see what index.php has to offer.



  // include configuration file

  // include controller files


index.php doesn’t do much except for including our core configuration file and controller class file. Controller class constructor is initiated as soon as the class file is included.

config.ini.php is our core configuration file. It provides the framework with an array called $config containing various information like: mysql database credentials, requested url details and various other global parameters. Lets see what all parameter does it provide us with.



  $config = array(
                                'name' => "http://".$_SERVER['HTTP_HOST'].'/',
                                'uri' => $_SERVER['REQUEST_URI'],
                                'url' => parse_url($_SERVER['REQUEST_URI']), // note it contains the parsed url contents
                                 'host' => 'localhost',
                                 'name'=> 'testdb',
                                 'user' => 'root',
                                 'pass' => 'password',
                                 'template' => 'On', // template caching switched on by default
                                 'memcached' => 'Off', // switch off memcached caching


$config[‘host’] array saves various parameter about the host itself, e.g. hostname, hosturi (the requested uri, hosturl (it contains the parse_url(hosturi)).

$config[‘mysql’] array contains mysql database parameters. However in this blog post we will not interact with databases.

$config[‘cache’] tells the framework what all caching modules are switched on.



    <h1>404 Page</h1>

Controller directory files
For this blog post, controller directory consists of a single class file. i.e. controller.class.php. We saw this being included by index.php in the root folder above. As soon as controller class is included, it’s constructor is invoked. Before we dig in more, lets see the controller class file:



  // include logger class

  // include cache class (contains template caching)

  // include model class

  class controller {

    function __construct($config) {
        global $config;

        // generate requested template name and path
        $config['template']['name'] = $config['host']['uri'] == '/' ? 'index.php' : substr($config['host']['uri'], 1, strlen($config['host']['uri']));
        $config['template']['path'] = "view/".$config['template']['name'];

        // check 404
        if(!file_exists($config['template']['path'])) {
            $config['template']['name'] = "404.php";
            $config['template']['path'] = "404.php";
        logger::log("Requested template name ".$config['template']['name'].", path ".$config['template']['path']);

        // invoke template caching engine
        $template_cache = new template_cache();

        // include the template

        // cache template


  $controller = new controller($config);


At the top, controller class includes the logger.class.php, cache.class.php and model.class.php files. At the bottom, the controller object is instantiated.

The constructor performs the following 5 tasks:

  1. At first it generates a template name and a template path for the incoming request i.e. for http://localhost/, $config['template']['name']='index.php' and for http://localhost/test1.php, $config['template']['name']='test1.php'.
  2. Second it checks for 404. For the above generated template path, e.g. $config['template']['path']='view/test1.php', it checks whether this file exists inside root directory. If it doesn’t template path and names are set to 404.php
  3. Thirdly, It invokes the template caching engine. i.e. $template_cache = new template_cache();
  4. Forth, it includes the generated template path above i.e. include_once($config['template']['path']);
  5. Fifth and finally, it caches the generated HTML, js, css code by the template file includes above. This is achieved by the following code, $template_cache->setTemplate();

Before we move our attention to, lets see in short the content of log and view directories.

Log directory files
Log directory contains our logger class. This class is auto loaded for each incoming request that is being routed to index.php in the root directory (as we saw above). The logger.class.php provides a static logger::log($log_message) method, which can be used throughout the framework for logging messages. We will be using it everywhere.



  class logger {

    static $log_file = "log/log.log";

    static function log($log) {
      if($log != '') {
        $fh = fopen(self::$log_file, "a");
        fwrite($fh,date('Y-m-d H:i:s')."n".$log."nn");



The logger class by default logs all data to a file called log.log.

View directory files
For this blog post, we have two simple test pages in view directory namely test1.php and test2.php, which can be access by typing http://localhost/test1.php and http://localhost/test2.php respectively in the browser.



      <?php echo model::test1data(); ?>

test1.php simply calls the model class method called model::test1data() (static method). This method extracts some dummy text from the database and returns it back.

Model directory files
Model directory contains the model class file. In production systems, model class file will provide various methods to select and insert data in the databases. However for this blog post we will simply return some static dummy test.



  class model {

    // This method will return data generally from a database table
    // To keep it simple for the post we return some dummy lipsum text
    static function test1data() {
      logger::log("Returning test1data() from database");
      return "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Proin ut nulla ac risus viverra ornare. Nulla consectetur, metus eleifend pharetra posuere, lacus nibh elementum leo, in fermentum lectus lorem in ipsum. Nullam pulvinar purus at erat pharetra volutpat. Pellentesque egestas rutrum lectus, ut rutrum tellus tristique sed. Integer diam est, ornare ac ultricies vel, aliquam non mi. Etiam tempor leo eu lacus tempus sagittis sagittis turpis dictum. Sed leo sapien, pharetra sit amet faucibus et, mollis id nulla. Praesent feugiat mi nec dui scelerisque mollis vehicula magna feugiat. Aliquam erat volutpat. Curabitur quis velit ut nibh rhoncus convallis. Proin mauris nunc, rhoncus vel laoreet vel, aliquet quis nunc. Aenean interdum risus non neque blandit sed adipiscing ipsum mollis. Vivamus enim orci, ultrices at scelerisque vel, laoreet a turpis. Nullam posuere ante sed nisl porta porta aliquam metus suscipit. Fusce enim odio, iaculis at suscipit eget, vestibulum volutpat enim. Nam dictum turpis quis velit posuere in malesuada mi convallis. Donec faucibus, felis id dictum imperdiet, orci tortor tristique neque, vitae lobortis libero tellus sed lorem. Duis tellus magna, commodo eget blandit ut, auctor nec nibh. Maecenas ornare ornare risus nec ultrices. Pellentesque lectus eros, imperdiet ut rhoncus vel, tempus ut nisi.";

    static function test2data() {
      logger::log("Returning test2data() from database");
      return "Vestibulum laoreet nibh sed nulla mollis cursus. Maecenas sodales mauris sit amet ligula euismod a lacinia turpis adipiscing. Nulla gravida porta augue, id adipiscing libero tincidunt ac. Morbi non velit id odio porta tempus id eget massa. Cras nibh purus, gravida sed suscipit ut, tincidunt eu neque. In id est eros, ac sodales orci. Ut lectus augue, feugiat sit amet consectetur id, pharetra quis tellus. Maecenas eget lobortis urna. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Fusce tincidunt eleifend neque. Aenean accumsan orci vitae erat blandit porttitor. Aliquam tristique dolor ac nibh elementum id lacinia diam cursus.";



Cache directory files
For this blog post, cache directory contains the main cache.class.php file which in turn includes various other cache classes e.g. template.cache.class.php



  // check for switched on cache modules
  foreach($config['cache'] as $key => $value) {
    // include all cache classes, which are swicted on
    if($value == 'On') {
      // naming convention is <modulename.cache.class.php>


Output Control Functions
PHP is a very simple language. You can write a Hello World! code or calculate similarity between two strings (see similar_text()), both with a single line of code. And hence there are a lot of fundamental concepts of PHP, which not only beginners but even some advanced coders can ignore. One such concept is Output Control in PHP.

The Output Control functions allow you to control when the output of your PHP script will be thrown to the browsers (console). i.e. You can pre-process the final html output (append, prepend, chip-chop, inserting ad-codes, url linking, keyword highlighting, template caching), which will otherwise be thrown on the browser. Interesting, isn’t it? Can you feel the power of Output Control functions?

  1. ob_start: This turns on output buffering. i.e. no output is sent from the script, instead the output is saved in an internal buffer. However output buffering doesn’t buffers your headers. ob_start() also takes an optional callback function name. The function is called when output buffer is flushed (see ob_flush()) or cleaned (see ob_clean()). We can access the this internal buffer using functions like ob_get_contents()
  2. ob_end_flush: This function send the content of the buffer (if any) and turns off output buffering. We should always call functions like ob_get_contents() before ob_end_flush(), since any changes after this functions will not reflect on the browser

Building a custom Template Caching Engine
We saw above some of the output control functions PHP has to offer. ob_start(), ob_get_contents() and ob_end_flush(); are the 3 functions we will use to create our custom template caching engine.



  class template_cache {

    var $template_cache_file = FALSE;
    var $template_cache_file_ext = ".tmp";
    var $template_cache_dir = "cache/template/";
    var $template_cache_ttl = 300; // secs

    function __construct() {

      // initiate template caching


    function init() {
      // get template path
      $this->template_cache_file = $this->generateTemplatePath();

      // get template from cache if exists

      // start output buffering

    function generateTemplatePath() {
      global $config;

      // generate template file name
      return $this->template_cache_dir.$config['template']['name'].$this->template_cache_file_ext;

    function getTemplate() {
      global $config;

      // check if a cached template exists
      if(file_exists($this->template_cache_file)) {
        if(time() - filemtime($this->template_cache_file) < $this->template_cache_ttl) {
          logger::log("Cache hit for template ".$config['template']['name']);
          $content = file_get_contents($this->template_cache_file);
          echo $content;
        else {
          logger::log("Cache stale for template ".$config['template']['name']);
          return FALSE;
      else {
        logger::log("Cache miss for template ".$config['template']['name']);
        return FALSE;

    function setTemplate() {
      global $config;

      // get buffer
      $content = ob_get_contents();

      // save template
      logger::log("Caching template ".$config['template']['name']);
      $fh = fopen($this->template_cache_file, 'w');
      fwrite($fh, $content);

      // Flush the output buffer and turn off output buffering



As we saw above in controller class, the template engine class was instantiated before including the actual template file. Template engine constructor do the following 3 tasks:

  1. Generate a cached file name for the requested uri by calling the $this->generateTemplatePath(); method. e.g. if http://localhost/test1.php is the requested uri, test1.php.tmp is it’s static cached template
  2. Secondly, it tries to fetch the cached template file by calling the method $this->getTemplate(); (read on for details of this method)
  3. Finally it turns on output buffering by calling ob_start();

List of methods provided by template.cache.class.php are:

  1. generateTemplatePath() method generates a cache file name for incoming request. By default extension of all cached files in “.tmp” and are stored under the /cache/template directory.
  2. getTemplate() method do a number of tasks. First, it checks if a cached template exists for the requested uri. If it does not exists or if it is not a fresh cache (see $template_cache_ttl), this method simply returns control back to controller which go ahead and include the actual template file. However if the file exists and is fresh it reads the content of the file and throw back to browser. At this point control is no longer transferred back to the controller, hence saving various un-necessary processing and database calls.
  3. setTemplate() method is called by controller after including the actual template file from under the view directory. Point to note is that, before getTemplate() returns control back to controller (in case of missed or stale cache), the template cache class constructor does switch on output buffering. And when setTemplate() method is called, we can access this buffer using output functions like ob_get_contents() and then save the template for next incoming request. Bingo!. Finally this method throw away the buffer to the browser using ob_end_flush();

Is it working?
To verify the flow of framework, I hit the url http://localhost/test1.php 3 times, with $template_cache_ttl = 10; (seconds).

  1. Once after clearing the template cache folder
  2. Once within next 10 seconds
  3. And finally after 10 seconds

Here is how the log file looks like:

2009-08-16 19:50:49
Requested template name test1.php, path view/test1.php (1st REQUEST)

2009-08-16 19:50:49
Cache miss for template test1.php

2009-08-16 19:50:49
Returning test1data() from database

2009-08-16 19:50:49
Caching template test1.php

2009-08-16 19:50:54
Requested template name test1.php, path view/test1.php (2nd REQUEST)

2009-08-16 19:50:54
Cache hit for template test1.php

2009-08-16 19:51:03
Requested template name test1.php, path view/test1.php (3rd REQUEST)

2009-08-16 19:51:03
Cache stale for template test1.php

2009-08-16 19:51:03
Returning test1data() from database

2009-08-16 19:51:03
Caching template test1.php

Moving forward, What’s Next? Extending template.cache.class.php
Template cache class can be extended to do a lot more, other than caching the template files. For instance we might want to perform (chip-chop, append, prepend etc) a few tasks, before we cache the final template and throw back to the browser. Few tasks which look quite obvious to me are:

  1. Short Codes: We can insert short codes in our HTML templates, which later on can be expanded into full fledged codes. e.g. For embedding a YouTube video, we can simply put something like [[YouTube yjPBkvYh-ss]] into test1.php. And in setTemplate() method we can call helper/plugin methods to process such short codes. More professionally, we can add hooks for various tasks we might want to perform before caching the template. Read How to add wordpress like add_filter hooks in your PHP framework for a more professional approach.
  2. Inserting page header and footer: Instead of including page header and footer inside test1.php, we can simply put our main <body> code inside test1.php. Then before caching the template file, we can append and prepend header and footer modules to the buffer of each page. Thus avoiding including the same header and footer files across various pages.
  3. HTML module caching: There are several instances where we can have a common module across all pages. For instance, I can have a events module across all my pages, which basically displays a calendar with various events for the week or month marked on it. The event details are extracted from the database. Since this module of mine is a static HTML chunk for atleast a week, I would like to have a difference cache for this module. Intelligently hooking up these modules with template caching engine, can allow us to do module level caching

I can probably write down 10-15 more such applications and probably there might be many more such applications of the above coded template caching engine. (Note: The power actually lies in Output Control Functions provided by PHP).

Let me know if you liked the post or any bug in it.