How to perform X-FACEBOOK-PLATFORM and Google Talk X-OAUTH2 XMPP authentication with PHP Jaxl library

Ever since Jaxl library first introduced support for X-FACEBOOK-PLATFORM XMPP authentication mechanism, it has changed significantly. Also, Google Talk now supports OAuth 2.0 Authorization, an XMPP extension to allow users to log in using OAuth 2.0 credentials.

Both these mechanisms are a big win for XMPP developers, since real-time conversation experience can now be provided to their application users without asking them for their passwords. In this blog post, I will demonstrate how to perform X-FACEBOOK-PLATFORM and X-OAUTH2 XMPP authentication mechanism using Jaxl v3.x PHP Library.

Here is a quick guide on how to perform X-FACEBOOK-PLATFORM XMPP authentication using xfacebook_platform_client.php which comes bundled with Jaxl v3.x examples:

  • Visit Facebook Developer Apps page and register your application
  • Once registered, visit access token tool to get required parameters to perform X-FACEBOOK-PLATFORM authentication Facebook Access Token Tool
  • Click on the debug button next to User Token and make sure xmpp_login is one of the extended permissions (scope)
  • Enter downloaded Jaxl library folder and run from command line as follows:

    $ php examples/xfacebook_platform_client.php fb_user_id_or_username fb_app_key fb_access_token

You can now take the source code of xfacebook_platform_client.php and customize it for your application needs.

Google Talk X-OAUTH2 XMPP Authentication
Here is a quick guide on how to perform Google Talk X-OAUTH2 XMPP authentication using xoauth2_gtalk_client.php which comes bundled with Jaxl v3.x examples:

  • Visit Google OAuth Playground and input as the required scope. Press “Authorize API” and then “Allow Access” button on the redirected page
  • In step 2, simply press “Exchange authorize code for tokens” and copy the access token
  • Enter downloaded Jaxl library folder and run from command line as follows:

    $ php examples/xoauth2_gtalk_client.php [email protected] access_token

You can now take the source code of xoauth2_gtalk_client.php and customize it for your application needs.

Wasn’t that simple :)

JAXLXml – Strophe style XML Builder : Working with Jaxl – A Networking Library in PHP – Part 2

Prior to Jaxl v3.x, the most ugliest piece of code inside Jaxl library was handling of XML packets. If you are working with XMPP protocol which is all about sending and receiving XML packets, it can become a nightmare if you don’t have a proper XML manipulation library in your toolkit. For Jaxl v3.x, first thing I decided to write was JAXLXml class, which is a custom XML packet implementation with no external dependencies and is an extension over the ideas from Strophe.Builder class written by Jack Moffitt.

JAXLXml is generic enough to find a place inside any PHP application that requires easy and elegant XML packet creation. In this blog post, I will give an exhaustive overview of how to create XML packets using JAXLXml class.

JAXLXml Constructor
Depending upon the need, there are several different ways of initializing a JAXLXml object:

  • $xml_obj = new JAXLXml($name, $ns, $attrs, $text);
  • $xml_obj = new JAXLXml($name, $ns, $attrs);
  • $xml_obj = new JAXLXml($name, $ns, $text);
  • $xml_obj = new JAXLXml($name, $attrs, $text);
  • $xml_obj = new JAXLXml($name, $attrs);
  • $xml_obj = new JAXLXml($name, $ns);
  • $xml_obj = new JAXLXml($name);


  • $name – the XML node name
  • $ns – the XML namespace
  • $attrs – Key-Value (KV) pair of XML attributes
  • $text – XML content

Here are a few examples for each constructor style shown above:

JAXLXml will sanitize attributes and text values as shown below:

Manipulating Attributes, Child Nodes and Content
Below is an exhaustive list of methods available over initialized JAXLXml object $xml_obj for manipulating attributes, child nodes and content:

  • c($name, $ns=null, $attrs=array(), $text=null) : Append a child node at current rover and update the rover to point at newly added child node. Rover is nothing but a pointer indicating the level in the XML tree where this and other methods will perform. When an JAXLXml instance is initialized, rover points to the top level node.
  • cnode($node) : Append a child node given by $node (a JAXLXml object) at current rover and update the rover to point at newly added child node.
  • t($text, $append=FALSE) : Update text of the node pointed by current rover
  • top() : Move rover back to the top in the XML tree
  • up() : Move rover one step up the XML tree
  • attrs($attrs) : Merge new attributes specified as KV pair $attrs with existing attributes at the current rover.
  • match_attrs($attrs) : Accepts a KV pair of attributes $attrs, return bool if all keys exist and have same value as specified in the passed KV pair.
  • exists($name, $ns=null, $attrs=array()) : Checks if a child with $name exist. If found, return matching child as JAXLXml object otherwise false. If multiple children exist with same name, this function will return on first matching child
  • update($name, $ns=null, $attrs=array(), $text=null) : Update $ns, $attrs and $text (all at once) of an existing child node $name
  • to_string($parent_ns=null) : Return string representation of JAXLXml object

Method Chaining
The best thing one will find while working with JAXLXml class is that all the above methods are chain-able i.e. Any complex XML structure can be built with a single line of code.

Here is an example building a fairly nested XML structure in a single line of code:

Working with Jaxl – A Networking Library in PHP – Part 1 – An Introduction, Philosophy and History

Development of Jaxl library started way back in December’07 while I was working on a self-initiated project called Gtalkbots. The project is now dead, if you are interested in knowing more about it go through Gtalkbots BlogSpot. Jaxl v1.x was first released in Jan’09 and about a year later in Aug’10 Jaxl v2.x was released. First two versions were released as JAbber XMPP Library for writing clients and external server components.

While working on my startup Jaxl – A Platform As A Service (PAAS) for developing real-time applications, I started experiencing v2.x limitations when my external server side components were unable to process XMPP packets at the speed they were sent by ejabberd server. I started restructuring and refactoring the library which gave birth to Jaxl v3.x. Since v3.x was initially being used for developing the entire infrastructure, it shaped up as a networking library in PHP with stable support for XMPP protocol. However, later I had to rewrite several infrastructure components in Erlang Programming Language due to several issues that PHP as a language couldn’t solve (after all PHP wasn’t made for such tasks). Finally in April’12, Jaxl v3.x was open sourced.

Jaxl v3.x is an asynchronous, non-blocking, event based networking library in PHP for writing custom TCP/IP client and server implementations. From previous versions, Jaxl library inherits a full blown stable support for XMPP protocol stack. In v3.0, support for HTTP protocol stack was also introduced. At the heart of every protocol stack sits a Core stack. It contains all the building blocks for everything that we aim to do with Jaxl library. Both XMPP and HTTP protocol stacks are written on top of the Core stack. Infact the source code of these protocol implementations knows nothing about the standard (inbuilt) PHP socket and stream methods.

Jaxl is designed to work asynchronously in a non-blocking fashion and provides an event based callback API. Now what does all that mean?

By non-blocking and asynchronous it means, when a library function like:
$jaxl->send($stanza); is called, it will return immediately i.e. this function call will NOT block any further execution of your application script until $stanza has actually been sent over the connected TCP socket. Infact, when this function is called, passed $stanza object is put into an output buffer queue, which will be flushed as and when underlying TCP socket is available for writes. Similarly, most of the available methods (wherever required and possible) inside Jaxl library are non-blocking and asynchronous in nature.

By event based callback API it means, application code will need to register/add callbacks over necessary events as they occur inside Jaxl instance lifecycle. A list of available event callbacks with some explanation can be found here. For example, most of the XMPP applications will usually register a callback over on_auth_success event. As and when this event occurs inside Jaxl instance lifecycle, registered function will be callback’d with necessary parameters (if any).

Related Links

  • Read library documentation
  • Download the latest and greatest source from GitHub.
  • Have any Question? Want to discuss? Need Help? Use Google Group/Forum.
  • Found something missing or a bug in the source code? Kindly report an issue.
  • Fixed a bug? Want to submit a patch? Want to improve documentation? Checkout source code and contribute to the library

XMPP Application Examples

HTTP Application Examples

Stay Tuned
In coming weeks, under this series of blog posts titled “Working with Jaxl – A Networking Library in PHP”, I will cover following major topics with sample code:

  • Explanation of each Core stack class and how to use them
  • Design of each XMPP and HTTP stack class
  • XMPP over HTTP
  • XMPP File Transfer and Multimedia Sessions
  • Understanding and Using External Jabber Components
  • Asynchronous Job/Task Queues
  • Developing Concurrent and Parallel Systems

If you have any specific topic that you would like me to be cover, kindly let me know via your comments here.

Announcing Jaxl v3.x – asynchronous, non-blocking I/O, event based PHP client/server library

Jaxl v3.x is a successor of v2.x (and is NOT backward compatible), carrying a lot of code from v2.x while throwing away the ugly parts. A lot of components have been re-written keeping in mind the feedback from the developer community over the last 4 years. Also Jaxl shares a few philosophies from my experience with erlang and python languages.

Jaxl is an asynchronous, non-blocking I/O, event based PHP library for writing custom TCP/IP client and server implementations. From it’s previous versions, library inherits a full blown stable support for XMPP protocol stack. In v3.0, support for HTTP protocol stack was also added.

At the heart of every protocol stack sits a Core stack. It contains all the building blocks for everything that we aim to do with Jaxl library. Both XMPP and HTTP protocol stacks are written on top of the Core stack. Infact the source code of protocol implementations knows nothing about the standard (inbuilt) PHP socket and stream methods.

Source code on GitHub



Group and Mailing List

Create a bug/issue

Read why v3.x was written and what traffic it has served in the past.

PHP Code, Setup and Demo of Jaxl boshchat application

Jaxl 2.0 bosh support allow web developers to write real time web applications within minutes, without having any pre-requisite knowledge about the XMPP protocol itself. In this blog post, I will walk you through setup and demo of an XMPP based web chat application using Jaxl library.

Get the code
Follow the following steps to download and install this sample web application on your systems:

  • Clone the development branch of Jaxl library
    [email protected]:~/git# git clone [email protected]:abhinavsingh/JAXL.git
    [email protected]:~/git# cd JAXL/
    [email protected]:~/git/JAXL#

    If you are not familiar with git, simply visit [email protected], click Download Source and extract under ~/git/JAXL directory on your system

  • Once inside Jaxl source directory, build the latest development package
    [email protected]:~/git/JAXL# ./
  • Install Jaxl library (view installation detail and options)
    [email protected]:~/git/JAXL# ./ install
    uninstalling old package...

Setup web chat application
Jaxl library is default installed under /usr/share/php/jaxl folder. Application code for our web chat application can be found under /usr/share/php/jaxl/app/boshchat folder.

Follow these steps to setup web chat application on your system:

  • I assume you have http://localhost/ configured on your local web server and it runs out of /var/www folder. Create following symlinks:
    [email protected]:~/git/JAXL# cd /var/www
    [email protected]:/var/www# ln -s /usr/share/php/jaxl/app/boshchat/boshchat.php index.php
    [email protected]:/var/www# ln -s /usr/share/php/jaxl/app/boshchat/jaxl.ini jaxl.ini
    [email protected]:/var/www# ln -s /usr/share/php/jaxl/env/jaxl.js jaxl.js
    [email protected]:/var/www# ln -s /usr/bin/jaxl jaxl.php

    Edit/Remove #!/usr/bin/env php inside jaxl.php if it causes any problem on your system.

  • Open and edit jaxl.ini
    define('JAXL_BOSH_COOKIE_DOMAIN', false);
  • I assume you have access to XMPP over Bosh enabled jabber server. Ejabberd users can verify this by hitting http://localhost:5280/http-bind in the browser
  • Open and edit index.php
    define('BOSHCHAT_ADMIN_JID', [email protected]');

    All messages sent using this web chat application will be routed to BOSHCHAT_ADMIN_JID

Ready for the demo
To run this example web chat application, visit http://localhost in your browser window. Enter a username/password already registered on your jabber server and press connect.

Login as BOSHCHAT_ADMIN_JID using a desktop client, so that you can receive messages sent from the browser on your desktop client.

Below is a screenshot when I logged in as “abhinavsingh” from browser and BOSHCHAT_ADMIN_JID was set to “[email protected]”:

Releasing Jaxl 2.0 – Object oriented XMPP framework in PHP

After months of restructuring the Jaxl library, I am pleased to announce Jaxl 2.0, an object oriented XMPP framework in PHP for developing real time applications for browsers, desktops and hand held devices.

What’s new in Jaxl 2.0?

  • A lot of structural changes has been done from the previous version to make it more scalable, robust, flexible and easy to use
  • Library now provides an event mechanism, allowing developers to register callbacks for various xmpp events in their application code
  • Use integrated BOSH support to write real time web applications in minutes
  • More than 10 new implemented XMPP extensions (XEP’s) added
  • Development hosting moves to github, stable releases available at google code

Documentation for Jaxl users
Below is a list of getting started documentation for XMPP app developers:

Implemented XEP’s
A lot of new XEP’s has been implemented and packaged with Jaxl 2.0. Developers can use Jaxl event mechanism to implement new XEP’s without knowing the working of other core parts of the library.

Below is a list of released implemented XEP with Jaxl 2.0:

Documentation for project contributors
For developers interested in contributing to the Jaxl project, here is a list of insight documentation to get you started:

  • Jaxl core workflow and architecture (coming soon)
  • How to implement new XMPP extensions using Jaxl (coming soon)

Useful Links
For live help and discussion join [email protected] chat room

MEMQ : Fast queue implementation using Memcached and PHP only

Memcached is a scalable caching solution developed by Danga interactive. One can do a lot of cool things using memcached including spam control, online-offline detection of users, building scalable web services. In this post, I will demonstrate and explain how to implement fast scalable queues in PHP.

MEMQ: Overview
Every queue is uniquely identified by it’s name. Let’s consider a queue named “foo” and see how MEMQ will implement it inside memcached:

  • Two keys namely, foo_head and foo_tail contains meta information about the queue
  • While queuing, item is saved in key foo_1234, where 1234 is the current value of key foo_tail
  • While de-queuing, item saved in key foo_123 is returned, where 123 is the current value of key foo_head
  • Value of keys foo_head and foo_tail start with 1 and gets incremented on every pop and push operation respectively
  • Value of key foo_head NEVER exceeds value of foo_tail. When value of two meta keys is same, queue is considered empty.

MEMQ: Code
Get the source code from GitHub:


	define('MEMQ_POOL', 'localhost:11211');
	define('MEMQ_TTL', 0);

	class MEMQ {

		private static $mem = NULL;

		private function __construct() {}

		private function __clone() {}

		private static function getInstance() {
			if(!self::$mem) self::init();
			return self::$mem;

		private static function init() {
			$mem = new Memcached;
			$servers = explode(",", MEMQ_POOL);
			foreach($servers as $server) {
				list($host, $port) = explode(":", $server);
				$mem->addServer($host, $port);
			self::$mem = $mem;

		public static function is_empty($queue) {
			$mem = self::getInstance();
			$head = $mem->get($queue."_head");
			$tail = $mem->get($queue."_tail");

			if($head >= $tail || $head === FALSE || $tail === FALSE)
				return TRUE;
				return FALSE;

		public static function dequeue($queue, $after_id=FALSE, $till_id=FALSE) {
			$mem = self::getInstance();

			if($after_id === FALSE && $till_id === FALSE) {
				$tail = $mem->get($queue."_tail");
				if(($id = $mem->increment($queue."_head")) === FALSE)
					return FALSE;

				if($id <= $tail) {
					return $mem->get($queue."_".($id-1));
				else {
					return FALSE;
			else if($after_id !== FALSE && $till_id === FALSE) {
				$till_id = $mem->get($queue."_tail");

			$item_keys = array();
			for($i=$after_id+1; $i<=$till_id; $i++)
				$item_keys[] = $queue."_".$i;
			$null = NULL;

			return $mem->getMulti($item_keys, $null, Memcached::GET_PRESERVE_ORDER);

		public static function enqueue($queue, $item) {
			$mem = self::getInstance();

			$id = $mem->increment($queue."_tail");
			if($id === FALSE) {
				if($mem->add($queue."_tail", 1, MEMQ_TTL) === FALSE) {
					$id = $mem->increment($queue."_tail");
					if($id === FALSE)
						return FALSE;
				else {
					$id = 1;
					$mem->add($queue."_head", $id, MEMQ_TTL);

			if($mem->add($queue."_".$id, $item, MEMQ_TTL) === FALSE)
				return FALSE;

			return $id;



MEMQ: Usage
The class file provide 3 methods which can be utilized for implementing queues:

  1. MEMQ::is_empty – Returns TRUE if a queue is empty, otherwise FALSE
  2. MEMQ::enqueue – Queue up the passed item
  3. MEMQ::dequeue – De-queue an item from the queue

Specifically MEMQ::dequeue can run in two modes depending upon the parameters passed, as defined below:

  1. $queue: This is MUST for dequeue to work. If other optional parameters are not passed, top item from the queue is returned back
  2. $after_id: If this parameter is also passed along, all items from $after_id till the end of the queue are returned
  3. $till_id: If this paramater is also passed along with $after_id, dequeue acts like a popRange function

Whenever optional parameters are passed, MEMQ do not remove the returned items from the queue.

MEMQ: Is it working?
Add following line of code at the end of the above class file and hit the class file from your browser. You will get back inserted item id as response on the browser:

var_dump(MEMQ::enqueue($_GET['q'], time()));

Lets see how cache keys looks like in memcached:

[email protected]:~$ telnet localhost 11211
Trying ::1...
Connected to localhost.
Escape character is '^]'.

get foo_head
VALUE foo_head 1 1

get foo_tail
VALUE foo_tail 1 1

get foo_1
VALUE foo_1 1 10

get foo_2
VALUE foo_2 1 10

MEMQ: Benchmark
Below are the benchmarking results for varying load:

  1. Queuing performance: 697.18 req/sec (n=1000, c=100) and 258.64 req/sec (n=5000, c=500)
  2. Dequeue performance: 641.27 req/sec (n=1000, c=100) and 242.87 req/sec (n=5000, c=500)

MEMQ: Why and other alternatives
There are several open source alternatives which provide a lot more scalability. However, MEMQ was written because my application doesn’t expect a load in order of 10,000 hits/sec. Listed below are a few open source alternatives for applications expecting high load:

  1. ActiveMQ: A reliable and fast solution under apache foundation
  2. RabbitMQ: Another reliable solution based on AMQP solution
  3. Memcacheq: A mash-up of two very stable stacks namely memcached and berkleyDB. However, it’s installation is a bit tricky.

MEMQ: Mantra and Customization
At the base MEMQ implementation can be visualized as follows:

There is a race between two keys in memcached (foo_head and foo_tail). Both are incremented on every dequeue and queue operation respectively. However, foo_tail is strong enough and never allows foo_head to exceed. When value of keys foo_tail and foo_head are equal, queue is considered empty.

The above code file still doesn’t include utility methods like MEMQ::total_items etc. However, writing such methods should be pretty easy depending upon your application needs. Also depending upon your application requirement, you should also take care of overflowing integer values.

How to use locks in PHP cron jobs to avoid cron overlaps

Cron jobs are hidden building blocks for most of the websites. They are generally used to process/aggregate data in the background. However as a website starts to grow and there is gigabytes of data to be processed by every cron job, chances are that our cron jobs might overlap and possibly corrupt our data. In this blog post, I will demonstrate how can we avoid such overlaps by using simple locking techniques. I will also discuss a few edge cases we need to consider while using locks to avoid overlap.

Cron job helper class
Here is a helper class (cron.helper.php) which will help us avoiding cron job overlaps. (See usage example below)


	define('LOCK_DIR', '/Users/sabhinav/Workspace/cronHelper/');
	define('LOCK_SUFFIX', '.lock');

	class cronHelper {

		private static $pid;

		function __construct() {}

		function __clone() {}

		private static function isrunning() {
			$pids = explode(PHP_EOL, `ps -e | awk '{print $1}'`);
			if(in_array(self::$pid, $pids))
				return TRUE;
			return FALSE;

		public static function lock() {
			global $argv;

			$lock_file = LOCK_DIR.$argv[0].LOCK_SUFFIX;

			if(file_exists($lock_file)) {
				//return FALSE;

				// Is running?
				self::$pid = file_get_contents($lock_file);
				if(self::isrunning()) {
					error_log("==".self::$pid."== Already in progress...");
					return FALSE;
				else {
					error_log("==".self::$pid."== Previous job died abruptly...");

			self::$pid = getmypid();
			file_put_contents($lock_file, self::$pid);
			error_log("==".self::$pid."== Lock acquired, processing the job...");
			return self::$pid;

		public static function unlock() {
			global $argv;

			$lock_file = LOCK_DIR.$argv[0].LOCK_SUFFIX;


			error_log("==".self::$pid."== Releasing lock...");
			return TRUE;



Using cron.helper.php
Here is how the helper class can be integrated in your current cron job code:

  • Save cron.helper.php in a folder called cronHelper
  • Update LOCK_DIR as per your need
  • You might have to set proper permissions on folder cronHelper, so that running cron job have write permissions
  • Wrap your cron job code as show below:
    	require 'cronHelper/cron.helper.php';
    	if(($pid = cronHelper::lock()) !== FALSE) {
    		 * Cron job code goes here
    		sleep(10); // Cron job code for demonstration

Is it working? Verify
Lets verify is the helper class really take care of all the edge cases.

  • sleep(10) is our cron job code for this test
  • Run from command line:
    sabhinav$ php job.php
    ==40818== Lock acquired, processing the job...
    ==40818== Releasing lock...

    where 40818 is the process id of current running cron job

  • Run from command line and terminate the cron job in between by pressing CNTR+C:
    sabhinav$ php job.php
    ==40830== Lock acquired, processing the job...

    By pressing CNTR+C, we simulate the cases when a cron job can die in between due to a fatal error or system shutdown. In such cases, helper class fails to release the lock on this cron job.

  • With the lock in place (ls -l cronHelper | grep lock), run from command line:
    sabhinav$ php job.php
    ==40830== Previous job died abruptly...
    ==40835== Lock acquired, processing the job...
    ==40835== Releasing lock...

    As seen, helper class detects that one of the previous cron job died abruptly and then allow the current job to run successfully.

  • Run the cron job from two command line window and one of them will not proceed as shown below:
    centurydaily-lm:cronHelper sabhinav$ php job.php
    ==40856== Already in progress...

    One of the cron job will die since a cron job with $pid=40856 is already in progress.

Working of cron.helper.php
The helper class create a lock file inside LOCK_DIR. For our test cron job above, lock file name will be job.php.lock. Lock file name suffix can be configured using LOCK_SUFFIX.

cronHelper::lock() places the current running cron job process id inside the lock file. Upon job completion cronHelper::unlock() deletes the lock file.

If cronHelper::lock() finds that lock file already exists, it extracts the previous cron job process id from the lock file and checks whether a previous cron job is still running. If previous job is still in progress, we abort our current current job. If previous job is not in progress i.e. died abruptly, current cron job acquires the lock.

This is the classic method for avoiding cron overlaps. However there can be various other methods of achieving the same thing. If you know any do let me know through your comments.

How to add content verification using hmac in PHP

Many times a requirement arises where we are supposed to expose an API for intended users, who can use these API endpoints to GET/POST data on our servers. But how do we verify that only the intended users are using these API’s and not any hacker or attacker. In this blog post, I will show you the most elegant way of adding content verification using hash_hmac (Hash-based Message Authentication Code) in PHP. This will allow us to restrict possible misuse of our API by simply issuing an API key for intended users.

Here are the steps for adding content verification using hmac in PHP:

  • Issue $private_key and $public_key for users allowed to post data using our API. You can use the method similar to one described here for generating public and private keys.
  • Users having these keys can now use following sample script (hmac-sender.php) to submit data:
            // User Public/Private Keys
            $private_key = 'private_key_user_id_9999';
            $public_key = 'public_key_user_id_9999';
            // Data to be submitted
            $data = 'This is a HMAC verification demonstration';
            // Generate content verification signature
            $sig = base64_encode(hash_hmac('sha1', $data, $private_key, TRUE));
            // Prepare json data to be submitted
            $json_data = json_encode(array('data'=>$data, 'sig'=>$sig, 'pubKey'=>$public_key));
            // Finally submit to api end point
  • At hmac-receiver.php, we validate the incoming data in following fashion:
            function get_private_key_for_public_key($public_key) {
                    // extract private key from database or cache store
                    return 'private_key_user_id_9999';
            // Data submitted
            $data = $_GET['data'];
            $data = json_decode(stripslashes($data), TRUE);
            // User hit the end point API with $data, $signature and $public_key
            $message = $data['data'];
            $received_signature = $data['sig'];
            $private_key = get_private_key_for_public_key($data['pubKey']);
            $computed_signature = base64_encode(hash_hmac('sha1', $message, $private_key, TRUE));
            if($computed_signature == $received_signature) {
                    echo "Content Signature Verified";
            else {
                    echo "Invalid Content Verification Signature";

Where to use such verification?
This is an age old method for content verification which is used widely in a variety of applications. Below are a few places where hmac verification finds a place:

  • If you have exposed an API for your vendors to submit requested data
  • If you are looking to enable third party applications in your website. Similar to developer application model of facebook.

Hope you liked the post. Do leave your comments.

PHP tokens and opcodes : 3 useful extensions for understanding the working of Zend Engine

“PHP tokens and opcodes” – When a PHP script is executed it goes through a number of processes, before the final result is displayed. These processes are namely: Lexing, Parsing, Compiling and Executing. In this blog post, I will walk you through all these processes with a sample example. In the end I will list some useful PHP extensions, which can be used to analyze results of every intermediate process.

Lets take a sample PHP script as an example:

	function increment($a) {
		return $a+1;
	$a = 3;
	$b = increment($a);
	echo $b;

Try running this script through command line:

~ sabhinav$ php -r debug.php

This PHP script goes through the following processes before outputting the result:

  • Lexing: The php code inside debug.php is converted into tokens
  • Parsing: During this stage, tokens are processed to derive at meaningful expressions
  • Compiling: The derived expressions are compiled into opcodes
  • Execution: Opcodes are executed to derive at the final result

Lets see how a PHP script passes through all the above steps.

During this stage human readable php script is converted into token. For the first two lines of our PHP script:

	function increment($a) {

tokens will look like this (try to match the tokens below line by line with the above 2 lines of PHP code and you will get a feel):

~ sabhinav$ php -r 'print_r(token_get_all(file_get_contents("debug.php")));';
    [0] => Array
            [0] => 368             // 368 is the token number and it's symbolic name is T_OPEN_TAG, see below
            [1] => <?php

            [2] => 1

    [1] => Array
            [0] => 371
            [1] =>
            [2] => 2

    [2] => Array
            [0] => 334
            [1] => function
            [2] => 2

    [3] => Array
            [0] => 371
            [1] =>
            [2] => 2

    [4] => Array
            [0] => 307
            [1] => increment
            [2] => 2

    [5] => (
    [6] => Array
            [0] => 309
            [1] => $a
            [2] => 2

    [7] => )
    [8] => Array
            [0] => 371
            [1] =>
            [2] => 2

    [9] => {
    [10] => Array
            [0] => 371
            [1] =>

            [2] => 2

A list of parser tokens can be found here:

Every token number has a symbolic name attached with it. Below is our PHP script with human readable code replaced by symbolic name for each generated token:

~ sabhinav$ php -r '$tokens = (token_get_all(file_get_contents("debug.php"))); foreach($tokens as $token) { if(count($token) == 3) { echo token_name($token[0]); echo $token[1]; echo token_name($token[2]);  }  }';

Parsing and Compiling:
By generating the tokens in the above step, zend engine is able to recognize each and every detail in the script. Where the spaces are, where are the new line characters, where is a user defined function and what not. Over the next two stages, the generated tokens are parsed and then compiled into opcodes. Below is the compiled opcode for the complete sample script of ours:

~ sabhinav$ php -r '$op_codes = parsekit_compile_file("debug.php", $errors, PARSEKIT_SIMPLE); print_r($op_codes); print_r($errors);';
    [3] => ZEND_ASSIGN T(0) T(0) 3
    [6] => ZEND_SEND_VAR UNUSED T(0) 0x1
    [7] => ZEND_DO_FCALL T(1) 'increment' 0x83E710CA
    [9] => ZEND_ASSIGN T(2) T(0) T(1)
    [function_table] => Array
            [increment] => Array
                    [0] => ZEND_EXT_NOP UNUSED UNUSED UNUSED
                    [1] => ZEND_RECV T(0) 1 UNUSED
                    [2] => ZEND_EXT_STMT UNUSED UNUSED UNUSED
                    [3] => ZEND_ADD T(0) T(0) 1
                    [4] => ZEND_RETURN UNUSED T(0) UNUSED
                    [5] => ZEND_EXT_STMT UNUSED UNUSED UNUSED
                    [6] => ZEND_RETURN UNUSED NULL UNUSED


    [class_table] =>

As we can see above, Zend engine is able to recognize the flow of our PHP. For instance, [3] => ZEND_ASSIGN T(0) T(0) 3 is a replacement for $a = 3; in our PHP code. Read on to understand what do these T(0) in the opcode means.

Executing the opcodes:
The generated opcode is executed one by one. Below table shows various details as every opcode is executed:

~ sabhinav$ php -d -d vld.execute=0 -f debug.php
Branch analysis from position: 0
Return found
filename:       /Users/sabhinav/Workspace/interview/facebook/peaktraffic/debug.php
function name:  (null)
number of ops:  13
compiled vars:  !0 = $a, !1 = $b
line     #  op                           fetch          ext  return  operands
   2     0  EXT_STMT
         1  NOP
   5     2  EXT_STMT
         3  ASSIGN                                                   !0, 3
   6     4  EXT_STMT
         5  EXT_FCALL_BEGIN
         6  SEND_VAR                                                 !0
         7  DO_FCALL                                      1          'increment'
         8  EXT_FCALL_END
         9  ASSIGN                                                   !1, $1
   7    10  EXT_STMT
        11  ECHO                                                     !1
   8    12  RETURN                                                   1

Function increment:
Branch analysis from position: 0
Return found
filename:       /Users/sabhinav/Workspace/interview/facebook/peaktraffic/debug.php
function name:  increment
number of ops:  7
compiled vars:  !0 = $a
line     #  op                           fetch          ext  return  operands
   2     0  EXT_NOP
         1  RECV                                                     1
   3     2  EXT_STMT
         3  ADD                                              ~0      !0, 1
         4  RETURN                                                   ~0
   4     5* EXT_STMT
         6* RETURN                                                   null

End of function increment.

First table represents the main loop run, while second table represents the run of user defined function in the php script. compiled vars: !0 = $a tells us that internally while script execution !0 = $a and hence now we can relate [3] => ZEND_ASSIGN T(0) T(0) 3 very well.

Above table also returns back the number of operations number of ops: 13 which can be used to benchmark and performance enhancement of your PHP script.

If APC cache is enabled, it caches the opcodes and thereby avoiding repetitive lexing/parsing/compiling every time same PHP script is called.

3 PHP extensions providing interface to Zend Engine:
Below are 3 very useful PHP extensions for geeky PHP developers. (Specially helpful for all PHP extension developers)

  • Tokenizer: The tokenizer functions provide an interface to the PHP tokenizer embedded in the Zend Engine. Using these functions you may write your own PHP source analyzing or modification tools without having to deal with the language specification at the lexical level.
  • Parsekit: These parsekit functions allow runtime analysis of opcodes compiled from PHP scripts.
  • Vulcan Logic Disassembler (vld): Provides functionality to dump the internal representation of PHP scripts. Homepage of VLD project for download instructions.

Hope this is of some help for PHP geeks out there.