How to use locks in PHP cron jobs to avoid cron overlaps

Cron jobs are hidden building blocks for most of the websites. They are generally used to process/aggregate data in the background. However as a website starts to grow and there is gigabytes of data to be processed by every cron job, chances are that our cron jobs might overlap and possibly corrupt our data. In this blog post, I will demonstrate how can we avoid such overlaps by using simple locking techniques. I will also discuss a few edge cases we need to consider while using locks to avoid overlap.

Cron job helper class
Here is a helper class (cron.helper.php) which will help us avoiding cron job overlaps. (See usage example below)

<?php

	define('LOCK_DIR', '/Users/sabhinav/Workspace/cronHelper/');
	define('LOCK_SUFFIX', '.lock');

	class cronHelper {

		private static $pid;

		function __construct() {}

		function __clone() {}

		private static function isrunning() {
			$pids = explode(PHP_EOL, `ps -e | awk '{print $1}'`);
			if(in_array(self::$pid, $pids))
				return TRUE;
			return FALSE;
		}

		public static function lock() {
			global $argv;

			$lock_file = LOCK_DIR.$argv[0].LOCK_SUFFIX;

			if(file_exists($lock_file)) {
				//return FALSE;

				// Is running?
				self::$pid = file_get_contents($lock_file);
				if(self::isrunning()) {
					error_log("==".self::$pid."== Already in progress...");
					return FALSE;
				}
				else {
					error_log("==".self::$pid."== Previous job died abruptly...");
				}
			}

			self::$pid = getmypid();
			file_put_contents($lock_file, self::$pid);
			error_log("==".self::$pid."== Lock acquired, processing the job...");
			return self::$pid;
		}

		public static function unlock() {
			global $argv;

			$lock_file = LOCK_DIR.$argv[0].LOCK_SUFFIX;

			if(file_exists($lock_file))
				unlink($lock_file);

			error_log("==".self::$pid."== Releasing lock...");
			return TRUE;
		}

	}

?>

Using cron.helper.php
Here is how the helper class can be integrated in your current cron job code:

  • Save cron.helper.php in a folder called cronHelper
  • Update LOCK_DIR as per your need
  • You might have to set proper permissions on folder cronHelper, so that running cron job have write permissions
  • Wrap your cron job code as show below:
    <?php
    
    	require 'cronHelper/cron.helper.php';
    
    	if(($pid = cronHelper::lock()) !== FALSE) {
    
    		/*
    		 * Cron job code goes here
    		*/
    		sleep(10); // Cron job code for demonstration
    
    		cronHelper::unlock();
    	}
    
    ?>

Is it working? Verify
Lets verify is the helper class really take care of all the edge cases.

  • sleep(10) is our cron job code for this test
  • Run from command line:
    sabhinav$ php job.php
    ==40818== Lock acquired, processing the job...
    ==40818== Releasing lock...
    

    where 40818 is the process id of current running cron job

  • Run from command line and terminate the cron job in between by pressing CNTR+C:
    sabhinav$ php job.php
    ==40830== Lock acquired, processing the job...
    

    By pressing CNTR+C, we simulate the cases when a cron job can die in between due to a fatal error or system shutdown. In such cases, helper class fails to release the lock on this cron job.

  • With the lock in place (ls -l cronHelper | grep lock), run from command line:
    sabhinav$ php job.php
    ==40830== Previous job died abruptly...
    ==40835== Lock acquired, processing the job...
    ==40835== Releasing lock...
    

    As seen, helper class detects that one of the previous cron job died abruptly and then allow the current job to run successfully.

  • Run the cron job from two command line window and one of them will not proceed as shown below:
    centurydaily-lm:cronHelper sabhinav$ php job.php
    ==40856== Already in progress...
    

    One of the cron job will die since a cron job with $pid=40856 is already in progress.

Working of cron.helper.php
The helper class create a lock file inside LOCK_DIR. For our test cron job above, lock file name will be job.php.lock. Lock file name suffix can be configured using LOCK_SUFFIX.

cronHelper::lock() places the current running cron job process id inside the lock file. Upon job completion cronHelper::unlock() deletes the lock file.

If cronHelper::lock() finds that lock file already exists, it extracts the previous cron job process id from the lock file and checks whether a previous cron job is still running. If previous job is still in progress, we abort our current current job. If previous job is not in progress i.e. died abruptly, current cron job acquires the lock.

This is the classic method for avoiding cron overlaps. However there can be various other methods of achieving the same thing. If you know any do let me know through your comments.

Calendar with Auto-Notification : API and demo.

I was quite impressed with the Google Calendar, Yahoo Calendar and Outlook implementation of the same and wondered how exactly is it done? What are the challenges? So I thought of making a clone of one of them and see if I am able to achieve the same level of perfection.

I must say my 1 day of work did indeed brought some smile to my face as I could see my application running and rocking. Here is it, try out for yourself and Let me know if it worked for you. You can even use it for your day to day needs, my server is up and running forever 😉

Click to visit the Calendar API Demo Page

Important before you try it out:

  1. It will ask you for Email Id and Password.
  2. Give your personal email id where you want to receive event’s notification.
  3. Password can be anything, which you can reuse to login.
  4. It will send out an email to your email id above 30 minutes before the event start
  5. For eg. If you marked an event for 26th July, 2008 – 06:00 PM. You will receive a mail notification of the same at 26th July,2008 – 05:30 PM.
  6. There must be atleast an hour gap between your event time and the moment you are marking it.

STEP 1:



STEP 2:



STEP 3:

Thats it, try out. Works for me perfectly. Let me know about your feedbacks.

Thanks
Abhinav