How to use locks for assuring atomic operation in Memcached?

Memcached provide atomic increment and decrement commands to manipulate integer (key,value) pairs. However special care should be taken to ensure application performance and possible race conditions while using memcached. In this blog post, I will first build a facebook style “like” application using atomic increment command of memcached. Also, I will discuss various technical difficulty one would face while ensuring atomicity in this application. Finally, I will demo how to ensure atomicity over a requested process using custom locks in memcached.

Where should I care about it?
Lets consider a sample application as depicted by the flow diagram below:
Facebook style "like" demo architecture using "memcached"

The above application is similar to facebook “like” feature. In brief, we maintain a key per post e.g. $key="post_id_1234_likes_count", storing count of users who liked this post. Another $key="post_id_1234_user_id_9999", stores user_id_9999 relationship with post_id_1234. Example, “liked” which is set to 1 if liked and “timestamp” which is the time when user liked this post.

Since this application is going to reside on a high traffic website, earlier design decisions are made to have memcached in-front of MySQL database and will act as the primary storage medium with periodic syncs to the database. For me a like/dislike functionality is not so important as compared to other social functionality on my website.

Here is a sample code for the above functionality:

	$mem = new Memcache;
	$mem->addServer("127.0.0.1", 11211);

	function incrementLikeCount($post_id) {
		global $mem;

		// prepare post key
		$key = "post_id_".$post_id."_likes_count";

		// get old count
		$old_count = $mem->get($key);

		// false means no one liked this post before
		if($old_count === FALSE) $old_count = 0;

		// increment count
		$new_count = $old_count+1;

		// set new count value
		if($mem->set($key, $new_count, 0, 0)) {
			error_log("Incremented key ".$key." to ".$new_count);
			return TRUE;
		}
		else {
			error_log("Error occurred in incrementing key ".$key);
			return FALSE;
		}
	}

	// get incoming parameters
	$post_id = $_GET['post_id'];

	// take action
	incrementLikeCount($post_id);

Why should I care about it?
Save the above code sample in a file called memcached_no_lock.php and hit the url http://localhost/memcached_no_lock.php?post_id=1234 five times. Verify the key value in memcached:

centurydaily-lm:Workspace sabhinav$ telnet localhost 11211
Trying ::1...
Connected to localhost.
Escape character is '^]'.
get post_id_1234_likes_count
VALUE post_id_1234_likes_count 0 2
5
END

Alright, application seems to give expected results. Next, lets verify this application for high traffic websites using apache benchmark:

centurydaily-lm:Workspace sabhinav$ ab -n 100 -c 10 http://localhost/memcached_no_lock.php?post_id=1234
Concurrency Level:      10
Time taken for tests:   0.090 seconds
Complete requests:      100
Failed requests:        0
Write errors:           0
Total transferred:      22200 bytes
HTML transferred:       0 bytes
Requests per second:    1112.03 [#/sec] (mean)

Verify the key value in memcached:

centurydaily-lm:Workspace sabhinav$ telnet localhost 11211
Trying ::1...
Connected to localhost.
Escape character is '^]'.
get post_id_1234_likes_count
VALUE post_id_1234_likes_count 0 2
36
END

What happened? We expected value for $key="post_id_1234_likes_count" to reach 100, but actually it is 36. What went wrong? This behavior can be explained by simply looking at the apache error log file:

[Sat Dec 05 14:32:08 2009] [error] [client ::1] Incremented key post_id_1234_likes_count to 1
[Sat Dec 05 14:32:08 2009] [error] [client ::1] Incremented key post_id_1234_likes_count to 1
[Sat Dec 05 14:32:08 2009] [error] [client ::1] Incremented key post_id_1234_likes_count to 1
[Sat Dec 05 14:32:08 2009] [error] [client ::1] Incremented key post_id_1234_likes_count to 2
[Sat Dec 05 14:32:08 2009] [error] [client ::1] Incremented key post_id_1234_likes_count to 2
[Sat Dec 05 14:32:08 2009] [error] [client ::1] Incremented key post_id_1234_likes_count to 3
[Sat Dec 05 14:32:08 2009] [error] [client ::1] Incremented key post_id_1234_likes_count to 3
[Sat Dec 05 14:32:08 2009] [error] [client ::1] Incremented key post_id_1234_likes_count to 3

Ohk, from above log we understand concurrency killed our application, since we see $key being incremented to the same value by more than 1 incoming request.

How should I take care of this?
Below is the modified code sample which will allow us atomic increments:

	$mem = new Memcache;
	$mem->addServer("127.0.0.1", 11211);

	function incrementLikeCount($post_id) {
		global $mem;

		// prepare post key
		$key = "post_id_".$post_id."_likes_count";

		$new_count = $mem->increment($key, 1);
		if($new_count === FALSE) {
			$new_count = $mem->add($key, 1, 0, 0);
			if($new_count === FALSE) {
				error_log("Someone raced us for first count on key ".$key);
				$new_count = $mem->increment($key, 1);
				if($new_count === FALSE) {
					error_log("Unable to increment key ".$key);
					return FALSE;
				}
				else {
					error_log("Incremented key ".$key." to ".$new_count);
					return TRUE;
				}
			}
			else {
				error_log("Initialized key ".$key." to ".$new_count);
				return TRUE;
			}
		}
		else {
			error_log("Incremented key ".$key." to ".$new_count);
			return TRUE;
		}

	}

	// get incoming parameters
	$post_id = $_GET['post_id'];

	// take action
	incrementLikeCount($post_id);

To ensure atomicity, we start with incrementing the $key="post_id_1234_likes_count". Since memcached increment() is atomic by itself, we need not put any locking mechanism in here. However, memcached increment returns FALSE, if the $key doesn’t already exists.

Hence, if we get a FALSE response from the first increment, we will try to initialize $key using memcached add() command. Good thing about memcached add is that, it will return a false FALSE, if the $key is already present. Hence, if more than one thread is trying to initialize $key, only one of them will succeed. All the rest of the threads will return FALSE for add command. Finally, if the response is FALSE from the first add, we will try to increment the $key again.

Lets try to test this modified code with apache benchmark. Also, this time we will increase concurrency from 10 to 100 threads. Save the above modified code in a file called memcached_lock.php and issue the following ab command:

centurydaily-lm:Workspace sabhinav$ ab -n 10000 -c 100 http://localhost/memcached_lock.php?post_id=1234
Concurrency Level:      100
Time taken for tests:   11.006 seconds
Complete requests:      10000
Failed requests:        0
Write errors:           0
Total transferred:      2224884 bytes
HTML transferred:       0 bytes
Requests per second:    908.61 [#/sec] (mean)

Lets verify the key value inside memcached:

centurydaily-lm:Workspace sabhinav$ telnet localhost 11211
Trying ::1...
Connected to localhost.
Escape character is '^]'.
get post_id_1234_likes_count
VALUE post_id_1234_likes_count 0 5
10000
END

Bingo! As desired we have a value of 10000 for $key inside memcached.

Using custom locks for atomicity:
There can be many instances where you SHOULD try to process a request atomically using locks. For e.g. while trying to fetch a query from database or while trying to regenerate a requested page template in your custom template caching engine.

In the example below, I will modify the memcached_lock.php script to ensure atomic increments without using increment command. Instead I will use custom locks using memcached:

	$mem = new Memcache;
	$mem->addServer("127.0.0.1", 11211);

	function incrementLikeCount($post_id) {
		global $mem;

		// prepare post key
		$key = "post_id_".$post_id."_likes_count";

		// initialize lock
		$lock = FALSE;

		// initialize configurable parameters
		$tries = 0;
		$max_tries = 1000;
		$lock_ttl = 10;

		$new_count = $mem->get($key); // fetch older value
	    while($lock === FALSE && $tries < $max_tries) {
	    	if($new_count === FALSE) $new_count = 0;
	        $new_count = $new_count + 1;

			// add() will return false if someone raced us for this lock
                       // ALWAYS USE add() FOR CUSTOM LOCKS
	        $lock = $mem->add("lock_".$new_count, 1, 0, $lock_ttl);

			$tries++;
			usleep(100*($tries%($max_tries/10))); // exponential backoff style of sleep
		}

		if($lock === FALSE && $tries >= $max_tries) {
			error_log("Unable to increment key ".$key);
			return FALSE;
		}
		else {
	    	$mem->set($key, $new_count, 0, 0);
			error_log("Incremented key ".$key." to ".$new_count);
			return TRUE;
		}

	}

	// get incoming parameters
	$post_id = $_GET['post_id'];

	// take action
	incrementLikeCount($post_id);

Try testing it using apache benchmark as above and then verify it with memcached.

centurydaily-lm:Workspace sabhinav$ telnet localhost 11211
Trying ::1...
Connected to localhost.
Escape character is '^]'.
get post_id_1234_likes_count
VALUE post_id_1234_likes_count 0 3
100
END

Benchmarks:
We see a drop in performance from 1112 hits/sec (memcached_no_lock) to 908 hits/sec (memcached_lock using increment). This is majorly because of increased concurrency. At same concurrency level of 10, I received a performance benchmark of 1128 hits/sec with our thread protected code. However, for our custom lock code above, I received a performance benchmark of 275 hits/sec.

Conclusion:
Always use memcached increment/decrement while dealing with locks on integer valued keys. For achieving locks on a process, use custom locks as demoed above using memcached add command. Also custom locks are subjected to configurable options like $max_tries and others.

Hope you enjoyed reading.
Do let me know through your comments.