Memcached provide atomic increment and decrement commands to manipulate integer (key,value) pairs. However special care should be taken to ensure application performance and possible race conditions while using memcached. In this blog post, I will first build a facebook style “like” application using atomic increment
command of memcached. Also, I will discuss various technical difficulty one would face while ensuring atomicity in this application. Finally, I will demo how to ensure atomicity over a requested process using custom locks in memcached.
Where should I care about it?
Lets consider a sample application as depicted by the flow diagram below:
The above application is similar to facebook “like” feature. In brief, we maintain a key per post e.g. $key="post_id_1234_likes_count"
, storing count of users who liked this post. Another $key="post_id_1234_user_id_9999"
, stores user_id_9999 relationship with post_id_1234. Example, “liked” which is set to 1 if liked and “timestamp” which is the time when user liked this post.
Since this application is going to reside on a high traffic website, earlier design decisions are made to have memcached in-front of MySQL database and will act as the primary storage medium with periodic syncs to the database. For me a like/dislike functionality is not so important as compared to other social functionality on my website.
Here is a sample code for the above functionality:
$mem = new Memcache; $mem->addServer("127.0.0.1", 11211); function incrementLikeCount($post_id) { global $mem; // prepare post key $key = "post_id_".$post_id."_likes_count"; // get old count $old_count = $mem->get($key); // false means no one liked this post before if($old_count === FALSE) $old_count = 0; // increment count $new_count = $old_count+1; // set new count value if($mem->set($key, $new_count, 0, 0)) { error_log("Incremented key ".$key." to ".$new_count); return TRUE; } else { error_log("Error occurred in incrementing key ".$key); return FALSE; } } // get incoming parameters $post_id = $_GET['post_id']; // take action incrementLikeCount($post_id);
Why should I care about it?
Save the above code sample in a file called memcached_no_lock.php
and hit the url http://localhost/memcached_no_lock.php?post_id=1234 five times. Verify the key value in memcached:
centurydaily-lm:Workspace sabhinav$ telnet localhost 11211 Trying ::1... Connected to localhost. Escape character is '^]'. get post_id_1234_likes_count VALUE post_id_1234_likes_count 0 2 5 END
Alright, application seems to give expected results. Next, lets verify this application for high traffic websites using apache benchmark:
centurydaily-lm:Workspace sabhinav$ ab -n 100 -c 10 http://localhost/memcached_no_lock.php?post_id=1234 Concurrency Level: 10 Time taken for tests: 0.090 seconds Complete requests: 100 Failed requests: 0 Write errors: 0 Total transferred: 22200 bytes HTML transferred: 0 bytes Requests per second: 1112.03 [#/sec] (mean)
Verify the key value in memcached:
centurydaily-lm:Workspace sabhinav$ telnet localhost 11211 Trying ::1... Connected to localhost. Escape character is '^]'. get post_id_1234_likes_count VALUE post_id_1234_likes_count 0 2 36 END
What happened? We expected value for $key="post_id_1234_likes_count"
to reach 100, but actually it is 36. What went wrong? This behavior can be explained by simply looking at the apache error log file:
[Sat Dec 05 14:32:08 2009] [error] [client ::1] Incremented key post_id_1234_likes_count to 1 [Sat Dec 05 14:32:08 2009] [error] [client ::1] Incremented key post_id_1234_likes_count to 1 [Sat Dec 05 14:32:08 2009] [error] [client ::1] Incremented key post_id_1234_likes_count to 1 [Sat Dec 05 14:32:08 2009] [error] [client ::1] Incremented key post_id_1234_likes_count to 2 [Sat Dec 05 14:32:08 2009] [error] [client ::1] Incremented key post_id_1234_likes_count to 2 [Sat Dec 05 14:32:08 2009] [error] [client ::1] Incremented key post_id_1234_likes_count to 3 [Sat Dec 05 14:32:08 2009] [error] [client ::1] Incremented key post_id_1234_likes_count to 3 [Sat Dec 05 14:32:08 2009] [error] [client ::1] Incremented key post_id_1234_likes_count to 3
Ohk, from above log we understand concurrency killed our application, since we see $key being incremented to the same value by more than 1 incoming request.
How should I take care of this?
Below is the modified code sample which will allow us atomic increments:
$mem = new Memcache; $mem->addServer("127.0.0.1", 11211); function incrementLikeCount($post_id) { global $mem; // prepare post key $key = "post_id_".$post_id."_likes_count"; $new_count = $mem->increment($key, 1); if($new_count === FALSE) { $new_count = $mem->add($key, 1, 0, 0); if($new_count === FALSE) { error_log("Someone raced us for first count on key ".$key); $new_count = $mem->increment($key, 1); if($new_count === FALSE) { error_log("Unable to increment key ".$key); return FALSE; } else { error_log("Incremented key ".$key." to ".$new_count); return TRUE; } } else { error_log("Initialized key ".$key." to ".$new_count); return TRUE; } } else { error_log("Incremented key ".$key." to ".$new_count); return TRUE; } } // get incoming parameters $post_id = $_GET['post_id']; // take action incrementLikeCount($post_id);
To ensure atomicity, we start with incrementing the $key="post_id_1234_likes_count"
. Since memcached increment()
is atomic by itself, we need not put any locking mechanism in here. However, memcached increment returns FALSE, if the $key doesn’t already exists.
Hence, if we get a FALSE response from the first increment, we will try to initialize $key using memcached add()
command. Good thing about memcached add is that, it will return a false FALSE, if the $key is already present. Hence, if more than one thread is trying to initialize $key, only one of them will succeed. All the rest of the threads will return FALSE for add command. Finally, if the response is FALSE from the first add, we will try to increment the $key again.
Lets try to test this modified code with apache benchmark. Also, this time we will increase concurrency from 10 to 100 threads. Save the above modified code in a file called memcached_lock.php
and issue the following ab
command:
centurydaily-lm:Workspace sabhinav$ ab -n 10000 -c 100 http://localhost/memcached_lock.php?post_id=1234 Concurrency Level: 100 Time taken for tests: 11.006 seconds Complete requests: 10000 Failed requests: 0 Write errors: 0 Total transferred: 2224884 bytes HTML transferred: 0 bytes Requests per second: 908.61 [#/sec] (mean)
Lets verify the key value inside memcached:
centurydaily-lm:Workspace sabhinav$ telnet localhost 11211 Trying ::1... Connected to localhost. Escape character is '^]'. get post_id_1234_likes_count VALUE post_id_1234_likes_count 0 5 10000 END
Bingo! As desired we have a value of 10000 for $key inside memcached.
Using custom locks for atomicity:
There can be many instances where you SHOULD try to process a request atomically using locks. For e.g. while trying to fetch a query from database or while trying to regenerate a requested page template in your custom template caching engine.
In the example below, I will modify the memcached_lock.php script to ensure atomic increments without using increment command. Instead I will use custom locks using memcached:
$mem = new Memcache; $mem->addServer("127.0.0.1", 11211); function incrementLikeCount($post_id) { global $mem; // prepare post key $key = "post_id_".$post_id."_likes_count"; // initialize lock $lock = FALSE; // initialize configurable parameters $tries = 0; $max_tries = 1000; $lock_ttl = 10; $new_count = $mem->get($key); // fetch older value while($lock === FALSE && $tries < $max_tries) { if($new_count === FALSE) $new_count = 0; $new_count = $new_count + 1; // add() will return false if someone raced us for this lock // ALWAYS USE add() FOR CUSTOM LOCKS $lock = $mem->add("lock_".$new_count, 1, 0, $lock_ttl); $tries++; usleep(100*($tries%($max_tries/10))); // exponential backoff style of sleep } if($lock === FALSE && $tries >= $max_tries) { error_log("Unable to increment key ".$key); return FALSE; } else { $mem->set($key, $new_count, 0, 0); error_log("Incremented key ".$key." to ".$new_count); return TRUE; } } // get incoming parameters $post_id = $_GET['post_id']; // take action incrementLikeCount($post_id);
Try testing it using apache benchmark as above and then verify it with memcached.
centurydaily-lm:Workspace sabhinav$ telnet localhost 11211 Trying ::1... Connected to localhost. Escape character is '^]'. get post_id_1234_likes_count VALUE post_id_1234_likes_count 0 3 100 END
Benchmarks:
We see a drop in performance from 1112 hits/sec (memcached_no_lock) to 908 hits/sec (memcached_lock using increment). This is majorly because of increased concurrency. At same concurrency level of 10, I received a performance benchmark of 1128 hits/sec with our thread protected code. However, for our custom lock code above, I received a performance benchmark of 275 hits/sec.
Conclusion:
Always use memcached increment/decrement while dealing with locks on integer valued keys. For achieving locks on a process, use custom locks as demoed above using memcached add command. Also custom locks are subjected to configurable options like $max_tries
and others.
Hope you enjoyed reading.
Do let me know through your comments.
Intelligent use of inner add and increment for avoiding race condition on first set. Was a nice read
Wouldn’t this be significantly less complicated if you just use CAS operations?
Hi Dustin,
Yes indeed if you use a library supporting CAS you can go ahead using the same. I know libmemcached do support that.
Pingback: Abhinav Singh’s Blog: How to use locks for assuring atomic operation in Memcached? | Development Blog With Code Updates : Developercast.com
Pingback: Abhinav Singh’s Blog: How to use locks for assuring atomic operation in Memcached? | Webs Developer
Thanks, interesting post.
Pingback: Abhinav Singh’s Blog: How to use locks for assuring atomic operation in Memcached? | TheUnical Technologies Blog
Pingback: Twitted by imoracle
In the custom locks example, shouldn’t we be doing a get() inside the while loop, in case someone else incremented the value during our time in the loop?
Also, if we’re not doing a lock on a pure “lock_” key, but rather a more global keyname (prefixed, so “lock_likevalues”, for example, to increment the key “likevalues”), shouldn’t we delete the lock key after doing the set so that other threads can immediately get the lock without waiting for the expiration timeout?
hi, great article but i have a question.
how do you know the keys you want to upgrade to bbdd in cron job?
Pingback: 利用分布式缓存实现分布式锁 | 求知若饥,虚心若愚
Pingback: Orleans vs a Distributed Cache | Richard Astbury's Blog