Memcache

Memcache:
Free & open source, high-performance, distributed memory object caching system, generic in nature, but intended for use in speeding up dynamic web applications by alleviating database load.

Memcached is an in-memory key-value store for small chunks of arbitrary data (strings, objects) from results of database calls, API calls, or page rendering.

Memcached is simple yet powerful. Its simple design promotes quick deployment, ease of development, and solves many problems facing large data caches. Its API is available for most popular languages.

It’s fast, flexible, lightweight and it looks like installing memcache on your servers automatically increases your website speed tenfold or more.

At heart it is a simple Key/Value store.

It is always required to having a good caching-strategy in place can help your website/application.

Memcache is Made Up Of?
1. Client software, which is given a list of available memcached servers.
2. A client-based hashing algorithm, which chooses a server based on the “key” input.
3. Server software, which stores your values with their keys into an internal hash table.
4. Server algorithms, which determine when to throw out old data (if out of memory), or reuse memory.

Design Philosophies:

Simple Key/Value Store
The server does not care what your data looks like. Items are made up of a key, an expiration time, optional flags, and raw data. It does not understand data structures; you must upload data that is pre-serialized. Its implementetion is partially in a client, and partially in a server. Clients understand how to send items to particular servers, what to do when it cannot contact a server, and how to fetch keys from the servers. The servers understand how to receive items, and how to expire them.

Memcached servers are generally unaware of each other. There is no crosstalk, no syncronization, no broadcasting. The lack of interconnections means adding more servers will usually add more capacity as you expect.

O(1) Everything
For everything it can, memcached commands are O(1). Most of memcache functionality (add, get, set, flush etc) are O(1). This means they are constant time functions. It does not matter how many items there are inside the cache, the functions will take just as long as they would with just 1 item inside the cache.

Having O(1) functions has a lot of advantages (memcache is and STAYS fast), but also some drawbacks. For instance: you cannot iterate over all your items inside memcache (if you need to do that, you are seriously misusing memcache). An iterator would probably be a O(N) operation, which means the amount of time doubles when the amount of items in the cache doubles.

Forgetting Data is a Feature (LRU):
Before you start a memcache-daemon, you have to tell it how much memory it should use. It will allocate that memory right from the start so if you want to use 1GB of data for memcache storage that 1GB memory is allocated directly, you cannot use further for any other purpose. Now what happen all 1GB is allocated to our different-different data object. How we can fill new data into memcache.

For this, Memcache will delete old data to make room for your new data, but before deleting data it should know which data i need to delete from system. How it will do?

Memcache uses a much more advanced technique called LRU: Least Recently Used algorithm to identified item/objects which is not used for the longest period of time and deletes it. This item/objects does not to be largest.

Memcached is, by default, a Least Recently Used cache. It is designed to have items expire after a specified amount of time. Both of these are elegant solutions to many problems; Expire items after a minute to limit stale data being returned, or flush unused data in an effort to retain frequently requested information.

This further allows great simplification in how memcached works. No “pauses” waiting for a garbage collector ensures low latency, and free space is lazily reclaimed.

Internally, all objects have a “counter”. This counter holds a timestamp. Every time a new object is created, that counter will be set to the current time. When an object gets FETCHED, it will reset that counter to the current time as well. As soon as memcache needs to “evict” an object to make room for newer objects, it will find the lowest counter. That is the object that isn’t fetched or is fetched the longest time ago (and probably isn’t needed that much, otherwise the counter would be closed to the current timestamp).

Cache Invalidation is a Hard Problem:
Given memcached’s centralized-as-a-cluster nature, the job of invalidating a cache entry is trivial. Instead of broadcasting data to all available hosts, clients direct in on the exact location of data to be invalidated.

Command line arguments:
Memcached comes equipped with basic documentation about its commandline arguments. View memcached -h or man memcached for up to date documentation. The service strives to have mostly sensible defaults.

When setting up memcached for the first time, you will pay attention to -m and -d.

-m tells memcached how much RAM to use for item storage (in megabytes). Note carefully that this isn’t a global memory limit, so memcached will use a few % more memory than you tell it to. Set this to safe values. Setting it to less than 48 megabytes does not work properly in 1.4.x and earlier. It will still use the memory.

-d tells memcached to daemonize. If you’re running from an init script you may not be setting this. If you’re using memcached for the first time, it might be educational to start the service without -d and watching it.

Connection Limit:
By default the max number of concurrent connections is set to 1024. Configuring this correctly is important. Extra connections to memcached may hang while waiting for slots to free up. You may detect if your instance has been running out of connections by issuing a stats command and looking at “listen_disabled_num”. That value should be zero or close to zero.

Memcached can scale with a large number of connections very simply. The amount of memory overhead per connection is low (even lower if the connection is idle), so don’t sweat setting it very high.

Lets say you have 5 webservers, each running apache. Each apache process has a MaxClients setting of 12. This means that the maximum number of concurrent connections you may receive is 5 x 12 (60). Always leave a few extra slots open if you can, for administrative tasks, adding more webservers, crons/scripts/etc.

Threading:
Threading is used to scale memcached across CPU’s. Its model is by “worker threads”, meaning that each thread makes itself available to process as much as possible. Since using libevent allows good scalability with concurrent connections, each thread is able to handle many clients.

This is different from some webservers, such as apache, which use one process or one thread per active client connection. Since memcached is highly efficient, low numbers of threads are fine. In webserver land, it means it’s more like nginx than apache.

By default 4 threads are allocated. Unless you are running memcached extremely hard, you should not set this number to be any higher. Setting it to very large values (80+) will not make it run any faster.

Memory allocation:
Memory allocation is an area that lies outside the reach for most php developers. Thay’s it is easy because all the memory related taken care by underlying operating system and compilers. But in case of Memcache, it is written in C, so it has its own memory allocation and management.
Memcache’s memory manager will allocate the maximum amount of memory from the operating system that you have set (for instance, 64Mb, but probably more). From that point on, it will use its own memory manager system called the slab allocator.

Slab allocation:
When memcache starts, it partitions its allocated memory into smaller parts called pages. Each page is 1Mb large (coincidentally, the maximum size that an object can have you can store in memcache). Each of those pages can be assigned to a slab-class, or can be unassigned (being a free page). A slab-class decides how large the objects can be that are stored inside that particular page. Each page that is designated to a particular slab-class will be divided into smaller parts called chunks. The chunks in each slab have the same size so there cannot be 2 different sized chunks inside the same page. For instance, there could be a page with 64byte chunks (slab class 1), a page with 128byte chunks (slab class 2) and so on, until we get the largest slab with only 1 chunk (the 1MB chunk). There can be multiple pages for each slab-class, but as soon as a page is assigned a slab-class (and thus, split up in chunks), it cannot be changed to another slab-class.

The smallest chunk-size starts at 80 bytes and increases with a factor of 1.25 (rounded up until the next  power of 2). So the second smallest chunksize would be 100 etc. You can actually find it out by issuing the “-vv” flag when starting memcache. You can also set the factor (-f) and the initial chunk-size (-s), but unless you really know what you are doing, don’t change the initial values.

$ memcached -vv
slab class   1: chunk size     80 perslab 13107
slab class   2: chunk size    100 perslab 10485
slab class   3: chunk size    128 perslab  8192
slab class   4: chunk size    160 perslab  6553
slab class   5: chunk size    200 perslab  5242
slab class   6: chunk size    252 perslab  4161
slab class   7: chunk size    316 perslab  3318
slab class   8: chunk size    396 perslab  2647
slab class   9: chunk size    496 perslab  2114

You actually see the slab class numbers, the chunk size inside the slab and how many chunks there are (the higher the chunk-size, the lower the number of chunks obviously). Memcache will initially create 1 page per slab-class and the rest of the pages will be free (which even slab class needs a page, gets a page) Now that memcache has partitioned the memory, it can add data to the slabs. Suppose I have a data object of 105 bytes (this includes the overhead from memcache, so what you would store inside memcache is actually a bit less). Memcache would know it should use a chunk in the slab-3 class since it’s the first class where the 105-byte object would fit. This also means that we have a penalty of 128-105 = 23 bytes that are unused. There is NO WAY that memory can be used for something else. That chunk is marked as used and that’s it. It’s the penalty we get from using the slab allocator but it also means our memory doesn’t get fragmented. In fact, it’s all a speed/memory waste trade-off.

Now, as soon as a complete page if full (all chunks in the page are filled) and we need to add another piece of data, it will fetch a new free page, assign it to the specified slab-class, partition it into chunks and gets the first available chunk to store the data. But as soon as there are no more pages left that we can designate to our slab-class, it will use the LRU-algorithm to evict one of the existing chunks to make room. This means that when we need a 128byte chunk, it will evict a 128byte chunk, even though there might be a 256byte chunk that is even older. Each slab-class has its own LRU.

So hypothetical:

If you fill your whole cache with 128-byte chunked data (slab class 3 in our case), it will assign all free pages to that slab-class. There would be in effect be only 1 (initial) page for each other slab-class, which means that when you have an object that needs to be stored in the 1MB page, a second 1MB object cannot allocate a new page. Therefore, memcache must use the LRU to remove an object, and since there can only be one object in the 1-MB slab-class, that object is removed. If you get the last paragraph, you understand how the slab-allocator works.

Consistent hashing:

Consistent Hashing is a model that allows for more stable distribution of keys given addition or removal of servers. In a normal hashing algorithm, changing the number of servers can cause many keys to be remapped to different servers, causing huge sets of cache misses. Consistent Hashing describes methods for mapping keys to a list of servers, where adding or removing servers causes a very minimal shift in where keys map to.
Your web application can talk to multiple memcache-servers at the same time. You only need to update your application with a list of ip’s where your memcache servers are located so it automatically use all your servers.

As soon we add an object to the memcache, it will automatically choose a server where it can store the data. When you have only one server that decision is easy, but as soon as you have multiple servers running it must find a way to place items. Lots of different algorithms can be used for this. For instance: a round-robin system that will store each object onto the next server on every save-action (1st object saved into server 1, 2nd object into server 2, 3rd object into server 1, etc etc). But how would such a system know the correct server when we want to fetch the specified data?

What memcache normally does is a simple, yet very effective loadbalance trick: for each key that gets stored or fetched, it will create a hash (you might see it as md5(key), but in fact, it’s a more specialized – quicker – hash method). Now, the hashes we create are pretty much evenly distributed, so we can use a modulus function to find out which server to store the object to:

In php code, it would do something like this:

$server_id = hashfunc($key) % $servercount;

That’s great. Now we can deduce which server holds the specified key just by this simple formula. The trouble with this system: as soon as $servercount (the number of servers) change, almost 100% of all keys will change server as well. Maybe some keys will get the same server id, but that would be coincidence. In effect, when you change your memcache server count (either up or down, doesn’t matter), you get a big stampede on your backend system since all keys are all invalidated at once.

Now, let’s meet consistent hashing. With this algorithm, we do not have to worry (much) about keys changing servers when your server count goes up or down. Here’s how it works:

Consistent hashing uses a counter that acts like a clock. Once it reaches the “12”, it wraps around to “1” again. Suppose this counter is 16 bits. This means it ranges from 0 to 65535.

Failure, or Failover:
What will your client do when a server is unavailable or provides an invalid response?

In the dark days of memcached, the default was to always “failover”, by trying the next server in the list. That way if a server crashes, its keys will get reassigned to other instances and everything moves on happily.

Managing Connection Objects:
A common first-timer failure is that no matter what you do, you seem to run memcached flat out of connections. Your small server is allocating 50,000 connections to memcached and you have no idea what’s going on.

Be wary of how you manage your connection objects! If you are constantly initializing connection objects every time you wish to contact memcached, odds are good you’re going to leak connections.

Some clients (like PHP ones) have a less obvious approach to managing how many connections it will open. Continually calling ‘addServer()’ may just leak connections on you, even if you’ve already added a server. Read your clients’ documentation to confirm what actions create connections and what will not.

Persistent Storage:
Many users want K/V stores that are able to persist values beyond a restart, or beyond available physical memory. In many cases memcached is not a great fit here; expensive flash memory is needed to keep data access performant. In most common scenarios, disaster can ensue if a server is unavailable and later comes up with old data. Users see stale data, features break, etc.

Reference:
https://www.adayinthelifeof.nl/2011/02/06/memcache-internals/
https://code.google.com/p/memcached/wiki/NewUserInternals
https://github.com/memcached/memcached

%d bloggers like this: