Putting Entropy in the Cloud
I was browsing through twitter mentions of @adrian_otto and found one posted by Ian Thompson mentioning an article about weak randomness in the cloud. It suggests that because there may be insufficient entropy sources on a Cloud Server or instance that it may make it easier to guess random number sequences because different cloud servers may have similar or even identical entropy pools (or worse yet identical host keys) when created, and therefore easier to break encryption algorithms that depend on them.
Yes, if you have similar entropy pools it is easier to break encryption dependent on it. It’s reasonably easy to work around this and make sure your entropy pool is uniquely initialized. You can consult the random manual for the Linux Kernel for information about how to seed your entropy pool with a particular set of data. If you are running an application in the cloud that utilizes encryption, and you are concerned about the initial state of your entropy pool, you can solve that. Use this procedure:
1) Seed your own pool from a long running system that has sufficient entropy in it, rather than relying on what you read from the kernel at startup.
2) Produce a network service that you use to seed your initial entropy pools. This service could be as simple as an entropy file that you create on pseudo-random time intervals, and just discard them as you serve them to cloud server instances (as they boot up) so you never serve the same one twice. At boot time from your VM, simply connect to wherever you run this service and download an input file to seed your entropy pool with. Restrict access to this so that it’s only available to your own server instances.
3) Make sure that your custom entropy pool initialization takes place prior to starting your encryption software.
4) If you are creating an AMI, or other server image that you plan to clone, be sure that it does not have a host key generated yet. Delete it and allow your initialization scripts to create it when the server is created (after step rather than making copies of the same one.
If you don’t trust what /dev/random or /dev/urandom emit, you can optionally use OpenSSL with prngd or egd as alternate entropy sources, and potentially feed in your own sensory input data. If you want to go hardcore, you could add environmental noise such as resistor noise on the microphone input of a sound card, or some other sensory data. There is existing software for doing just that. There’s all sorts of possibilities. Among them are a number of hardware solutions for RNG, most of which are pretty expensive and are not options for a cloud environment. There are sources of random numbers provided as a service from various sources.
There are things that we can do as Cloud Computing service providers to pre-initialize your entropy pools for you when the given server instance is created so the procedure above would be redundant. This still leaves the question as to the quality of the RNG available to you on a cloud server.
There are two standard randomness sources that you should know about:
/dev/random = produces actual entropy, if you have some, and blocks otherwise.
/dev/urandom = produces available entropy regardless of quality, but does not block.
The Linux kernel has a paravirtual entropy driver which provides kernel-side support for the virtual RNG hardware. The kernel compile option CONFIG_HW_RANDOM_VIRTIO enables it, and it can be built as a kernel module. There are drivers that run within the hypervisor host kernel that connect this with the RNG hardware available on the server (if any).
drivers/char/hw_random/amd-rng.ko = H/W RNG driver for AMD chipsets
drivers/char/hw_random/intel-rng.ko = H/W RNG driver for Intel chipsets
drivers/char/hw_random/virtio-rng.ko = VirtIO Random Number Generator support
How it works is the hypervisor host (dom0) runs rngd to read data from /dev/hwrandom (using the Intel or AMD modules mentoined above) and feeds it into /dev/random, then the guest VM (domU) does the same thing. The rngd can mixes data from both /dev/random and /dev/urandom so you get as much random data as you need in a non-blocking fashion. You can consult the kernel source code to learn more. Then you run rngd in the guest VM to feed that into the kernel.
What happens if multiple guest VM’s are reading this data at the same time using this arrangement? I’m not sure if it’s possible to deplete the entropy pool of the hypervisor host and produce PRNG patterns that are therefore less random. So if one guest VM emptied the entropy pool by aggressively reading from the /dev/hwrandom device, you might cause someone else’s guest VM to get less data. This could be solved if there were a simply a rate limit enforced on the consumption of RNG data allowed per guest VM. There is further discussion of that as well.
The truth is that for most needs you can have reasonably secure encryption by simply having an ordinary PRNG source like /dev/urandom that’s properly initialized with random data. I suggest that you use that approach in your cloud deployments.
You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.
Adrian,
I saw your article on setting up memcached on Cloud Sites using Cloud Servers:
http://www.rackspacecloud.com/blog/2009/07/30/setting-up-memcached-on-cloud-sites/
I have no experience with all the command line stuff, but I spent hours setting up my first Rackspace Cloud Server on Debian 4.0 and figuring out how to install memcached on it. I was able to successfully install memcached on my cloud server, and I uploaded your example.php test script using the hostname of my Cloud Server but I got a “Could not connect” message.
Is there no way to enable memcached directly on Cloud Sites? And why?
Shan,
Thanks for your comment. At the moment, Cloud Sites has client library support for memcached, but does not offer a place to run the memcached server. That’s what the article was about. You can run the server in Cloud Servers and connect to it from Cloud Sites. There are two key reasons why right now you can’t simply use memcached directly on Cloud Sites today:
1) The current stable release of memcached does not have any user authentication features. There are development versions that have this feature, so this problem will go away soon.
2) Cloud Sites is intended for interpreted code, not long running processes. In order to run memcached, you need to be allowed to run processes that don’t quit after your HTTP request finishes.
So for those of you that are looking for a super easy way to use memcached from cloud sites (no CLI tricks) you will have a way in the future when memcached is added as one of the included features of Cloud Sites as part of the base platform. That solution is still a way down the road because of a few other products that will launch before it, but keep your eye open on The Rackspace Cloud Blog for announcements that will come out when that service is ready to be launched.
Cheers,
Adrian