Latest Publications

Bandwidth != Network Performance

You might think that if you want faster internet performance, you can simply get a connection to the internet that has higher bandwidth. When you get a “faster” internet connection you may observe faster downloads. But it’s less frequently the additional bandwidth, and more frequently reduced latency that actually produces increased interactive web performance. This post explains why.

First of all, let’s review some definitions:

  • Bandwidth: The amount of data that can be passed along a communications channel in a given period of time.
  • Latency: The time it takes for a packet to cross a network connection, from sender to receiver.
  • Speed: Fast and rapid moving, going, traveling, proceeding, or performing; swiftness.
  • Throughput: The quantity data transmitted by a computer network over a given period of time.

Now, all of these terms are related, and I want to highlight some of the minutia here:

Bandwidth

The higher the bandwidth is on a network connection, the more data it’s capable of transmitting in a given period of time. Higher bandwidth is better.

Latency

This is very very important, because latency effectively limits the amount of bandwidth you can consume if you are using a synchronous data transmission, like a TCP/IP download. Lower latency is better, and will yield faster speed.

Throughput

Throughput is another way of expressing speed. The higher the throughput, the faster your network communications will be. Note that your maximum possible throughput is your bandwidth. Actual throughput is equal to or less than your bandwidth.

Speed

If your network is high speed, you should observe high bandwidth, low latency, and high throughput.

Latency and Bandwidth are Inversely Proportional

For TCP/IP transmissions, the higher your latency is, the lower your throughput will be. Let’s explore why. The most common use of TCP/IP is for the web, which uses the HTTP protocol. HTTP works by making a TCP/IP connection to a remote server, issuing a request for a document, and then receiving the response. The protocol is text based. A simple HTTP transmission is illustrated below.

Client Request:

GET / HTTP/1.1
User-Agent: Wget
Host: www.example.com

Server Response:

HTTP/1.1 200 OK
Server: Apache/2.2.3 (Red Hat)
Last-Modified: Tue, 15 Nov 2005 13:24:10 GMT
ETag: "b300b4-1b6-4059a80bfd280"
Accept-Ranges: bytes
Content-Type: text/html; charset=UTF-8
Connection: Keep-Alive
Date: Wed, 18 Nov 2009 22:36:34 GMT
Age: 1010
Content-Length: 438

  Example Web Page

You have reached this web page by typing "example.com",
"example.net",
  or "example.org" into your web browser.

These domain names are reserved for use in documentation and are not available
  for registration. See &lta href="http://www.rfc-editor.org/rfc/rfc2606.txt">RFC
  2606</a>, Section 3.

Here is a trace of the TCP/IP packets that make up that request:

14:57:47.146665 IP 192.168.144.2.39556 > 192.0.32.10.80: S 3717672264:3717672264(0) win 5840
14:57:47.220092 IP 192.168.144.2.39556 > 192.0.32.10.80: . ack 1 win 183
14:57:47.220309 IP 192.168.144.2.39556 > 192.0.32.10.80: P 1:123(122) ack 1 win 183  (GET Request)
14:57:47.300962 IP 192.0.32.10.80 > 192.168.144.2.39556: P 1:728(727) ack 123 win 4502  (200 OK Response)
14:57:47.300993 IP 192.168.144.2.39556 > 192.0.32.10.80: . ack 728 win 228
14:57:47.302035 IP 192.168.144.2.39556 > 192.0.32.10.80: F 123:123(0) ack 728 win 228
14:57:47.375475 IP 192.0.32.10.80 > 192.168.144.2.39556: . ack 124 win 4502
14:57:47.375499 IP 192.0.32.10.80 > 192.168.144.2.39556: F 728:728(0) ack 124 win 4502
14:57:47.375510 IP 192.168.144.2.39556 > 192.0.32.10.80: . ack 729 win 228

Notice that there are 10 packets in the above trace. It’s a three way handshake to set up the TCP session, then a round trip to send the data, then two more round trips to close down the connection. Each time the server receives a packet from the client, the connection may wait in the server’s connection queue to be processed, which can further increase the interactive protocol latency. Consider the impact of high latency on a connection like this. Suppose that it takes 0.2 seconds for each round trip. That connection would have a total throughput of 727 bytes downloaded in 0.8 seconds. That’s a rate of 909 Bytes/sec. Maybe your internet connection is 15 Mb/sec. bandwidth did not matter. Latency caused the throughput to be low.

Now, you might be wondering why we can’t just improve networking technology to make latency lower. We can, but that’s not going to help much, because we are still bounded by the speed of light, among other factors. The speed of light is slow when you consider the distance it has to travel to cross continents on the earth. Let’s look at some match to explain that:

  • The speed of light in vacuum is 299,792,458 m/s.
  • The speed of light in fiber optic cable is ~200,000,000 m/s.
  • The distance from Anaheim, CA to New York is 4,494,898 meters
  • The one-way latency to New York is 4,494,898 / 200,000,000 = 22.47ms
  • The round-trip time between Anaheim, CA and New York is 44.95ms
  • The current ping time from Anaheim, CA to New York is 72 ms
  • Tracing the route to sl-gw33-nyc.sprintlink.net (144.228.243.82)
      1 sl-crs1-ana-0-14-2-0.sprintlink.net (144.232.11.9) 0 msec
        sl-crs2-ana-0-14-2-0.sprintlink.net (144.232.11.11) 0 msec
        sl-crs1-ana-0-14-2-0.sprintlink.net (144.232.11.9) 4 msec
      2 sl-crs2-fw-0-13-3-0.sprintlink.net (144.232.19.197) 28 msec
        sl-crs2-fw-0-9-5-0.sprintlink.net (144.232.20.130) 28 msec
        sl-crs1-fw-0-3-3-0.sprintlink.net (144.232.9.65) 28 msec
      3 sl-crs2-kc-0-0-0-2.sprintlink.net (144.232.19.141) 40 msec
        144.232.20.57 40 msec
        sl-crs1-kc-0-5-5-0.sprintlink.net (144.232.24.9) 40 msec
      4 sl-crs2-chi-0-13-5-0.sprintlink.net (144.232.20.109) 52 msec
        sl-crs1-chi-0-1-0-3.sprintlink.net (144.232.18.214) 56 msec
        sl-crs2-chi-0-15-2-0.sprintlink.net (144.232.24.206) 52 msec
      5 sl-crs1-nyc-0-8-0-3.sprintlink.net (144.232.18.123) 72 msec
        sl-crs2-nyc-0-8-0-1.sprintlink.net (144.232.20.119) 72 msec
        sl-crs1-chi-0-10-3-0.sprintlink.net (144.232.9.148) 72 msec
      6 sl-gw33-nyc-14-0-0.sprintlink.net (144.232.6.56) 72 msec *
        sl-gw33-nyc-15-0-0.sprintlink.net (144.232.6.58) 72 msec
    

This round trip time includes all of the switching and routing to get the packet through its full round trip. That means that even if all switching and routing were instantaneous, and we had a perfectly straight fiber path between all points on the earth, that we could only reduce latency by about 40%. We can not accelerate the speed of light, so without a significant advance in data transmission technology (perhaps a quantum physics approach) we must accept the speed of light as a performance boundary.

Making Web Sites Faster

If you’re a web content publisher, you can set up your systems to work around these natural limitations. One way to make interactive web performance faster is to place copies of your data in various geographic locations that are physically closer to your end users. Using a CDN for your media content is one way to do this. You can also make your web server as fast as possible so that your dynamically generated content can be processed as quickly as possible. Using memcached to speed up your web application can help. Also, take a look at some best practices for web developers for good performance.

Put WiFi on your cell phone’s SIM Card!

Have you ever wanted to surf the web from your laptop using the internet connection on your cell phone without connecting any wires, and with no hassle goofing around with software? Well guess what, for you happiness is close at hand!

Today Sagem Orga made a press release that raised my eyebrows. They have a new SIM card (the identification chip in your GSM cell phone) that has WiFi capability right on the chip. This is exciting, because it would enable otherwise ordinary cell phones to be used as WiFi internet gateways, running both WiFi and 3G data connections at the same time.

This is something that most phones simply can not do. The ones that can do it require that a software program must be running on the phone to make it into a router that can relay WiFi signals over the web through a 3G data connection over the cell phone network. Getting this on a Blackberry, for example is a huge nuisance, if your service provider supports it at all.

Well, that nuisance may be a thing of the past once the new “SIMFi” technology hits the market. Imagine just plugging in the snazzy new card into your phone, joining its WiFi network from your laptop, and accessing the internet from practically anywhere. How cool is that!?!

There has been a discussion on Slashdot about this. One of the interesting commentary was about the need for a 2.4 GHz antenna, which can actually fit fine on the SIM card itself, as long as it’s bent around a bit. An obvious question with any WiFi product is “what’s the implication on battery life?”. It will definitely be shorter. Hopefully this device will have some sort of a tunable transmit power adjustment for the WiFi signal so power consumption can be kept to a minimum. After all, your laptop and your cell phone will only be an arm’s length apart when you are using this setup anyway, so range is not a major concern.

Yes, I do love technical gadgets. The thought of where this could go is very exciting. I’ll be the first on the waiting list for this!

CPU Time stolen from a virtual machine?

Those of you studying the vmstat(8) man page may be wondering what the ’st’ figure is in the CPU column. The manual refers to it as “Time stolen from a virtual machine“. More specifically:

It’s the time the hypervisor scheduled something else to run instead of something within your VM. This might be time for another VM, or for the Hypervisor host itself. If no time were stolen, this time would be used to run your CPU workload or your idle thread.

There is some disagreement circulating about whether the Hypervisor will steal idle time, or only preempted time. In other words, it has been suggested that stolen time is where your local kernel scheduler within the VM wanted to run something but the Hypervisor made that impossible. I have found that stolen time does in fact count borrowed idle time, where the local scheduler actually had nothing to run. For example, here are some vmstat values from a VM that’s got a very low cpu workload on it:

vmstat -S M 1 10
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 1  0    121     42     53    460    0    0     0     1    0    1  0  0 89  0 10
 0  0    121     42     53    460    0    0     0    28 1014   39  0  0 90  0 10
 0  0    121     42     53    460    0    0     0     0 1016   36  0  0 91  0  9
 0  0    121     42     53    460    0    0     0     0 1024   32  0  0 93  0  7
 0  0    121     42     53    460    0    0     0     0 1019   40  0  0 91  0  9
 0  0    121     42     53    460    0    0     0     0 1015   32  0  0 90  0 10
 0  0    121     42     53    460    0    0     0     0 1022   34  0  0 92  0  8
 0  0    121     42     53    460    0    0     0     0 1016   36  0  0 91  0  9
 0  0    121     42     53    460    0    0     0     0 1013   34  0  0 92  0  8
 0  0    121     42     53    460    0    0     0     0 1028   43  0  0 93  0  7

As you can see, user time (us), system time (sy), and iowait time (wa) are zero, but idle time is not 100%. This normally indicates that your system is doing something, but in this case idle time is actually the sum of the id and st columns.

In this example, I really don’t care that I have a nonzero st column because my workload is basically idle all the time anyway.

If you are on a cloud host where you purchase a small sliver of a server, you should expect to see nonzero values in this column when you run vmstat. If you have a heavy CPU load and need more processing power, you can solve this problem by upgrading to a larger VM server size so that you command a larger portion of the physical host.