HTTP/2

HTTP/2 is "...a replacement for how HTTP is expressed “on the wire.”" [http2.github.io] It was invented to improve performance for the web, based on Google's experimental SPDY protocol (which it has since replaced.)

Among its stated goals were the requirements that it use the same protocol semantics (request-response exchanges, headers, status codes, etc.) and traverse the same networks (gateways, proxies, etc.) as HTTP/1.x.

Problems with HTTP/1.x

Head-of-line blocking

Because of HTTP's request-response, request-response flow, any subsequent exchange cannot progress until the preceding one has completed. This is called "head-of-line (HoL) blocking," and is a problem if:

TCP connection overhead

In the olden days, each request-response exchange happened on its own TCP connection (open(), write("GET ..."), read(...), close()). Unfortunately that open() is a fairly costly operation, especially if latency is involved, and then even once it's completed you find yourself throttled by TCP congestion control [slow start]. (It's even worse if HTTPS/TLS is involved, because the initial handshakes and key exchanges and whatnot involved there can be very slow.) For high-churn servers you also end up with a lot of TCP ports bound up in TIME_WAIT. This can be partially overcome with Keep-Alive and persistent connections or by sharding; however:

HTTP overhead

Sometimes (and not uncommonly):

GET /devcon/http2-rfc7540/ HTTP/1.1
Host: xfiles.library.qut.edu.au
Connection: keep-alive
Cache-Control: max-age=0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/43.0.2357.81 Safari/537.36
DNT: 1
Referer: https://xfiles.library.qut.edu.au/devcon/
Accept-Encoding: gzip, deflate, sdch
Accept-Language: en-US,en;q=0.8
Cookie: com.silverpop.iMAWebCookie=12345678-abcde-1234-abcd-123456781234; III_EXPT_FILE=aa1122;
  III_SESSION_ID=1aca388da6270e7153258a4ea06cf7be; SESSION_LANGUAGE=eng; ezproxy=xxxxxaaaaa55555;
  _saml_idp_for_int=aHR0cHM6Ly9lc29lLXRzdC5xdXQuZWR1LmF1; _saml_idp="aHR0cHM6Ly9lc29lLnF1dC5lZHUuYXU=";
  __utmz=255014920.1432528569.4.3.utmccn=(referral)|utmcsr=xxx.qut.edu.au|utmcct=/idp/profile/SAML2/POST/SSO|utmcmd=referral;
  spepSession=319954b846de6f70ad1bff65ea9b85d23f037d68-d5f57d6e97b139fbaf0952803ec36fea5e4000c1-1432598840;
  __utma=255014920.1032807906.1432269288.1432598569.1432602655.6; __utmc=255014920; _ga=GA1.3.1037755534.1431397557
If-None-Match: "98c66e-f0c-8e1315c0"
If-Modified-Since: Tue, 26 May 2015 04:28:15 GMT
  

Total: 1200 bytes (>1k)
Required: 72 bytes

This request is repeated, almost byte-for-byte, for every image, stylesheet, font, etc. in the page.

Cache busting

The ideal cost for a HTTP request-response exchange is 0 bytes delivered in 0 seconds. Sounds crazy, but this can be achieved – using caching.

To get around all of the problems listed above it's common to see two practices:

However:

Solutions with HTTP/2

Multiplexed streams

HTTP/2 supports parallel multiplexed request-response exchanges on a single connection. This means that HoL blocking is eliminated, and resources can be requested and delivered as soon as they're known to be needed.

Persistent connections

HTTP/2 works over a single, long-lived TCP connection. This means the costs of open() and TCP slow start are amortised over the lifetime of the connection.

Binary format & header compression

HTTP/2 uses a binary packing format (where HTTP/1.x used Good Old ASCII™) and a header compression mechanism that reduces the number of bytes sent over the wire quite a bit. (It was particularly designed with stupid headers, like User-Agent and Cookie, in mind.)

Caching works

Because these solutions eliminate the drivers for things like sharding and spriting, caches now work the way they were intended*.

* HTTPS/TLS/MitM/etc. notwithstanding – but that's a topic for another time.

Other benefits of HTTP/2

HTTP/2 also introduces some other goodies:

Side-effects of HTTP/2

As I've written elsewhere, all of these changes do mean that while HTTP/2 was meant to be a drop-in replacement for HTTP/1.1’s transport, realistically we have to rethink how our applications are structured and redesign them to take advantage of what HTTP/2 has to offer. And it's not really practical to try and offer the exact same service over both protocols (c.f.: Happy Eyeballs) unless the application is explicitly programmed that way.

Why to use HTTP/2

HTTP/2 was designed for web browsing. It's useful for:

However it is computationally more complex (and therefore slower to execute) than HTTP/1.x, although that's usually more than made up for in other efficiencies – being CPU bound is almost always better than being IO bound.

Finally, unless you can control both the client and the server and can assert a level of surety over every intermediate network device, there's no way to use HTTP/2 over cleartext HTTP. All the major browser vendors decided to only support HTTP/2 over HTTPS†. This adds additional costs to running HTTP/2 (the cost of certificates, administrative overheads, additional computation, etc.) If you're already using HTTPS for everything, as we often are, then it's not a problem; however if you're running sites in cleartext HTTP the path to upgrading can be quite costly.

† For reasons. The technical reason is complex:

  • If HTTP/2 is meant to be an upgrade for HTTP, then it should still work with the same URLs – and same URLs means same default ports. Which means for 99.999% of URLs on the open web we would have to carry HTTP/2 over ports :80 and :443.
  • Some proxies assume all data flowing through TCP port :80 (or any port labeled "HTTP" in their config) is HTTP/1.x, and those proxies can die in horrible and unpredictable (and sometimes undetectable or undiagnosable) ways if they instead get a stream of apparent binary guff. In most cases those devices have been convinced, over the course of the past two decades, to expect and allow a stream of binary guff on port :443 (or any port labeled "HTTPS") so "smuggling" HTTP/2 inside a :443 TLS stream has much more chance of success.

The non-technical reason is simpler: Google wants HTTPS everywhere.

Update [2017-03-14]

Regarding Problems with HTTP/1.x

There are also lower-level workarounds to help fix things like TCP connection establishment overhead, such as TCP Fast Open (and I think there's a 0-rtt TLS hack as well, but I don't know much about that.) These workarounds don't help with slow start, though, or any of the other issues listed above.

Regarding Why to use HTTP/2

Another point I forgot to bring up is that HTTP/2 over TLS (i.e. the de facto standard) requires the client and server to negotiate the HTTP version inside the TLS protocol. This requires the use of the ALPN TLS extension.

ALPN is supported by modern TLS libraries, and HTTP/2 is supported by modern servers, however official support is limited. For example, the versions of Apache httpd available from Red Hat don't have mod_h[ttp]2 compiled in, so you'd have to build your own httpd from source (or use an unsupported repository) to be able to use HTTP/2 with Apache.



thumbnail image

Matthew Kerwin

Published
Modified
License
CC BY-SA 4.0
Tags
development, web
HTTP/2 is "...a replacement for how HTTP is expressed “on the wire.”" It was invented to improve performance for the web, based on Google's experimental SPDY protocol. Among its stated goals were the requirements that it use the same protocol semantics and traverse the same networks as HTTP/1.x.

Comments powered by Disqus