HTTP/2

HTTP/2 is "...a replacement for how HTTP is expressed “on the wire.”" [http2.github.io] It was invented to improve performance for the web, based on Google's experimental SPDY protocol (which it has since replaced.)

Among its stated goals were the requirements that it use the same protocol semantics (request-response exchanges, headers, status codes, etc.) and traverse the same networks (gateways, proxies, etc.) as HTTP/1.x.

Problems with HTTP/1.x

Head-of-line blocking

Because of HTTP's request-response, request-response flow, any subsequent exchange cannot progress until the preceding one has completed. This is called "head-of-line (HoL) blocking," and is a problem if:

the client doesn't (or can't) stack requests, thus waiting for a whole request-response round-trip before starting a new one (see also: pipelining)
big/slow/unimportant resources are requested before small/fast/important ones, so it appears nothing useful is happening for a while

TCP connection overhead

In the olden days, each request-response exchange happened on its own TCP connection (open(), write("GET ..."), read(...), close()). Unfortunately that open() is a fairly costly operation, especially if latency is involved, and then even once it's completed you find yourself throttled by TCP congestion control [slow start]. (It's even worse if HTTPS/TLS is involved, because the initial handshakes and key exchanges and whatnot involved there can be very slow.) For high-churn servers you also end up with a lot of TCP ports bound up in TIME_WAIT. This can be partially overcome with Keep-Alive and persistent connections or by sharding; however:

HoL blocking leads folk to use multiple simultaneous connections (domain sharding is the server-driven equivalent of this) resulting in multiple simultaneous slow starts (i.e. the average speed is slower over all)
for historical reasons HTTP is geared towards tearing down connections periodically (Keep-Alive timeouts, httpd's MaxKeepAliveRequests and – to a lesser extent – MaxRequestsPerChild settings, etc.)
sharding breaks caches (explained in more detail below)

HTTP overhead

Sometimes (and not uncommonly):

GET /devcon/http2-rfc7540/ HTTP/1.1
Host: xfiles.library.qut.edu.au
Connection: keep-alive
Cache-Control: max-age=0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/43.0.2357.81 Safari/537.36
DNT: 1
Referer: https://xfiles.library.qut.edu.au/devcon/
Accept-Encoding: gzip, deflate, sdch
Accept-Language: en-US,en;q=0.8
Cookie: com.silverpop.iMAWebCookie=12345678-abcde-1234-abcd-123456781234; III_EXPT_FILE=aa1122;
  III_SESSION_ID=1aca388da6270e7153258a4ea06cf7be; SESSION_LANGUAGE=eng; ezproxy=xxxxxaaaaa55555;
  _saml_idp_for_int=aHR0cHM6Ly9lc29lLXRzdC5xdXQuZWR1LmF1; _saml_idp="aHR0cHM6Ly9lc29lLnF1dC5lZHUuYXU=";
  __utmz=255014920.1432528569.4.3.utmccn=(referral)|utmcsr=xxx.qut.edu.au|utmcct=/idp/profile/SAML2/POST/SSO|utmcmd=referral;
  spepSession=319954b846de6f70ad1bff65ea9b85d23f037d68-d5f57d6e97b139fbaf0952803ec36fea5e4000c1-1432598840;
  __utma=255014920.1032807906.1432269288.1432598569.1432602655.6; __utmc=255014920; _ga=GA1.3.1037755534.1431397557
If-None-Match: "98c66e-f0c-8e1315c0"
If-Modified-Since: Tue, 26 May 2015 04:28:15 GMT

Total: 1200 bytes (>1k)
Required: 72 bytes

This request is repeated, almost byte-for-byte, for every image, stylesheet, font, etc. in the page.

Cache busting

The ideal cost for a HTTP request-response exchange is 0 bytes delivered in 0 seconds. Sounds crazy, but this can be achieved – using caching.

To get around all of the problems listed above it's common to see two practices:

domain sharding – duplicating resources across multiple domains and distributing requests between them (thus increasing the total number of parallel connections)
inlining/spriting – combining multiple resources into a single überresource (often accompanied by instructions for separating them out again) to reduce the amount of TCP and HTTP overhead per resource

However:

a resource made available on multiple shards has to be cached once for each shard, which means (worst case) it has to be requested n+1 times (where n is the number of shards) in order to get one perfect 0 byte, 0 second request.
any change to any sub-resource of an überresource means that the entire überresource is invalidated in all caches and needs to be refreshed – which is much more costly than updating a single sub-resource. (Especially if that sub-resource isn't actually required at the moment.)

Solutions with HTTP/2

Multiplexed streams

HTTP/2 supports parallel multiplexed request-response exchanges on a single connection. This means that HoL blocking is eliminated, and resources can be requested and delivered as soon as they're known to be needed.

Persistent connections

HTTP/2 works over a single, long-lived TCP connection. This means the costs of open() and TCP slow start are amortised over the lifetime of the connection.

Binary format & header compression

HTTP/2 uses a binary packing format (where HTTP/1.x used Good Old ASCII™) and a header compression mechanism that reduces the number of bytes sent over the wire quite a bit. (It was particularly designed with stupid headers, like User-Agent and Cookie, in mind.)

Caching works

Because these solutions eliminate the drivers for things like sharding and spriting, caches now work the way they were intended*.

* HTTPS/TLS/MitM/etc. notwithstanding – but that's a topic for another time.

Other benefits of HTTP/2

HTTP/2 also introduces some other goodies:

explicit stream priorities – allowing clients to add a preference/weighting to requests, as a hint to the server that some resources (e.g. render-blocking CSS) should be delivered before others (e.g. asynchronous javascript)
reset (aka "stop") – allowing either end of a connection to cancel an in-flight request-response exchange, so you don't have to clog the tubes waiting for an unwanted resource to be fully delivered
server push (aka "cache push") – allowing a server to send a resource to a client without the client asking for it first (useful for updating stale cached resources)
flow control – allowing either end to limit the amount of data it receives from its peer, so you don't have to worry about buffer overflow or (some classes of) DoS attacks

Side-effects of HTTP/2

As I've written elsewhere, all of these changes do mean that while HTTP/2 was meant to be a drop-in replacement for HTTP/1.1’s transport, realistically we have to rethink how our applications are structured and redesign them to take advantage of what HTTP/2 has to offer. And it's not really practical to try and offer the exact same service over both protocols (c.f.: Happy Eyeballs) unless the application is explicitly programmed that way.

Why to use HTTP/2

HTTP/2 was designed for web browsing. It's useful for:

web apps – a single, long-lived connection, with potentially lots of repeated metadata and resources
multiple tabs – ...to a single server can share a connection (taking advantage of better network usage and compression), and use prioritisation to "optimise the user experience" between the tabs
servers in general – ...will benefit from better network usage (fewer connections) and hopefully less bandwidth usage

However it is computationally more complex (and therefore slower to execute) than HTTP/1.x, although that's usually more than made up for in other efficiencies – being CPU bound is almost always better than being IO bound.

Finally, unless you can control both the client and the server and can assert a level of surety over every intermediate network device, there's no way to use HTTP/2 over cleartext HTTP. All the major browser vendors decided to only support HTTP/2 over HTTPS†. This adds additional costs to running HTTP/2 (the cost of certificates, administrative overheads, additional computation, etc.) If you're already using HTTPS for everything, as we often are, then it's not a problem; however if you're running sites in cleartext HTTP the path to upgrading can be quite costly.

† For reasons. The technical reason is complex:

If HTTP/2 is meant to be an upgrade for HTTP, then it should still work with the same URLs – and same URLs means same default ports. Which means for 99.999% of URLs on the open web we would have to carry HTTP/2 over ports :80 and :443.
Some proxies assume all data flowing through TCP port :80 (or any port labeled "HTTP" in their config) is HTTP/1.x, and those proxies can die in horrible and unpredictable (and sometimes undetectable or undiagnosable) ways if they instead get a stream of apparent binary guff. In most cases those devices have been convinced, over the course of the past two decades, to expect and allow a stream of binary guff on port :443 (or any port labeled "HTTPS") so "smuggling" HTTP/2 inside a :443 TLS stream has much more chance of success.

The non-technical reason is simpler: Google wants HTTPS everywhere.

Update [2017-03-14]

Regarding Problems with HTTP/1.x

There are also lower-level workarounds to help fix things like TCP connection establishment overhead, such as TCP Fast Open (and I think there's a 0-rtt TLS hack as well, but I don't know much about that.) These workarounds don't help with slow start, though, or any of the other issues listed above.

Regarding Why to use HTTP/2

Another point I forgot to bring up is that HTTP/2 over TLS (i.e. the de facto standard) requires the client and server to negotiate the HTTP version inside the TLS protocol. This requires the use of the ALPN TLS extension.

ALPN is supported by modern TLS libraries, and HTTP/2 is supported by modern servers, however official support is limited. For example, the versions of Apache httpd available from Red Hat don't have mod_h[ttp]2 compiled in, so you'd have to build your own httpd from source (or use an unsupported repository) to be able to use HTTP/2 with Apache.

Matthew Kerwin

Published: 2017-03-10
Modified: 2021-07-30
License: CC BY-SA 4.0
Tags: development, web