The world wide web runs on the Hypertext Transfer Protocol (HTTP) and TCP. This is the current agreement, at least for now. More about that later. From the early 1990s, servers that are part of the web have had connections made and information supplied using a version of these two protocols. In this lesson, we will take a close look at HTTP.
There are three significant reasons why we should be familiar with HTTP. First, nearly everyone in the AV industry uses the web daily to receive information. Second, video is often delivered using HTTP. Third, and most important, both HTTP and TCP are undergoing significant modifications. TCP may be approaching its end-of-life status and replacing it could cause significant changes to HTTP and the web itself.
[Byte-Sized Lesson: Dissecting the Data Link Level]
The term “text” is used in its title because the original version, 0.9, was designed to allow requests and responses to be made using text only. The request/response character was preserved in the initial widely adopted version 1.0 and in the follow-on protocol HTTP 1.1. Since 1.1 was adopted within six months of 1.0 and is very similar, we will look at it closely. HTTP 1.1 remains the most widely used version.
Clients make requests using a command called a “get” that identifies a file or files that it needs. The web server typically responds with an “OK” and then sends the resource. If the resource is not available or is located somewhere else, it may respond with the error code, “404 Resource Not Found,” or with a redirect that indicates the location of the needed resource. One of the problems with version 1.1 is head-of-line (HOL) blocking. Resources that are needed to begin rendering a web page must be received in a specific order. HOL occurs when resources are received out of order.
Nearly a decade ago, Ilya Gregorik discussed the fact that typical web page retrieval involved the client issuing 90 gets to 15 or more different hosts. Obviously, variable latency could wreak havoc on the order in which the resources are needed and slow the building of the page. Often, each request is made in a separate TCP connection. This means a three-way handshake for each session and a proper session close after each resource is received. Many of these sessions will also start with a DNS request. All of this is quite inefficient. Attempts were made to correct these problems with techniques called keep-alives, pipelining, and persistent connections. However, they have had mixed success.
HTTP 2.0 is gradually being deployed. It makes significant changes to the protocol and includes:
- Data compression of HTTP headers
- HTTP/2 Server Push
- Pipelining of requests
- Multiple requests in a single TCP session.
Every major browser supports HTTP/2; but even with widespread support of HTTP/2, the problem of TCP performance will remain. We have discussed this difficulty many times. Currently it is extremely difficult to separate poor TCP performance from poor HTTP performance. But a developing group of influential researchers is pushing for the adoption of Google’s QUIC (no longer used as an acronym) as the replacement for TCP. This will create a significantly new environment for internet users and developers.
Phil Hippenstel, EdD, is a regular columnist with AV Technology. He teaches information systems at Penn State Harrisburg.