Building Fast and Robust REST APIs using Conditional HTTP Requests

Conditional HTTP requests are one of the little-known but widely used features. An intelligent client can determine the status and content of HTTP requests without actually transmitting the body over the wire. With conditional HTTP requests, you can reduce the work done by your server by identifying whether existing data available with the client still valid, hence saving bandwidth.

In conditional HTTP requests, the result of an HTTP request can be changed by comparing the affected resources with the value of a validator.

REST uses HTTP extensively — HTTP verbs in combination with URLs have special implied meanings and status codes for errors. Conditional requests can be used to validate the content of a cache or determining whether the current document being edited is valid.

Photo by Caspar Camille Rubin on Unsplash

Making a conditional HTTP request

Properly configured web servers and CDNs instruct browsers to cache most of the static content to avoid re-downloading big chunks of assets, improving page loading speed.

The easiest way to cache static content is to instruct the browser to cache something with the HTTP header. It directs browsers, on an intermediate proxy (CDN Edge), to assume the downloaded content to be fresh until the given time and use the same when the subsequent identical request arrives.

This approach has some drawbacks, such as difficulty in changing content once cached by browsers, etc. These can be mitigated with well-known cache-busting techniques — the benefits outweigh the problems.

Last-Modified, ETags and 304 Not Modified

One of the ways to compromise between unconditional caching and not caching is validating cached content. By supplying a timestamp with the header, you can instruct browsers to send an header with the timestamp of the cached response. If the server has the same content, it responds with a with an empty body, and the browser proceeds to use content from it’s cache. If the content has changed, the server sends a new body with a new value with a response and the new body.

Sometimes you don’t want to (or cannot) use timestamps to track the state of a resource. You can use a state-identifier such as a checksum (or anything) in a header called which the browser uses with the header. If the value sent by the browser matches the at the server, then the server sends a status like before.

Conditional requests in APIs

Most API calls, however, are not cached by default. This is true for calls originating from the browser and server-to-server calls. Every time you repeat a GET request, the server fetches the resource from its data store and transfers the body over the network.

Using the date headers — Expiry, Last-Modified and If-Modified-Since

A good example would be an API for new article recommendations. Your client polls the endpoint for new items to be available. For every call, your API server fetches content from the database, serializes them to JSON, and returns the result. The client then compares the response with the previous and detects changed items.

The recommendations are generated daily using a batch job at midnight, which means that the article recommendations are valid till midnight. You can attach an header to indicate that the content is fresh till midnight.

In some cases, you may not be able to determine the expiry in advance, such as the API for returning the content of an article. In this case, you can return a header with the last updated timestamp from your database. For subsequent requests for the same article, the client sends an with the timestamp sent by the server earlier. The server responds with a or with a response of depending on whether the article has been modified or not.

Optimistic Locking with ETags/If-Match or If-Unmodified-Since

Let’s continue with our example. We have an update API which can be called simultaneously when multiple editors are working on the same article. Without any optimistic locking, we will have something called a lost-update problem.

  1. A and B started editing the same article at 9:00 am. Both see identical contents.
  2. A makes some changes and decides to save them at 9:15 am.
  3. B takes a little longer and saves at 9:30 am.

Without optimistic locking, the content of B will overwrite the changes of A, and the updates made by A will be lost.

To prevent this, you can use a version number as the or serve the timestamp. While making the request to update the article, the client also sends or headers. The server now checks if there is any modification since the value specified those headers headers. The server either returns a if there has been an update since 9:00 am. The client can decide to fetch the latest value and attempt to merge with the latest changes.

Implementing conditional requests

The client

If you are calling APIs from a browser, you do not have to set up anything special. Set the appropriate header, and the browser will do the heavy lifting for caching. If you need advanced control, you can control the cache behavior using fetch API.

For server-side clients, it a bit complex. You need to configure a store, such as “Redis” to store the response and associated metadata. Before sending an HTTP request, you need to fetch the values for a given URL and add it to the headers. Libraries such as OKHttp contain pluggable providers to easily integrate such a mechanism.

The server

The server must understand these headers and perform the appropriate action. For example, in the API described above, your server can return if the “last-modified” timestamp on the database is older than the timestamp sent an header.

To prevent “lost updates”, the server can use a version number as the or use the header and assert it to be newer than the value in the your database. If the condition is violated, send a status. The client can then initiate appropriate conflict-resolution action such as displaying a diff and letting the user merge the differing versions of the article.

Conclusion

Leveraging HTTP conditional requests is an often ignored aspect of REST APIs. An adequately designed API can improve performance by reducing unnecessary network traffic. By implementing proper locking semantics, you can ensure APIs perform safe operations. This translates to thousands of dollars of money saved for a moderate-scale service.

Computer Whisperer. Open-source contributor. Find me at https://amitosh.in/