Everything in one go
In the figure above, a client (be it a mobile or a desktop) communicates with a server to fetch a page. They exchange several messages, and doing so requires time, which is depicted using the vertical direction. So a message starts at a given time and then its line steeps down because there is some physical time that the message needs to reach its destination. Requests are represented in orange, and responses from the server are represented in blue.
The left panel in the figure shows how a web page is usually fetched: it starts with a .html file. As soon as the browser gets that first resource, at point "A" in the panel, it decides that it needs a new file, "styles.css", and asks for it to the server. After the browser analyzes "styles.css", it decides at point "B" that it needs "font.ttf" and asks for it... Since the browser discovers incrementally what resources are needed to assemble the website, several round trips are needed to render the page.
Discovering resources in the browser therefore implies several round trips, and delays the moment when the page can be shown to the user. Web site developers have known this for a time, and avoided round trips at all costs. One way to avoid them is bundling: creating relatively large files that contain many smaller files. However, no matter how many tools have been devised to help, bundling remains a burden for web developers, because at development time, we simply are more comfortable delimiting files by their function and not by their size. More often than not, this results in many small files.
Another issue with bundling is that the browser's cache works on individual files. A larger file bundling many smaller ones is much more "volatile" in the sense that it will change as soon as any of the smaller ones change, and the entire bundle will have to be fetched again. In other words, bundling helps with performance for first-time visitors, which are admittedly a majority for most websites, but it is detrimental for returning users, who will have to fetch large chunks of the website as soon as a very small piece of functionality changes.
Another possible scenario is shown in the right panel of the figure: as soon as the server notices that the client is requesting "index.html", it sends both this file and most or all the other files that the client is going to need. This is what we call serving everything in one go, and the technical possibility to do so has been present since SPDY and now has been standardized in HTTP/2, through something called HTTP/2 PUSH.
However, serving everything in one go brings a few challenges. For starters, conventional web servers are not wired to work in this model, and this is not just an issue of "having more" functions, but also of suppressing some of their habitual functions that would become problematic in the presence of PUSH.
And then there are some unknowns that one needs to consider, for example, what to push and when, and how to take into account previous visits of the user to the site. After all, we don't want to push files to the user that she already has. Finally, the most important issue of all is the human factor: PUSH should be a way of simplifying developer's lives, not of burdening them with more work, and therefore should be as automatic as possible.
ShimmerCat is a Web Applications Edge Server (WAES, pronounced as "ways") built from scratch to serve everything in one go. We call the part of the program doing that an Ahead of Time Transfer Engine (ATTE). Being a greenfield implementation, we don't have the many constraints that would come with trying to honor more conventional ways of doing things. For example, ShimmerCat is built around the assumption that it will talk to the client through a TLS connection, and therefore without HTTP proxies and intermediate caches in the middle ( TCP proxies and TLS termination are OK of course). It also assumes that all the virtual hosts it serves are functional and possibly related components of one unique application, not mutually distrusting domains that should be isolated from each other with brick, mortar and steel. This has been used as a guiding principle, in practice and right now all the domains running in a ShimmerCat instance are isolated from each other.
The issue of what to push and when is solved by regarding some resources as special in the sense that they can start a page fetch and thus prompt the server to push more resources. Furthermore, those fetch roots or apexes are constrained to be simple static files, not pages rendered dynamically by a backend. More generally, and for the time being, ShimmerCat only pushes static resources, while using a normal request/response model for dynamically generated content or data. These restrictions are quite often acceptable for complex front-end applications.
Building upon those assumptions, ShimmerCat comes with a laboratory that we call learning mode. In learning mode, the server experiments with HTTP/2 package delivery times to build a priority model for resources whose push is triggered by one of the apexes. We call those resources a fetch set. If you want to learn more, read the following subsections .
Some notes about the diagram are given below.
... should be active only in Devlove mode and in "learning mode" to infer preconnect hints.
These are the evolved version of cookieless "static domains" that application developers are familiar with. They are not that static anymore, since ShimmerCat manages what resources are pushed to the client and when, but they are not dynamic either since ShimmerCat doesn't mess with file contents. Hence the name "electric". ShimmerCat's electric domains only support GET and HEAD (the later TBI soon) methods.
Handles relatively ambiguous URLs that point to fetch apexes, redirecting URLS whose last component doesn't contain a dot nor ends in a slash to the slash form. We will make this more flexible in the future, right now, we just follow Google's recommendations. The redirect is a 302, and we call its location header value canonical form.
Once a fetch apex is taken to a canonical form, the resolver checks if there is a corresponding file in the cache. If not, several template pages are tried, as shown in the figure. Right now those template pages are wired with the names shown in the figure, i.e. /.../index.html, but we will make this more flexible in the future. If a template page is found, there are two paths possible, depending on if there is a consultant configured. A consultant is a backend HTTP/1.1 application that can say if the resource at a particular URL exists. The server will forward the original request to the consultant, changing a GET to HEAD. The consultant can respond with either 200 or 404, which should contain only headers since it is a HEAD request. When a 200 is returned by the consultant, Shimmercat sends the contents of the template file, and possibly pushes resources associated to that apex. This way, a single fetch set can be associated with an unlimited number of URLs. When a 404 is returned by the consultant, ShimmerCat tries to find a 404 template page, and if successful it serves it possibly pushing associated resources. Otherwise a generic 404 is returned.
When no consultant is configured, ShimmerCat behaves as if there were a consultant that approves all requests, as long as a template page be available up the file hierarchy.
ShimmerCat's electric domains use a Redis database. Said database holds both file contents (gzip compressed in most cases) and file metadata. It also contains the structure of the fetch sets. All the data in Redis is namespaced with a cache name, and there is only one cache per electric domain (so far). We call data under a common cache name simply a cache. A cache contains versioned entries, so that the clients are pushed only resources that have changed since their last visit. It makes sense to only store the metadata of old cache versions, while actual file contents are stored only for the latest version of the cache. Selective PUSH will be implemented soon.
Notice that Redis is an in-memory database, however it can store substantial amounts of data in-disk. It also has nice cluster capabilities.
Fetch sets are sets of files that are pushed in response to the intent from a client of fetching the apex of the set. We have represented them as trees, however their best representation is as a set with a weak ordering, where files coming first should be received first by the browser.
We compute the weak ordering by analyzing the lab data acquired in learning mode.
There can be multiple fetch sets per electric domain, of course.
We have an upcoming lab mode implemented on top of our current, for-developers SOCKS5 internal proxy server, where we will automate deployment of the preconnect hint.