We have regarded the scalability of web applications from an architectural point of view so far. In the next chapters, we will focus on scalability within web architectures, based on the usage of concurrency inside web servers, application servers and backend storage systems.
However, there are other factors which influence the scalability and perceived performance of web applications. Therefore, we will provide a brief overview of factors relevant for web site setup and client-side web application design. The overview summarizes important strategies outlined in relevant books[All10,Abb11,Abb09] and a blog article from the Yahoo Developer Network.
From a user's perspective, a web applications appears scalable, when it continues to provide the same service and the same quality of service independent of the number of concurrent users and load. Ideally, a user should not be able to draw any inferences from his experience interacting with the application about the actual service load. That is why constant, low-latency responses are important for the user experience. In practice, low round-trip latencies of single request/response cycles are essential. More complex web applications mitigate negative impacts by preventing full reloads and through dynamic behaviour, such as asynchronous loading of new content and partial update of the user interface (c.f. AJAX). Also, sound and reliable application functions and graceful degradation are crucial for user acceptance.
First and foremost, it is very important to minimize the number of HTTP requests. Network round trip times, perhaps preceded by connection establishing penalties, can dramatically increase latencies and slow down the user experience. Thus, as few requests as necessary should be used in a web application. A popular approach is the use of CSS sprites or images maps. Both approaches load a single image file containing multiple smaller images. The client then segments the image and the smaller tiles can be used and rendered independently. This technique allows to provide a single image containing all button images and graphical user interfaces elements as part of a combined image. Another way of reducing requests is the inlining of external content. For instance, the data URI scheme [Mas98] allows to embed arbitrary content as a base64-encoded URI string. In this way, smaller images (e.g. icons) can be directly inlined into an HTML document. Browsers often limit the number of parallel connections to a certain host, as requested in RFC 2616 [Fie99]. So when multiple resources have to be accessed, it can help to provide several domain names for the same server (e.g. static1.example.com, static2.example.com). Thus, resources can be identified using different domain names and clients can increase the number of parallel connections when loading resources.
On-the-fly compression of content can further reduce size with marginal CPU overhead. This is helpful for text-based formats and especially efficient in case of formats with verbose syntax (e.g. HTML, XML).