AppHarbor: Horizontal vs. Vertical Scaling

Continuing my experiments with AppHarbor, this time it’s all about scaling

As a hosting company AppHarbor do a splendid job of simplifying the process of server configuration, with a minimalist approach that appealed to me instantly when I first saw it. But I’m assuming I’m not the only one who’s a little perplexed by the server scaling options with which one can tinker. Having used AppHarbor for a number of years for prototypes and modest SaaS projects I openly admit that I’ve never had to delve too deeply into the complexities of worker process scaling – the most I’ve done up to now is add an extra worker or two, having made safe assumptions about the definition of “horizontal scaling” and its effects.“Vertical scaling”, however, remained an enigma to me, the question is: why and when would it be useful?

As 2015 is turning out to be a big year for my SaaS rollouts it was time I got myself schooled. A quick email to AppHarbor’s support guys garnered the following explanation:

You’d usually scale out to multiple workers (i.e. instances of your app running multiple servers) to increase your throughput, parallel processing and/or to increase availability of your application. Vertical scaling is usually beneficial when you need more worker resources per worker – for instance, you might want to process a CPU-intensive workload faster, and 2-4x the CPU resources would help you do that. Vertical scaling is described in more detail in this blog post which also provides common use cases for choosing this over (or in combination with) horizontal scaling.

Having read this explanation and the recommended blog post, and using my own SaaS projects as an example, I have surmised the following:

  • TuitionKit has a mundane computational complexity (mostly regular CRUD operations) but has a large user volume. An ongoing increase in user sign-ups will require more worker processes (horizontal scaling) to cope with the increase in page requests. As the system doesn’t do anything overly complex over and above serving pages and querying databases the horizontal scaling should cover most growth scenarios.
  • RopeWeaver involves serious number crunching for each company customer at a set number of times per day. The periodic nature of the computationally-heavy calculations means that they are done on a background worker process, removing the burden from the main web server thread, which does make it easier to manage the scaling. Ultimately most of the work is to be migrated to the database tier and done in T-SQL stored procedures, which will remove the burden from the web server entirely. Whilst it is obvious that careful management of vertical scaling is required, as a decision support system intended for use in factories and warehouses (where most users tend to be logged in for extended periods of time, e.g. an 8 hour shift) I fear a horizontal scaling aspect will also come into play. RopeWeaver poses a tricky server resource conundrum which I’ll have to experiment with further before I can come to any proper conclusions.