How we made axonator.com highly scalable

At axonator.com we employed a highly scalable and available architecture right from the beginning. In this blog post I would like to discuss some of the highlights of the architecture. The details of some of the elements would be covered in upcoming posts.

To put things into context, I would like to enumerate the key factors that has driven the architectural decisions:

  1. Axonator is a multi-tenant application
  2. Axonator is a data and network intensive platform
  3. Part of the database is relational and part is document based
  4. Application needs to be highly available in the events of fatal catastrophes
  5. Application needs to be scalable to large number of users

The key features of this architecture include but are not limited to robustness of the object oriented, highly scalable and available design. At axonator we are a small team of developers who take extreme pride in our architectural designs and world-class development.

Here is the overview in a visual format:

Screen Shot 2015-09-29 at 1.51.02 pm

Diagram: D1

Highlights of Axonator Architecture depicted above:

  1. Highly scalable
  2. Object oriented
  3. Highly available

A Brief Intro to Axonator.com

Axonator gives you the control you need to build and deploy your form and workflow based mobile applications without coding. Axonator offers a complete, web-based drag-and-drop app development and delivery platform that allows you to quickly build data capture and workflow mobile apps at a fraction of the cost of other solutions. Maximize the productivity and reliability of your mobile employees and minimize disappointed customers and lost revenue.

Axonator is a complete ecosystem system for the modern mobile-first enterprises. Whether you’re capturing events, incidents, assets, surveys, and inspections or deploying complex workflows in the field, Axonator is the optimal platform to connect your field employees and data to your decision dashboard.

The high-performance, efficient workflow engine in Axonator handles data submissions traffic equally well before switching and routing each request to the user or other destinations.  Companies deploy Axonator to manage the field data capture associated with their activities like sales or inspections and to make their business more responsive, scalable, fast, and secure.

Here are the key components of Axonator platform:

  1. App Builder: App builder allows users to develop apps without coding. 90% of the activities are drag and drop based. The two main components of the app builder are:
    1. Form Builder: Allows you to build forms with various widgets like text, photo, barcode/QR code capture.
    2. Workflow Designer: Allows users to define the set of activities a business does in order to complete a task like inspection, survey, maintenance etc.
  2. Workflow Engine: The high performance workflow engine executes the workflows designed by the app builder.
  3. Axonator Host App: All the apps built using the Axonator App Builder run inside a host app built for each platform. The host-app architecture allows users to deploy the mobile apps without going through apple app submission and approval process. This means the apps are instantly available to the mobile employees.
  1. Host app for Android
  2. Host app for iOS

Axonator Needs to be Highly Scalable

Axonator allows users to build native mobile apps. These mobile apps are used by mobile employees for various activities like surveys, inspections and other types of data collection. Since huge amount of data may be collected every minute by large number of users, Axonator’s architecture needs to be highly scalable to accommodate current and the future usage patterns.

Axonator Needs to be Highly Available

Like any business-critical application, Axonator needs to be available all the time, 24×7 with ideally zero downtime. This is important because businesses rely on it’s availability to function smoothly.

The Axonator Architecture – How we made Axonator scalable and highly available?

Let’s get to the meat of this post, the actual architecture of the Axonator, that makes it highly available and scalable. In this section, we will talk about the exact and detailed description about the pieces that work together to provide these important features.

High level components:

  1. NGinx
  2. Redis
  3. RabbitMQ
  4. GUnicorn
  5. NodeJS
  6. Celery
  7. Python/DJango
  8. MongoDB
  9. PostGreSQL

GUnicorn

Gunicorn is based on prefork worker model. Which means that there is a central master process that handles a set of worker processes. The master is never aware of anything about individual clients. All requests and responses are  completely  taken care by worker processes.

Screen Shot 2015-09-29 at 1.49.03 pm

Diagram D2

Gunicorn should only need 4-12 worker processes to manage hundreds or thousands of requests per second.

Always note, there is definitely such a thing as too many workers. After a certain point your worker processes will start thrashing system resources diminishing the throughput of the entire system.

Use of threads instead of processes is a nice way to decrease the memory footprint of Gunicorn, while still permitting for application upgrades using the reload signal, since the application code will be distributed among workers, however, loaded only in the worker processes (It isn’t like when using the preload setting, which loads the code in the master process).

PostGreSQL

PostGreSQL is a much sound database server than MYSQL. Most Django users opt for PostgreSQL. The Django ORM works well with this database server than any other.

NGinx Reverse Proxy

Proxying is commonly used to share the load among many servers, effortlessly show content from different websites or pass requests for executing to application servers over protocols excluding HTTP.

This resource bound assumption is the reason why we need a buffering proxy in front of a default configuration Gunicorn. By exposing synchronous workers to the internet, a DOS attack would be inconsequential by creating a load that trickles data to the servers. For the inquisitive, Slowloris can be an example of such type of load.

NGINX is an unbelievably fast and lightweight web server. We will be using it to serve up our static files for our Django app.

NGINX is an efficient accelerating proxy for an extensive range of HTTP-based applications. Its caching, HTTP connection processing, and offload powerfully amplify application performance, particularly during the time of high load. NGINX Plus enhances the capabilities of NGINX by adding even more load-balancing capabilities like session persistence, health checks, dynamic reconfiguration of load-balanced server groups and supervising live activity.

NGinx is an event driven server. Every worker process is single-threaded and runs a non-blocking event loop in order to process requests very quickly.

Load balancers benefit us scale. They make horizontally scaling easier, by distributing the load across multiple backends. Instead of having only one very-powerful and very- expensive server, we can use load balancing to share the load across five cheap medium- powered servers.

Load balancers help us handle failures. Even if you have enough power or just need a single server to serve your web application, that server will ultimately go down. It will crash/collapse. The CPU will get fried. In the age of cloud computing, servers are fleeting and not long-lived. Load balancers permit us to be highly available by giving redundant backends.

 

Who balances the load to load balancers?

Is it ever possible for you to be highly available if you have multiple backend application servers but only a single load balancer?  Isn’t it that you are back into the exact position, except one thing that now the load balancer is your single point of failure? Yes it is.

You have to have multiple load balancers. But who load balances to load balancers?

The answer to this is DNS. Use of DNS load balancing to present multiple IPs for a single A Record is recommended when you have multiple nginx load balancers for your website.

Screen Shot 2015-09-29 at 1.47.33 pm

Diagram: D3

One flaw here is that although the clients are assumed to choose a random IP from the list, poorly implemented ones will always use the first IP presented. Modern DNS providers (Like AWS Route 53 and Dynect) will randomize the ordering for each DNS request to help mitigate hotspots.

The other flaw is that DNS records are often cached for a long span of time. As a result, if a load balancer stops to function, the IP Address may still be cached for some users, even if you remove it from your DNS entry.

Even worst is that users do not get any indication that something is wrong, just a difficult error that they could not connect. This is unacceptable even with DNS TTLs as low as 60s.

Thus, a combination of DNS Load Balancing and a mechanism to re-assign IP addresses is essential to be highly-available. As per your provider, you can use something like keepalived to send heartbeats between servers and takeover the IP Address of a downed server by sending an ARP broadcast.

Initially keepalived did not work on cloud providers like AWS because they usually filter broadcast traffic from the network for security. But since version 1.2.8, keepalived supports unicast and makes a better way to build high availability nginx configurations— also on AWS.

C10K with NGinx

We use nginx as a load balancer importantly for scalability. The reason is that you generally have only a handful of load balancers for 10s or 100s of application servers which can see a much larger number of connections and requests per second.

Latency and scalability is much more important with load balancers and you want to be certain that you get the correct values.

The objective of serving 10,000 simultaneous, concurrent connections through a web server is the  ”C10K Problem”.

Nginx can scale to 10,000 connections, with a bit of tuning (both nginx and the linux kernel), without much difficulty  because it is highly scalable. The fact is, it’s very common to scale it to 100,000 or more connections.

It is entirely a different problem altogether for your application to be able to handle it.

Node.js and Web Push Notifications

Screen Shot 2015-09-29 at 1.56.28 pm

Diagram: D4

Celery/Redis/RabbitMQ:

Task queues are used as a tool to share work across threads or machines. A task is a task queue’s input. It is a unit of work. Dedicated worker processes consistently examine task queues for new work to perform.

Screen Shot 2015-09-29 at 1.52.50 pm

Diagram: D5

Celery communicates through messages, usually using an agent to mediate between clients and workers. To launch a task, a client adds a message to the existing  queue, which the agent then transports to a worker. A Celery system can consist of many workers and agents to give a way to high availability and horizontal scaling.

 

  • Highly Available
    • Workers and clients will automatically retry in the case of loss of connection or failure, and few agents support HA in the manner of Master/Master or Master/Slave replication.
  • Fast
    • One Celery process can process millions of tasks per minute, with sub-millisecond round-trip latency (using RabbitMQ, py-librabbitmq, and optimized settings).

Message Queueing-1

Diagram: D6

Clustering

For forming a single logical broker, many RabbitMQ servers on a local network are clustered together.

Federation

RabbitMQ extends a federation model for the servers that have to be more loosely and unreliably connected than clustering allows.

Highly Available Queues

Even in the case of hardware failure, it is ensured that your messages are secured by mirroring queues in a cluster across numerous machines .

MongoDB

A distinctively powerful feature of MongoDB is its support for geospatial indexes. This permits you to store either geoJSON or x and y coordinates within the documents and then find documents that are either near a set of coordinates or within a circle or a box.

While replication can help performance to some extent (by isolating long running queries to secondaries and reducing latency for some different types of queries), it’s main motive is to provide high availability. The primary method for scaling MongoDB cluster is Sharding. Combining replication with sharding is the proscribed way to achieve scaling and high availability. ‘

Screen Shot 2015-09-29 at 2.01.10 pm

Diagram: D7

Scaling the PostGreSQL

Database servers can work together to allow a second server to take over quickly if the primary server fails (high availability), or to allow several computers to serve the same data (load balancing). Ideally, database servers could work together seamlessly. Web servers serving static web pages can be combined quite easily by merely load-balancing web requests to multiple machines. In fact, read-only database servers can be combined relatively easily too. Unfortunately, most database servers have a read/write mix of requests, and read/write servers are much harder to combine. This is because though read-only data needs to be placed on each server only once, a write to any server has to be propagated to all servers so that future read requests to those servers return consistent results.

This synchronization problem is the fundamental difficulty for servers working together. Because there is no single solution that eliminates the impact of the sync problem for all use cases, there are multiple solutions. Each solution addresses this problem in a different way, and minimizes its impact for a specific workload.

Database servers have an ability to  work together in order to permit the second server to take over when the primary server fails and also permit many computers to serve the same data. The web servers serving Static web pages can be easily combined just by sharing the web requests on multiple machines. One more example where the servers can be easily combined are the read-only database servers. Sadly most of the database servers have both read and write requests. Comparatively read-write database servers are much difficult to combine. Because read-only data has to be put on each server only once. But write data has to be propagated to all servers so that the future read data requests to all those servers give consistent output. This Synchronisation becomes the fundamental problem for servers which are working together. There are multiple solutions for this. But each solution minimises its impact for a particular load. None of them solves the problem for all use cases.

Some solutions deal with synchronization by permitting only one server to modify the data. Servers that can modify data are called read-write, primary or master servers. Servers that track changes in the primary or master server are called slave or standby servers. A warm standby server is a standby server that cannot be connected until it is passed to a master server and one that can  accept connections and serve read-only queries is called a hot standby server.

Some solutions are synchronous, which means that until all servers have committed the transaction, a transaction that is modifying data isn’t considered committed. This assures that a failover will not lose any data and all load-balanced servers will return consistent outputs irrelevant of which server is queried. On the contrary , asynchronous solutions permit some lag between the time of a commit and its propagation to the other servers, opening the possibility that some transactions might have been lost in the switch to a backup server and that load balanced servers might return slightly old results. Asynchronous communication is preferred when synchronous would be very slow.

One more way in which the solutions can be categorized is by their granularity. Some solutions can deal only with an entire database server. While others permit control at the per-table or per-database level. Performance must be considered in any preference. There is assumed to be a trade-off between functionality and performance. One example to mention would be, a fully synchronous solution over a slow network might diminish performance by more than half, while an asynchronous one might  just have a minimal performance impact.

Mobile App considerations

Axonator provides native host apps for each platform it supports. As of the writing of this post it supports two popular platforms iOS and Android smartphones and tablets. The host apps are high performance natives apps and are available for free download at Apple store and Google’ play store.

Premises:

  1. User Experience: Unlike many other enterprise app developers, we consider the app users as human beings similar to how consumer apps users
  2. Performance: Mobile apps need to be high performance in terms of response times and smoothness of the user interface

Above considerations have led us to following implementation:

  1. Native Technology: The apps that are built using Axonator App Builder are native. The alternative HTML5 or the hybrid approaches are discarded due to the following reasons:
    1. HTML5 and Hybrid apps are sluggish in performance: These apps use javascript for the logic development. Javascript, being an interpreted language, is slow in execution. This effect is more pronounced on CPU and memory strapped mobile devices.
    2. The user experience is extremely poor in terms of UI interaction and response time. This is obviously frustrating to any user. The argument that this is fine with enterprise customers is fraught with flaws.
    3. The native apps provide better control on the use of the scarce memory on smartphones and tablets.
  2. JSON over XML API: We ditched XML based API for JSON due to performance reasons. Also we further use HTTP compression to save internet bandwidth. This leads to faster synchronization of the data between mobile apps and the server.

Key Takeaways

Axonator is highly scalable and available besides being robust in functionality and maintainability which is very important for our developers as it allows them to quickly roll out upgrades to the features. Availability is very important for business-critical apps on which the modern mobile-first businesses rely on for their day-to-day operations and ultimately their customer’s satisfaction with their service. The key design decision of mobile apps being built using native technology is driven by a simple concept: Enterprise users are human beings just like any consumers. So they deserve a great user experience like consumer apps.

 
Jayesh Kitukale

Jayesh Kitukale

Jayesh Kitukale is the CEO of Axonator.com. Axonator is a web based tool for non-programmers like you to build your own business process automation mobile apps for iOS and Android without writing a single line of code.

 
  • Subhash Trivedi

    This article is very useful