- The week is over. There is a retrospective post here.
- I have posted Tuesdays notes here, Wednesday’s notes here, Thursday’s notes here, and Friday’s notes here.
I have decided to write a simple PaaS (Platform as a Service), in order to demonstrate the capabilities of my perfectapi API framework for Node.js. I don’t want to compete with other PaaS offerings – this is just a need that my company has – to demonstrate how to simply and easily self-host and scale services developed with perfectapi.
Doing it in one week is a challenge I put to myself, because I have a limited amount of time to devote to this, and because doing it in one week (5 days) can drive a little PR for perfectapi.
So what exactly will I be building? The platform will have the following qualities:
- Ability to deploy multiple Node.js perfectapi-based services, and load balance multiple instances of those services
- Each PaaS platform supports a single domain endpoint – in my case it will be services.perfectapi.com
- Provide the groundwork to automatically scale out as load increases (actual feature of automatically scaling out is not in scope for this week, but manually scaling out will be possible)
- Services all run behind SSL
- Complete code and Amazon AMI images (Ubuntu 11.10) to be provided, so that others can reproduce the PaaS environment easily
- Written in Node.js and using freely available Linux tools where necessary
- No security to speak of (limit access using firewalls or shared secrets)
Day 1 (Monday) – Design
I spent most of Monday in Caribou coffee, researching and designing on my iPad. Oh, the life I lead
The design I have consists of several components. At the core is the Service Registry, which stores a list of the services installed in my domain.
Day 2 (Tuesday) - Service Registry
The service registry maintains a record of detail for each service instance, including the unique service name, and the path on which it will host. So for example, the following:
- “Payment Portal”, /payments, attributes, files
- “Payment Portal”, /payments, attributes, files
- “Mailer”, /email, files
…represents 3 service instances. The first two are instances of the same service, and host on the same “/payments” path. The 3rd is another service. Each instance has its own copy of the files, and has a set of attributes.
In my “services.perfectapi.com” domain, the records above would represent 2 endpoints:
- https://services.perfectapi.com/payments – load balanced across 2 instances
- https://services.perfectapi.com/email – pointing at a single instance
The Service Registry does not create these services, or manage their endpoints, or anything really other than record their existence. It will have the following API commands and queries:
- RegisterInstance(service name, path, attributes, files) – creates a new record of a service instance
- UnregisterInstance(service name, path, [host], [port]) – removes a single matching existing record of a service instance
- GetServiceInfo(service name) – returns array of host, path, port, attributes, files
- ListServices – lists service names
- ListUnclaimedServices – lists service names which do not yet have a port and host specified. May list the same service name multiple times if their are multiple matches.
- ClaimService(service name, path, host, port, attributes) – updates an unclaimed service instance with new details. If there are multiple unclaimed instances, then only one will be updated. If there are no matches, returns an error.
In version 1, the Service Registry will probably store its data in Redis. In later versions I would like if it could use DNS-SD.
Day 3 (Wednesday) - Reverse Proxy
The Service Registry was a single service for a domain. The Reverse Proxy is a single service per machine. The Reverse Proxy monitors the Service Registry, and ensures that all instances on the current machine have a route from outside. So for example, on the current host we may have the following in the service registry:
- “Payment Portal”, /payments, port 4001, host ABC
- “Payment Portal”, /payments, port 4002, host ABC
- “Mailer”, /email, port 4003, host ABC
The Reverse Proxy ensures that all services are accessible on port 80, i.e.
- http://abc/payments – load balanced across 2 instances
- http://abc/email – pointing at a single instance
It also monitors the Service Registry for changes, and when it finds that there is a discrepancy, it updates the proxy configuration to create or remove a route.
In addition to routing traffic, the reverse proxy exposes an endpoint with load information for the machine it is installed on. It exposes two methods:
- GetLoad – returns a number indicating current 5 minute load average. 1.0 or higher indicates that the server load is at or above capacity
- GetCulprit – returns the name of the service instance that is most likely the culprit, i.e. the instance that is using the most CPU. (I/O ignored because Node.js processes should not be I/O intense, and network ignored because it is unlikely to be a limitation in a cloud environment).
Day 4 & 5 (Thursday & Friday) - Hosting Server
Like the Reverse Proxy, the Hosting Server runs a single instance per machine. It contains the bulk of the PaaS functionality. Specifically, it is responsible for:
- installing new app instances, on request (copying files, launching the process, ensuring it remains running).
- removing app instances, on request
- updating App registry when apps are added or removed
The expected API methods are:
- PublishApp – publishes a new app
- DeleteApp – deletes an existing app
- UpdateApp – combination of a delete and a publish (publish new, then delete old)
- GetAppDetails – returns the attributes of an app
- ListApps – lists all apps
- ListAppInstances – lists the instances for an app
The Publish workflow goes something like this:
- Write new instance(s) to app registry (with empty host and port)
- Poll App Registry every 1 minute
- If an unclaimed service instance is found, and this server meets to service instance requirements, and this server’s load is low enough, then claim it (write host and port to Service Registry).
The definition of “ this server’s load is low enough” is that a call to the Unix “uptime” command returns a 5 minute load average number less than 1.0 per CPU of the system. This criteria should work on both Amazon EC2 Instances and Rackspace Cloud Servers.
Day 5 (Friday) – SSL Termination & Load Balancing
Another Reverse Proxy. This one is a Linux package called “pound”, which can do SSL termination (host the https) as well as load balancing amongst our servers. It works off of a static configuration file. The file will not need to change as long as the list of servers remains static.
Auto Scaling Server (Future)
The part I will not build this week. In order to scale out, we need to
- detect the need to scale – using GetLoad API call on the Reverse Proxies.
- determine which service needs scaling – using GetCulprit API call on the overloaded Reverse Proxy
- update the Service Registry with new instances
That takes care of scaling out within the existing machines in our cloud. Once we exhaust the capacity of the existing machines, we have to start using 3rd party services, such as SCALR or Amazon’s Auto Scaling. In that case, it probably makes sense to move the SSL termination and Load Balancing function off of our simple “pound” package.