This is a continuation of previous posts:
- Writing a PaaS (using Node.js) in 1 week – Monday
- Writing a PaaS (using Node.js) in 1 week – Tuesday
Well Wednesday is done. The reverse proxy is written, code at https://github.com/perfectapi/node-paas-machine-proxy.
As a reminder, the purpose of the reverse proxy was to provide a single IP + host endpoint to each machine (host) on which multiple services are running. We can run multiple services on that host, or multiple instances of the same service. The reverse proxy provides routing and round-robin load balancing between the services. Each service runs on a unique path, e.g. http://host/service1, http://host/service2.
A higher level proxy will provide routing between machines and SSL termination, providing endpoints like https://mydomain.com/service1.
The reverse proxy must also automatically update its configuration when it detects a change in the Service Registry, and expose load information for the host (so that we can know when it is over capacity).
How did it go?
I had some conflict on whether to use one of the node.js reverse proxies (bouncy, node-http-proxy) for the routing, or to use a more mature Linux package. In the end I decided to go with haproxy, mainly because it has detection of when a particular endpoint is not responding, and can switch that endpoint off. This should prove useful for when problems occur, or when services are being upgraded or scaled down.
It also supports hot-switching of the proxy configuration, which is awesome.
Once the decision was made, configuring haproxy was not very hard. I had to take the time to read its configuration manual (a gigantic text file, argh). Ok, I just skimmed it
Testing
I did less testing on this component than I did on the service registry. I think its one of those things where you can only find the issues once you start loading up services and seeing what happens.
I did do enough testing to know that the code does what I want – its just a matter of figuring out if what I want will work in practice. So I’ll do that on Thursday/Friday, and hopefully it will work itself out.
Determining Load
Determining the load was fairly easy. I use a combination of the node-provided
os.loadAvg()
method, and calling
ps -A u
to find the process using the most CPU percentage.