3rd-Eden

Apr 08

Finding balance

When building a real-time service it’s vital to have a high-performance scalable proxy that actually works with WebSockets. There are many flavors but which one is actually best tool for the job in terms of raw performance?

The following technologies were tested:

3 different, separate servers were used for testing. All these servers are hosted at joyent.

  1. Proxy, a 512mb Ubuntu server. This is the server were all the proxy servers are installed. image: sdc:jpc:ubuntu-12.04:2.4.0
  2. WebSocketServer, a 512mb Node.js smart machine that ran our WebSocket echo server. The server is written in Node.js and spread across multiple cores using the cluster module. image: sdc:sdc:nodejs:1.4.0
  3. Thor, another 512mb Node.js smart machine with the same specs as above. This was the server were we generated the load from. Thor is a WebSocket load generation tool which we’ve developed. It’s released as open source and available at http://github.com/observing/thor

Configuring the Proxy server

The Proxy server was just a clean, bare bones Ubuntu 12.04 server. These are the steps that were taken to configure and install all the dependencies. To ensure that everything is up to date we have to run.

apt-get upgrade

The following dependencies were installed on the system:

apt-get install git build-essential libssl-dev libev-dev

Node.js

Node.js is required for the http-proxy. While it runs on the latest Node.js version for these tests were executed under 0.8.19 to ensure compatibility of all dependencies. It was installed through github.

git clone git://github.com/joyent/node.git cd node git checkout v0.8.19 ./configure make make install

This also installed the npm binary on the system so we can install the dependencies of this project. Run npm install . in the root of this repository and the http-proxy and all it’s dependencies are installed automatically.

Nginx

Nginx is already a widely deployed server. It supports proxing of to different back end servers but it did not support WebSockets. This got recently added in to the development branch of Nginx. There for we installed the latest development version and compiled from source:

wget http://nginx.org/download/nginx-1.3.15.tar.gz tar xzvf nginx-1.3.15.tar.gz cd nginx-1.3.15 ./configure --with-http_spdy_module --with-http_ssl_module --pid-path=/var/run/nginx.pid --conf-path=/etc/nginx/nginx.conf --sbin-path=/usr/local/sbin --http-log-path=/var/log/nginx/access.log --error-log-path=/var/log/nginx/error.log --without-http_rewrite_module

As you can from the options above we’ve included SSL, SPDY and configured some other settings. This yielded the following configuration summary:

“` Configuration summary + PCRE library is not used + using system OpenSSL library + md5: using OpenSSL library + sha1: using OpenSSL library + using system zlib library

nginx path prefix: “/usr/local/nginx” nginx binary file: “/usr/local/sbin” nginx configuration prefix: “/etc/nginx” nginx configuration file: “/etc/nginx/nginx.conf” nginx pid file: “/var/run/nginx.pid” nginx error log file: “/var/log/nginx/error.log” nginx http access log file: “/var/log/nginx/access.log” nginx http client request body temporary files: “client_body_temp” nginx http proxy temporary files: “proxy_temp” nginx http fastcgi temporary files: “fastcgi_temp” nginx http uwsgi temporary files: “uwsgi_temp” nginx http scgi temporary files: “scgi_temp” “`

After this it’s just a simple make away:

make make install

HAPprox

HAproxy was already able to proxy WebSockets in tcp mode but it’s now also possible to do so in http mode. HAproxy also got support for HTTPS termination. So again, we need to install the development branch.

wget http://haproxy.1wt.eu/download/1.5/src/devel/haproxy-1.5-dev18.tar.gz tar xzvf haproxy-1.5-dev18.tar.gz cd haproxy-1.5-dev18 make TARGET=linux26 USE_OPENSSL=1 make install

Stud

While HAProxy is capable of terminating SSL it’s common practise to have stud in front of HAProxy for SSL offloading. So this is something we want to test as well.

git clone git://github.com/bumptech/stud.git cd stud make make install

Now that everything is installed we need to install the configuration files. For Nginx you can copy & paste the nginx.conf from the root of this repository to /etct/nginx/nginx.conf. All the other proxies can be configured on the fly.

Kernel tuning

After all the proxies are installed we need to do some socket tuning. This information was generously stolen from the internets:

vim /etc/sysctl.conf

And set the following values.

“`

General gigabit tuning:

net.core.somaxconn = 16384 net.core.rmem_max = 16777216 net.core.wmem_max = 16777216 net.ipv4.tcp_rmem = 4096 87380 16777216 net.ipv4.tcp_wmem = 4096 65536 16777216 net.ipv4.tcp_syncookies = 1

this gives the kernel more memory for tcp

which you need with many (100k+) open socket connections

net.ipv4.tcp_mem = 50576 64768 98152 net.core.netdev_max_backlog = 2500 “`

Benchmarking

There are 2 different tests executed:

  1. Load testing the proxies without SSL. This will purely test the performance of WebSocket proxing.
  2. Load testing the proxies with SSL. Nobody should be running unsecured WebSockets as they have really bad connectivity in browsers. But this adds overhead of SSL termination to the proxy.

In addition to different tests we’re also testing the different amount of connections:

And for the equal results:

Before each test all WebSocketServer is reset and the Proxy re-initiated. Thor will hammer all the Proxy server with x amount of connection with a cuncurrency of 100. For each established connection one single UTF-8 message is send and received. After the message is received the connection is closed.

Running

Stud

stud --config stud.conf

HAProxy

haproxy -f ./haproxy.cfg

Nginx

nginx

http-proxy

FLAVOR=http node http-proxy.js

WebSocketServer

FLAVOR=http node index.js

Results

The http-proxy lives up to it’s name, it proxies requests and does it quite fast. But as it’s build on top of Node.js it quite heavy on the memory. Just a simple node process starts with a 12MB of memory. For the 10K requests it took around 70mb of memory. The overhead was of the HTTP proxy was about 5 seconds if you compare it to control test. The HTTPS test was the slowest of all, but that was expected as Node.js sucks hairy monkey balls in SSL. Not to mention that will put your event loop to a grinding halt when it’s under severe stress.

I had high hopes for Nginx and it did not let me down. It had a peak memory of 10MB and it was really fast. The first time I tested Nginx, it had a horrible performance. Node was even faster in SSL then Nginx, I felt like failure, I genuinely sucked a configuring Nginx. But after some quick tips from some friends it was actually a one line change in the config. I had the wrong ciphers configured. After some quick tweaking and a confirmation using openssl s_client -connect server:ip it was all good and used RC4 by default which is really fast.

Up next was HAproxy, it has the same performance profile as NGINX, but lower on the memory it only required 7MB of memory. The biggest difference was when we tested with HTTPS. It’s was really show and no where near the performance of Nginx. Hopefully this will be resolved as it’s a development branch we are testing. When we put stud in front of server it gets closer the performance of Nginx.

Conclusions

http-proxy it’s a great flexible proxy, really easy to extend and build up on. If you deploy this in production I advice to run stud in front of it to take care of the SSL offloading.

nginx and haproxy were really close, it’s almost not significant enough to say that one is faster or better then the other. But if you look at it from an operations stand point. It’s easier to deploy and manage a single nginx server instead of stub and haproxy

HTTP

Proxy Connections Handshaken (mean) Latency (mean) Total
http-proxy 10k 293 ms 44 ms 30168 ms
nginx 10k 252 ms 16 ms 28433 ms
haproxy 10k 209 ms 18 ms 26974 ms
control 10k 189 ms 16 ms 25310 ms

HTTPS

Proxy Connections Handshaken (mean) Latency (mean) Total
http-proxy 10k 679 ms 62 ms 68670 ms
nginx 10k 470 ms 30 ms 50180 ms
haproxy 10k 968 ms 55 ms 102037 ms
haproxy + stud 10k 492 ms 42 ms 52403 ms
control 10k 703 ms 65 ms 71500 ms

All test results are available at:

https://github.com/observing/balancerbattle/tree/master/results

Sep 04

Lessons learned from entering Node Knock Out

As the second NKO is reaching it’s end I would like to share my experiences and lessons that I have learned from entering the contest in 2010 and 2011. I hope this would help some people in the future to prevent them or learn from the countless mistakes I have made.

Sleep is for the weak

Producing code for 48h is a really long run, it will feel as if your eye balls are shrinking to little peas, but you have a project to finish. So unless you are 100% positive that you have done enough work already to gain the privilege of sleep.. I wouldn’t close my eyes, at all. Yes 48h is long, but if you are 1 hour short at the end of the competition because you felt like you needed to take a power nap for 4 hours.. I will be more than happy to tell you, I told you so. Don’t forget that only ~50% of all the teams actually make it to the judging/voting round.

Select a database that matches the needs of your project

There are tons on MYSQL and NOSQL databases available but not every database is suitable for the need of your project. Do you have a lot of read or write request, or do want to maintain history of documents or should it just be really fucking fast. Deciding this up front allows you to fully focus on coding up your project during the competition.

Choose a database that can be hosted by a service

In 2010 I was ambitions enough to install MongoDB locally on my server. The installation was a pain, I was used to a different Linux distribution and was getting odd compilation errors all over the place. Once I finally managed to get it running smoothly, it completely crashed and killed the database file.

I had lost a total of 6 hours because I was to stubborn to use hosted service. This year I decided to use a hosted right away. Because I could really use those 6 hours.

Use your production database as development database

This is probably a more personal thing, but I noticed that when I deployed for production my site stopped working properly. I had forgotten to create indexes on the production database. I did add these on the development database. This costs to much time to debug. In a normal development process you would always use snapshot of the database but you only got 48 hours here, no time to debug these kind of silly issues.

Abstract your database connections if needed

Both years I decided to go with MongoDB because it suited my needs the best, but the amount of code that you actually need to write to fetch data from the database is immense. You could use a ORM like mongoose instead or create your own abstraction.

/**
 * Fetches a working mongodb connection.
 *
 * @param {String} collection
 * @param {Function} fn Callback
 * @api private
 */

exports.allocate = function (collection, fn) {
  /**
   * Fetches the correct collection.
   *
   * @param {Error} error from the connecting
   * @param {MongoDB} db Database
   * @api private
   */

  function collect (err, db) {
    db.collection(collection, function collection (err, col) {
      if (err) return fn.apply(fn, arguments);

      fn.call(col, err, db);
    });
  };

  // fast case
  var stream = exports.allocate.stream;
  if (stream) return collect.call(stream, null, stream);

  /**
   * Called once the database opens.
   *
   * @param {Error} err Connection error
   * @param {MongoDB} stream Mongodb connection stream
   * @api private
   */

  function open (err, stream) {
    if (err) return fn.call(fn, err, null);

    exports.allocate.stream = stream;

    // handle uncaught errors
    stream.on('error', function uncaughtError (err) {
      exports.allocate.stream.close();
      exports.allocate.stream = null;

      console.error('(mongodb) uncaught error', err.message, err.stack);
    });

    // start the whole collect thingy
    collect.apply(stream, arguments);
  }

  var mongodb = mongo.connect(conf.db, {
      auto_reconnect: true // auto reconnect broken connection
    , poolSize: 10         // connection pool
  }, open);
};

I use the that small allocate function to clean up the way I needed to connect with database. This way I could mock up database driven function really fast without creating a async mustache }}}}});}}});.

exports.exists = function (observer, fn) {
  exports.allocate('account', function (err, db) {
    if (err) return fn.call(exports, err);

    this.findOne({ _id: new objectid(observer)}, fn.bind(exports));
  });
};

Research the npm modules that you want to use and how to make them work

When I had the idea for my application, I started thinking about the technologies that I needed to make it work, using Socket.IO was no brainer for me because it’s simply awesome and I spend a lot of time patching bugs and making it super stable.

Socket.IO provides a great range of different features that makes it really easy to setup a real time application. To my surprise I was actually able to leverage a lot of this functionality, everything that I had to create last year like authorization, rooms is now baked in the core. So I suggest you start sniffing around in the modules you are about to use, to find hidden gems.

Pick your hosting before you are getting started

The hosting providers are usually announced a few days before the contest so you have enough time to look around and find out what they offer for your application. Pick a application that suites the needs of your project. Some hosts require more installation, but they will also provide a greater range of flexibility. If you want to build a real-time application make sure that your host is supporting Web Sockets. Yes, there are still hosts (Heroku) that doesn’t support Web Sockets. While other doesn’t allow you to deploy code on anything other than port 80. So pick wisely, you might time to swap during the contest but don’t place your bets on it.

When in doubt, choose the host you know

If you have no idea which one to pick, go with a host you are most comfortable with. For me this was Joyent for both years, the SmartOS is a really well designed Linux distribution with DTrace integration. Plus they have cloud analytics which allows you to put DTrace probes in your server and see it’s pain points in real time. Which is pretty sweet.

Expect hosting and deployment failures

Even if you already have experience with the hosting provider it doesn’t guarantee anything. I can’t tell you if you will be getting big or minor failures. But they are going to happen, so you better prepare for them.

The first time I entered I forgot that ISP usually do network and maintenance updates during the night because not a lot of people are surviving the web then. I was disconnected from the Internet for one hour, normally this doesn’t really matter but this was right before the end of the competition. I didn’t had the ability to tether my iPhones Internet connection so I could only continue working off-line and hope that my Internet connection got back eventually.

This year I made sure my iPhone was jail broken and able to tether it’s 3g connection. I experienced about 3 Internet drops, but none occurred during the competition. But it’s still nice to know you actually have a back up.

Have a back-up idea

If you are going to enter as a team make sure you have a back-up idea for when your team members get ill or find other excuses for not entering the competition with you. If you don’t have a back-up idea drop some less important features so that you project is still do-able in the 48h time frame.

Haters gonna hate

It doesn’t matter how good your idea or execution is people are going to dislike your entry. You should just ignore it, haters gonna hate.

Have fun

Remember it’s just contest, just have fun and enjoy your project. You have done good ;).

Last but not least, voting

While getting getting contestant and Judge votes is pretty awesome, don’t forget that public also counts for 20% of your total score.

Speaking of voting, If you haven’t done so already, voting for my solo application is open until 6 Sept. 2011.

I will try to keep this list as up to date as possible and add missing tips if I remember them.

Why I want to win so badly

It’s only been one weekend since the Node Knock Out finished, one might expect that coding a full application in 48 hours using a complete Node.js stack is hard work but boy.. was I wrong. The hardest part is actually winning during the Node Knock Out.

Entering Node Knockout

This is the second time that I entered the Node Knock Out competition, it’s a amazing contest that allows you to show case your skills and creativity to make the most spectacular web application you can build in 48 hours. Last year I entered with a heat mapping web service and become second in the solo category. I lost with just 0.3 points behind the first solo contestant @mape and boy, that sucks really hard, being so close to a first place, so I already made up my mind back then I would enter again.

http://dl.dropbox.com/u/1381492/nko/exampleclick.png

Preparation

I still remembered from 2010 that games seemed to be a hot topic (the winners for overall and solo where both games). But I really do not like to waste my time building games, I would rather spend my time building something that is useful for me and hopefully for other.

So I decided to create real time user tracking service that allows you to see the browser interactions of your users played on your screen, if they click on a button that animates, you will see a ghost mouse that will click that exact button does the exact animation. It should replay all common user interactions such as typing, scrolling, mouse moments, clicks, navigation and JavaScript errors.

I wanted to record all this information so you can easily see how your users are interacting with your site and see if changes to the layout need to be made. When a JavaScript error happened during the session, you would actually know and see the steps that the user made so you can easily reproduce the bug and fix it.

http://dl.dropbox.com/u/1381492/nko/exampleerror.png

About 2 weeks before the start of NKO (Node Knock Out) I started with preparations of the project, I created mind maps, started sketching the design, researching which modules I was going to need, how my database should be structured what client side modules I wanted to use and which browsers I was going to support.

One of the biggest lessons I learned from last year is to go directly with a hosted database service and don’t mess with installations during the competition. I lost about 6 hours on this last year because MongoDB kept failing.

So I had everything planned, created on big to-do list in the most optimal order I could think of so I wouldn’t be wasting time on features or design aspects that wouldn’t be needed.

Getting started

I just got home from a 8 hour drive (back from vacation) and after a quick power nap I was ready to get started. My first step was to get started on the database binding layers and implement a small CMS so I could update and add content without having to update my views (this is something I do for all the applications I build so my git project isn’t cluttered with pointless typo commits). After a few hours or so this was done and I was quite happy about it. Planning this advanced definitely helped out.

As the hours passed more and more messages started streaming in the IRC channel that Socket.IO was broken on no.de and that nobody could get Web Sockets to work. I decided to just continue with my work in the hope that Joyent would fix it..

After a day ago I had it running stable enough to deploy on my no.de host. It worked but as other had stated Web Sockets where horribly broken. After a some debugging I found out that the upgrade event was never fired on the servers. After a quick chat with the Joyent folks this fixed right away.

Finished

After staying awake for more then 72 hours, it was time to deploy the app for the last time and keep the fingers crossed that it would survive for the rest of the voting. And luckily it’s still up (it had some up and downs thanks to broken Joyent proxies but it’s mostly up now)

After a day or so the votes started streaming in.. I was overwhelmed by positive reaction of the judges and contestants.

I actually want to win, really badly

When I started in the competition I just entered because I really liked my idea and would love to have another shot in the solo category. But than I realized that a lot of users told me they loved the service so much and that it’s pure gold that I could actually try to make a small start up out of it.

Winning would allow me to use the rewards from the competition to continue the development and finish the application in the way that I envisioned it and make it a really cheap but powerful tool for usability testers and web site owners.

But winning ain’t easy

That is something that I realize now, I have gotten amazing Judge and Contestant votes, but I was no where near the first spot of a other solo contestant. So I wondered why? I noticed that almost all my scores where higher than his project, except popularity.

And that is why I need you

Popularity can only be increased using the vote button. So if you have a face book account please go to http://observer.no.de/vote and press the vote button. Alternatively you can also press the vote button here in the post (if it shows up)

It does not post anything to your Facebook wall, it just uses Facebook to record the votes.

If you want to know more about my application and see how it works, register a account at http://observer.no.de/ or checkout the demo at http://observer.no.de/demo

May 27

This week in Node

Welcome to another epic episode of This week in Node, where I blog about the latest news and updates in the Node.js community.

Node 0.4.8

A new Node stable has been released just one day after I posted This week in Node #20. This release brings us more performance for encrypted SSL connections. This is achieved by disabling the compressions in OpenSSL. This improves memory consumption and speed. If you want to apply compression it should now be done in “userland”.

Full list of fixes

Download the latest version from the Node.js site http://nodejs.org/dist/node-v0.4.8.tar.gz

New Core team members

Just announced today, 2 new core team members (Bert and Felix) have been added to Node. Bert Belder (@piscisaureus) has contributed countless bug fixes for the windows, not only fixing bugs but also implementing new features in windows. He has been working been working allot on Libuv the last couple of weeks / months and it’s being integrated in to node as we speak.

Felix (@felixge) has been working and patching allot of different I/O streams which are one the most important parts of Node.js. But not only did he work on Streams, he fixed and worked on allot of more important parts of Node.js.

So congratulations for the both of them thanks for all the bugs that you guys fixed, and will fix in the future!

May 24

Theoretical Node.js Real time Performance

Benchmarking Node.js seems to be hot topic lately, not week passes without a benchmark being posted on HackerNews All these benchmarks highlight different parts and usages of Node.js. On some of them Node.js performs well and on others it sucks, badly. Both types of benchmarks are useful as it shows other users what Node.js is capable off. But it also shows us, developers, what needs to be worked and could potentially become a bottleneck in our application stack. But either way, it’s a win/win situation.

Real time frameworks

When people start building real time applications they always want the fasted, best and the most bad-ass system there is available as it needs to handle a large amount of concurrent connected users. There isn’t much known about the performance of Node.js when handling real time connections. You can create a real time application on any language and stack you want but developers usually prefer to libraries and frameworks that do the hard lifting for them. The frameworks that they usually consider are Twisted or Tornado, a evented server written Python, Netty the Java powered asynchronous event framework or Node.js using Socket.IO.

Because I spend allot my free time working with and on Socket.IO I started wondering what the limits would be for Node.js based real time system. Can it handle 20k, 50k, 100k or even 500k of connected users? What is the limit and when will it break?

Felix Geisendörfer recently tweeted that Node.js can allocate 1.6 million concurrent HTTP server requests through a single socket before running out of memory. Resulting in a 1.53kb memory allocation per request. This was done by flushing a huge buffer of GET /\n\n in one HTTP connection. 1.6 million is quite impressive, but than again Node.js is known for it’s first class and high quality HTTP parser (Thanks Ry!).

Measuring connected sockets

It’s nice to know that your server can handle 1.6 million concurrent requests over one single connection, the possibility of this occurring in real life is 0.0%. A more realistic test would be to measure how many concurrent connected connections a single Node.js process could handle and much memory one single connection allocates. Creating a benchmark that would generate 1.6 million concurrent connected requests would require me to buy allot of IP addresses to distribute the requests as we are tied to 64k ephemeral ports per IP address. So instead of generating the actual load of 1.6 million sockets I decided to calculate the theoretical performance of a single node process.

I started coding up a small script that allows to create a bunch of real time connections:

/**
 * Dependencies
 */
var http = require('http')
  , agent = http.getAgent('127.0.0.1', 8080);

// Agent default to max sockets of 5, we need more
agent.maxSockets = Infinity;

// create more, agents un till we have enough
for(var i = 0; i < 4000; i++){
  http.get({
    agent: agent
  , path: '/'
  , port: 8080
  , host: '127.0.0.1'
  });

  console.log("Client connected: " + i);
}

While this doesn’t follow any best practices of doing a proper http benchmark it’s enough for this particular test as we are not testing processing performance of Node.js here but it’s HTTP / TCP socket limits. Now that the simple benchmark script ready I started building the script that handles the incoming the requests, to make it more realistic I made sure that both the request and the response object where stored in Node.js process memory so we could communicate with the connected sockets like you would normally do with a Comet / real time server. I decided to test of different storage backends for the request objects. First Array storage:

/**
 * Dependencies
 */
var http = require('http')
  , host = 'localhost'
  , port = '8080';

// the stuff
var connections = []
  , i = 0;

// process title so I can find it back more easily
process.title = "connection";
var server = http.createServer(function(req, res){
  connections[connections.length] = {req: req, res:res};

  res.writeHead(200);
  res.write('ping');

  console.log('established connections: ' + ++i);
});

/**
 * Send messages to the stored connections
 *
 * @api public
 */
function message(){

  var i = connections.length
    , connection;

  while(i--){
    connection = connections[i];
    connection.res.write('wut');
    console.log('pew');
  }
};

// spam each minute
setInterval(function(){
  message()
}, 1000 * 60 );

server.listen(port);
console.log('listening on 8080');

And another server instance that would use an Object to store the requests and responses.

/**
 * Dependencies
 */
var http = require('http')
  , host = 'localhost'
  , port = '8080';

// the stuff
var connections = {}
  , i = 0;

// process title so I can find it back more easily
process.title = "connection";
var server = http.createServer(function(req, res){
  connections[Object.keys(connections).length] = {req: req, res:res};

  res.writeHead(200);
  res.write('ping');

  console.log('established connections: ' + ++i);
});

/**
 * Send messages to the stored connections
 *
 * @api public
 */
function message(){

  var arr = Object.keys(connections)
    , i = arr.length
    , connection;

  while(i--){
    connection = connections[arr[i]];
    connection.res.write('wut');
    console.log('pew');
  }
};

// spam each minute
setInterval(function(){
  message();
}, 1000 * 60 );

server.listen(port);
console.log('listening on 8080');

Running the benchmark

So I stared testing out the Array based storage first as V8 is know for it’s high performing arrays. I started up the Node server, waited a while until the server was idle and fired off the simulation script. I got peak memory of 43.5mb while it was connecting all the sockets, after a few seconds (the garbage collected kicked in?) and the memory dropped back to 28.7mb RSS. Messaging the server gave it a small spike of memory but that was excepted. I re-ran the test 10x to confirm the findings and they produced the following averages:

Array:

Up next was the Object based store, surprisingly it came really close to Array based storage. It used a bit more memory, but for some use cases it would be worth to store the response/requests in a object because you can do easy key/client lookups. I re-ran the test 10x to confirm the findings and they produced the following averages:

Object:

OMG what does it mean??

Now that we have these stats we can calculate the the theoretical limits of the one single process. We know that once single process is limited to a V8 heap of 1.7 gb. When you get near that limit your process already starts to die. We had a start up RSS memory of 12.6mb and the servers memory stabilized to 28.7mb so for the 4000 connections it spend 16.1mb thats 16.1mb / 4000 = 4.025kb for each connection. To fill up the server with stabilized connections it could reach a total of 1.7gb / 4.025kb = 422.000 connections. These findings come really close the 512000 concurrent Web Socket connections on groovy++ which took 2.5gb for the Java heap according the article.

I’m impressed. Node.js is the awesome sause that you should be eating when want to take a bite out of the real time web.

This article is translated to:

May 20

This week in Node

Welcome to another episode of This week in Node where I write about the latest update in the Node.js repository and existing updates that happen in user land.

Socket.IO 0.6.18

This week we released Socket.IO 0.6.18, which will hopefully be our last maintenance release before the much anticipated 0.7 release. While I could dedicate a whole new post on the changes that where made in 0.6.18, I will touch the surface of the most noticeable changes.

JSDocs

There is no such thing as to much documentation, with Socket.IO this is definitely the case. The main documentation for Socket.IO which be found on the website and the Github’s README.md is sufficient for most users to get started with Socket.IO but it doesn’t cover the details on how it actually works. Some users stated they wanted a more detailed API documentation, so we started annotating the source code. We used the JSDoc syntax which is based on the more widely known JavaDoc syntax. This allows us to easily generated API documentation and annotated source code on the fly using a JSDoc compatible parser like dox. We haven’t pushed out any documentation yet but you can generate it your self with the following command:

dox --title "socket.io" socket.io.js > api.html

Reconnection support

We already announced this in the 0.6.17 release, but unfortunately we notice that we didn’t update the client library inside socket.io-node. But this time we are 100% sure that it’s available! Socket.IO is configured to automatically re-establish a the connection with the server when the current connection is disconnected by a transport. To prevent from creating a DOS attack on the server we have implemented a exponential back off algorithm, basically it multiplies the reconnection timeout by 2 every time it attempts to reconnect to the server. The reconnection is attempted with the same transport that was used before the connection got dropped. When the reconnection reaches the maximum allowed reconnection attempts (set to 10 by default) it will do one last attempt and try out all enabled transports.

Giving feedback to the user is always vital part of building real time application as they always expect it to work. We have implemented a couple of new events to help you with notifying your users of the reconnection process. When the reconnection has started we emit the reconnecting event and supply it with the reconnection duration for the next reconnection attempt.

// construct a socket.io instance
var socket = new io.Socket();
socket.connect();

// listen for the reconnecting event
socket.on('reconnecting', function(type, attempt){
  console.log('Attempting to reconnect to the server. This is attempt #' + attempt);
});

To know when the reconnection is successfully you can listen to the reconnect event. This event receives the transport type that was used to connect and how many attempts it took to reconnect successfully.

// handle successful reconnects
socket.on('reconnect', function(type, attempts){
  console.log('It only took ' + attempts ' attempts to reconnect using ' + type);
});

When the reconnection is failed we emit the reconnect_failed event. We could keep attempting to reconnect to the server, at this point there must be something seriously wrong.

// handle failures
socket.on('reconnection_failed', function(){
  console.error('Server fail');
  window.location.reload(); // use any custom logic here.
});

Annoying loading spinners

Webkit based browser suffered from an annoying issue, if you create a asynchronous long polling XHR request before the document’s resources are fully loaded it will keep triggering the browsers loading indicators. We solved this by doing a connect after the onload event has fired. This is the same trick that got applied to iPhone and Android devices, but it’s now extended to all Webkit based browsers. This will probably make allot users happy.

Flashsocket updated to the latest version

The flashsocket fallback has been updated to the latest build. This removes the FLABridge.js dependency as the ExternalInterface is being instead. This is great news as the FLABridge.js originates from the stone age, is leaking memory as a basket and is full of code that follows worst practises on the Internet. Because of this important change inside the flashsocket we decided to test it under Opera again to see if it’s more stable than the previous builds. Luckily this was the case and we enabled the flashsocket for Opera! The upgrade also solves allot of issues that users where having with cross port connections.

Removal of introduced globals

With commit 0a1a93b4846b0 allot of variables got exposed to the global scope. We have correctly identified this issue and have taken measurements to prevent this from happening. This is done with the introduction of a test suite that I have started to prototype which will hopefully be read for the 0.7 release, so stay tuned for more information about that.

For more information see the announcement on the Socket.IO google group.

node-http-proxy 5.3

More exiting new for those of us who build real time applications with the 5.3 update of Nodejitsu’s node-http-proxy we can finally proxy Web Socket request! This has been plague ever since Web Sockets where added to the browsers. Most HTTP proxies do not support the HTTP 1.1 protocol. HAProxy was the only proxy available to proxy successfully proxy Web Sockets. But this was only possible in TCP mode making HTTP header inspection etc impossible. The best thing about node-http-proxy is that it only requires one line of code to get started with proxying your requests:

// proxy all requests from port 80 to port 8080, including Web Socket requests.
require('node-http-proxy').createServer(8080, 'localhost').listen(80);

So thank the Nodejitsu’s for the hard work, by downloading the latest code now.

Buffer floats and doubles

Commit e505a1215c5 lands supports from reading and writing floats and doubles to the buffer object. These changes will make it allot easier to handle binary data inside Nodejs and JavaScript. JavaScript already started implementing typed Arrays but buffer objects are much useful for Node at this point as they add allot less to the total heap size of V8. And there for allow you allot more memory than the 1.7 GB restriction on 64 bit machines (this is a v8 restriction). But also Buffer objects are send to the sockets much faster because it’s easier for node to get the pointer to the raw data.

UPDATE: Thanks to the correction of @mraleph Nodes buffers are actually implemented on top the External Array API of V8. They have a different syntax, but they are basically the same thing.

Documentation

The is still allot of code in the Nodejs core that isn’t documented, but thankfully this is being worked on. Commit 68d840b47d1 adds documentation for the Readline module. This is module allows you to read use input from the stdin. This will allow even more command line based applications to be build on top of Node. The DNS module receive information on the resolveNs and the resolveCname methods with commit 56aa2fd4c3d.

May 13

This week in Node

This week in Node is a weekly blog post that focuses on the latest changes and updates in the Node.js repository and in NPM modules.

Forking and IPC

Commit 9e26dab150e and 337c48db5fe landed a new method in the child_process module. The fork method allows you to spawn a new Node.js process and pass messages between your current Node.js process and the newly created process using a Inter Process Communication (IPC) channel.

So why is this an important change? This is the first step to proper and hopefully first class support for multiple processing Node.js. When you are doing allot of CPU heavy processing it can take a while before your function is done and is ready to return to the event loop again. Ideally you would use a job server such as Gearman or Resque for heavy processing but that is not always and option. As your application is busy with processing data it can the amount of concurrent your application can handle, but you can now offload all the processing to a different Node.js process and be notified once the work is done.

The API is quite simple and straight forward:

var cp = require('child_process')
  , n = cp.fork(__dirname + '/process.js');

n.on('message', function(m) {
  console.log('PARENT got message:', m);
});

n.send({ hello: 'world' });

And inside process.js:

process.on('message', function(m) {
  // illustrative function ;) 
  hardcoreProcessing(m, function(err, res){
    process.send(res);
  })
});

Wait, what? No Web Worker API

That was the first thing that came to my mind, why don’t we just use the Web Worker API for it. It was designed for stuff like this and it’s already known to some developers even the Node.js website states:

But what about multiple-processor concurrency? Aren’t threads necessary to scale programs to multi-core computers? Processes are necessary to scale to multi-core computers, not memory-sharing threads. The fundamentals of scalable systems are fast networking and non-blocking design—the rest is message passing. In future versions, Node will be able to fork new processes (using the Web Workers API ) which fits well into the current design.

I don’t know how far in the future we currently are but it with the current release it’s actually quite easy to emulate with a few simple lines of code:

if (process.env.NODE_CHANNEL_FD){
  (function(){
    var onmessage;

    // create onmessage api
    Object.defineProperty(GLOBAL, 'onmessage', {
      get: function(){ return onmessage }
    , set: function(fn){
        onmessage = fn;
        process.on('message', function(msg){
          onmessage({data:msg});
        });
      }
    });

    // all we need to do is proxy the process.send
    Object.defineProperty(GLOBAL, 'postMessage', {
      get: function(){ return process.send }
    });
  })()
}

Adding those lines of code would allow you to use onmessage and postMessage as specified in the Web Worker API. But non the less this forking and IPC between different node instances is a more than welcome addition to the node core. See full .fork api here.

Libuv

Integration of Libuv (formerly known as liboio) in to Node.js has started. Libuv provides a high-concurrency and high-performance I/O on all operating systems. This would allow Node.js to be deployed on Windows without using Cygwin or MinGW Unix emulation layers. Because these are emulation layers, the performance has always been really poor and Libuv changes that by using native Windows APIs.

For more information about Libuv check out the github repository and Ry’s Nodeconf 2011 slides.

Hello! I’m from the Internet

Hello, I’m from the Internet and I would like to introduce my self. I’m a web designer, web developer, front end engineer, a Node.js user, father, JavaScript pirate and overall performance whore. I love to find out what makes the Internet tick and how to make it tick more efficiently. I started out as web designer and started educating my self to learn more about programming and especially JavaScript.

I started learning JavaScript by using the Adobe Spry Framework, it didn’t provide any API sugar for JavaScript like jQuery does but it just makes it dead simple to create powerful applications with a few line of code. I was amazed by the fact that it’s so easy to write powerful JavaScript applications and wanted to learn more about it. I started reading through the documentation, examples, articles and best practises. They provided me with a little satisfaction. I wanted to know how it worked so I started digging deeper, studying the source code of the framework. It taught me a really valuable lesson, you can read all the documentation that you want but that doesn’t tell you how a function actually works. Not long after that I got contacted by Scott Fegette from the Adobe Dreamweaver product team, they noticed my activity and passion for Spry and asked be to become a Adobe Community Expert for Spry.

Ever since it feels like I have been on a roller coaster. I have had the privilege to talk with amazing developers, build new widgets and web services for Spry and got in contact with the main Spry Framework developers, Donald Booth and Kin Blas. I went to Don’s session at Adobe MAX 2008 and in 2009 I got the opportunity to speak at Adobe MAX 2009 in LA.

Spry opened my eyes and showed me the power and beauty of JavaScript but when Ryan Dahl announced Node.js as JSconf it opened my eyes even further. After the some what embarrassing fail of Aptana Jaxer to bring JavaScript and DOM to the server it seemed like that Node.js really had potential. In 2011 I entered the Node Knock Out (NKO) a 48 hour coding competition for Node.js based applications. During these 48 hours I build Speedo a real time usability SaaS which allowed you to create heat maps of your website users activity while they where interacting with your interface. During the judging I received a overall score of 8.44 and became second in the solo category.

I now spend my days writing and contributing to Node.js modules such as Node-Memcached, Socket.IO and much much more.

So hello! I’m Arnout Kazemier and I’ll try to share my knowledge, tips, tricks and rants with the rest of the world.