Posts filled under: Socket.IO

Why I want to win so badly

It’s only been one weekend since the Node Knock Out finished, one might expect that coding a full application in 48 hours using a complete Node.js stack is hard work but boy.. was I wrong. The hardest part is actually winning during the Node Knock Out.

Entering Node Knockout

This is the second time that I entered the Node Knock Out competition, it’s a amazing contest that allows you to show case your skills and creativity to make the most spectacular web application you can build in 48 hours. Last year I entered with a heat mapping web service and become second in the solo category. I lost with just 0.3 points behind the first solo contestant @mape and boy, that sucks really hard, being so close to a first place, so I already made up my mind back then I would enter again.

http://dl.dropbox.com/u/1381492/nko/exampleclick.png

Preparation

I still remembered from 2010 that games seemed to be a hot topic (the winners for overall and solo where both games). But I really do not like to waste my time building games, I would rather spend my time building something that is useful for me and hopefully for other.

So I decided to create real time user tracking service that allows you to see the browser interactions of your users played on your screen, if they click on a button that animates, you will see a ghost mouse that will click that exact button does the exact animation. It should replay all common user interactions such as typing, scrolling, mouse moments, clicks, navigation and JavaScript errors.

I wanted to record all this information so you can easily see how your users are interacting with your site and see if changes to the layout need to be made. When a JavaScript error happened during the session, you would actually know and see the steps that the user made so you can easily reproduce the bug and fix it.

http://dl.dropbox.com/u/1381492/nko/exampleerror.png

About 2 weeks before the start of NKO (Node Knock Out) I started with preparations of the project, I created mind maps, started sketching the design, researching which modules I was going to need, how my database should be structured what client side modules I wanted to use and which browsers I was going to support.

One of the biggest lessons I learned from last year is to go directly with a hosted database service and don’t mess with installations during the competition. I lost about 6 hours on this last year because MongoDB kept failing.

So I had everything planned, created on big to-do list in the most optimal order I could think of so I wouldn’t be wasting time on features or design aspects that wouldn’t be needed.

Getting started

I just got home from a 8 hour drive (back from vacation) and after a quick power nap I was ready to get started. My first step was to get started on the database binding layers and implement a small CMS so I could update and add content without having to update my views (this is something I do for all the applications I build so my git project isn’t cluttered with pointless typo commits). After a few hours or so this was done and I was quite happy about it. Planning this advanced definitely helped out.

As the hours passed more and more messages started streaming in the IRC channel that Socket.IO was broken on no.de and that nobody could get Web Sockets to work. I decided to just continue with my work in the hope that Joyent would fix it..

After a day ago I had it running stable enough to deploy on my no.de host. It worked but as other had stated Web Sockets where horribly broken. After a some debugging I found out that the upgrade event was never fired on the servers. After a quick chat with the Joyent folks this fixed right away.

Finished

After staying awake for more then 72 hours, it was time to deploy the app for the last time and keep the fingers crossed that it would survive for the rest of the voting. And luckily it’s still up (it had some up and downs thanks to broken Joyent proxies but it’s mostly up now)

After a day or so the votes started streaming in.. I was overwhelmed by positive reaction of the judges and contestants.

I actually want to win, really badly

When I started in the competition I just entered because I really liked my idea and would love to have another shot in the solo category. But than I realized that a lot of users told me they loved the service so much and that it’s pure gold that I could actually try to make a small start up out of it.

Winning would allow me to use the rewards from the competition to continue the development and finish the application in the way that I envisioned it and make it a really cheap but powerful tool for usability testers and web site owners.

But winning ain’t easy

That is something that I realize now, I have gotten amazing Judge and Contestant votes, but I was no where near the first spot of a other solo contestant. So I wondered why? I noticed that almost all my scores where higher than his project, except popularity.

And that is why I need you

Popularity can only be increased using the vote button. So if you have a face book account please go to http://observer.no.de/vote and press the vote button. Alternatively you can also press the vote button here in the post (if it shows up)

It does not post anything to your Facebook wall, it just uses Facebook to record the votes.

If you want to know more about my application and see how it works, register a account at http://observer.no.de/ or checkout the demo at http://observer.no.de/demo

Theoretical Node.js Real time Performance

Benchmarking Node.js seems to be hot topic lately, not week passes without a benchmark being posted on HackerNews All these benchmarks highlight different parts and usages of Node.js. On some of them Node.js performs well and on others it sucks, badly. Both types of benchmarks are useful as it shows other users what Node.js is capable off. But it also shows us, developers, what needs to be worked and could potentially become a bottleneck in our application stack. But either way, it’s a win/win situation.

Real time frameworks

When people start building real time applications they always want the fasted, best and the most bad-ass system there is available as it needs to handle a large amount of concurrent connected users. There isn’t much known about the performance of Node.js when handling real time connections. You can create a real time application on any language and stack you want but developers usually prefer to libraries and frameworks that do the hard lifting for them. The frameworks that they usually consider are Twisted or Tornado, a evented server written Python, Netty the Java powered asynchronous event framework or Node.js using Socket.IO.

Because I spend allot my free time working with and on Socket.IO I started wondering what the limits would be for Node.js based real time system. Can it handle 20k, 50k, 100k or even 500k of connected users? What is the limit and when will it break?

Felix Geisendörfer recently tweeted that Node.js can allocate 1.6 million concurrent HTTP server requests through a single socket before running out of memory. Resulting in a 1.53kb memory allocation per request. This was done by flushing a huge buffer of GET /\n\n in one HTTP connection. 1.6 million is quite impressive, but than again Node.js is known for it’s first class and high quality HTTP parser (Thanks Ry!).

Measuring connected sockets

It’s nice to know that your server can handle 1.6 million concurrent requests over one single connection, the possibility of this occurring in real life is 0.0%. A more realistic test would be to measure how many concurrent connected connections a single Node.js process could handle and much memory one single connection allocates. Creating a benchmark that would generate 1.6 million concurrent connected requests would require me to buy allot of IP addresses to distribute the requests as we are tied to 64k ephemeral ports per IP address. So instead of generating the actual load of 1.6 million sockets I decided to calculate the theoretical performance of a single node process.

I started coding up a small script that allows to create a bunch of real time connections:

/**
 * Dependencies
 */
var http = require('http')
  , agent = http.getAgent('127.0.0.1', 8080);

// Agent default to max sockets of 5, we need more
agent.maxSockets = Infinity;

// create more, agents un till we have enough
for(var i = 0; i < 4000; i++){
  http.get({
    agent: agent
  , path: '/'
  , port: 8080
  , host: '127.0.0.1'
  });

  console.log("Client connected: " + i);
}

While this doesn’t follow any best practices of doing a proper http benchmark it’s enough for this particular test as we are not testing processing performance of Node.js here but it’s HTTP / TCP socket limits. Now that the simple benchmark script ready I started building the script that handles the incoming the requests, to make it more realistic I made sure that both the request and the response object where stored in Node.js process memory so we could communicate with the connected sockets like you would normally do with a Comet / real time server. I decided to test of different storage backends for the request objects. First Array storage:

/**
 * Dependencies
 */
var http = require('http')
  , host = 'localhost'
  , port = '8080';

// the stuff
var connections = []
  , i = 0;

// process title so I can find it back more easily
process.title = "connection";
var server = http.createServer(function(req, res){
  connections[connections.length] = {req: req, res:res};

  res.writeHead(200);
  res.write('ping');

  console.log('established connections: ' + ++i);
});

/**
 * Send messages to the stored connections
 *
 * @api public
 */
function message(){

  var i = connections.length
    , connection;

  while(i--){
    connection = connections[i];
    connection.res.write('wut');
    console.log('pew');
  }
};

// spam each minute
setInterval(function(){
  message()
}, 1000 * 60 );

server.listen(port);
console.log('listening on 8080');

And another server instance that would use an Object to store the requests and responses.

/**
 * Dependencies
 */
var http = require('http')
  , host = 'localhost'
  , port = '8080';

// the stuff
var connections = {}
  , i = 0;

// process title so I can find it back more easily
process.title = "connection";
var server = http.createServer(function(req, res){
  connections[Object.keys(connections).length] = {req: req, res:res};

  res.writeHead(200);
  res.write('ping');

  console.log('established connections: ' + ++i);
});

/**
 * Send messages to the stored connections
 *
 * @api public
 */
function message(){

  var arr = Object.keys(connections)
    , i = arr.length
    , connection;

  while(i--){
    connection = connections[arr[i]];
    connection.res.write('wut');
    console.log('pew');
  }
};

// spam each minute
setInterval(function(){
  message();
}, 1000 * 60 );

server.listen(port);
console.log('listening on 8080');

Running the benchmark

So I stared testing out the Array based storage first as V8 is know for it’s high performing arrays. I started up the Node server, waited a while until the server was idle and fired off the simulation script. I got peak memory of 43.5mb while it was connecting all the sockets, after a few seconds (the garbage collected kicked in?) and the memory dropped back to 28.7mb RSS. Messaging the server gave it a small spike of memory but that was excepted. I re-ran the test 10x to confirm the findings and they produced the following averages:

Array:

  • Start: 12.6mb
  • Peak: 43.5mb
  • Idle: 28.7mb
  • messaging: 34.8mb / 11% cpu

Up next was the Object based store, surprisingly it came really close to Array based storage. It used a bit more memory, but for some use cases it would be worth to store the response/requests in a object because you can do easy key/client lookups. I re-ran the test 10x to confirm the findings and they produced the following averages:

Object:

  • Start: 12.6mb
  • Peak: 48.3mb
  • idle: 28.7mb
  • messaging: 35mb / 10.9% cpu

OMG what does it mean??

Now that we have these stats we can calculate the the theoretical limits of the one single process. We know that once single process is limited to a V8 heap of 1.7 gb. When you get near that limit your process already starts to die. We had a start up RSS memory of 12.6mb and the servers memory stabilized to 28.7mb so for the 4000 connections it spend 16.1mb thats 16.1mb / 4000 = 4.025kb for each connection. To fill up the server with stabilized connections it could reach a total of 1.7gb / 4.025kb = 422.000 connections. These findings come really close the 512000 concurrent Web Socket connections on groovy++ which took 2.5gb for the Java heap according the article.

I’m impressed. Node.js is the awesome sause that you should be eating when want to take a bite out of the real time web.

This article is translated to:

This week in Node

Welcome to another episode of This week in Node where I write about the latest update in the Node.js repository and existing updates that happen in user land.

Socket.IO 0.6.18

This week we released Socket.IO 0.6.18, which will hopefully be our last maintenance release before the much anticipated 0.7 release. While I could dedicate a whole new post on the changes that where made in 0.6.18, I will touch the surface of the most noticeable changes.

JSDocs

There is no such thing as to much documentation, with Socket.IO this is definitely the case. The main documentation for Socket.IO which be found on the website and the Github’s README.md is sufficient for most users to get started with Socket.IO but it doesn’t cover the details on how it actually works. Some users stated they wanted a more detailed API documentation, so we started annotating the source code. We used the JSDoc syntax which is based on the more widely known JavaDoc syntax. This allows us to easily generated API documentation and annotated source code on the fly using a JSDoc compatible parser like dox. We haven’t pushed out any documentation yet but you can generate it your self with the following command:

dox --title "socket.io" socket.io.js > api.html

Reconnection support

We already announced this in the 0.6.17 release, but unfortunately we notice that we didn’t update the client library inside socket.io-node. But this time we are 100% sure that it’s available! Socket.IO is configured to automatically re-establish a the connection with the server when the current connection is disconnected by a transport. To prevent from creating a DOS attack on the server we have implemented a exponential back off algorithm, basically it multiplies the reconnection timeout by 2 every time it attempts to reconnect to the server. The reconnection is attempted with the same transport that was used before the connection got dropped. When the reconnection reaches the maximum allowed reconnection attempts (set to 10 by default) it will do one last attempt and try out all enabled transports.

Giving feedback to the user is always vital part of building real time application as they always expect it to work. We have implemented a couple of new events to help you with notifying your users of the reconnection process. When the reconnection has started we emit the reconnecting event and supply it with the reconnection duration for the next reconnection attempt.

// construct a socket.io instance
var socket = new io.Socket();
socket.connect();

// listen for the reconnecting event
socket.on('reconnecting', function(type, attempt){
  console.log('Attempting to reconnect to the server. This is attempt #' + attempt);
});

To know when the reconnection is successfully you can listen to the reconnect event. This event receives the transport type that was used to connect and how many attempts it took to reconnect successfully.

// handle successful reconnects
socket.on('reconnect', function(type, attempts){
  console.log('It only took ' + attempts ' attempts to reconnect using ' + type);
});

When the reconnection is failed we emit the reconnect_failed event. We could keep attempting to reconnect to the server, at this point there must be something seriously wrong.

// handle failures
socket.on('reconnection_failed', function(){
  console.error('Server fail');
  window.location.reload(); // use any custom logic here.
});

Annoying loading spinners

Webkit based browser suffered from an annoying issue, if you create a asynchronous long polling XHR request before the document’s resources are fully loaded it will keep triggering the browsers loading indicators. We solved this by doing a connect after the onload event has fired. This is the same trick that got applied to iPhone and Android devices, but it’s now extended to all Webkit based browsers. This will probably make allot users happy.

Flashsocket updated to the latest version

The flashsocket fallback has been updated to the latest build. This removes the FLABridge.js dependency as the ExternalInterface is being instead. This is great news as the FLABridge.js originates from the stone age, is leaking memory as a basket and is full of code that follows worst practises on the Internet. Because of this important change inside the flashsocket we decided to test it under Opera again to see if it’s more stable than the previous builds. Luckily this was the case and we enabled the flashsocket for Opera! The upgrade also solves allot of issues that users where having with cross port connections.

Removal of introduced globals

With commit 0a1a93b4846b0 allot of variables got exposed to the global scope. We have correctly identified this issue and have taken measurements to prevent this from happening. This is done with the introduction of a test suite that I have started to prototype which will hopefully be read for the 0.7 release, so stay tuned for more information about that.

For more information see the announcement on the Socket.IO google group.

node-http-proxy 5.3

More exiting new for those of us who build real time applications with the 5.3 update of Nodejitsu’s node-http-proxy we can finally proxy Web Socket request! This has been plague ever since Web Sockets where added to the browsers. Most HTTP proxies do not support the HTTP 1.1 protocol. HAProxy was the only proxy available to proxy successfully proxy Web Sockets. But this was only possible in TCP mode making HTTP header inspection etc impossible. The best thing about node-http-proxy is that it only requires one line of code to get started with proxying your requests:

// proxy all requests from port 80 to port 8080, including Web Socket requests.
require('node-http-proxy').createServer(8080, 'localhost').listen(80);

So thank the Nodejitsu’s for the hard work, by downloading the latest code now.

Buffer floats and doubles

Commit e505a1215c5 lands supports from reading and writing floats and doubles to the buffer object. These changes will make it allot easier to handle binary data inside Nodejs and JavaScript. JavaScript already started implementing typed Arrays but buffer objects are much useful for Node at this point as they add allot less to the total heap size of V8. And there for allow you allot more memory than the 1.7 GB restriction on 64 bit machines (this is a v8 restriction). But also Buffer objects are send to the sockets much faster because it’s easier for node to get the pointer to the raw data.

UPDATE: Thanks to the correction of @mraleph Nodes buffers are actually implemented on top the External Array API of V8. They have a different syntax, but they are basically the same thing.

Documentation

The is still allot of code in the Nodejs core that isn’t documented, but thankfully this is being worked on. Commit 68d840b47d1 adds documentation for the Readline module. This is module allows you to read use input from the stdin. This will allow even more command line based applications to be build on top of Node. The DNS module receive information on the resolveNs and the resolveCname methods with commit 56aa2fd4c3d.

Top of Page