node.js

Benjamin Erb, Michael Müller | 2012-02-13

Single-threaded and event-driven
JavaScript outside your browser

Background

Michael Müller
Student @ Ulm University
Computer Science and Media

Interests:
unix, web technologies, creative coding, ubiquitous computing

Contact:

www:
twitter:
micha.elmueller.net
@cmichi

Background

Benjamin Erb
Student @ Ulm University
Computer Science and Media

Interests:
web technologies, scalable architectures, distributed systems and mobile/ubiquitous computing

Contact:

www:
twitter:
g+:
benjamin-erb.de
@b_erb
b.erb.io/+

Shameless Self-promotion

IOException.de
selected nerd stuff by CS students @uulm

UlmAPI.de
datalove ♥ – a local group of open data enthusiasts

Motivation

Let's build a network or distributed application!

Distributed Application – Building Blocks (1)

A program
is the code you write.
A process
is what you get when you run a program.
A message
is used to communicate between processes.
A packet
is a fragment of a message that might travel on a wire.
A protocol
is a formal description of message formats and the rules that two processes must follow in order to exchange those messages.

Distributed Application – Building Blocks (2)

A network
is the infrastructure that connects different machines via communication links.
A component
can be a process or any piece of hardware required to run a process, support communications between processes, store data, etc.
A distributed system
is an application that executes a collection of protocols to coordinate the actions of multiple processes on a network, such that all components cooperate together to perform a single or small set of related tasks.

Challenge:
Handling I/O

Challenge:
Concurrency + State

Challenge:
Scalability

Modeling Activities in Programming Languages

  • synchronous vs. asynchronous
  • blocking vs. non-blocking
  • procedural vs. event-based
Image: Karlina - Carla Sedini (cc-by-nc-sa 2.0)

node.js

scalable network applications using
an event-driven, non-blocking I/O model

…written in JavaScript

JavaScript #101

JavaScript #101

The Basics

Random facts

  • Brendan Eich, 1995
  • ECMAScript 5.1 (June 2011)

  • script language
  • multi-paradigm language:
    functional, procedural, object-oriented

Types: Simple Types

  • Numbers
  • Strings
  • Booleans
  • Function
  • Object

Types: Simple Types

  • Numbers
  • Strings
  • Booleans
  • Null
  • Undefined
  • Object
    • Function
    • Array
    • Date
    • RegExp

Prototypes

  • Lookup chain ("Inheritance")
Array.prototype.inArray = function(value) {
	for (var index in this) { /* ... */ }
}

[1,2,4].inArray(3);

delete song.artist	

hasOwnProperty();

Object Literals

    var song = {
      "never": "gonna",
      "give" : "you",
      "up": {
        "never": 2,
        "gonna": true,
        "let"  : [2, 4, 6],
        "you"  : function() { return 1; },
        "down" : null
      }
    }
    

Functions

Functions as first-class citizens
  • Variables, argument, return values
  • Composable at run-time
  • Identity: independent from name
http://en.wikipedia.org/wiki/First-class_citizen

Callback Functions

var callback = function(result) {
  //...
};

doItAsynchronously(foo, bar, callback);

JavaScript intrinsically supports event-oriented programming!

Interesting Usages

Anonymous functions

setTimeout( function() { /* ... */ }, 1000);

Functions as variables

var foo = function() {}
foo();
execute(foo);

Interesting Language Usages

Closures

var foo = function() {
  var value = 42;

  return  {
    getValue: function() { return this.value }
  }
}

var obj = foo();

Cascading

getBox()
  .setPosition(30, 40)
  .setBorder(1)
  .setVisible(true)

The Bad Parts

  • global Object
  • scoping
  • eval(), this, typeof

The Good Parts

  • Functions as first class objects
    (callback functions, closure)
  • Prototypal inheritance
  • Object literal & Array literal, RegEx syntax
JSON

http://memegenerator.net/instance/12711521

JSON

light-weight data interchange

JSON

  • JavaScript Object Notation
  • light-weight data format

  • Objects, Arrays, Strings, Numbers, Booleans, null

JSON Types

// Objects
{ foo: "bar", year: 2012 }					

// Arrays
[ "a", "b", "c" ]

// Strings
"john doe"

// Numbers
42 2e+6 2.34

// Boolean
true

// Null
null

JSON

{
  "Some": "String",
  "Large": 2e+6,
  "Small": 42,

  "An Object": {
    "An Array": [ "Un", "Dos", "Tres" ],
    "Empty Array": [],
    "Nothing": null
  }
}					

JSON Usage in JavaScript

// Object as a JSON-String
JSON.stringify(  {two: 2, three: 3}  );
// String as a JSON-Objekt
var foo = '{"one":1, "two":"2"}';
var bar = JSON.parse(foo);
node.js Basics

 

The Basics

The Creator: Ryan Dahl

Image: bumi (cc-by-sa 2.0)

The Idea

To provide a purely evented, non-blocking infrastructure to script highly concurrent programs.
http://s3.amazonaws.com/four.livejournal/20091117/jsconf.pdf

But why JavaScript?

But why JavaScript?

  • it had no API for I/O at all
  • it supports callbacks, closures etc.
  • it embraces an event-driven style

The Project

  • single-threaded event loop
  • completely non-blocking and asynchronous
  • esp. no (implicit) blocking I/O
  • simple concurrency model

Event Loop?

Event-driven Programming

  • well-known: GUIs
  • events, event handlers and callbacks
  • event queue and single-threaded event loop
  • explicit task management
  • no call stacks
Well, in node everything runs in parallel, except your code.
@felixge

???

Event-driven Programming: Event Handling

Okay, but JavaScript is slow, right?!

Google v8

  • part of Google Chrome web browser
  • compiles JavaScript to native machine code before execution
  • advanced features (e.g. inlince caching)
  • sufficiently fast

node.js: Overview

  • built on Google's v8
  • written in C++ and JavaScript
  • libraries that wrap POSIX interfaces (async!)
  • some dedicated libraries (e.g. http-parser)
  • signaling via epoll, kqueue, /dev/poll or select

Reminder: Blocking Calls

SQL Query with Java

Statement s = conn.createStatement();
s.executeQuery("SELECT id FROM users");
ResultSet rs = s.getResultSet();
// use result set

Non-Blocking Calls

Pseudo example in asynchronous JavaScript

db.executeStatement("SELECT id FROM users", function(resultSet) {
	// use result
});
//...
  • Does not block!
  • Callback function is passed to call
  • Execution of the callback once operation finished

Comparison: File I/O – Sync/Async

var fs = require('fs');

//Async...
var callback = function(err, stats){
	console.log(stats);
};
fs.stat("/etc/passwd", callback);
var fs = require('fs');
					
//Sync: Don't try this at home...
var stats = fs.statSync("/etc/passwd");
console.log(stats);

Access Latencies

RAM Access 250 Cycles
Harddisk I/O 41.000.000 Cycles
Network I/O 240.000.000 Cycles

ZZZZzzzzzZZZZ

What happens in the meantime?

Conurrency: Multithreading

  • multiple flows of execution
  • OS schedules threads/processes
  • Thread is waiting for I/O? => switch

Multithreading: Beware…

  • context switching is not for free
  • execution stacks cost memory
  • Also, multithreading is hard. (Yes it is!)

What about C10k?

10.000 Connections = 10.000 Threads?

Oppps!

We're fucked.

Just use a single thread!

Ermh, are you nuts?

Remember…

Well, in node everything runs in parallel, except your code.
@felixge

node scales!

…at least for I/O-bound operations

node does not scale!

…for CPU-bound operations.

Programming with node.js

Programming with node.js

Building Applications

The Tools

  • Read-Eval-Print Loop (REPL)
  • Node Package Manager

TCP Echo Server

var net = require('net');

var server = net.createServer(function (socket) {
  socket.write("Echo server\r\n");
  socket.pipe(socket);
});

server.listen(1337, "127.0.0.1");
http://nodejs.org/

Live Demo

Enhancing the example

Structure: CommonJS Module System

// config.js
exports.port = 1337;
// echo-server.js
var config = require('./config.js')
server.listen(config.port, 'localhost');

Concurrency

  • good: there is only on thread
  • bad: what about multicore?

Utilizing Multiple Cores

  • run multiple instances (you don't have state, do you?)
  • use built-ins: process, cluster, child_process
  • if necessary, use a message queue (e.g. ØMQ)

socket.io

Socket.IO aims to make realtime apps possible in every browser and mobile device […]
http://socket.io/

socket.io

// Server
var io = require('socket.io').listen(80);

io.sockets.on('connection', function (socket) {
  socket.emit('news', { hello: 'world' });
  socket.on('my other event', function (data) {
    console.log(data);
  });
});
// Client 
<script src="/socket.io/socket.io.js"></script>
<script>
  var socket = io.connect('http://localhost');
  socket.on('news', function (data) {
    console.log(data);
    socket.emit('my other event', { my: 'data' });
  });
</script>
http://socket.io/

request

Request is designed to be the simplest way possible to make http calls. It support HTTPS and follows redirects by default.
https://github.com/mikeal/request

request

var options = {
  url: 'http://foo/login',
  form: {'email': 'john@doe.de', 'password' : 'mypw'}
};

request.post(options, function (err, res, body) {
  request.get('http://foo/orders', function (err, resp, body) {
    //...
  });
});
  • Uses cookies globally by default
  • Scraping, crawling, automization of tasks

jQuery, jsDOM

  • Using "client-side" libraries on the server

  • $.each(), …
  • Selectors
  • Good tool for Scraping tasks

Your daily reddit

var jsdom = require('jsdom');

jsdom.env({
  html: 'http://reddit.com/',
  scripts: ['http://code.jquery.com/jquery-1.7.1.min.js'],

  done: function(errors, window) {
      // ?
  }
});

Your daily reddit

var jsdom = require('jsdom');

jsdom.env({
  html    : 'http://reddit.com/',
  scripts : ['http://code.jquery.com/jquery-1.7.1.min.js'],

  done: function(errors, window) {
	
    var cnt = 0;
    window.$("a.title").each(function() {
      if (cnt++ < 3) console.log(window.$(this).text());
    });    

  }
});

node_pcap

  • Bindings for libpcap
  • Decoding, Analyzing, Printing packets
pcap_session.on('packet', function (raw_packet) {
  var packet = pcap.decode.packet(raw_packet);
  console.log(packet.link.ip.tcp.dport);
});
tcp_tracker.on('start', function() {});
tcp_tracker.on('end', function() {});
https://github.com/mranney/node_pcap

Other interesting libraries

  • express
  • coffee-script
  • jade
  • dnode
  • … various database connectors

node.js is no Silver Bullet

Told you so.

Event Loop = Inversion of Control

  • flow of execution !== code sequence
  • sequence of async code === callback chaining

Welcome to the Callback Hell!

doA(foo, function (err, a) {
    doB(bar, function (err, b) {
        doC(bla, function (err, c) {
            doD(4, function (err, d) {
                doE(d, function (err, e) {
                    doF(e++, function (err, f) {
                        //THIS IS SPA…GHETTI!
                    });
                });
            });
        });
    });
});

Countermeasures

Continuation-Passing Style

Example: Identity Function
function id(x) {
  return x;
}
function id(x,cp) {
  cp(x);
}

Chain the calls, pass a final "continuated" callback for completition:

fn2( fn1( 123 ))  //regular chaining
fn1(123, fn2)     //CPS style

Countermeasures

Dedicated Libraries (i.e. async)

async.waterfall([
    function(callback){
        callback(null, 'one', 'two');
    },
    function(arg1, arg2, callback){
        callback(null, 'three');
    },
    function(arg1, callback){
        // arg1 now equals 'three'
        callback(null, 'done');
    }
], function (err, result) {
   // result now equals 'done'    
});				

Other Caveats

  • never use blocking calls
  • don't ever run CPU-bound tasks on the event loop
  • node.js still changes fast!
  • mixed maturity of available modules
Conclusion

Conclusion

let's wrap it up

Usage scenarios

  • Web
    • Real Time Web
    • Webservices (e.g. REST + JSON)
  • Network applications
  • Rapid Prototyping

Companies Using node.js in Production



Reading Recommendations – Books

Douglas Crockford
JavaScript: The Good Parts
2008, O'Reilly Media, 170 pages
ISBN: 978-0-596-51774-8

David Flanagan
JavaScript: The Definitive Guide
2011, O'Reilly Media, 1098 pages
ISBN: 978-0-596-80552-4

Thank you!

Questions?

Contents

Backup Slides