Streaming JSON parser

So I found something I don’t like about nodejs.  It has a low memory limit.  The limit is around 2 GB of memory ; even on a 64 bit system.  It is a V8 thing.  If you don’t believe me try parsing a json file > 1GB – not going to work.  You end up with a nice out of memory exception.

So how do you parse a very large json file?  The same way you do everything else in NodeJS – async.  You stream the file and wait for events.  The jsonparse (https://github.com/creationix/jsonparse/) module is a huge help!


var Parser = require('jsonparse');

var fs = require('fs');

var p = new Parser();

p.onValue = function (value) {
 if(value.StatementNumber && value.StatementNumber == "12345"){
 console.log("Found me");
 }
};

var stream = fs.createReadStream('data.json', {});
stream.on('data',function(buffer){
 p.write(buffer);
});

NodeJS – more than just a totally awesome web app platform.

So I needed to create a monitoring app that would hit a RESTful interface that required authentication. The only way to hit this RESTful service was to POST to a login page and then manage the authentication cookies for the remainder of the session. And I needed to get this done fast.

This is why I love NodeJS. NoneJS is more than a totally awesome web app platform. It makes for a great client application too.

NodeJS for the command line

The following is the basic code for kicking of an https request. The options argument contains the url, method, and any headers (including cookies). The data argument is for any data that needs to be POSTed with the request. (Make sure you set the appropriate headers for the data like content-type).


var cookie = "";
function processRequest(options, data){
  var req = https.request(options, function(res) {
    res.on('data', function(data) {
      cookie = processResponse(data);
    });
  });
  if(data){
    //POST data get written here.
    //Not included in the options like Jquery
    req.write(data);
  }
  req.end();
  req.on('error', function(e) {
    console.error(e);
  });
}

The magic is in the processResponse function. This is where we capture the cookie and store it for the next request.


function processResponse(){
  var cookies = null;
  if(res.headers && res.headers["set-cookie"]){
    cookies = res.headers["set-cookie"].join(";");
    cookies = cookies.replace(/path=\/; HttpOnly;/g, "");
    cookies = cookies.replace(/ path=\/; HttpOnly/g, "");
    cookies = cookies.trim();
    cookies = cookies + " otherCookie=oatmealraisen";'
    return cookies;
  }
}

Finally we can create the options objects and pass them to the processRequest function.


var loginData = "UserName=XXXXXXXXX&Password=XXXXXXXX&RememberMe=false";
function Login(cookies){
 var headers = { 'Content-Type': 'application/x-www-form-urlencoded', 'Content-Length': loginData.length };
 var options = { hostname: 'ACME.COM', port: 443, path: '/LogOn', method: 'POST', headers : headers }; return options;
}
function dataSummary(cookies){
  var headers = { Cookie: cookies };
  var options = { hostname: 'ACME.com', port: 443, path: '/DataSummary', method: 'GET', headers : headers };
  return options;
}
processRequest(login());
processRequest(dataSummary(cookie));

NodeJS as a https client. Now the next step is to hook up a send email function. And that is for another blog.

I am using NodeJS for a lot more than creating web apps

I am a ASP.NET MVC developer. I love .NET and C# – great products. But some times you just need something light weight. I find myself using NodeJS to create those one off projects for diagnostics and automation.

My web apps tend to be very JSON heavy. In fact we produce tons of JSON every week! NodeJS makes for a great interactive JSON inspection utility.

var fs = require("fs");
var data = JSON.parse(fs.readFileSync("data.json", "UTF-8"));

Now you are free to investigate the data object. This is particularly helpful when you are using the REPL.