JavaScript and Maps (in that order)

NPM as a build system

EDIT: In case I wasn’t clear enough, when you use npm to run a script with npm run foo or npm test it makes sure all of the local modules are in your path, so if you only have uglify and mocha installed locally, you can still write "test": "uglifyjs ./lib && mocha test/*.js"

The reasons that you often don’t need grunt have been done to death, so I don’t feel any need to lend my voice except to agree. I’ve also seen Make thrown around as another alternative. Make is a fine program on *nix systems but for node.js it’s missing a copule things,

  1. it is global, you have to specify ./node_modules/uglify-js/bin/uglifyjs instead of just uglifyjs.
  2. it is a pain to use on windows.

Now before we turn this into a windows/*nix flame war let me point out, I do not develop on windows, I despise developing on windows, the only time I ever voluntarily touch windows machines is when I’m helping my fiance fix her laptop. But that being said I work with people who use windows and I’d like to continue working with them as I have no interest in doing the heroic work they do with our legacy .NET apps.

Many people starting out with node are also on windows. While we may later be able to seduce them over to the *nix side, for now they are learning on the machine they have.

So back to your build step needing to work on windows, keep in mind not every node project on windows is simply going to get deployed to a linux box so simply using vagrant is not the answer. I’ve used IIS node to replace pieces of old asp apps, and I’ve used node to compile static assets that are then pushed somewhere.

And yes it is theoretically possible to get make installed on windows, but not easy, most of the suggestions I’ve seen either contain sourceforge links or references to Cygwin. These are not things that sane people want to do.

What instead?

For the 99% of projects that are a build step and some tests, you can use NPM as a build system, just put them in your package.json, when you run the script that way, npm makes sure your path has your node_module .bin folder is in the paths so if you use mocha to test you can just add

"test": "mocha -R nyan test/test.js"

to the scripts object in your package.json, and it will run tests on all your files that start with test. and are in the test folder, if you like tape you can do

"test": "node test/test.js | tspec"

jshinting can be done with

"lint": "jshint lib/*",

or you can combine it with your testing setup

"test": "jshint lib/* && node test/test.js | tspec"

both of these will look to .jshintrc for the rules, no need to specify them in a config.

karma can be run this way easily and testling can as well with browserify

"test": "browserify test/tests.js | testling"

For building browserify excels at using a package.json script, but if you must use [require.js] instead of putting the overly verbose and totally unnecessary 30 lines of config options to do something simple in with the build stuff, you can put it in a build.js file and then go

"build": "r.js -o build.js"

but if you don’t like having to specify:

you can just have a package.json with:

(I’ll get to the syntax in a minute) though to be fair since I often have a couple steps here I usually just separate it out into a couple steps:

with browserify you can set specify other things in the package.json, for instance the “browser” key defines a map where you can exclude or substitute things from the build, when you specify the substitutions here instead of in the grunt file then other people browserifing a library that uses your lib as a dep also get the substitution applied, in other words, you need to be doing this anyway, why do it twice?

browserify-shim is a very helpful library in this respect (so helpful, grunt-browserify includes it but fails to mention it causing a confusing blending of the options in the docs).

If things get too busy we can just launch a batch script with

"build": "./build.sh"

over in build.sh same rules apply, it will work in windows if you stick to node commands (and git) you work cross platform, you can also call node scripts aka

"build": "node build.js"

in the some the examples I used | and && to separate commands on a single line and used > to out put a file. In shell commands && means the same thing as in JavaScript, it’s a lazy and operator so command1 && command2 means run command1 and if it doesn’t fail, run command2.

The other commands related to pipes/stdin/stdout. Commands can read from stdin and write to stdout and in node those are both streams, available at process.stdin and process.stdout, though another thing that outputs to stdout in node is console.log. By default unless it is piped anywhere your console will print stdout and thats why programs like browserify will print your bundle on the screen if you forget the -o option.

The | command (pipe) takes the output from one commands stdin and pipes it to the next commands stdin, so when you want to browserify a file and give it to testling, you can do

browserify . | testling

which takes the output of browserifing the current directory (the dot) and pipes it to testling.

or if you want to take the ugly tap output from tape and make it look nice, use

node test/test.js | tspec

which pipes it to tap-spec, or you could use

node test/test.js | tnyan

to pipe it to tap-nyan, because testing should be fun.

the > command means write stdout to a file, so you’ll sometimes see people write

browserify . > bundle.js

which means write the bundle to bundle.js (yes I know you could also write -o)

then can be combined with pipes as well so to browserify and minify you can write

browserify . | uglifyjs -mc > dist/script.js

which browserifies it and then pipes the output to uglifyjs to minify it with the mangle and compress option.

N.B. uglifyjs is the command line program, but it’s in NPM as uglify-js so you need to do npm install --save-dev uglify-js

The < operator reads the specified file as stdin so the following, while stupid, would work

browserify . > temp.js && uglifyjs -mc < temp.js > dist/script.js

in which we write our browserify output to a tempfile and then read it into stdin for uglify, a non contrived example would involve something like istanbul which always writes code coverage to a file and prints test results to stdout, so if you wanted to run coverage on a mocha script and then pipe it to coveralls then

istanbul cover ./node_modules/mocha/bin/_mocha --report lcovonly -- tests/*.mocha.js -R spec -t 5000 && coveralls < ./coverage/lcov.info

would do it

N.B. ./node_modules/mocha/bin/_mocha is the path to the real mocha binary, the mocha in your path (that we would get if we ran mocha) is a command line parser that will prevent istanbul from working right.

You will sometimes see people writing the above as

istanbul ... spec -t 5000 && cat ./coverage/lcov.info || coveralls

those people are wrong, cat is for concatenating files, not for reading them into stdout for you because cat isn’t going to be available on windows.

You want to watch your bundle and have it rebuild when you change anything, try watchify

"watch": "watchify . -o ./dist/bundle"

note you need to specify the output folder, you can’t use > because … it throws an error for some reason.

WINDOWS!

There is no rm -rf so instead use rimraf which will remove a directory and all the stuff in it recursively so from our previous example

rimraf ./foo

will remove ./foo, and ./foo/bar, and ./foo/bar/baz and anything that might be in them.

N.B. Don’t be an idiot with rimraf, if your careless you can delete everything, especially on windows.

To do batch copies ala grunt-copy you can use copyfiles which I wrote expressly as a replacement for grunt-copy, so to move all the JavaScript files from the vendor and src folder to the dist

copyfiles ./vendor/*.js ./src/*.js dist

but if you want them with out the outer vendor and src folder

copyfiles -f ./vendor/*.js ./src/*.js dist

if if you want to do a globstar

copyfiles -f './vendor/**/*.js' './src/**/*.js' dist

for operating system consistency you need the quotes around globstars, and for once that’s not windows that is the problem, it’s OSX with its out of date bash.

Conclusion

You probably don’t need to use any build system for many node projects because npm can act as a build system and for the love of god don’t use make because that’s practically holding up a sign telling .NET programers to continue writing .NET program. We want them publishing all the things to NPM.

discuss on reddit

Other articles on the subject

JavaScript style mistakes I see a lot

  • Edit: removed recursive reference, it was incorrect (my bad) but you should probably avoid recursive references in general.
  • Edit2: discuss on reddit
  • Edit3: forgot hoisting was such a hot button issue, switching to the hopefully less controversial statement that declarations should be your default

A quick run down of a few very common JavaScript style issues, that while aren’t outright errors you should probably avoid.

Function expressions, not on objects

This is when you write

var name = function () {};

instead of

function name(){}

the bottom version should be your default which you should use unless you have a reason to do it the other way. Some of the benefits include

  • hoisting (though you might passionately disagree)
  • reference to the functions name inside the function can’t be changed.
  • because you have to pick one for consistency and I’ve never seen project choose the top version.

using module.exports, to export an object

That is when you write

module.exports = {foo: function(){}, bar: function(){}}

instead of

exports.foo = function (){};

exports.bar = function (){};

the difference is that the bottom syntax can handle circular dependencies, and the top can’t, they are otherwise identical.

making me guess your exports key.

that is if you only export one thing from your module, doing

exports.someRandomName = thing;

as opposed to

module.exports = thing

unless you have circular dependency you have to deal with, making me write

var yourThing = require('yourModule').someRandomName;

is going to cause me to judge you.

PouchDB 2.0.0

PouchDB 2.0.0 is out and has a lot of great stuff, first off you’ll notice breaking changes they are:

  • allDbs has died in a fire. This optional disabled by default feature complicated the db creation process to no end and was the source of many race conditions in testing. If you were one of the 3 people using it fear not, while it was previously impossible to get this behavior otherwise it now is due to another new feature.
  • PouchDB now uses prototype for inheritance, no more of that copying methods from object to object. This does mean that var get = db.get will blow up, so write var get = db.get.bind(db) instead.
  • plugin now take advantage of prototype, meaning you can add them after you create a database.
  • For indexedDB and webSQL adapters the schema has been updated to give correct doc counts, your db should migrate fine,
  • for the leveldb adapter we switched from being a folder with 4 leveldb and an annoying ‘.uuid’ file it is a single leveldb with 4 sublevels and no ‘.uuid’ file. This also will transfer your data over the first time you open an old style one in the new.
  • these last 2 mean that v1 PouchDBs will open in v2 but not the other way around. Also v1 ones will become v2 ones as soon as they are opened in the new release so don’t switch back and forth.

Non breaking changes:

  • leveldb based PouchDB creation is now totally async with no fs.
  • all the methods return promises (except the replication ones). This is in addition to taking callbacks. If you hate promises then the only change you’ll notice is that we more consistently return errors instead of throwing. We use the superfast bluebird in node and in the browser we default to native promises and fallback to the supersmall lie.
  • it works in web worker now, including in chrome where you have access to indexedDB inside a worker.
  • we prevent overly enthusiastic caches in IE from preventing requests from reaching CouchDB
  • we no longer screw around with your options object so you can re use it for multiple calls.
  • if you pass emit as the second argument to your temp query function, map reduce no longer evals it.
  • a whole host of things to make replication much faster
  • a whole lot of tweaks to the webSQL adapter thanks to Nolan.
  • the PouchDB constructor is now an event emitter and you can listen for create and destroy events.
  • If somebody has shat all over your JavaScript environment by adding enumerable properties to Object.prototype, PouchDB has you covered.
  • Put sugar you can specify _id and _rev as the 2nd and 3rd arguments to put, i..e. instead of db.put({some: 'stuff', _id: 'myId', _rev: '1-rev'}), you can do, db.put({some: 'stuff', 'myId', '1-rev'), when you are creating a document you can omit the rev (though if you omit the _id and do include the rev you’ll create a document with that rev).

Some fixes that make developing much easier include

  • qUnit has been totally replaced with mocha and chai, if you’ve ever had the misfortune of using qUnit in node or with async tests or both you’ll know why this is so awesome.
  • all the tmp test dbs are in their own folder which is deleted afterwards, no more mess after failed tests.

Grab it from npm, component or bower, or download the regular or minified version.

Dear TC-39, can you please take another look at your module syntax

Dear TC-39,

You’ve been doing some fantastic work recently, all the new features slated for ES6 look amazing. Arrow functions and the functional parameters sugar do such a good job at making functions just that much more fun to use. Set, Map, WeakMap great and really going to help for data centric apps (like all the GeoSpatial stuff I do). Generators, well I’m already using them and they are fantastic. Promises are already in browsers, that’s so fast it’s unbelievable. All these great things you’ve done make this just so much harder. But the module syntax you folks are proposing for ES6, I think you should take a fresh look at it.

Most of the discussion we’ve had on the module system has been framed poorly, it has been people like me who it rubs the wrong way attacking it and people like you who spent a lot of time developing it defending it. But really we’re all on the same side, we want the best possible module system that we are able to actually get into browsers in a realistic time frame. More specifically a module system that does static analysis (finds the dependencies without running the code), so that we can avoid the significantly added layer of complexity that comes with dynamically loading modules.

I originally was going to write a while thing about the flaws in the system you guys are proposing, I even made this flow chart to show how complicated the export syntax is and a whole list of links to code that wouldn’t be writable in the new syntax. But then it occurred to me that it is hard to tell if you guys writing the spec really even think this syntax is any better then what is currently out there, because if you do your doing a very bad job of selling it to us in the node community. Static analysis of dependencies and imports with browserify or the CommonJS wrapper in require.js (define(function(require, exports, module){...});) show that in practice there is no need for restrictions on the location of imports to successfully find them in static analysis and people trying to conditionally load modules in a situation where they can’t isn’t a problem in the wild.

There are sources which describe ES6 modules as a compromise between CommonJS and AMD modules, but again that isn’t really the case any more as using the CommonJS wrapper has become more prevalent in AMD code to the point that most debates about CommonJS vs AMD have much more to do with tooling and deployment strategy. In many ways ES6 modules are redefining the one thing about modules that the AMD and CommonJS communities agree on, syntax.

Which brings us to the root issue, TC-39 hasn’t sold the node and AMD community on their new modules system. Coming from node and AMD systems this one is more complicated with significantly less functionality, specifically

  • Lack of import statements (as opposed to declarations).
  • Syntax that is still geared towards exporting a bag of values as opposed to a thing.
  • Generally high level of complexity of import export syntax.

The people in the node community giving you grief about this syntax aren’t just bikeshedding but are people building production apps with a module system that they like and you haven’t convinced them that you have something better. And that’s just sad because the node community is your biggest fans. We love generators, we are using block scoping in the wild, and have some very strong feelings about Promises, much of which is positive.

TC-39 that’s what I’m asking you, take a hard look at your module syntax and ask yourself

  • Are we taking advantage of the experience that the community has gained building production systems with CommonJS and AMD?
  • Are we taking advantage of the examples offered by more modern build tools like browserify and require.js?
  • Are there real and non-hypothetical advantages gained from restricting imports to declarations?
  • Have we taken into account other features of ES6 like destructuring in designing these modules?
  • If you weren’t involved in the spec writing process, would the export syntax make as much sense?

From an outside perspective it feels like TC-39 is suffering from Not Invented Here syndrome ,if committee members don’t think that this is the case then I think a lot of people would appreciate it if you guys tried to sell us on the advantages of this syntax as compared to current tools if members don’t feel that way I don’t think I’m the only one who would prefer modules done right in ES7 to settling in ES6.

Edit:

  • Comments on reddit
  • Also in case it’s not clear, I’m not advocating to just drop in node modules into the JavaScript spec, just it would be nice to have a similar level of features.

Allow me to reintroduce you to Proj4js

tl;dr Proj4js used to be really bad, now it’s better and you should use it and help build it.

Background

A while ago I decided the write a program to parse shapefiles in the browser into geojson. Despite the shapefile’s reputation as as perversely bizarre format (it’s binary and mixed endian) the part that ended up being the trickiest had nothing to do with the file itself. The problem was with the communities that produce shapefiles and consume geojson. Geojson almost exclusively records coordinates in unprojected latitude, longitude notation (WGS84 to be exact, though somebody did give me NAD83 once). Shapefiles tend to store coordinates in a projected coordinate references system.

The problem I faced was how could I get shape files in any of the thousands of projections out there to just work. There was a program called Proj4js, but it was in a sorry state, the math was all there, but the interface was cumbersome to say the least.

I (and quite a few others) have spent quite a bit of time fixing it up and I want to reintroduce you to this very helpful library.

Getting it

You can install it with npm, bower, component, jam or just stick the script in your browser. It’s build in browserify and exports a UMD that should work in any module system (Dojo might give you some trouble still, but that is an upstream problem fixed in browserify which will be fixed next time you build).

Usage

The global it exports if you drop in the script is proj4 (as is the name in npm, bower, and jam package but component is proj4js/proj4js as it goes by repo) and proj4 is a function which will be all you need 90% of the time. You use it like so

proj4(fromProj, toProj, coords);

It’s that simple, a few notes:

  • From and to proj can be either wkt strings or proj strings, note the lack of need to to create a new Proj object or anything.
  • Ihe coordinates can be in 2 formats, [x,y] or {x:x, y:y} if you pass in an array you get one back, if you pass in an object, you get that back instead. No point class needed.
  • If you only give one projection then it’s assumed you want to project from wgs84
  • If you don’t pass in a coordinate you can an object with forward and inverse methods
  • You can’t just pass in an EPGS code, there is no realistic way to store all of them in a easy way. Most of craziness of the old version related to trying to let people use arbitrary epgs codes.

So if you want to convert a latlng to a projected coordinate,

var newCoords = proj4(wkt/projString, latlng);

If you want to avoid having to reparse the string each time

var trans = proj4(wkt/projString);

var newCoords = trans.forward(latlng);

var newLatLng = trans.inverse(xy);

Bottom line

Proj4 works for every projection I’ve been able to test against, find a problem, open an issue, or even better, open a pull (the code is still ugly, but not as bad as it used to be thanks to browserify).

Guide to arguing about promises

If your using JavaScript you may have noticed that opinions about promises are much like hands, most people have one, many people have several. Here is your cheat sheet so you can pretend to care while just using the version that has shipped in Chrome and Firefox.

Promises vs Callbacks

The original argument, centering around what the default API in node should be. This is actually a rehash of an old argument and promises never had a chance. They weren’t performant the first time and by the time they were the ship had sailed on the API.

Bottom Line, always accept a callback, though feel free to return a promise too.

errors, foot guns , and jQuery

Using throw in a synchronous program when you have a problem is like sending up a flare, you get a stack trace, you can pause your script in dev tools, it’s great.

Using throw in async code is like sending up a grenade, you have no idea where it’s going to hit and it’s probably going to be unhelpful at best.

The mainstream promise implementation catches all errors and passes them to the second function you give to then. It also has a catch method which is sugar for then(null, something). Many people believe that this is bad because it means that unless you explicitly handle an error it gets silenced. Especially if the error is in your error handling code.

JQuery famously subscribes to this and insists on waiting until 2 browsers ship the other kind of promises before even starting to change their promises.

A compromise suggestion thrown around is a done method that rethrows errors. But you’re still requiring explicit action for errors to not be swallowed and we are still throwing asynchronously.

Most importantly most of this ignores the fact that in async code it’s often OK to ignore errors because an error can handle itself often by the call back not being called and while the current setup is probably not the best it is likely the least worst.

Bottom Line: you can’t remove all the foot guns from programing because sometimes you need to rocket jump.

monad and chain

Promises have had resolve and reject methods to create already fulfilled promises for quite a while. The spec added a cast method which I always assumed was meant to coerce jQuery promises to real promises.

Somebody on es-discuss pointed out that resolve and cast do essentially the same thing. Like 50 comments latter I bothered to read the thread and what emerges us a group unhappy with the current spec.

Currently then is never called with a promise, always a value. In other words in the current model async is like pregnancy, all of nothing. This is a benefit of promises because it let’s you deal with nested async code in a much simpler manner.

There is another promise method proposed called chain which unwraps a promise but only one of one layer. In other words if you fulfilled a promise with a promise then would give you the value but chain would give you the inner promise.

An argued use case for this (bear in mind I’m writing this on my cell phone while waiting for laundry so this is entirely from memory) is for when you want to act differently based on whether something contains a promise or a value, like a storage object that returns promises but can store values or promises. You might want to act differently if what you get back (inside the promise) is a promise also, which you can’t do with just then.

The main issue with this is its trying to treat promises as a little bit async. Which misses the point the current implementation. Where you would treat all promises you got back from the storage thing as similarly async and get in with your life.

The reason that cast/resolve relates is that while they act the same for then they would act differently for chain. But in all honesty this argument relates more to the long running monad argument which boils down to whether it is a good idea for promises to be implemented as a concept that nobody who understands is able to explain or is a simpler model for them better in the long run.

Bottom Line: do you care that promises aren’t monads?

Block Scoping JavaScript: no Automatic Curly Bracket Insertion

Riddle me this JavaScript user, when you write

if(something) doStuff();

what is that equivalent to? If you’re like me you guess that much like how semicolons are inserted by the parser, curly brackets are inserted as well, meaning that the above statement would be equivalent to

if(something){doStuff();}

But no, from this recent thread on esdiscuss it would seem that the above statement is equivalent to

void(something && doStuff());

Now currently these 2 methods of parsing the statement (for the purposes of our example) have identical results, but with ES6 and the block scoped variables there will be a difference,

if(something){let x = 5;}

is not the same as

void(something && let x = 5);//not valid

If for no other reason that the lower statement isn’t valid as let x = 5; is a declaration not of a statement, but hopefully you get my point that

if(something) let x = 5;

Would declare x in the outer scope and not do nothing like one might expect.

While this might be a bit confusing at first what it really means is that, with a few exceptions, block scope and curly braces are synonymous and will always do exactly what they look like they are doing, HOORAY!!!

*the exception is the top part of a for loop (but not a while loop as you can’t declare anything in the head) in that for(let x in y){} x is part of the lower scope.

Exporting a function in ES6

Exporting a function in ES6

Using generators via co to put Massachusetts case law into CouchDB

I’ve been hearing good things about generators for a while and was looking for a project to use them on, then someone posted a huge amount of case law and I had one. Maybe six months ago I scraped the Massachusetts General Court’s (think state legislature, but we’re a commonwealth so the name has to be ridiculous, we don’t have a DMV either, but I digress).  So adding case law seemed the next logical step.  I had used python last time but that ended up being a poor choice so this one I was going to do it in straight up JavaScript (node.js) to get all the laws into a database (CouchDB).

The cases were in individual xml files organized into folders to specify types leaving 4 folders of court cases (department of industrial accidents , appellate court opinions, district court appellate division opinions, superior court opinions) with a total of 101057 files (with apparently random filenames). 

The steps I need to take was for each of those 4 types

  1. get a listing of all the files in in the folder and for each file
  2. read it’s contents
  3. convert the xml to json
  4. get it into CouchDB

You will also notice that each of these is an asynchronous operation, which is where generators come in.  I used co for flow control which allows you to use generators to write asynchronous code in a synchronous way, which means that the main loop of my code was (more or less) this

If your curious, here is the full code.

Your first reaction should be, well so what, you’ve successfully made an async program synchronous.   The key is that while it may be synchronous it is still non-blocking. Which means that if this was part of a bigger program (like a server) the program wouldn’t be idle while this part was waiting around.

That is also the case in this example, you’ll notice that I am using PouchDB to put the document into CouchDB and that PouchDB can act as both a client to a remote CouchDB instance but also create a local instance (on node using LevelDB). Local PouchDB instances are a lot faster then remote ones so what I did was I put the documents into a local one and had it replicate in the background to the remote one.  Since the replication is also asynchronous but slower than the main task of putting the stuff into the local database then whenever we are waiting around for files to be read or stuff to be put into LevelDB then PouchDB is moving more local stuff remotely.

(yes I could have also removed the outer co and had it do each of the 4 folders at the same time, but that would have used up a lot more memory and I was having this run in the background while I was doing other stuff).

I haven’t made a a site with the case law yet, but if you want it, it’s in a publicly accessible CouchDB instance caselaw.calvinmetcalf.com (or if you just want to see it all, here is the link to all of the documents)

Best puzzle ever

Best puzzle ever