JavaScript and Maps (in that order)

You probably don’t want to use the module pattern.

Every so often I see code (and this includes my old code) which uses the ‘module pattern’.  This is when instead of using a constructor to make an object with a prototype you just create an object, add methods to it, and return it, for instance

You probably don’t want to do this.  Why? Because the benefits don’t outweigh the costs. 

Benefits to the module pattern:

  • Conceptually simpler, especially if you aren’t familiar whit how JavaScript inheritance works.

Benefits to constructor pattern:

  • Faster, JS engines have optimized it.
  • Uses less memory. 
  • instanceof operator works.
  • Can inherent with it easily. 

Two other things differences which I’m calling a wash are

  • monkey patching, which is modifying the prototype method to change current functions, which some people like but I believe is usually an awful idea, can do it with constructors
  • data hiding, which is having truly ‘private’ data, this can be done sorta with constructors but breaks inheritability. I usually consider this bad, but some people like it.

Lest anyone using the module pattern’s feelings get hurt I will point out that I myself have used the module pattern extensively in libraries.   But the good news is that it is easy to switch, step one, have your module object be ‘this’ instead of an empty object and delete the return value

there is no step 2.Well that’s a lie, step 2 for that one would be to go find all the places in your code that called it and add the new keyword, there are 2 other options, capitalize the new constructor and then make a lowercase factory function.

or just use the instanceof trick

I usually use the latter as I find the factory pattern to be more complex with it’s multiple functions, for a final outcome of:

Bear in mind that calling it without new adds another frame to the stack, so should be avoided except when necessary.

If you want to ‘hide data’ your constructor you can do it here like so

but bear in mind that most of the benefits that come form constructors we haven’t realized yet.

To get the most benefits out of using prototypes etc you need to actually use prototypes by moving your method definitions outside of the body of the constructor. 

Note you can’t data hide in this way, and if you need to redefine ‘this’ so it can be used in callbacks you need to do it per method you can’t do it once. But all the functions only get defined once so it is significantly faster.

Note: for the love of God we are not talking about AMD and CommonJS modules, we are talking about an object construction pattern which happens to have the same word in the title, that exists in the wild, sometimes wrapped in an IIFE.

regex + eval = crazy delicious

A while back someone (who shall remain anonymous  because I have made this type of category mistake before) someone opened an issue for an ajax library I have, complaining about the security risks related to using eval as a fallback to parse JSON if someone had a browser that didn’t support JSON.parse.  What was notable was that the library also supports JSONP which the issue made no note of. 

So why does eval (and friends like Function and SetTimeout with a string) get so much FUD exactly, it’s not just the security, preventing eval doesn’t do much good when any script can add arbitrary script tags, I mean your closing the barn doors after the sheep cows have left if your preventing the Function constructor on page that has jQuery (with it’s $.jsonp).

This is not to say that there are no security issues, there are but you can’t deal with them by just avoiding eval, CSP are the way to do that, they can prevent pretty much everything I talk about here, except maybe the blob worker through I’m not sure. But they are beyond the scope of this article.

One reason has to do with historical misuse of eval, take a look at this snippet which is honest to god in the wild code

This file (which is downloaded with a bunch of files which start with ‘dnn’ someone is going to figure it out, it’s this script, from this page part of Dot Net Nuke CMS site) uses eval nearly 340 times for string concatenation.  340 times they create a new instance of the virtual machine in order … in order to do nothing, the script will act exactly the same without the eval (except be faster).

Brendan Eich mentioned how back in the day, people wouldn’t realize you could do obj[key] and instead would do eval(“obj.”+key).

This is the reason eval is ‘evil’, because unless you actually know what you are doing, you’re more likely to shoot your self in the foot then anything else so for all intensive purposes you should consider it evil and move along.

eval and friends

While eval is well known, when we are talking about eval we are actually talking about a couple things, first you have the Function constructor. new Function(arg1, arg2, string) creates a new function with the last argument as the body and the other arguments as the functions arguments. In essence it’s this.

So you’re running eval on the body of it. The other ‘classic’ evalesque functions are setInterval and setTimeout when you give it a string in the first argument instead of a function.  Unlike the function constructor which occasionally has some legit uses passing a string to setTimeout and setInterval doesn’t, ever.

There are several other JavaScript things which for security purposes may as well be the same as eval, the first is to execute a script via the dome either by inserting a script tag into the dom and setting it’s src, loading an iframe which has a script tag in it, or setting the onload property of an image and a myriad other way.  The next is loading a web worker, either from a url or from a string via a blob url, the last last is the web workers importScripts function which loads an executes a script.

Using the DOM  isn’t as bad as the previous methods performance wise as it doesn’t create a new VM, and when creating a working a new VM is the point so it’s a legit trade off.  Workers are somewhat less of a risk as they have no access to the DOM, and can only message the main thread but now execute anything, on the other hand workers can do ajax requests and can load scripts via the importScripts function and run eval so security wise there are trade offs.

When eval isn’t evil

Until the ES6 proxy object eval was the only way meta program.  For instance in CouchDB you write a function in the database which calls a function called ‘emit’ with the result. e.g.

The only way to implement this in JavaScript for PouchDB and have the emit the function inside that function refer to the emit we want it to and not the emit it’s closure would usually refer to is to use eval. 

Eval and it’s friends, mostly blob workers are used extensively by Catiline, a library which creates new creates a new API on top of web workers which allow them to be used easier in libraries but also with fall backs that don’t support workers. In order to be able to write a function in current scope, but run inside a web worker is by using the toString method of functions and then running regexes on them before opening them in the worker (or iframe as a fallback). 

The function constructor can be used to make very fast templates as it uses the constructor to create a function which concats strings and nothing else.

Axel Rauschmayer points out that eval can be very effective in carefully controlled offline situations to make config files much more expressive.

Thomas gratier pointed me to an article by Nick Zakas on using eval in a css parser.

Regex + eval as a combination can allow meta programming as powerful as macros in other languages. Of course these are unhygienic as hell (but this never stopped certain LISPs) and have some major performance penalties so you while you can do amazing things with regex + eval you can also shoot yourself in the foot. 

In other words if you don’t know what you’re doing, you should just go ahead and consider eval evil, but if you do, and your careful, you can do some amazing things.

Immediately Invoked Function Expressions

Note: if you aren’t familiar with what an IIFE is, skip to the bottom it should be obvious where.

For some reason I have always written my IIFEs as ‘(function(){})();’. I’ve noticed that others use ‘(function(){}());’ and decided to investigate. I posted a query on twitter which @rauschma helpfully re tweeted and here is what I learned. 

(fn)() and (fn()) are for all intensive purposes the same the few differences (hat tip Ben Menoza) are extremely superficial. 

You can also use !function(){}(), void function(){}(), ~function(){}() for the same purpose, ~function(){}() is just silly, void function(){}() requires using void which i can think of no good reason to ever use and !function(){}() might be a good idea to use in code not intended to be human readable (e.g. code produced by a minifier) it is not very clear though when used in human readable code, for the same reason ~array.indexOf() is bad.

It’s probably a better idea to use (fn()) over (fn)() because 

  1. (fn)() looks like dog balls, which in context means that there is stuff related to it that is hanging out side of it (this is not a problem specific to canine testicles) 
  2. If you are in a situation where function(){}() works (like assigning to a variable or using as an argument) then you can just remove the outer parentheses instead of removing inner ones. 

Edit:

It was pointed out to me that there is a second type of IIFE, you can use var foo = new function(){}; to create a new object (singleton in the jargon), you don’t actually have to assign it to a variable, but if you don’t it’s no different then a regular IIFE. This works as written because using the new operator allows you to omit parentheses if they aren’t needed (aka new Constructor(); is the same as new Constructor;) so new function(){}; is the same as new function(){}(); I didn’t cover this because.

  1. I’d never heard of this pattern
  2. It serves a different purpose. 

IIFE?

An Immediately Invoked Function Expression (IIFE) is a term in JavaScript for when you want to define a function, and then immediately invoke it. Due to parsing rules if you try to write function(){}() you will get an error, this is because if function is the first thing it sees it assumes your writing function myFunctionName(){} and is then disappointed when it sees a ‘(‘. For this same reason you can’t type into the console function(){}, you’ll get the same error, and if your curious the technical explanation for the parse error is that it is expecting a function declaration but is getting a function statement. 

Browserify/NPM: Package Manager Odyssey Part IV

Part 4 in a however long I end up making it series, see the intro, or the previous ones on Jam and Component.

Today we are talking about about a very different type of client side package manager, the reason it’s different then the previous two is that it isn’t a client side package manager, it’s a server side one, NPM which you probably have heard of.  The only reason we are talking about it is because of the program substack wrote called browserify.

Browserify can end up being used a couple of different ways, you could use it as a drop in replacement for component to build a library, the API is almost identical, to create standalone file ‘foo.js’ in directory ‘dist’ with an entry point of file ‘bar.js’ and to have it be a stand alone umd bundle called ‘baz’: in component it’s

component build -o dist -n foo -s baz (you specify bar.js in the component.json)

in browserify you do:

browserify -o dist/foo.js -s baz bar.js

the main difference is that browserify doesn’t double as a package manager, so all your dependencies are handled by npm, this is a big plus as npm has a significant percentage of all JavaScript libraries even if they are only client side, this means you can do npm install backbone —save and then just require it in your app. 

For trickier libraries like jquery where you can’t do npm install for them (the jquery in npm is not the real one) you can do bower install jquery and then do use the ‘require’ command when you build, ‘—require ./bower_components/jquery/jquery.js:jquery’ allows you to do ‘var $ = jquery;’.

The other thing browserify does is it adds shims in for the node standard library, meaning you can use say path to manipulate path strings, or crypto to hash things or glib to unzip stuff.  As long as the app doesn’t try to do anything that the browser doesn’t allow (i.e. use the fs library) you can require that module, meaning you can require modules even if they author didn’t opt in. 

One of the reasons this series has taken so long is that I have been building real honest to goodness libraries and apps with the different things I’m covering and with browserify I built a library for parsing ESRI file geodatabases and was nicely able to make a node library, a browser library, and an app

I’m also in the process of converting proj4js to browserify and the dev branch of my Massachusetts law viewer I converted as well (which I’ll be speaking about at CouchDB conf btw, if you want to come you can save $25 with this link). I feel like I say this in each blog post but I’m going to be making my apps with this going forward for the following reasons:

  • being able to use almost all of the modules in npm without the author having to explicitly design it for this package manager means that you have access to an order of magnitude more modules.  The no opt-in for modules is really the killer feature in many ways.
  • The ability for libraries like curl to asynchronously load commonJS modules really is the final nail in the coffin for AMD. commonJS is the clear winner (unless TC39 decides to come to its senses and bring something besides pythons module system to the browser)
  • Browserify also allows complex transforms for those who use CoffeeScript or other such things. 

Next time (whenever that is) I’ll probably be looking into bower by way of yeoman, a package manager setup that doesn’t actually compete with npm/browserify 

Proxies in ES6 something.*()

I had an interesting problem today, it involved catiline, my library for web workers, how do I fake console in all it’s various forms in the worker.  This is easy for log, warn etc it just sends back the method names but there are quite a few and they aren’t universally supported in different browsers. In the end I just hard coded a list of the ones I’d go with, but what I really wanted to do was be able create something that allows me to do console.* = function…

As it turns out you can do this with proxies in ES6, and that’s just the beginning. This

A couple notes: that will only work in firefox as V8 has an older version of proxy, (edit this shims new proxy via the old proxy) also I seem to remember you don’t need to use the new keyword for proxies but firefox throws an error if you omit it.

The way it works is that we are intercepting the call to look up the key and then building the function for that key, there is no reason we couldn’t return a value in certain cases. Like with

which makes the method lowercase and tries to return a value if its all caps. Now all of this only uses the ‘get’ option, the full list is here

If you’re into Aspect Oriented Programing then this is going to be your dream come true as you can intercept function calls before they touch the function, like in this example where we wrap it in a try statement to make sure it runs.

Some more links:

Component: Package Manager Odyssey Part III

See the intro and the previous post on jam

Component is a package manager by TJ Holowaychuk, best know for the Express web server, the mocha testing framework and like 40 other things.  Component has you write your modules using regular node style commonjs modules and then adds the AMD boilerplate automatically. 

It uses github as it’s package manager so instead of in asking for a package by name, you request it by author/name. Component specific details are stored in a component.json file.

Building and writing libraries with component is just heavenly after coming from the requirejs world. With requirejs everything is extremely thoroughly documented, but you still can’t figure out how to do a single thing because it’s like a textbook with no index.  Component on the other hand don’t have much documentation, but you don’t need it because it has sensible defaults, in total my build has 3 options, name because I want the out file to be something other then ‘build’,  out because I want it in a directory not named build, and standalone because I want to give it the boiler plate to make it work in node, amd, and in a browser. In another stoke of just being user friendly, standalone takes a value, which is the name to use in the browser.

The main downside of component is the exact same downside jam had, insufficient traction and interoperability, this is even more frustrating for component because doesn’t have a database, if it was to use the package.json like jam and to assume some sensible defaults then significant percentage of packages on github would be usable in component with out the author doing anything.

The author/packagename binomial nomenclature is also kind of annoying too. If node was like this, then to install TJ’s package ‘express’ instead just npm install express I’d need to remember who wrote it,  then I’d have to remember the authors username because in TJ Holowaychuk’s case visionmedia, unless he decides at some point to transfer it to an org. You can see the problem, latter we’ll look at bower which has a system that solves the problem of using github as a database but also having easy names.

Unlike jam when you install dependencies you don’t automatically get a usable requirejs file, you need to compile your app.

Overall the build tools of component are fantastic but the actual back age manager less so and it doesn’t make throwing together an app quite as easy as jam.

If someone was to combine the build tool from component, with jams autoconfigured require file and use of package.json and know to look on github to actually download the file…that would be the package manager I’d want to use.

Package Manager Odyssey Part II: JAM So Close But Not There Yet

You can read part I here.

Jam is a package manager that works only with AMD style modules and is built around being consumed by amd style apps. Adding it to a project is super easy.  It just uses a jam key in the package.json with keys in side the jam object overriding the values in in the global name space.  

This means that if the file pointed to by the ‘main’ key in your package already works as an AMD module and that file needs no decencies, you don’t have to do anything to your package.json to publish it.  

Once you install jam (in npm as jamjs NOT jam) and add some dependencies to the jam object in your package.json you can then just use jam install and it makes a jam folder that in addition to you dependencies has a file called require.js and a file called require.config.js, which are a full copy of require.js with all the dependencies configured and just the config respectively. Meaning all I need to do is point a script tag to jam/require.js and then require what I need, simple even works in a web worker, you can see the repo and the results.

The first major downside is the small number of the repositories. AMD isn’t universally popular and a lot of the modules in there are shimmed things put in by other people and may or may not be up to date, it isn’t a coincidence that the two libraries I’m using from jam I put there myself.  It also doesn’t help that you have to manually publish your module to jam every single time, this is in contrast to component and bower which do not require that, even more annoying its refusal to remember my login.

You may also notice that the two modules I grab from jam I don’t build with jam, that’s because building things with jam has been an exercise in frustration, jams compiler is a thin wrapper around r.js and r.js does not believe in ‘just working’, these are the options I pass to r.js to get a umd module for proj4

For comparison I have yet to figure out how to do this for jam, while component (which we’ll deal with next time) needs a single option to do that (well technically 3, one for output folder, one for output name, one for umd). 

Overall I feel like Jam is exactly what I want for a package manager for making apps, but its lack of packages means I’m probably going to have to use npm for some of my dependencies unless I put them into jam myself.

For libraries jam inherits all the issues that plague rjs, requirejs and amd modules in general, namely a lack of sensible defaults and general set of assumptions that work fine if everyone is using AMD modules but more or less explode into a awkward mix of shims and crying in the real world, if I have to use AMD modules to put together a library I’m probably going to just use npm because I can then write my own config that will also work with curl in addition to requirejs.

Next up will be component a library which more or less decided make the exact opposite choice Jam did at every step along the lines.

2013: A client side package manager oddyssey

NPM dominates server side JavaScript environment and for good reason, NPM is one of the draws of node because it is so easy and simple to use. The attempts at recreating this for client side development has led to a couple problems exacerbated by a lack of a standardized module system (until very recently).

Node and NPM use commonjs modules which work well for that environment, the main issues in the browser is that they can’t be just dropped into a script tag and used as is. There are a couple solutions

  • Use a program to compile your modules into a program.
  • Wrap your modules in a function

Good thing that JavaScript developers are a mature group of individuals who can work together to come up with a great solution together … just kidding it’s a shit show with 4 different package managers, they are:

  1. browserify by substack, which makes node modules from npm work in the browser through dark sorcery
  2. bower, from twitter, which is framework, builder, and language agnostic. It’s more or less a glorified version of wget (that was going to be a curl joke but that would have been confusing)
  3. jam by caolan, a newer repo build around amd modules
  4. component by TJ Holowaychuk
  5. Volo as suggested on twitter

More in this series

Sets in ES6

Sets from the ES6 spec have been implemented in browsers for a while now but without iteration they have had limited utility. 

Iteration of collections has landed in firefox nightly (note: check on canary) so we can do stuff with them now.

Sets can be thought of as Maps without keys, or (sorta) un-ordered arrays.  Previously you could could somewhat create a set of strings and numbers by using an object where all the values where the true. 

The use case of a Set is when you want an array but don’t care about order and you don’t want duplicate keys. 

Now remember when I said it was sorta unordered? That’s because the enumeration order is predictable, it’s always enumerated in the order that they keys were added. So what, well also bear in mind that unlike array iteration you can add things to the set while you are looping it and if they are new to the set they will definitely be iterated. 

A few notes:

  • Adding the same object a second time is ignored.
  • Objects are mutable and defined by reference not value so if you have an array ‘a’ add it to a set and then do a.pop(), if you add it again nothing will happen, it’s already in there.
  • If you delete an re-add an object it’s order is now at the most recent not it’s old position.
  • This means if you add in object a, then object b, then a again, then c then delete and re add a, and add in e, the order is b,c,a,e. The additions are not a,b,c,d (back where it was) or b,a,c,d (the adds are somehow cumulative)
  • A php or haskel programer WILL get an angry article about the above mentioned fact onto the front page of HN in about 6 months, I expect a link to this post.