The Latest in Mongoose 2.x

Mongoose is now at version 2.6.0 so I thought I’d share a few features that have been added in recent weeks that you may have missed in the release notes.

Default Path Selection

If your schema has paths that tend to contain large amounts of data, say images stored as Buffers that are rarely needed in your application, Mongoose now provides the select schema option which allows you to exclude that field from all of your queries by default.

new Schema({ img: { type: Buffer, select: false }})

Now, when you query your collection, the img field will not be included.

Products.findById(id, cb);
function cb (err, product) { 
  if (err) // handle err
  console.log(product.img) // undefined
});

To get the img back you may select the field normally:

Products.findById(id).select('img name etc').exec(cb);
function cb (err, product) { 
  if (err) // handle err
  console.log(product.img) // binary data
});

Getter/Setter SchemaType Introspection

Getters and setters are now passed their parent SchemaType giving them the ability to tailor logic based on options specified in your schema. For example, say you want to set the default title when a special value is passed:

// create a schema with a default title
new Schema({ title: {
    type: String
  , default: 'Please enter a title'
  , set: setter 
}})

// our setter that returns the default value if
// '-reset' is passed
function setter (val, type) {
  if ('-reset' == val)
    return type.getDefault(this);
  return val;
}

// generate our model
var Model = db.model('IntrospectionExample', schema);

// create an instance
var m = new Model({ title: 'Avengers' });

// see it in action
console.log(m.title) // 'Avengers'
m.title = '-reset';
console.log(m.title) // 'Please enter a title' :D

The same is true for getters.

Tailable Queries

Queries now have the tailable option which lets us tail a collection much like the Unix tail -f command.

var stream = Model.find().tailable().stream();

Tailable queries can only be used on capped collections, can only return documents in their natural order, and never use indexes. Unless the cursor dies, a tailable QueryStream will remain open and receive documents as they are inserted into the collection. Check out this gist for an example.

You Rock

Much of what happens in Mongoose is driven directly by your code and suggestions. If you have thoughts or ideas about Mongoose, stop by our Google Group and chime in, or if you want to really make a difference, write a bug fix and open a pull request!

The Future

Features for the 2.x branch are winding down (but not bugfixes) while we move closer towards a 3.0 release. Stay tuned for an upcoming post about the Mongoose roadmap and our vision for the future of Node and MongoDB.

CharlotteJS Turns Two

This months CharlotteJS meetup marks its two year anniversary! How far have we come? Here are some stats:

Most active week of membership joins: March 11, 2010 (32 joined)

Biggest meetup - JS 101: July 2011 (46 attendees)

All in all, its been a blast. Here to the next two years! 

Durable Writes in Mongoose

By default, Mongoose sends the safe: true option with every call to MongoDB. This, we feel, is the smart default because for most of our applications we want to know when any write fails (if you were unaware, MongoDB does not report errors by default, a practice which only persists due to backwards compat reasons).

Knowing when errors occur is good, but in our case at LearnBoost where we run replica sets, we really desire knowing if our data persisted to two or more replicas.

MongoDB is nice in this way in that it allows configuring just how durable each write is. We first started out by globally defining our durability settings as follows:

var options = { db: {}};
options.db.strict = {
    j: 1 // the write must commit to the journal
  , w: 2 // the write must propagate to at least 2 nodes
  , wtimeout: 10000 // timeout after 10 seconds
}
mongoose.createSetConnection(uri, options);

The intention was that we take advantage of node-mongodb-natives global db options so we didn’t need to configure this for every model.. which doesn’t work with Mongoose. It fails because when a document is saved, the safe option of the model’s schema overwrites the global db options, setting it back to true. Not obvious at all.

What does work is setting this using a global mongoose plugin:

mongoose.plugin(function (schema) {
  schema.options.safe = {
      j: 1            // the write must commit to the journal
    , w: numNodes     // the write must propagate to at least 2 nodes
    , wtimeout: 10000 // timeout after 10 seconds
  };
})

Which works perfectly. Each schema will have this plugin applied. Now instead of true being passed, our custom safe option is passed and we didn’t really need our app to be strict anyway.

Yes, another option is to just copy this manually into every schema we have, but out goal was to set these options “globally”.

So there is definitely room for improvement in Mongoose here. I like the way node-mongodb-native does it where there are connection options which are overridden by collection options which are overridden by per document update settings. There’s actually no way to pass options in a per-document update basis with Mongoose unless you are using Model.update.

Hopefully this post helps you save a little time trying to get your Mongoose apps configured properly.

MongoDB sharding with Mongoose

Mongoose 2.5.3 has support for sharding. So what do we need to do to enable it? Not much. Just include the shardkey in the schema options and Mongoose does the rest.

var LoginSchema = new Schema({ 
    at: Date
  , _user: ObjectId
}, { shardkey: { at: 1, _user: 1 }})

// Boom

How does it work? During calls to doc.save() Mongoose checks for the schema shardkey. If it exists it grabs the appropriate values and includes them in the where clause. These values must be included or the save() will fail and an error returned.

Note that setting up and configuring your shards is not handled by Mongoose. You still need to set this up first, then add your shard key to your schemas.

As always, if you find any bugs please report them here or bring any questions to our Google group.

Mongoose 2.5.3

Mongoose 2.5.3 is now available with some new shiny features and bug-fixes.

Sharding

First up is beta support for MongoDB sharding. If you run a sharded configuration, Mongoose no longer leaves you shivering in the cold. Checkout the blog-post for details on setting this up.

doc.isSelected(path)

Next is doc.isSelected(path) which tells us if the given path was selected in our original query. For example, say we are finding a Pet and only return it’s name:

Pet.findById(id).select('name').exec(function (err, pet) {
  if (err) ..
  pet.isSelected('name') // true
  pet.isSelected('type') // false
})

This is helpful, for example, in situations where hooks might need knowledge of field selection before setting a new value to avoid overwriting a value that may already exist. Without knowledge of whether or not the field was selected, we might inadvertently overwrite the value in the database.

PetSchema.post('init', function () {
  if (null === this.likes && this.isSelected('likes')) {
    // old data in the db had a null set. safely fix it here
    this.likes = [];
  }
})

Query.equals()

We now have Query.equals() which is a little sugar for Model.where(path, val). The following are equivalent:

Pet.where('name', 'banana').exec(fn);
Pet.where('name').equals('banana').exec(fn);

Updated API docs

The API docs were also tweaked a little so you can CTL+F and search them quickly. (you may need to hard refresh to see the changes)

Bug fixes

Several bug-fixes were also included. Check out the project history for full details.

You

As always, if you find any bugs please report them here or bring any questions to our Google group. Thanks for all the bug reports and pull requests everyone!

GraphicsMagick on Heroku with Nodejs

I wanted to play around with Nodejs on Heroku so I set up a little project using my gm module to do image manipulation for the gm homepage.

The idea was simple. When a button is clicked we’ll stream the results of GraphicsMagick to the browser for display in an image tag.

It turns out that these free Heroku free instances don’t have GraphicsMagick installed.. but they do have ImageMagick. Excellent. Since version 1.x of my gm module we now have support for ImageMagick too.

var gm = require('gm')
  , imageMagick = gm.subClass({ imageMagick: true });

No we just use imageMagick where we’d normally use gm and everything “just works”.

app.get('/', function (req, res, next) {
  imageMagick('/path/to/img.jpg')
  .autoOrient()
  .flip()
  .stream('png', function (err, stdout) {
    if (err) return next(err);
    res.setHeader('Expires', new Date(Date.now() + 604800000));
    res.setHeader('Content-Type', 'image/png');
    stdout.pipe(res);
  });
});

The full heroku image manipulation project on the gm homepage is available here.

The Mongoose Query Chain

One of the nice things about the query api in Mongoose is that Model.find(), Model.where(), Model.count(), and other Model methods return an instance of a Query. This is nice because we can then continue manipulating that query until we are ready to execute it.

// suppose our app only cares about events that happened
// in the past day.
Event
.find({ type: 'click' })
.where('age').$gt(oneDayAgo)
.exec(function (err, clicks) {
  ...
})

As our application grows the need arises for custom “packaged” queries. Let’s say the above query is strewn through out the app so we refactor that into our Event model.

EventSchema.statics.findClicks = function (cb) {
  Event
  .find({ type: 'click' })
  .where('age').$gt(oneDayAgo)
  .exec(cb);
}

and in our routes we now have

Event.findClicks(function (err, clicks) { ...

A bit better. Everywhere we had that query will now use this method. But wait. What if we need to further tweek our query params? Instead of always executing our query in the model method, let’s return the query and keep execution optional.

EventSchema.statics.findClicks = function (cb) {
  var q = Event.find({ type: 'click' }).where('age').$gt(oneDayAgo);
  if ('function' == typeof cb) return q.exec(cb);
  return q;
}

There we go. Now we only execute the query if a function is passed into findClicks, otherwise we return the query for further customization.

Say we need to narrow those clicks further by retrieving only the “right click” events? We can extend the query returned in our findClicks helper:

EventSchema.statics.findRightClicks = function (cb) {
  return this.findClicks().find({ meta: 'right' }, cb);
}

This time instead of manually checking if cb is a function to decide if we should execute or not, we let the query itself figure that our for us by passing the cb as the second argument to find().

That’s it. By returning queries wherever possible we gain a much more flexible and reusable api.

Strict Mongoose Schemas

A common Mongoose issue as of late is that by default it saves properties that were not explicitely defined in our schema.

var ThingSchema = new Schema({ name: String })
  , Thing = db.model('Thing', ThingSchema)

var thing = new Thing({ iAmNotInTheSchema: true })
thing.save(); // saves "iAmNotInTheSchema" to the db!

This is a pretty surprising default, if not downright stupid. Why have a schema if its not strictly enforced? Since Mongoose 2.4.8 (thanks to _nw_) we now have a `strict` option available to schemas that corrects this behavior. The option is passed as the second constructor argument:

var ThingSchema = new Schema({ name: String }, { strict: true });

Any property not explicitly defined in the schema will no longer be saved. Expect this to be the default behavior in 3.0.

Instances

We can also override this behavior at the instance level when that is desirable:

var strict = new Thing({ name: 'Mongoose' }, true); 
// forces this instance to be strict

var lax = new Thing({ name: 'Mongoose' }, false);
// forces this instance to not be strict

Short But Sweet

Thats it for this time. The docs on mongoosejs.com have been updated with these details and hopefully you find this useful.

Mongoose 2.4.0

Mongoose 2.4.0 is now available and comes with a few enhancements.

ReadStreams

First off we've now implemented a ReadStream api for queries. This means you can do things like

Model.find().stream().pipe(writeStream)

or just add listeners the way you do with other streams:

var stream = Model.where('name', /^a/i).stream();

stream.on('data', function (doc) {
  // do something with the doc
})

stream.on('error', function (err) {
  // handle the error
})

stream.on('close', function () {
  // all done
})

Check out the QueryStream docs for more examples and how it compares to the existing Query#each method which will likely soon be deprecated.

Debug Print mode

We finally have an option to print all queries to your console to assist in development. To enable this option, set the debug option of mongoose to true. This is a global setting that affects all connections.

var goose = require('mongoose');
goose.set('debug', true);

Note: this is not recommended in production since it uses blocking writes. To get around this you are free to set your own custom logging function and process the arguments however you'd like.

var goose = require('mongoose'); 
goose.set('debug', function (collectionName, method, query, doc [, options]) {
  //...
});

$unset support

New in 2.4.0 is transparent support for the $unset command. Previously it was only possible to $unset a document key by using Model.update. Now when we set a key to undefined, it will be passed to MongoDB as an $unset. This not only gives us support for $unset but actually fixes incorrect behavior we had before: when setting a key to undefined and saving, the next time we retrieved that document, the key would be set to null. With this change we can retain the undefined value despite MongoDB not actually "knowing" what undefined is.

thing.name = undefined;

thing.save(function (err) {
  Thing.findById(id, function (err, thing) {
    console.log(thing.name) // undefined
  })
})

Windows support

We've updated the mongodb-native driver which gets us Windows support and several bugfixes. One important fix to mention impacts performance when running on node 0.4. This fix prevents a 100% CPU issue from occuring but causes the Mongoose test suite to run an avg of ~18% slower. This impacts node 0.4 only. Performance on Node 0.6 is not impacted.

Other stuff

Many query improvements have been made such as better geoSpatial support along with a few bugfixes.

We also now have a dedicated page on mongoosejs.com for Queries (about time).

As always, if you come accross any bugs please report them here. And if you want to be really awesome, fork Mongoose and send a pull request with a fix!

 

Getting started with Mongoose and Nodejs

In this post we'll talk about getting started with Mongoose, an object modeling tool for Mongodb and Nodejs.

Install

We're going to assume that you have both MongoDB and npm installed for this post. Once we those let's install Mongoose:

$ npm install mongoose

Hurray! Now we can simply require mongoose like any other npm package.

var mongoose = require('mongoose');

Schema definition

Though MongoDB is a schema-less database we often want some level of control over what goes in and out of our database collections. We're confident that we're going to be the next Netflix so we'll need a Movie schema and a Rating schema. Each Movie is allowed to have multiple Ratings.

var Schema = mongoose.Schema;
var RatingSchema = new Schema({
    stars    : { type: Number, required: true }
  , comment  : { type: String, trim: true }
  , createdAt: { type: Date, default: Date.now }
});

So far we've created a Rating schema with a stars property of type Number, a comment property of type String, and a createdAt property of type Date. Whenever we set the stars property it will automatically be cast as a Number. Note also that we specified required which means validation will fail if an attempt is made to save a rating without setting the number of stars. Likewise, whenever we set the comment property it will first be cast as a String before being set, and since whitespace around comments is very uncool, we use the built-in trim setter.

Now that we're happy with our Rating model we'll use it within our Movie model. Each movie should have name, director, year, and ratings properties.

var MovieSchema = new Schema({
    name    : { type: String, trim: true, index: true }
  , ratings : [RatingSchema]
  , director: Schema.ObjectId
  , year    : Number
});

Here we see that ratings is set to an array of Rating schemas. This means that we'll be storing Ratings as subdocuments on each Movie document. A subdocument is simply a document nested within another.

You might have noticed the index option we added to the name property. This tells MongoDB to create an index on this field.

We've also defined director as an ObjectId. ObjectIds are the default primary key type MongoDB creates for you on each document. We'll use this as a foreign key field, storing the document ObjectId of another imaginary Person document which we'll leave out for brevity.

TIP: Note that we needed to declare the subdocument Rating schema before using it within our Movie schema definition for everything to work properly.

This is what a movie might look like within the mongo shell:

{ name: "Inception", year: 2010, ratings: [{ stars: 8.9, comment: "I fell asleep during this movie, and yeah, you've heard this joke before"}, { stars: 9.3 }], director: ObjectId("asdfasdfasdf") }

Now that we've finished our schemas we're ready to create our movie model.

var Movie = mongoose.model('Movie', MovieSchema);

And thats it! Everything is all set with the exception of being able to actually talk to MongoDB. So let's create a connection.

mongoose.connect('mongodb://localhost/nodeknockout');
var db = mongoose.connection;
db.on('open', function () {
  // now we can start talking
});

Now we're ready to create a movie and save it.

var super8 = new Movie({ name: "Super 8", director: anObjectId, year: 2011 });

super8.save(function (err) {
  if (err) return console.error(err); // we should handle this
});

Oh, but what about adding ratings?

Movie.findOne({ name: "Super 8" }).where('year', 2011).run(function (err, super8) {
  if (err) // handle this

  // add a rating
  super8.ratings.push({ stars: 7.7, comment: "it made me happy" });
  super8.save(callback);
});

To look up our movie we used Model.findOne which accepts a where clause as its first argument. We also took advantage of the Query object returned by this method to add some more sugary filtering. Finally, we called the Query's run method to execute it.

We didn't have to do it this way, instead you could just pass all of your where params directly as the first argument like so:

Movie.findOne({ name: "Super 8", year: 2011 }, callback);

Though the first example is more verbose it highlights some of the expressive flexibility provided by the Query object returned.

Here are a couple more ways we could write this query:

Movie.where('name', /^Super 8/i).where('year', 2011).limit(1).exec(callback);
Movie.find({ name: "Super 8", year: { $gt: 2010, $lt: 2012 } }, null, { limit: 1 }, callback);

This is all well and good but what if we look up movies by director and year a lot and need the query to be fast? First we'll create a static method on our Movie model:

MovieSchema.statics.byNameAndYear = function (name, year, callback) {
  // this could return multiple results
  return this.find({ name: name, year: year }, callback);
}

We'll also add a compound index on these two fields to give us a performance boost:

MovieSchema.index({ name: 1, year: 1 });

For good measure we'll add a movie instance method to conveniently look up the director:

MovieSchema.methods.findDirector = function (callback) {
  // Person is our imaginary Model we skipped for brevity
  return this.db.model('Person').findById(this.director, callback);
}

Putting it all together:

Movie.byNameAndYear("Super 8", 2011, function (err, movies) {
  if (err) return console.error(err); // handle this
  var movie = movies[0];
  movie.findDirector(function (err, director) {
    if (err) ...
    // woot
  })
});

Thats it for this post. For more info check out mongoosejs.com, the github README, or the Mongoose test directory to see even more examples.