mongodb.jpg

Skimbox stores user data in Mongo, the Bon Iver of distributed databases. So far, we're digging Mongo, but it's definitely a mix of guns and roses. If you're using it or thinking about trying it, hopefully, some of the following advice and information will be helpful!

Tips 'n' Tricks

1. Pretty Printing from the Command Line Shell:

When using the shell, the results are often lumped together into a single line. Use the .pretty() method to format nicely, as in:

 > db.emails.find({_id:/b8037de0-1170*/}, {headers:1}).pretty();

If you only have one element that you are looking at, specifically indexing it will also give pretty results:

 > db.emails.find({_id:/b8037de0-1170*/}, {headers:1})[0]

or

 > db.emails.findOne({_id:/b8037de0-1170*/}, {headers:1})
 

2. Backup The Production Database Locally:

$ mongodump -h <your mongo instance> --port <port> -d Gander -u <user> -p '<password>'

(enclose the password in apostrophes to prevent the shell from interpreting special characters).

This will create a subdirectory called ./dump that contains the exported database.

3. Restoring a Mongo Dump to a Local Meteor Instance

$  mongorestore -h localhost --port 3002

This assumes that the current directory has a subdirectory ./dump created from a previous backup and that Meteor is running locally.

The production database is named Gander, which is the database that will be created by this restore. By default, the local Meteor database is named 'meteor'. To rename the local database, open the mongo command line tool:

> db.copyDatabase( "Gander", "meteor" )

There may be a faster way to copy databases between servers, but I haven't tried it yet.

4. Retrieving Specific Subcollection Fields

To retrieve certain fields from the top level of a collection, use the field selector:

> db.collection.find ({}, {field1:1, field2:1});

If you wanted just fields of a subcollection, use apostrophes:

> db.collection.find({}, {'field1.subfield1':1, 'field1.subfield2':1, field2:1});

5. Repairing a Corrupt Local Database

Over the weekend, my trusty Macbook shutdown due to battery exhaustion. Usually it puts it to sleep, but I guess it ran out of juice. When I rebooted Ubuntu, upon starting Meteor, I got this nasty message:

[[[[[ ~/Projects/Mahogany ]]]]]
Unexpected mongo exit code 100. Restarting.
Unexpected mongo exit code 100. Restarting.
Unexpected mongo exit code 100. Restarting.
Can't start mongod
MongoDB had an unspecified uncaught exception.

So I basically had a corrupt local database. Riffing off these instructions, I did the following operations specific to the Meteor install of Mongo:

  cd .meteor/local
  rm db/mongod.lock
  /usr/local/meteor/mongodb/bin/mongod --dbpath db --repair --repairpath db1

and all was good again. Ensure that no other instances of Meteor and Mongo are running when you do this procedure.

Gotchas we've found

1. Keys Cannot Contain Periods:

A key cannot contain a period "." or start with a "$" (ref). This is particularly annoying if the hash is an email address like "joe@smith.com". This also occured with folder_name, specifically for the last_uid collection. For Gander, in order to escape the period, I used:

  a = addr.gsub('.','#DOT#') # Ruby encoding
  i = item.replace(/#DOT#/g, '.'); // JavaScript decoding

Unfortunately, this was discovered the hard way: no error message when trying to do an update:

   @user_coll.update({"_id" => user['_id']},{"$set" => {'address_book' => vips } })

The update would fail silently and not generate an exception.  

2. Exercise Caution When Using update

I was testing marking messages for deletion. I wanted to revert the change and retest, so I entered:

   db.emails.update({gander_status:'deleting'}, {gander_status:'gmail'});

Intuitively, you would think that this would simply change the value of the gander_status item. Wrong - it deletes all the other fields, leaving only the gander_status field (and id of course). The correct syntax is using $set

   db.emails.update({gander_status:'deleting'}, {$set: {gander_status:'gmail'}});

3. Mongo Does Not (Easily) Support SSL

See http://docs.mongodb.org/manual/administration/ssl/ and a relevant discussion at: http://stackoverflow.com/questions/11310299/securing-mongodb-transport-in-the-cloud.

4. Count does not take into account skip and limit

Let's say you have the following code:

  • Example 1 - limit()

var x = Emails.find();

console.log("x=",x.count());

var y = Emails.find({},{limit:20});

console.log("y=",y.count());

You would expect:

y = 20 (if x > 20)

  • Example 2: skip

var x = Emails.find();

console.log("x=",x.count());

var y = Emails.find({},{skip:50});

console.log("y=",y.count());

You would expect:

     y = x - 50 (if x > 50)

This is not the case. By default, Mongo's .count() does not take into account usage of skip and limit. So in both of the above examples x = y.  Count() returns the entire cursor's count. Different drivers (e.g. Mongo shellperlJavaScript) have other means of returning the actual expected cursor count. I have not found a way in Meteor's driver to find the adjusted count.

Best Practices

Replica Set Configuration

  • Don't use IP addresses
  • Don't use /etc/hosts
  • Use DNS
    • Pick appropriate TTLs 

See Also

General:

Schema Design:

Indexing and Performance:

Books: MongoDb: The Definitive Guide title says it all. 

Comment