Validating MongoDB’s DBRefs

As discussed on this SO question.

Ref Marks The Spot

For various reasons MongoDB doesn’t support joins, but documents can be linked using DBRefs.

For example, we’d like to build a relation between space ships and crew members.
Our ships document looks like:

{
    "_id": "someID"
    "class": "Firefly"
    "name": "Serenity"
}

And our crew document looks like:

{
    "name": "Malcolm 'Mal' Reynolds"
    "ship": DBRef("ships", "someID")
}

We link Capt. Mal to his ship “Serenity” by adding a `ship` field to the crew document; the value of the `ship` is a DBRef object composed of:

  1. The name of the collection which we reference.
  2. The ID of the item we reference.

But what happens when you encounter an inconsistency between references? Links may become invalid, typically in dev and staging environments but this could also happen in your *gasp* production environment!

Solution

There’s no official built-in way to validate DBRefs, but it is easy to manually validate them.
MongoDB is awesome in many ways; one manifest of this awesomeness is the ability to execute commands in the form of Javascripts.

So I wrote a small script – validateDBRefs.js:

//Create a generic function to extract the ID from a document
var returnIdFunc = function(doc) { return doc._id; };

//Map the collection of ships to a collection of ship IDs
var allShipIds  = db.ships.find({}, {_id: 1} ).map(returnIdFunc);

//Find all crew members with ship IDs that don't exist in the allShipIds collection
var crewWithInvalidRefs = db.crew.find({"ship.$id": {$nin: allShipIds}}).map(returnIdFunc);

print("Found the following documents with invalid DBRefs");
var length = crewWithInvalidRefs.length;
for (var i = 0; i < length; i++) {
    print(crewWithInvalidRefs[i]);
}

That can be run with:

mongo DB_NAME validateDBRefs.js

In the given form, the script will output all the crew documents that reference non-existing ships:

Found the following documents with invalid DBRefs

513c4c25589446268f62f487

513c4c26589446268f62f48a

These IDs can now be used to perform validity reporting or even cleaned up as part of an automated maintenance procedure!

About these ads

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s