Mongo Tips #1: How to Ensure Non-Duplicate Records in a Mongo Collection?

Reading Time: 1 minute

Mongo is a json-based no-sql database that is being used to store large datasets. If you are having trouble to ensure the uniqueness of a key in a MongoDb, I recommend you to create a new database with a unique index and copy everything from the old collection (table) to the new collection. During the copy operation, the index in the new collection will check the uniqueness before inserting the records, so you will have a pure clean non-duplicate collection.

Note that there are other methods, but based on my experience, this is the most clean and stable approach to clean a Mongo collection.

Here are the necessary codes

First, create a new table with a unique index

db.newuser.ensureIndex( { "user_id":1 }, { "unique":true, "dropDups":true } )  

Then copy the records from the old collection to the new one

db.user.find().forEach(function(doc) {  
  db.newuser.insert(doc); 
});

Finally rename your collections. Do not forget to double check whether you have other indexes in your old collection.

db.user.renameCollection("userJul11")
db.newuser.renameCollection("user")

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.