mongodb text search

nosql databases have their good and bad parts. In the case of mongodb a problem was definitely 'in text searches'. You could use regex queries which are very slow compared to what you probably want and need.

What is the new text search?

The feature was introduced in January 2013 and adds a full text search directly implemented into mongodb. MongoDB Text Search is currently in beta and supports the search of string content in documents of a collection.

How to use it?

There are only two steps necessary to activate the text search. You can either use one of the following commands:

mongod --setParameter textSearchEnabled=true  
db.adminCommand( { setParameter : 1, textSearchEnabled : true } )  

Now the mongodb is running with the text search enabled. So at last you only need to tell mongodb on which field it should build its index for the serach.

use exampleDB  
db.exampleCollection.ensureIndex( { exampleField: "text" } );  

mongodb will now build its index which should take a few minutes and after that, you are good to go.

db.exampleCollection.runCommand( "text", { search : "any text" } );  

Performance

I did not make a full performance check, but i can tell you how the text search improved the performance in a project i did.
We had a Twitter database with about 7.000.000 tweets in it and needed a good performing text search. The mongodb was running on a AWS small instance and a simple query via the first implemented regex search solution took about 2 minutes. With the text search enabled the query time (for the first request) boosted to roughly 15 seconds. Also mongodb seems to somehow improve the index after each query, which resulted in query times of 1 second after the first query.

Limitations

  • a mongodb collection can only have 1 index at a time.
  • Text indexes have significant storage requirements and performance costs.

Conclusion

With the text search, Mongodb introduced very useful feature which will hopefully leave the Beta phase soon and get some more functionality built into. For now it is not recommended to use it in production.

Thomas Sattlecker

Thomas Sattlecker

View Comments
Navigation