How to do Text Search in MongoDB?

If You are interested to learn about the Map Reduce in MongoDB

Text search is a useful method that MongoDB offers. This method allows us to search the string fields for a specific word or passage of text. Or, to put it another way, MongoDB enables you to do a query operation to extract the desired text from a string. Utilizing the text index and $text operator, we can execute text searches in MongoDB.

Text index: Text indexes provided by MongoDB are used to extract the desired text from the content of a string. Either a string or an array of string components should be used as text indexes. Always keep in mind that your collection must have a text index, and that a collection can only have one text index that includes several fields. This is important when using text search queries. We can create a text index using createIndex() method.

Starting from version 2.4, MongoDB started supporting text indexes to search inside string content. The Text Search uses stemming techniques to look for specified words in the string fields by dropping stemming stop words like a, an, the, etc. At present, MongoDB supports around 15 languages.

Syntax:

db.collectionName.createIndex( { field: “text” } )

Enabling Text Search

Initially, Text Search was an experimental feature but starting from version 2.6, the configuration is enabled by default.

Creating Text Index

Consider the following document under posts collection containing the post text and its tags −

> db.posts.insert({
   "post_text": "enjoy the mongodb articles on tutorialspoint",
   "tags": ["mongodb", "tutorialspoint"]
}
{
	"post_text" : "writing tutorials on mongodb",
	"tags" : [ "mongodb", "tutorial" ]
})
WriteResult({ "nInserted" : 1 })

We will create a text index on post_text field so that we can search inside our posts’ text −

>db.posts.createIndex({post_text:"text"})
{
	"createdCollectionAutomatically" : true,
	"numIndexesBefore" : 1,
	"numIndexesAfter" : 2,
	"ok" : 1
}

Using Text Index

Now that we have created the text index on post_text field, we will search for all the posts having the word tutorialspoint in their text.

> db.posts.find({$text:{$search:"tutorialspoint"}}).pretty()
{
	"_id" : ObjectId("5dd7ce28f1dd4583e7103fe0"),
	"post_text" : "enjoy the mongodb articles on tutorialspoint",
	"tags" : [
		"mongodb",
		"tutorialspoint"
	]
}

The above command returned the following result documents having the word tutorialspoint in their post text −

{ 
   "_id" : ObjectId("53493d14d852429c10000002"), 
   "post_text" : "enjoy the mongodb articles on tutorialspoint", 
   "tags" : [ "mongodb", "tutorialspoint" ]
}

Deleting Text Index

To delete an existing text index, first find the name of index using the following query −

>db.posts.getIndexes()
[
	{
		"v" : 2,
		"key" : {
			"_id" : 1
		},
		"name" : "_id_",
		"ns" : "mydb.posts"
	},
	{
		"v" : 2,
		"key" : {
			"fts" : "text",
			"ftsx" : 1
		},
		"name" : "post_text_text",
		"ns" : "mydb.posts",
		"weights" : {
			"post_text" : 1
		},
		"default_language" : "english",
		"language_override" : "language",
		"textIndexVersion" : 3
	}
]
>

After getting the name of your index from above query, run the following command. Here, post_text_text is the name of the index.

>db.posts.dropIndex("post_text_text")
{ "nIndexesWas" : 2, "ok" : 1 }

Sorting Documents Based on Search Relevance

TextScore

The $text operator returns the score for each page that contains the search term in the indexed fields. The result demonstrates how well a document matches a certain text search query. Both the projection expression and the sort() method description allow the score to be supplied. The term $meta: “textScore” provides details about how the $text operation was handled. See the $meta MongoDB projection operator for more details on how to get the score for a projection or sort.

We’re conducting a text search, thus we’d like to know how relevant the pages that come up are on average. We’ll use the $meta: “textScore” phrase, which provides details about how the $text operator processed data, to accomplish this. We’ll also use the sort command to order the papers by textScore.

db.messages.find({$text: {$search: "dogs"}}, {score: {$meta: "textScore"}}).sort({score:{$meta:"textScore"}})

This query returns the following documents:

{ "_id" : ObjectId("6176b68b750fd1447889f942"), "subject" : "Joe owns a dog", "content" : "Dogs are man's best friend", "likes" : 60, "year" : 2015, "language" : "english", "score" : 1.2916666666666665 } 

{ "_id" : ObjectId("6176b69f750fd1447889f943"), "subject" : "Dogs eat cats and dog eats pigeons too", "content" : "Cats are not evil", "likes" : 30, "year" : 2015, "language" : "english", "score" : 1 }

As you can see, the first document gets a score of 1.2916666666666665 (since the keyword dog appears twice in its subject), whereas the second has a score of 1. The query also ordered the returned documents by their score in descending order.

Compound Indexing:

We will allow compound text indexing on the subject and content fields in our example. Proceed to run the following command in mongo shell:

db.messages.createIndex({"subject":"text","content":"text"}

This command will not work. Attempting to create a second text index will result in an error message stating that a full-text search index already exists. Why is this the case? The explanation is that text indexes are limited to one text index per collection. As a result, if you want to build another text index, you must delete the old one and establish a new one.

db.messages.dropIndex("subject_text")  
db.messages.createIndex({"subject":"text","content":"text"})

After running the index creation queries listed above, try searching for all pages with the keyword cat.

db.messages.find({$text: {$search: "cat"}}, {score: {$meta: "textScore"}}).sort({score:{$meta:"textScore"}})

The above query will give the following output:

{ "_id" : ObjectId("6176b69f750fd1447889f943"), "subject" : "Dogs eat cats and dog eats pigeons too", "content" : "Cats are not evil", "likes" : 30, "year" : 2015, "language" : "english", "score" : 1.3333333333333335 }

{ "_id" : ObjectId("6176b6cb750fd1447889f944"), "subject" : "Cats eat rats", "content" : "Rats do not cook food", "likes" : 55, "year" : 2014, "language" : "english", "score" : 0.6666666666666666 }
{ "_id" : ObjectId("6176b69f750fd1447889f943"), "subject" : "Dogs eat cats and dog eats pigeons too", "content" : "Cats are not evil", "likes" : 30, "year" : 2015, "language" : "english", "score" : 1.3333333333333335 }

{ "_id" : ObjectId("6176b6cb750fd1447889f944"), "subject" : "Cats eat rats", "content" : "Rats do not cook food", "likes" : 55, "year" : 2014, "language" : "english", "score" : 0.6666666666666666 }

Indexing the Entire Document

The subject and content fields were combined to form a composite index in the previous example. You might, however, want any text in your papers to be searchable at times.

Think about, for instance, storing emails as MongoDB documents. All email fields, including Sender, Recipient, Subject, and Body, must be searchable. In these circumstances, you can index every string field in your document by using the $** wildcard specifier.

The query would be as follows (make sure you delete the existing index before establishing a new one):

db.messages.createIndex({"$**":"text"})

This query would create text indexes on any string fields in our documents.

Implementing Text Search in an Aggregation Pipeline:

TThe $text query operator in the $match stage of the aggregate pipeline supports ext search. But for text search in the aggregate pipeline, the following rules apply:

The $match stage with a $text must be the pipeline’s first stage.

A $text operator may only show up once in the stage.

Expressions using $or or $not are not allowed to use the $text operator expression.

The text search does not by default return relevant items in the sequence of relevant scores. To sort by descending score, use the $sort stage’s $meta aggregation expression.

A text string is given by the $text operation.

re to each document that contains the search word in the index field. The score shows the importance of a document in relation to a given text search query.

Examples:

The following examples are based on a message collection with a text index on the field subject:

use people 
db.people.insert({"name":"Ishan","pet":"dog"}) 
 db.people.insert({"name":"Abhresh","pet":"cat"}) 
 db.people.insert({"name":"Madan","pet":"cat"}) 
 db.people.insert({"name":"Sneha","pet":"dog"})

db.people.find().pretty()

Count the number of the document in which the pet value is dog:

db.people.aggregate([{$match:{$text:{$search:"dog"}}},{$group:{_id:null,total:{$sum:1}}}])

Count the number of the document in which the pet value is Cat:

db.people.aggregate([{$match:{$text:{$search:"dog"}}},{$group:{_id:null,total:{$sum:1}}}])

Summary

If you handle string content in MongoDB, you should use full-text search to make your searches more effective and accurate. In this article, we demonstrated how to conduct a basic full-text search on a sample dataset.

How to do Text Search in MongoDB?

How to do Text Search in MongoDB?

Enabling Text Search

Creating Text Index

Using Text Index

Deleting Text Index

Sorting Documents Based on Search Relevance

TextScore

This query returns the following documents:

Compound Indexing:

Indexing the Entire Document

Implementing Text Search in an Aggregation Pipeline:

Count the number of the document in which the pet value is Cat:

Summary

Category Posts

Categories

Recent Posts