MongoDB is an open-source document-oriented database. It stores various data in key-value pairs, as we all know since it is a NoSQL database. The term NoSQL means non-relational. You all know about it, from my previous articles, isn’t it? So, in this article, we will look at something new, which is “group MongoDB”. So, let’s get going with the topic right away.

group MongoDB

The group operator has another name, the accumulation operator. It is an important operator in the MongoDB language and helps in performing various data transformations. Group forms a part of the aggregation functions in MongoDB.

So, now that we are ready to dive deep into the $group operator, let’s take a quick look at some of the pre-requisites.

Prerequisites
  • You shall already have MongoDB on your system. For installation you can check the official documentation here.
  • Also, you shall have some prior knowledge of MongoDB and its shell commands.
  • That’s it! You are now ready to march ahead.. 🙂

What is Aggregation? :”:”:

Aggregation is an operation that processes data to produce a calculated result. This type of operation combines values from multiple documents and can perform a variety of operations on the grouped data to produce a single result like calculating the sum, average, mean, median, mode, etc. They are stages or pipelines in which data processes obtain a combined result.

Group MongoDB

Pipelines are basically stages in which the data is processed, or more precisely, transformed according to the specified criteria. Each pipeline is independent and receives the data from the previous stage. Always remember that the first pipeline has direct contact with the data itself, not with its transformed version, and it can use indexes.

Now, let’s take a quick look at the $group operator.

$group MongoDB

As the name suggests, the $group operator groups similar data according to a particular expression and combines them into a single result or document. Say, for example, there are around 20 employees of an organization in a database who all have a similar passion. If we want to count all the people who have a common passion, the $group operator will provide the best solution for such a task. Here’s the syntax to it:

{ $group: { _id: <expression>, <field1>: { 
              <accumulator1> : <expression1> }, ... } }

Here, the “_id” field contains the unique group by key values in the resultant (output) document. Thus, the “_id” field here takes the fields by which you want to group the documents. The use of _id is mandatory here.

Note: the $group operator here groups the data based on the field provided in the “_id” field.

Some important points to note

  • The $group operator can’t be used along with the LogReduce operator.
  • You shall avoid using names of group functions as field names.
  • When using count or any grouping function for that matter, you shall insert the underscore before the field name like, sort by _count.
  • Multiple aggregation functions can be on the same row, but other functions cannot be there on the same row. Like, you can’t put the math function on the same row of a query.
Group MongoDB

The $group operator in MongoDB

Remember, in PyMongo, we mainly bring the aggregate method in use to process records from multiple documents, and return the result to the user. It is based on the data processing pipeline and includes several stages at the end of which we get the aggregated result.

One of the stages of the aggregate method involves the $group. This operation groups the input documents of the collection according to the identifier expression entered by the user. Thereafter, it applies the accumulator expression to them. And, after that do the output documents get generated.

The $group operator contains two components namely, the “_id” and the field(optional). The _id is the expression according to which the documents are grouped. While the optional field contains an accumulator expression that will apply to the documents.

Now, let us see some examples to make this concept clearer.

Example 1 :
from pymongo import MongoClient 
 
# Now we need to create the MongoClient 
client=MongoClient() 
    
# Now, to connect to the port and host
client = MongoClient("mongodb://localhost:27017/") 
    
# To access the database 
mydatabase = client['database'] 
    
# Now, to access the collections  
mycollection=mydatabase['myTable'] 

user_profiles = [
    {"_id":101, "user":"Shubham", "title":"Python"},
    {"_id":102, "user":"Kavin",  "title":"JavaScript"},
    {"_id":103, "user":"Shubham",  "title":"C++"},
    {"_id":104, "user":"Ayush",  "title":"MongoDB"},
    {"_id":105, "user":"Shubham",  "title":"R"}
]
  
mycollection.insert_many(user_profiles)

agg_result= mycollection.aggregate(
    [{
    "$group" : 
        {"_id" : "$user", 
         "tutorial" : {"$sum" : 1}
         }}
    ])

for i in agg_result:
    print(i)

The output that thus generates is:

{'_id': 'Shubham', 'tutorial': 3}
{'_id': 'Kavin', 'tutorial': 1}
{'_id': 'Ayush', 'tutorial': 1}

Thus, we see that the documents are grouped based on the $user expression; while the “tutorial” field contains the $sum accumulation operator to calculate the count of languages known by each user.

Now, let us see another example.

Want to know how indexing works in MongoDB? Click here and munch on this guide ~~> ~~>

Group MongoDB
Example 2 :
from pymongo import MongoClient 
 
# Now we need to create the MongoClient 
client=MongoClient() 
    
# Now, to connect to the port and host
client = MongoClient("mongodb://localhost:27017/") 
    
# To access the database 
mydatabase = client['database'] 
    
# Now, to access the collections  
mycollection=mydatabase['myTable'] 

user_profiles = [
    {"_id":101, "user":"Shubham", "title":"Python"},
    {"_id":102, "user":"Kavin",  "title":"Python"},
    {"_id":103, "user":"Shubham",  "title":"MongoDB"},
    {"_id":104, "user":"Ayush",  "title":"MongoDB"},
    {"_id":105, "user":"Shubham",  "title":"R"}
]
  
mycollection.insert_many(user_profiles)

agg_result= mycollection.aggregate(
    [{
    "$group" : 
        {"_id" : "$title",  
         "total" : {"$sum" : 1}
         }}
    ])

for i in agg_result:
    print(i)

Now, the output is :

{'_id': 'Python', 'total': 2}
{'_id': 'MongoDB', 'total': 2}
{'_id': 'R', 'total': 1}

Thus, in this other example, the documents are grouped by the expression $title; and we get the output as the total number of items of each title.

Thus, I hope that these two examples and the theory explained above now make you familiar with the concept and use of the group MongoDB operator.

Here’s how to link MongoDB with Python for your use !! ~~> ~~>

WRAPPING UP !! Group MongoDB ~~> ~~>

To wrap things up, in this post, we touched upon the basics of MongoDB and the $group operator. We also touched upon the aggregation function and how to perform some operations. Not only that, but you also got to look at some examples to make this concept clearer and more precise. So, I hope this post was worth your while. On that note, until next time, see ya !! Goodbye !! ~~> ~~>

:: ~~> ~~> :: ** :::::: ** :: )) ** :: ** (( ~~> ~~>

Categorized in: