In this post, I will discuss $lookup, $unwind, $project, and $addFields stages in the MongoDB Aggregation pipeline with examples.
Let's get started
$lookup
$lookup
is used for joining the data from multiple collections. You can think of it as an outer join in SQL.
$lookup
is a little more complicated than other stages, here are the 4 params:
from
: this is the destination collection from which we want data linked to the source collectionlocalField
: this is the field on which comparison takes place in the source collectionforeignField
: this is the field on which comparison takes place in the destination collection (right table of the join)as
: this is the name of the field which will have an array of objects from thefrom
collection if the localField of current collections matches theforeignField
of destination collection.
Example
Let’s say we have 2 collections users and posts in a Twitter type social media app. users collection store the data of users and posts collection have the status updates of users.
Here is the sample data in users
And here is the sample post by a user
As you can see, the 2 collections are linked using the created_by field which is the unique handle for the user in the system.
Now if we want to list the posts along with the user’s firstName
and lastName
. We will need to have $lookup
in this way.
Explanation of the above code Since we need to lists posts data (not users data), so we do db.posts.aggregate (not db.users.aggregate).
So the source is posts collection and users are foreign collections here. Therefore from parameter is users.
localField is the field in source collection (posts) i.e. created_by.
foreignField is the field in the foreign collection (users) i.e. handle. These two fields are under the comparison
as can be anything meaningful which you can choose, where the data from users will be stored.
Here is the summary again
Here is the result
You noticed that the user is not an object, it is an array. If you want it as an object, then you need to use $unwind
.
$unwind
This is used normally along with $lookup
. It will take an array field as an argument (e.g. user in the above example). For each input document and for each element of the array in input, there will be a corresponding output document with array replaced with actual array element.
Sounds like a tongue twister. Let me give you a figure to understand
Let’s continue with our previous example
Now since the user is an array. We may need it in object form, so as to form the next operations.
Here’s what our new pipeline looks like after applying the $unwind
Here is how the response looked like
This was the basic example, there is more to it which you can look in the official documentation
$project
The output for our little example was pretty much okay, cause we didn’t make it too complex. But you know, it is never the case with real-world projects.
There can be a lot of fields that you don’t want to add in the aggregate response (due to security reasons, or simply they are not required).
Similarly, there can be some fields that are needed additionally, maybe some computation of existing fields.
The solution for such scenarios is the $project
.
It simply passes along the requested fields to the output. These fields might be existing or they might be computed fields.
Let’s move on to an example.
If we wanted to list only the post title, the handle of the creator, creator’s first name, and creator’s last name. Then what we need to do?
So what are we doing here?
$project
is essentially creating/composing the object structure of all the output documents.
We needed to post text, so 'text':'$text'
,
Next, we needed a handle (we have essentially, renamed created_by field to handle), so we had 'handle':'$created_by'
,
After that, we created a creator object and we had 2 fields inside that called firstName and lastName. We can refer to the nested fields using '.'. As you can see from $user.firstName
syntax.
Here is how the output looks like
As you can see its much cleaner with fewer fields right? Notice the $
symbol to refer to fields of the input.
We will explore computed fields in the $project
in some other post. Stay Tuned 🙂
$addFields
$addFields
is very similar to $project. The only difference is $addFields simply adds the new fields, it doesn’t remove or change any existing field.
I quote the documentation which says
"The $addFields the stage is equivalent to a $project stage that explicitly specifies all existing fields in the input documents and adds the new fields." - MongoDB Official Documentation
Let’s jump to our example straightaway.
We will keep it very simple. Let’s say we want 1 more field called fullName which will simply join the first name and last name of the creator.
To do this, we need to use the $concat
operator. Here’s how.
As an argument to $addFields, we pass the new field which we want to add to each document of its input (here it is fullName). Inside that, we pass the $concat operator which intakes an array of expressions which are needed to be concatenated. We passed, $user.firstName, space
and $user.lastName.
Here is how the response looks like
Conclusion
In this post we discussed the basics of $lookup, $unwind, $project and $addFields. Let me know if you have any doubts in the comments section.
Please don’t forget to share this with your developer friends.
Happy Coding 🙂
Originally posted on Appsyoda.com