As a database management system, MongoDB stands tall as a powerful NoSQL solution. We developers, often face significant challenges when joining collections in MongoDB.
The process of using the MongoDB lookup multiple fields operation is really not easy and requires many additional steps to structure our data correctly. To synchronize our data with the need, we sometimes have to include operations such as `$match`, `$unwind`, `$group`, and `$map` to structure our data.
However, the complexity of combining these operations can be really stressful and prone to errors. The syntax of MongoDB lookup with conditions with these complicated operations is really not easy and extremely confusing. It's not uncommon to spend considerable time debugging, trying to get everything to work together.
I always get heavy errors with Mongodb complex lookup. I try to fix one part and the other part of my lookup aggregation code breaks.
Recognizing these difficulties, I have developed an ultimate guide to MongoDB $lookup with advanced operations. With this guide, you will have access to detailed explanations and practical examples, covering a wide range of scenarios you might face.
Whether you are filtering nested documents, flattening arrays, grouping results by specific criteria, or transforming data into a desired format, this blog post aims to save you time and effort by traveling from one page to another page for solutions, With this thorough guide, you won't have to look elsewhere for solutions, and I can assure you chatGpt barely helps with the complex scenarios. I tried many times and got humongous errors.
MongoDB Lookup Match
I hope you all have done a simple $match in MongoDB lookup aggregation. But if you have played with complex data structures in MongoDB, then you might be aware of how difficult it is sometimes! I am pasting one simple example of $match with $regexFind.
const checkData = Model.aggregate([
{
$match: {
deletedAt: null,
isGroup: false,
"members.memberId": new ObjectId(params.userId),
"members.status": { $nin: ["rejected", "pending"] },
},
},
{
$lookup: {
from: "users",
localField: "members.memberId",
foreignField: "_id",
as: "friendDetails",
pipeline: [
{
$match: {
$and: [
{
$expr: {
$and: [
{ $eq: ["$deletedAt", null] },
keyword
? {
$or: [
{
$regexFind: {
input: "$email",
regex: keyword,
options: "i",
},
},
{
$regexFind: {
input: "$username",
regex: keyword,
options: "i",
},
},
{
$regexFind: {
input: "$fullname.firstName",
regex: keyword,
options: "i",
},
},
{
$regexFind: {
input: "$fullname.lastName",
regex: keyword,
options: "i",
},
},
],
}
: true,
],
},
},
{ _id: { $ne: new ObjectId(params.userId) } },
],
},
},
{
$project: {
fullname: 1,
username: 1,
email: 1,
totalCoinsCount: 1,
},
},
{
$project: {
friendDetails: { $arrayElemAt: ["$friendDetails", 0] },
members: 1,
},
}
]);
This MongoDB aggregation pipeline retrieves information about friends (users) who are members of a group that the specified userId belongs to. It combines data from two collections, applies filters and projections to refine the output, and ultimately presents a structured document with relevant details about each friend and their membership in the group.
The Role of $match Here
The first `$match` filters documents from the `Primary Model` collection to ensure they are not deleted, are individual user records (`isGroup: false`), and have active memberships for the specified `params.userId` with statuses that are neither "rejected" nor "pending". This initial filtering step focuses on retrieving relevant user data associated with active group memberships.
The second `$match` stage operates within a `$lookup` stage, filtering documents from the `users` collection. It refines the selection based on additional criteria such as ensuring users are not deleted (`deletedAt: null`) and optionally matching a keyword mongodb lookup case insensitiv search across their email, username, or full name fields.
MongoDB Lookup Pipeline Filter
Why do we use $filter? To structure and filter our data as per demand! Let's get a clear understanding of MongoDB $filter here.
const myAggregateStage = [
{
$lookup: {
from: "employee_ratings",
localField: "_id",
foreignField: "employeeScoreId",
as: "ratingDetails",
pipeline: [
{
$match: {
$and: [
{ $expr: { $eq: ["$deletedAt", null] } },
{
companyId: params.companyId
? new ObjectId(params.companyId)
: params.authUser.parentId,
},
],
},
},
{
$project: {
performanceOption: 1,
potentialOption: 1,
},
},
],
},
},
{
$project: {
ratingDetails: {
_id: 1,
performanceOption: {
$map: {
input: "$ratingDetails.performanceOption",
as: "option",
in: {
questionInfo: {
_id: "$$option.questionInfo._id",
question: {
code: "$$option.questionInfo.question.code",
name: "$$option.questionInfo.question.name",
answers: {
$arrayElemAt: [
{
$filter: {
input: "$$option.questionInfo.question.answers",
as: "answer",
cond: {
$eq: ["$$answer._id", "$$option.answerId"],
},
},
},
0,
],
},
},
},
_id: "$$option._id",
},
},
},
potentialOption: {
$map: {
input: "$ratingDetails.potentialOption",
as: "option",
in: {
questionInfo: {
_id: "$$option.questionInfo._id",
question: {
code: "$$option.questionInfo.question.code",
name: "$$option.questionInfo.question.name",
answers: {
$arrayElemAt: [
{
$filter: {
input: "$$option.questionInfo.question.answers",
as: "answer",
cond: {
$eq: ["$$answer._id", "$$option.answerId"],
},
},
},
0,
],
},
},
},
_id: "$$option._id",
},
},
},
},
},
},
];
This aggregation pipeline first looks up related documents from employee_ratings, filters them based on company and deletion status, and then projects and transforms the retrieved data into a structured format (ratingDetails). It focuses on extracting and restructuring performanceOption and potentialOption arrays, ensuring the output contains organized rating details associated with employee scores.
The Role of MongoDB Lookup with Filter Here
Within the $project stage, $filter is applied to the answers array of each question within performanceOption and potentialOption arrays retrieved from the employee_ratings collection. It ensures that only the answer matching $$option.answerId is included in the final output. This means that for each question in performanceOption and potentialOption, $filter selects the corresponding answer where _id matches $$option.answerId.
MongoDB Lookup Pipeline Sort
Websites without sorting functionality is nothing. Let's have a look on easy $sort in the following code!
const myAggregate = Model.aggregate([
{ $match: query },
{ $unwind: "$question.answers" },
{ $sort: { "question.answers.ordering": 1 } },
{
$group: {
_id: "$_id",
question: { $first: "$question" },
answers: { $push: "$question.answers" },
ordering: { $first: "$ordering" },
},
},
{
$project: {
_id: 1,
question: {
code: "$question.code",
name: "$question.name",
answers: "$answers",
},
ordering: 1,
},
},
{ $sort: { _id: 1 } },
]);
This aggregation pipeline performs operations to filter, unwind, sort, group, and reshape documents from the Primary Model collection. It focuses on processing questions and their associated answers, ensuring that the final output is structured and sorted.
The Role of $sort Here
Firstly, after the $unwind stage, $sort is applied to rearrange the documents based on the ordering field within question.answers. This ensures that the answers within each question are processed in ascending order of their specified ordering values.
Secondly, following the $project stage, another $sort stage is applied to order the final output documents by the _id field in ascending order.
MongoDB Lookup let Pipeline Example
We have seen using let variable often. What is it? Let's get a look!
pipeline = [
{
$match: {
deletedAt: null,
},
},
{
$lookup: {
from: "company_rating_options",
let: { optionId: "$optionId" },
pipeline: [
{
$unwind: "$type.options",
},
{
$match: {
$expr: {
$eq: ["$type.options._id", "$$optionId"],
},
},
},
{
$project: {
code: "$type.code",
label: "$type.options.label",
value: "$type.options.value",
_id: "$type.options._id",
},
},
],
as: "optionDetails",
},
},
{
$project: {
optionDetails: { $arrayElemAt: ["$optionDetails", 0] },
},
}
];
checkData = await Model.aggregate(pipeline);
This MongoDB aggregation pipeline enhances documents from the `Primary Model` by appending specific details from the `company_rating_options` collection. It begins by filtering active records where `deletedAt` is `null`. Then, it performs a `$lookup` to fetch related documents based on a matching condition involving `optionId`. The pipeline further refines the joined data by extracting and shaping essential fields (`code`, `label`, `value`, `_id`) into an array field named `optionDetails`.
The Role of MongoDB Lookup with let Here
MongoDB lookup let helps pass the optionId variable from the outer $match stage into the inner stages of the $lookup pipeline. This MongoDB lookup let array variable serves as a dynamic parameter for filtering and matching documents within the company_rating_options collection.
MongoDB Lookup From Multiple Collections
In MongoDB, performing Mongodb lookup of multiple collections for aggregating related data from disparate sources into a single query result. This Mongodb lookup on nested field approach is invaluable in scenarios where complex relationships exist between entities, such as collecting detailed information about related entities like users, products, or transactions.
const myAggregate = await Model.aggregate([
{
$match: query,
},
{
$lookup: {
from: "employee_scores",
localField: "boxNumber",
foreignField: "boxNumber",
as: "employeeDetails",
pipeline: [
{
$lookup: {
from: "users",
localField: "userId",
foreignField: "_id",
as: "employeeData",
pipeline: [
{
$lookup: {
from: "positions",
localField: "positionId",
foreignField: "_id",
as: "positionDetails",
pipeline: [
{ $match: { $expr: { $eq: ["$deletedAt", null] } } },
{
$project: {
title: 1,
},
},
],
},
},
{
$match: {
$and: [
{ $expr: { $eq: ["$deletedAt", null] } },
{
parentId: params.companyId
? new ObjectId(params.companyId)
: params.authUser.parentId,
},
{
employeeStatus: "un-hidden",
},
],
},
},
{
$project: {
firstName: 1,
lastName: 1,
email: 1,
positionDetails: { $arrayElemAt: ["$positionDetails", 0] },
},
},
],
},
},
{
$match: {
$and: [
{ $expr: { $eq: ["$deletedAt", null] } },
{ ratingStatus: "rated" },
],
},
},
{
$project: {
employeeData: { $arrayElemAt: ["$employeeData", 0] },
},
},
],
},
},
{
$project: {
employeeDetails: 1,
},
},
]);
In this Mongodb lookup nested array, I am joining multiple collections here in 3 stages. If you are performing Mongodb lookup on array of objects, the code structure will be the same. And if you are wondering about "$arrayElemAt", then it is nothing but converting an array of 0th index to an object.
We sometimes want Mongodb lookup return object instead of array. That's what it does!
MongoDB Lookup and Unwind
Unwind is very useful, when you have some nested array, and you have to perform some sorting and matching with it.
const myAggregate = Model.aggregate([
{ $match: query },
{ $unwind: "$question.answers" },
{ $sort: { "question.answers.ordering": 1 } },
{
$group: {
_id: "$_id",
question: { $first: "$question" },
answers: { $push: "$question.answers" },
status: { $first: "$status" },
global: { $first: "$global" },
category: { $first: "$category" },
companyId: { $first: "$companyId" },
deletedAt: { $first: "$deletedAt" },
createdBy: { $first: "$createdBy" },
ordering: { $first: "$ordering" },
},
},
{
$project: {
_id: 1,
question: {
code: "$question.code",
name: "$question.name",
description: "$question.description",
wightingPercentage: "$question.wightingPercentage",
answers: "$answers",
fieldType: "$question.fieldType",
},
global: 1,
companyId: 1,
status: 1,
deletedAt: 1,
createdBy: 1,
ordering: 1,
category: 1,
},
}
]);
This MongoDB lookup with condition starts by matching documents in the `Primary Model` collection based on a specified `query`. It then unwinds the `question.answers` array to flatten nested responses.
The pipeline sorts these answers by their `ordering` field in ascending order. Using a MongoDB lookup pipeline group stage, it groups documents by `_id`, retaining the first occurrence of fields like `question`, `status`, `global`, and others.
The Role of $unwind Here
The MongoDB lookup unwind is used to deconstruct the `answers` array nested within each `question` object. This transformation is essential because it allows subsequent stages, such as sorting or grouping, to operate on individual elements of the array rather than the array as a whole.
MongoDB Lookup Merge Objects
The MongoDB $lookup stage with mergeObjects is used to get documents by merging fields from multiple documents collected from a secondary collection into a single document.
const pipeline = [
{
$match: query,
},
{
$project: {
questionAnswers: {
$map: {
input: "$questionAnswers",
as: "qa",
in: {
$mergeObjects: [
"$$qa",
{
totalTimeSpent: {
$cond: {
if: {
$and: [
{ $ne: ["$$qa.submitTime", null] },
{ $ne: ["$$qa.attendTime", null] },
],
},
then: {
$divide: [
{
$subtract: [
{ $toDate: "$$qa.submitTime" },
{ $toDate: "$$qa.attendTime" },
],
},
1000,
],
},
else: null,
},
},
},
],
},
},
},
},
},
{
$unwind: "$questionAnswers",
},
{
$group: {
_id: "$_id",
questionAnswers: { $push: "$questionAnswers" },
totalTimeSpent: { $sum: "$questionAnswers.totalTimeSpent" },
totalQuestionsAnswered: { $sum: 1 },
},
},
{
$project: {
_id: 1,
questionAnswers: 1,
totalTimeSpent: {
$cond: {
if: { $ne: ["$totalTimeSpent", null] },
then: "$totalTimeSpent",
else: 0,
},
},
totalQuestionsAnswered: 1,
},
},
];
const result = await Model.aggregate(pipeline);
This MongoDB aggregation pipeline begins by matching documents in the `Primary Model` collection based on a specified `query`. It then uses the `$project` stage to transform each document by iterating over the `questionAnswers` array using `$map`. Within MongoDB lookup merge objects, it computes `totalTimeSpent` for each `questionAnswer` by calculating the difference in milliseconds between `submitTime` and `attendTime` if both are present, converting the result to seconds. After `$unwind`ing `questionAnswers` for further processing, the pipeline `$group`s documents by `_id`, aggregating `questionAnswers` into an array and calculating `totalTimeSpent` by summing `totalTimeSpent` values across all answers.
The Role of $mergeObjects Here
The mongodb lookup in nested array effectively combines dynamic computations with pre-existing document structures through the use of $mergeObjects, enabling thorough data enrichment and further analysis inside the aggregate framework.
MongoDB Lookup Elemmatch
The $elemMatch is very helpful when you want to filter based on one specific criteria within the complicated nested array.
try {
const {
selectValue = "leaders groupId ordering parentId uuid emailVerified companyName companyEmail companyPhone firstName lastName email noOfEmployees website positionId phone image createdAt updatedAt role status administratorName ratingStartDate ratingEndDate",
} = params;
const select = selectValue && selectValue.replaceAll(",", " ");
const selectProjectParams = convertFieldsToAggregateObject(select, " ");
let query = {
deletedAt: null,
};
const companyRoleOnly = await roleModel
.findOne({
code: "COMPANY-ADMINISTRATOR",
})
.select("code");
query["role"] = {
$elemMatch: {
_id: new ObjectId(companyRole._id),
code: "COMPANY-MANAGER",
},
};
const myAggregateStage = [
{
$match: query,
},
{
$project: {
...selectProjectParams,
},
},
];
const myAggregate = await Model.aggregate(myAggregateStage);
return { status: HttpStatus.OK, message: ApiMessage.SUCCESS, ...result };
} catch (error) {
return errorHandler(error, params);
}
Here MongoDB lookup elemmatch ensures that the `role` field contains an element that matches both the `_id` of a company administrative role (`companyRole._id`) and the code `"COMPANY-MANAGER"`. This operator is crucial as it allows for precise querying within arrays, ensuring that documents are selected only if they contain an array element that satisfies multiple conditions simultaneously.
The Role of `$elemMatch` Here
In this MongoDB lookup pipeline, the `$elemMatch` operator is used within the `$match` stage to filter documents based on nested array elements that meet specific criteria.
I hope you get some advanced ideas of MongoDB lookup with multiple conditions with the above MongoDB lookup pipeline example. If we talk about how fast MongoDB performs? Then I'll say the MongoDB lookup performance can vary based on several factors. And for sure the MongoDB lookup is not slow to perform.
The speed of `$lookup` depends on the size of the collections involved, the complexity of the lookup query, the indexes present, and the hardware resources allocated to MongoDB.
Understanding advanced MongoDB lookup project operations opens up a world of data manipulation and aggregation techniques. In upcoming blog posts, I'll go deeper into how these features can transform how you query and analyze data.