MongoDB: a Fast and Easy Way to Calculate Aggregated Values without Map-Reduce

Download PDF
MongoDB: a Fast and Easy Way to Calculate Aggregated Values

A MongoDB aggregation framework allows you to calculate aggregated values without having to use map-reduce. While map-reduce is a powerful tool, it often proves to be slow when processing big volumes of data. In this article, I would like to compare map-reduce with MongoDB and show the significant benefits of using the latter.

MongoDB vs Map-Reduce

The main differences of Aggregation Framework from Map-Reduce are:

  • declarative syntax, no need to write code in JavaScript;

  • describing chains of operations to apply;

  • expressions evaluation;

  • higher performance because aggregation framework is implemented in C++ instead of JavaScript;

  • projections of returned data so a user can add computed fields, sub-objects, etc.

Framework concepts

Aggregation Framework provides the similar logic as the “GROUP BY” SQL operator. There are 2 main concepts in aggregation framework: pipelines and expressions. Pipelines are operators that can process a stream of the documents. Expressions return the output documents after the calculations on input documents. Some pipelines:

  • $match – uses query predicate like collection.find({});

  • $project – allows to change the shape of the result, include computed values, sub-objects, etc.;

  • $unwind – separates elements of an array and add it into an output document;

  • $sort – sorts documents;

  • $limit – specifies maximum number of documents to be returned;

  • $skip – skips a specified number of documents.

Using MongoDB in Node.JS: our hands-on experience

MongoDB has drivers for many programming languages and platforms, including Node.JS. You can install Node.JS driver by typing npm install mongodb.

All MongoDB features are available in the driver. There was a task to aggregate huge data collection by three fields to build some statistical report. The collection contained about 500k records with web pages views statistics. Each document had the following format:

It was necessary to group data by time, IP address and URL. The first version of this logic was implemented using map-reduce:

The processing of 500k records took about 1 minute. It was an annoying issue and we decided to switch to the MongoDB 2.1. aggregation framework. The new version of aggregation logic is presented below:

In this code, we use 2 pipelines: $match and $group. The $match filter required records, and the $group aggregates records by three fields: time, URL and IP. These fields are used as a key because we explicitly specified ‘_id’ field and expression $sum calculates the number of records with the same key. The output data has the following view:

Result

The use of an aggregation framework significantly improved the performance of the processing. Now 500k of records are processed within 3-4 seconds. The MongoDB aggregation framework is a powerful, simple and lightweight tool that really allows you to improve the performance of aggregated values calculations without using map-reduce.

Chief Product Officer
With a passion for innovation and a keen understanding of market trends, Alexander plays a pivotal role in shaping Magora's product development strategy and ensuring the delivery of cutting-edge solutions to clients.
open
related
Access to Oracle Data from SharePoint 2010 Using Business Data Connectivity Model How To Calculate Costs and ROI for the Project Ensure Secure Data Transmission without a Secure Channel How to Create an API-server on Yii PHP Framework
recent
The Downsides of Using LLMs for AI Development and How to Overcome Them The UK Software Development Market in 2024: Trends and Opportunities Top Legal AI Models and How to Build One for Your Business Leading the Charge: Top UK SaaS Startups of 2024
recommended
Everything You Want to Know About Mobile App Development App Development Calculator Infographics: Magora development process Dictionary
categories
News Technologies Design Business Development HealthTech IoT AI/ML PropTech FinTech EdTech Mobile Apps Discovery Transport&Logistics AR/VR Big Data Sustainability Startup Enterprise Security
Logo Magora LTD
close
Thank you very much.
Magora team

Grab your e-book: Design to attract more buyers

Logo Magora LTD
close
Get in touch
Open list
Open list
Logo Magora LTD
close
Thank you very much.

Your registration to the webinar on the 27th of September at 2 p.m. BST was successfuly completed.
We will send you a reminder on the day before the event.
Magora team
Registration for a webinar

"Let Smart Bots Speed up your Business"
Date: 27.09.2018 Time: 2 p.m. BST
Do you agree to the personal data processing?


Logo Magora LTD
close
Download our curated selection of resources for accelerating your software development journey.