First, a quick summary: You don't need to worry about the order of .skip()
and .limit()
(chained methods), but you do need to worry about the order of $skip
and $limit
(aggregation stages).
In MongoDB, you might use helpers like find
and updateMany
to perform simple database operations. These helpers use chaining - e.g. .skip()
and .limit()
. But for more complex database operations, you might use an aggregation pipeline. Aggregation uses stages - e.g. $skip
and $limit
.
Chaining (.skip()
and .limit()
) and stages ($skip
and $limit
) behave differently, which can mess you up if you don't get them in the right order.
In MongoDB chaining, the order of .skip()
and .limit()
is not important.
When you're running a helper like db.collection.find()
to fetch documents from a single collection, you'll probably use the chained methods .skip()
and .limit()
to get paginated data from MongoDB.
This:
db.collection.find({}).limit(60).skip(20);
Will give you the same results as this:
db.collection.find({}).skip(60).limit(20);
You'll get 20 results, skipping the first 60 (e.g. page 4 if you fetch 20 results per page).
When you chain
skip()
andlimit()
, the method chaining order does not affect the results. The server always applies the skip operation based on the sort order before it applies the limit on how many documents to return.— MongoDB - Using
skip()
withlimit()
So I got used to the order of skip()
and limit()
not being that important. And that all worked great... until I needed an aggregation pipeline.
For more complex database queries, involving things like data transformation, you might need aggregation, and if you're switching from MongoDB CRUD helpers (like find
), you'll need to get your stages in order!
MongoDB skip and limit order in aggregation
When calling skip()
and limit()
on a cursor in MongoDB (as we just covered), the order doesn't matter. Because the server always applies the skip before it applies the limit - even if you put limit first.
But when you’re using an aggregation pipeline, order is important. The server isn't going to put skip and limit in the right order for you anymore. Because when you use aggregation, you're in charge!
Order is important in aggregation because each stage of the aggregation pipeline only gets the documents passed to it from the previous stage.
Let’s say you have 240 results. And you’re returning a page of 60 results to the client.
For a cursor both .skip(60).limit(60)
and .limit(60).skip(60)
work the same. You’ll skip the first 60 results, and get the next 60. Order doesn’t matter.
But in aggregation, if you put $limit
before $skip
, and use the following in your aggregation pipeline:
db.collection.aggregate([
{ $limit : 60 },
{ $skip : 60 },
]);
You’re going to get no results. Nothing. Zero. Even though you know results are sitting in your database collection waiting to be fetched.
By limiting the results to 60, the $limit
stage will pass 60 results to the $skip
stage. And then you tell the pipeline to skip 60 results, so it skips those 60 results it got. So you're skipping all of the results passed from the $limit
stage to the $skip
stage. And that’s why you get no results! 😬
We need to switch those around and put the $limit
stage after $skip
.
db.collection.aggregate([
{ $skip : 60 },
{ $limit : 60 },
]);
Now you are skipping the first 60 results, and $skip
will pass the remaining 180 results to the $limit
stage. $limit
can then limit those remaining 180 results to 60, and return 60 results as a page of data!
$skip
before $limit
.
Adding sort()
for consistency
For simplicity, I've only focused on skip
and limit
, but there's another method you will need too - sort
!
If you’re returning paginated data from MongoDB, you really do need a .sort()
stage. And this applies in both chaining (.sort()
) and aggregation ($sort
).
Let's go back to my earlier example:
db.collection.aggregate([
{ $skip : 60 },
{ $limit : 60 },
]);
Which 60 documents are we skipping? The first 60? But what are the first 60 documents? We don't know because they are not sorted in any particular order.
So when we come to page 2 of the data { $skip : 60*2 }
, the documents might get returned in a different order. We might end up with some of the same documents from page 1, and some might be missed altogether.
Not cool!
So don't forget to sort your queries! At least on the _id
field.
db.collection.aggregate([
{ $sort : { _id: 1 } },
{ $skip : 60 },
{ $limit : 60 },
]);
Or if you're sorting on another field, for example, a title field alphabetically, if it's not unique, add the _id
field too. This makes sure the sort order stays consistent.
db.collection.aggregate([
{ $sort : { title: 1, _id: 1 } },
{ $skip : 60 },
{ $limit : 60 },
]);
Or for chained methods...
db.collection.find({}).sort({ title: 1, _id: 1 }).skip(60).limit(60)
Going forward, I'll try to get my .skip()
and .limit()
in order on chained methods too, for consistency, and so I don't forget when I need to convert a find
query to an aggregation pipeline!