Cursor Pagination is the FASTEST - But you can't use it if...

Milan Jovanović

Подписаться 104 тыс.

Просмотров 19 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

15 окт 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 97

@MilanJovanovicTech 8 месяцев назад

Want to master Clean Architecture? Go here: bit.ly/3PupkOJ Want to unlock Modular Monoliths? Go here: bit.ly/3SXlzSt

@arthurasimpson 7 месяцев назад

Another problem is IDs with (unknown) gaps in between. Here, finding the cursor value becomes even more difficult or even impossible (this is precisely the reason why the database cannot do an index seek, but has to count lines by scanning). A small addition from me, however: this technique is wonderful for updating or deleting in large tables. You can update/delete rows in blocks by working with TOP and WHERE ID < x in a loop.

@MilanJovanovicTech 7 месяцев назад

Nice addition to the discussion, thanks

@arthurasimpson 7 месяцев назад

@@MilanJovanovicTech You're welcome! I discovered you on LinkedIn and saw that the videos are even better 😉

@heischono4917 15 дней назад

Thank you very much. It's nice to know that I've done everything right, even without having seen this video beforehand. I am currently working with realtime data, and saving the last ID helps to keep the next query fast. However, I find your example with the fixed value 200400 somewhat misleading. It's completely ok for benchmarks, but in real life I don't even know which ID I have as a starting point at the beginning. The video would have benefited from providing a practical approach to this. I personally know what to do, but @emma-vi's question shows the need for clarification.

@MilanJovanovicTech 15 дней назад

Of course, of course... I needed something stable for the benchmark. In a real world example, we'd fetch the first page, and then use the ID of the last record as the cursor.

@emma-vi 8 месяцев назад

Hi there, great videos btw. I am just wondering, how do you know that the cursor = 200400 corresponds to the page that you are trying to access? thanks

@MilanJovanovicTech 8 месяцев назад

Just a setup after checking the DB - helps with benchmarks

@emma-vi 8 месяцев назад

@@MilanJovanovicTech but I mean how can apply this on a table where I don’t know which number means any page? For example I have a table where can have logical deleted rows so the number itself can’t be associated to a calculation of a page

@osman3404 7 месяцев назад

@@emma-vithe app or the search screen logic will need to cache the value to use as the cursor.

@emma-vi 7 месяцев назад

@@osman3404 but if you add an order or a filter that doesn't work anymore right? because the cursor uses a number as an anchor to calculate the next rows

@phugia963 6 месяцев назад

he just doesn't know it and can't provide you the answer. In fact, identifying where the cursor is for correct page is very tricky and sometime impossible if you have id as guid or non sequential id. Normally the only situation I can see we could leverage cursor pagination is infinite scroll, where we would know next exact cursor position. For normal paging that allow user to arbitrarily go to any page, cursor won't work! And the way he presented his example is confusing enough for your question to arise :D

@MsGordonbennett 8 месяцев назад

Great video. Thanks Milan

@MilanJovanovicTech 7 месяцев назад

My pleasure!

@mattmarkus4868 8 месяцев назад

Nice! Thank you, perhaps a naive question but you picked an id to use as your cursor. How would you know what to use as your cursor in a real life scenario? Maybe I misunderstood something.

@MilanJovanovicTech 8 месяцев назад

It should be a column that's sortable in creation order

@orterves 8 месяцев назад

I think your cursor example needs an orderBy

@MilanJovanovicTech 8 месяцев назад

The PK is already sorted, so it wouldn't have an effect. But it can help if using a non-indexed column. Or traversing the index in the opposite order.

@dsvechnikov 8 месяцев назад

In theory, you can add a map of ids corresponding to pages. You then will need to periodically update it and whenever querying a page, query an offset (number of records added since last map update) to add this offset to the record id from the map. This will add ability to go to arbitrary page (if you use sequential numeric ids and sort paged records by that id). May perform better than offset...limit if paged table contains many more records.

@MilanJovanovicTech 8 месяцев назад

That would be a mess to maintain for each user and with records being added and removed

@dsvechnikov 8 месяцев назад

@@MilanJovanovicTech it sure would be. Not so much if you don't need to maintain maps for different users (all users see the same list of records and therefore have the same pages) but still quite messy. And I completely forgot about the removal of records being a thing... It could be accounted for relatively easily as part of the periodical map updates, but between updates some pages would intersect. OR there could be a list of records removed since last map update which can be used to calculate proper offsets when needed... But anyway, it is indeed a very complicated and messy solution that wouldn't make sense for probably anyone

@ronaldschutte7948 7 месяцев назад

Is there a concept to keep page id's in cache to use them as cursor boundries to speed up the queries? I would never hard code an Id like this, but I can imaging to have these id's collected for instance every day to cache them.

@MilanJovanovicTech 7 месяцев назад

It's obviously just for demo purposes 😅 Typically you will save this value on the client side

@denm8822 8 месяцев назад

OData is the best solution for me , for about a decade

@MilanJovanovicTech 8 месяцев назад

I never had a chance to use it

@ConradAkunga 7 месяцев назад

How practical is cursor pagination in a real-life scenario given: 1. You almost always want to allow the data to be sorted by user-defined criteria - order date, order value, etc. 2. Once real-life scenarios such as order deletion/cancellation/reversals start to occur the id becomes very brittle, especially when you add a where clause to the select e.g. you want to page all non-cancelled, non-reversed orders

@MilanJovanovicTech 7 месяцев назад

It's not for general purpose pagination, but fits perfectly for use cases where you need an infinite-scroll solution. Examples could be social media timelines, e-commerce catalogs, e-mail, etc.

@AlexanderRadchenko 8 месяцев назад

Order by will break cursor pagination, for example order by name

@MilanJovanovicTech 8 месяцев назад

Yep, cursor pagination sucks if you need random sorting

@AnatholyBonder 8 месяцев назад

Just add an expression to the arguments to detect the order field and then use EF method when building the query.

@AnatholyBonder 8 месяцев назад

Ah, and of course made the method generic :)

@drhdev 8 месяцев назад

That won’t fix the underlying database call though and the reasoning behind this video. You will need good covering indexes

@bogdanb904 8 месяцев назад

I'm guessing cursor pagination would not work when you throw in filtering.

@MilanJovanovicTech 8 месяцев назад

Nope, works fine with filtering. Doesn't work with random sort orders.

@vigneshveeramani3934 8 месяцев назад

I would like to know about dapper with clean architecture. Can you provide video for proper implementation.

@MilanJovanovicTech 8 месяцев назад

I touched on that in a recent CQRS video

@maziyar.m 8 месяцев назад

Woow God bless you brother

@MilanJovanovicTech 8 месяцев назад

Thank you

@jeanpatrick2412 5 месяцев назад

Great Video Milan. Is there a way for this to work with Strongly Typed Ids?

@MilanJovanovicTech 5 месяцев назад

Yes, but too cumbersome for my liking. The juice ain't worth the squeeze.

@petropzqi 8 месяцев назад

What if your ID is not auto incremented or if it's a guid?

@MilanJovanovicTech 8 месяцев назад

Then you'd need another auto-incrementing column (or sortable column at least) - a good example is a CreatedOnUtc column

@salmanshafiq8151 8 месяцев назад

Wow ❤ Does it work for Guid primary key?

@MilanJovanovicTech 8 месяцев назад

Not really, you need something you can sort on that also matches when the records were created. Otherwise, you might get a different result each time as more records are added/removed from the DB.

@ЗамирЗакиев-т6д 8 месяцев назад

cursor pagination is faster, but you cannot order the data by some column except id, so if user want to order data by some column better using cursor, but if he want sort data then better is offset pagination

@MilanJovanovicTech 8 месяцев назад

Yes

@akilarsath6499 2 месяца назад

Then which sorting should be used

@afshin7104 7 месяцев назад

Great video But what if our Id is Guid how do you do it in cursor pagination

@MilanJovanovicTech 7 месяцев назад

You need something else that's sortable to pair it with a GUID. An auto-incrementing integer, CreatedOn column, etc.

@Chris-zb5nm 8 месяцев назад

But why on earth would anybody want to have a pagination that filters by ID? A pagination means: Where(any condition) & OrderBy(any column) & any page & any size

@MilanJovanovicTech 8 месяцев назад

Think about your Gmail inbox. You see the latest 50 emails, and can navigate to the next page, etc. The emails are sorted in the order that they arrive to your inbox - i.e. creation order. Which is exactly what an integer PK gives you - creation order.

@matiasmiraballes9240 8 месяцев назад

@@MilanJovanovicTech At which moment do you decide to create/update the cursors in order to fetch the latest 50 emails? on each email arrival do you delete the cursor for said user, recalculate which Id should the cursor have to be the 51st element, then offset all the records from that point (by deleting and recreating them with Id+1) to make room for the cursor?. yes, you could potentially use a step of 2 or bigger so you always have room for creating cursors, but there is also the point of this cursor being hardcoded in the code. Seems very unwieldy at the time of updating the cursors.

@antonmartyniuk 8 месяцев назад

@@matiasmiraballes9240you don't need to update a cursor each time. All you need to do is order emails by id descending, and create a cursor pagination backwards: where Id < x

@rouensk 8 месяцев назад

I would not call this cursor pagination - it's just selecting by clustered index. Cursor pagination is technique that is actually using database feature CURSOR, by defining query (where you can actually use ORDER BY with other expressions than just Id) and then FETCH pages.

@MilanJovanovicTech 8 месяцев назад

Nope, this is cursor/keyset pagination. A DB cursor is something else.

@osman3404 7 месяцев назад

What if the ID used as cursor was a Guid and not a sequential id ?

@MilanJovanovicTech 7 месяцев назад

Wont' work in that case, you need something that's sortable + grows in "creation order" If you have to use a Guid, you'd need another column to handle the creation order part. A good solution could be a CreatedOn column

@harundurakoglu3414 8 месяцев назад

Where do you find this kind of information?

@MilanJovanovicTech 8 месяцев назад

Research

@Ariel-yv8uw 8 месяцев назад

What theme do you use in your Visual Studio? It's fire Man

@MilanJovanovicTech 8 месяцев назад

It's ReSharper syntax highlighting

8 месяцев назад

But how do you know the cursor initially?

@MilanJovanovicTech 8 месяцев назад

For a number: 0, or max(int)/max(long) Basically, the default value of a cursor where any other values is greater or smaller.

@abellima3501 8 месяцев назад

ok, great, but what if it was a different order, for example, an order by price?

@MilanJovanovicTech 8 месяцев назад

Then you'd need an index on that column, and a way to solve duplicates.

@drhdev 8 месяцев назад

On the second example, your CountAsync should be before your pagination. If you kept it like it is you don’t need 2 roundtrips lol.

@MilanJovanovicTech 8 месяцев назад

How would that change the number of round trips?

@drhdev 8 месяцев назад

@@MilanJovanovicTech rewatch your video at 4:30. It’s easy to overlook. You wrote countasync after tolistasync and said it takes 2 trips. You accidentally called countasync on the query instead of before paging. If you are indeed doing what you are coding one round trip was enough but I’m guessing you didn’t mean to call count on the paginated results. For real pagination you should use countasync on the entire filtered query set then call tolistasync with the pagination.

@MohammedHassan-ug4cu 8 месяцев назад

how this approach will work if the Id is Guid

@bobek8030 8 месяцев назад

it wont

@MilanJovanovicTech 8 месяцев назад

You need another column you can sort by time of creation, like a CreatedOnUtc column

@ttolst 8 месяцев назад

Interesting that the final version will not work with the UI shown in the thumbnail 😉 If i was in a situation where a primitive unsortable next page implementation was needed, my data would not be in a sql db anyway.

@MilanJovanovicTech 8 месяцев назад

It would work - so long as you go to thew next/prev page only 😁

@Leobraic 8 месяцев назад

What about if the Id is a Guid? The cursos approach still works?

@MilanJovanovicTech 8 месяцев назад

It "works" in theory - you can sort a Guid, right? But it's practically useless, because a Guid is random. You want something that is increasing with creation time, like a numeric PK does.

@Leobraic 8 месяцев назад

@@MilanJovanovicTech I this scenario what approach you suggest? When we only have Guids as Keys?

@SuperAwdawdawdawd 8 месяцев назад

@@Leobraic Sort by creating date, and then use Skip().Take()

@ChuDevMoHon 8 месяцев назад

Cursor is the ID of the last record in Sale table?

@mokeev1995 8 месяцев назад

Nope. It's an ID of some specific record in the table (near the end of this table, if I understand Milan correctly).

@MilanJovanovicTech 8 месяцев назад

The ID of the last record that you READ in the current page - and it denotes the start of the next page

@AlexanderLoshkaryov 8 месяцев назад

@@MilanJovanovicTech , Thanks Milan. Wouldn't it be more readable if the "Last()" used instead of [^1]? But I appreciate the approach shown, from my understanding it'll do the job working with arrays.