CRDTs: The Hard Parts

Martin Kleppmann

Подписаться 49 тыс.

Просмотров 66 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

16 сен 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 38

@jawaffles 4 года назад

As a concrete example of duplication/picking-both-options being a valid resolution, see the merge-conflict behaviour of most any source code version control tool, which inserts conflict markers along with two or more incompatible states. Note this leaves conflict at the user-meaning level, needing user input to resolve, but it resolves the conflict at the level of the version control data structure. This difference between conflict of high level meaning vs. conflict at the data structure level can cause a lot of misunderstandings in casual discussion I think.

@BosonCollider Год назад

There is a VCS that makes this explicit called Pijul, which uses a CRDT to make patches associative & commutative, and goes for exactly this approach where it is very conservative about how to handle conflicting merges (explicitly identifying them and marking them as conflicts) and forces the user to resolve them with a third patch.

@kikikukudu61 Год назад

Great video! Easy to follow, thanks for the hard work! :)

@YG-bx3wk 3 года назад

Such a great video with a detailed explanation! thanks a lot!

@russellbrown8760 4 года назад

Thanks Martin, a really interesting talk, so good to see constant progresss in this field. I have to point out that CRDTs have been in production systems since 2014 at least, so maybe your final summary about them one day being practical and not just research prototypes is a little unfair.

@kleppmann 4 года назад

Good point Russell! I had in mind collaboration software (where the replicas are apps running on end-user devices), where OT is currently still quite popular; of course distributed databases like Riak have been using CRDTs in production for a long time.

@kevinjahns2746 4 года назад

@@kleppmann I pointed out to you before that CRDTs are being used in collaborative software that runs in production. Already in 2016 Pluxbox - a radio management software - integrated the Yjs CRDT for collaborative richtext editing. More recently Nimbus Notes (>1 million users) integrated Yjs as their collaborative engine. Another example is the Teletype CRDT that powers collaborative editing in Atom (since 2017). Dismissing this, and describing the current state of CRDTs as just research prototypes, really doesn't help to make CRDTs more prominent in collaborative software.

@Ondra011 4 года назад

@@kevinjahns2746 I don't think Martin is trying to dismiss the practicality of CRDTs, it seemed more like he was a bit dismayed at the fact that the largest of collaborative editing applications (Google Docs, Office 365) are not using CRDTs. One would think -- or hope -- that Google and Microsoft hire enough talent to solve these hard problems properly and innovate beyond last-century algorithms.

@kevinjahns2746 4 года назад

@@Ondra011 Many people, not just Martin, describe CRDT implementations for shared editing just as prototypes. This is unfortunate and simply not true. Just because Google and the majority of companies are not using them doesn't make them impractical. There is little chance that they are going to rebuild a system that works really well.

@CalebRox 3 года назад

@@kevinjahns2746 What makes CRDT superior to OT for Google Docs-like apps that can already share a central server?

@ruslanfadeev3113 2 года назад

Thought: maybe it's better to design the operations themselves with underlying CRDT structures in mind. For example, if moving a node cannot be done well in all cases, we could give users an on-screen personal clipboard that holds the node that is being moved. So when the user presses Ctrl+X to cut the node, it goes into the clipboard and gets a new ID. When the user presses Ctrl+V, the node is moved from the clipboard to the main graph. This way we're duplicating nodes on concurrent operations, but because of the new element this behavior makes sense to the user because duplication happens during copying into the personal clipboard. And if user A moves the node while user B was in the process of doing so, they can still choose to insert the node somewhere else too. I guess this makes conflict resolution into a part of the editing process.

@AndersonSilva-dg4mg 4 года назад

Thank you very much!

@NikitaNikita-k8q 11 месяцев назад

Great Video! I am adding collaboration to my app and this video helped me a lot.

@hendrikahuis6919 4 года назад

Very interesting and concise talk. Thanks!

@davidmiller252 4 года назад

This would make a really good Computerphile episode. You should contact Brady

@monarodan 3 года назад

It's a bit unfair to say that OT requires a central server without mentioning that this is only done to avoid needing to satisfy the TP2 property of transformation functions.

@artem_r 3 года назад

Thanks for the amazing talk, it was very interesting. I have one question-at the beginning (2:58) there is a table that shows that CRDTs are being used (TomTom GPS, Teletype for Atom, etc), but at the end of the talk you said that CRDTs might become a future of collaborative apps. What does the latter mean? Thanks!

@bernardosulzbach 4 года назад

Great slides.

@dabblewriter 3 года назад

Late to the party, but you could solve the Bacon Milk problem by representing text with an array having each line be a separate entry. This does require additional work for splitting/joining lines, but it seems it would map closer to what the user would expect to happen in a merge.

@vishalsharma-pb4pj 3 года назад

Why not bundle operational transformation in the application itself? So it wouldn't have to go through server each time.?

@MaxPrehl 2 года назад

39:05 The result on Windows from Powershell is: mv : Access to the path 'C:\Users\Max\3D Objects\a' is denied.

@user-gr2wq6qp8c 4 года назад

24:35 would it solve the problem to order branches by session id then time?

@gamenovice4928 Год назад

@23:40 - actually wouldnt the interleaving for typing backwards problem be a big issue for languages that are typed right to left, like arabic or hebrew? That being said you could invert RGA for multi-lingual cases so that the structures are in the right order. But if the string itself is just left to right character sequences, and it's up to the text rendering engine to pick how it's laid out, then what I raise as a question here is likely a non-issue

@asm0di0 4 года назад

Thank you for very interesting talk! I have a question on concurrent move operation in tree from 46:41. Do we really sync all our undo and redo operations to Replica 2? And if yes - then how should it roll back its mv(b, a) operation?

@MikkoRantalainen 4 года назад

Such CRDT system can only implement actions that it can undo. Basically if you originally move something and you have to undo it, you just move it back to previous location. And because *all* changes to this filesystem go using the same CRDT system, it has already undone every change done later so it obviously can undo the move, too. And the "tombstones" mentioned later are just the fact that when you *delete* something, you have to still store it somewhere to be able to undelete perfectly *when* needed.

@asm0di0 4 года назад

@@MikkoRantalainen Thank you for the answer!

@EvgeniyDolzhenko 7 месяцев назад

What I don't quite understand is how is it possible to say "*last* writer wins", how do we get nodes to agree on which one is the last?

@codeelsewhere 4 месяца назад

timestamp

@luckyshaqqq 3 года назад

Thanks Martin, Is the problem with the AutoMerge the same class of problem as the SplitBrain?

@stefan-bayer 9 месяцев назад

Does anyone have insights if Apple implemented CRDTs already in Apple Notes?

@amoghyermalkar5502 4 месяца назад

That's a great question but no way of finding this out :( let me know if you do though!

@nseco 4 года назад

Thanks for this. I am confused about the suggestion to use Lamport clocks at 49:20. Don’t Lamport clocks only provide a partial ordering? The impression I got from the example you give prior to explaining the data structure seem to imply the need for a total order.

@spudwaffle 4 года назад

Nuno Seco see 1:01:50

@bojidarmarinovbg 4 года назад

1:06:42 isn't it possible to encode/decode individual UTF8 characters without storing length alongside them? Referring to the Wikipedia article, en.wikipedia.org/wiki/UTF-8#Encoding, it should be possible to find the number of bytes in the variable-width character by looking only at the significant ones in the first byte.

@Ondra011 4 года назад

sure, but you're also interested in the actual range that the edit added/removed.

@kleppmann 3 года назад

That is sufficient for splitting the byte sequence into code points, but a unicode "character" (technically, a grapheme cluster) may consist of multiple code points. For example, a letter plus an accent, or an emoji plus a skintone modifier. Storing the length separately allows us to encode such situations where a sequence of multiple code points get treated as a single character.