Can someone clarify the Shard key optimization donor items demo for me? how does getting the middle key from each of the list in the 100 lists ensure even distribution of keys across 100 partitions?
It is pretty simple. If you take 100 values randomly and use 100 partitions, some of the partitions will be empty while others will have more data. In the test conducted here, he has 100k items and if you don't distribute 1k across 100 partitions, you will hit the 1000 WCU limit per partition. Hence what he has done is to ensure even distribution of data. Since we have fixed the partition size (100), he takes a value between 1 and 10k that would fit inside each partition. Now, we distribute the data evenly if we use this key for GSI.
@khaledgaber4643 @ricardospear922 ;; as far as I understand it; instead of saying order#0-99 ;; what you do is insert 10000 bogus data (donor items);; this is primarily to fill out the parition space and since its big enough its gonna fill it evenly enough; then you do a parallell scan (you do it so that it roughly aproxx the number of physical partitions you have); then you get the results and get the keys that sit in the middle of the lists(100 lists in this case since there are 100 physical paritions);; instead of now giving keys like order#0-99 you can now choose a key from this list(preferably just pop the last one and give) for your data entries;; what I am failing to understand is why not just another uuiq and prefix it with something at the start to make it more evenly spaced out;; well the moral of the story is, if you are gonna create a gsi make sure the partition key is evenly spaced, this is just one trick for it :D