"Stop Rate Limiting! Capacity Management Done Right" by Jon Moore

Strange Loop Conference

Подписаться 83 тыс.

Просмотров 36 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

1 окт 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 13

@FelixSargent 6 лет назад

Great talk. I think the TL;DR is - Don't use rate limits - Capacity *planning* is hard. Make it dynamic. - Dynamic Capacity Planning can be done with AIMD en.wikipedia.org/wiki/Additive_increase/multiplicative_decrease - It's OK to tell your clients you're overloaded, they are the ones who are obliged to respect back pressure.

@parodoxis 3 года назад

You're still rate limiting; you're just pushing that job onto the clients. But how will they know how long to wait after receiving a NOPE before trying again? If they try again to quickly, they'll just keep getting NOPEd. If they try again too slowly, there may be stranded capacity again. Instead, what if we simply sent back a "wait this long before your next request" header with every response? The wait period could be zero if the server is below capacity, but if it's at capacity we calculate a conservative estimate in milliseconds for how long they should wait - may be different every time - before making the next request. Simply compare how many requests we completed last second to how many we get done this second and assume the demand will be the same for the future second - then divide the next second up fairly among all the clients. Clients who we've never seen before will get priority, as they have no known wait time to go by, and we should give them VIP treatment anyway over the clients who have been hammering us for a while with no sign of stopping. Clients who disrespect the wait time get deprioritized, or even noped. I feel like this would maintain a constant near-100% pressure, and yet clients also know exactly what to expect - if they respect the wait time, they're guaranteed a quick response and no NOPE. If they see a wait value that's too high, they can choose to write the server off as too congested and give up for now, if they want. This just leaves even more capacity for the rest of the clients. Same happens when you gave a client a wait time but they have no more to send. Some capacity goes unused, and you can account for this in the next second's measurement.

@LiamBaker-y4s Год назад

Sounds helpful for the server to proxy and proxy to client to use a back channel signal to advise a 'back off', and then a response (dropping/refusing requests) if it does not.

@adityasanthosh702 5 месяцев назад

One thing I'd like to see is how AIMD is configured. How do you decide backoff factor in multiplicative decrease and additive factor in Additive increase?

@yairmorgenstern416 Год назад

Incredible presentation. We need more practical strangeloop talks like this one!

@igorgulco6608 4 года назад

High Quality presentation! Thats what im here for!

@growlingchaos 7 лет назад

Really great talk!

@julienviet 6 лет назад

yes great!!!

@AnhNguyen-vu7mc 6 лет назад

this is a really helpful talk. Do you happen to open source all the code?

@vajravelumani1827 Год назад

instead of using `number of concurrent users` , should'nt we be using the `time taken to serve a request` as deciding factor for increasing/decreasing incoming requests ?