LiteLLM - Dynamic Rate Limiting Demo

video1.0<iframe src="https://www.loom.com/embed/1b54b93139ee415d959402cc0629f3f7" frameborder="0" width="1672" height="1254" webkitallowfullscreen mozallowfullscreen allowfullscreen></iframe>12541672Loomhttps://www.loom.com12541672https://cdn.loom.com/sessions/thumbnails/1b54b93139ee415d959402cc0629f3f7-eac80727e95a1a0b.gif120.399LiteLLM - Dynamic Rate Limiting DemoIn this video, I demonstrate how to use the LightElem dynamic rate limiter with priority reservations to manage traffic for different use cases. We have a model set to 100 RPM, where the production use case receives 90% of the traffic and the development use case gets 10%. I walk through the process of generating a key with priority metadata and show how to run a load test with 100 users to validate the setup. The expected outcome is that the traffic splits according to the defined priorities, with 90 successes for the higher priority key. I encourage you to implement this feature in your configurations for better traffic management.