How we optimized our servers by 20x 💸 @ 250K monthly active users
Several people have been asking why we use Rust, so here's the full story:
Back in July 2023 📅, we were serving 80K monthly active users. About 4 requests per second were hitting our servers at any given time from Jenni users.
Like many AI startups, we built our initial backend server with Python. Python was a good language to prototype. It had convenient AI/ML libraries (Numpy, OpenAI SDK) and Jupyter notebooks were great for tinkering.
Our backend server is monolithic and handles many operations on Jenni, including prompting, RAG pipeline, etc. It's deployed on serverless Cloud Run.
The Problem 🐍
By mid-2024 📅, we scaled to 250K active users and were serving 10+ requests per second.
As the complexity of our product scaled over time, 2 main problems stood out:
1. High Resource Consumption
Python was consuming a lot of CPU and memory, and it was hard to debug. Sometimes containers took a minute to start.
2. Increased Bugs
One source of bugs was the lack of strictness in Python as a dynamically typed language. For example, you could pass an object into a prompt string, and no compiler would warn you something went wrong.
These silent errors were costly and something we couldn't afford.
And we've tried Python typings and linting - the tool stack really sucked.
Getting Rusty 🦀
After hearing success stories from friends who tried Rust and surveying other developers' opinions, I decided to prototype a simple API endpoint in Rust.
We deployed this prototype to production and it ran extremely fast. More importantly, the strict compiler ensured a class of errors that cannot happen.
Being 100% confident about something is much more valuable than creating many unit tests that lead to 99% confidence. The best tests are the ones you never had to write.
Outcome
Within a few months, our team got the hang of Rust and was able to rewrite our Python server into Rust.
Results:
• Cost Optimization: Without further optimization work, our server cost was optimized by ~20x.
• Faster Container Starts: Our containers start within ~0.6 seconds - an order of magnitude faster.
• Less Compute Usage: ~60x less compute to serve the same traffic.
Rust helped us scale up by scaling down and doing more with less.
Although Rust does have a learning curve and lacks an abundance of libraries like Python, the tradeoffs made sense for our team and scale.
We still prototype in Python and use it for research, but Rust is a clear winner for reliable production code.
Curious what you guys use rust for?