You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
**We’re [Manychat](https://manychat.com/)! 👋🏻 Chat automation for 1M+ businesses at massive scale, with Python infrastructure that actually handles it without headaches.**
22
+
**We’re [Manychat](https://manychat.com/)! 👋🏻 Chat automation for 1M+ businesses
23
+
at massive scale, with Python infrastructure that actually handles it without
24
+
headaches.**
22
25
23
-
Bet you know this feeling all too well: deploying an LLM feature and HOPING for the best in production 🤞🏻
26
+
Bet you know this feeling all too well: deploying an LLM feature and HOPING for
27
+
the best in production 🤞🏻
24
28
25
-
The costs, rate limits, provider outages taking down your entire feature… it used to haunt us, but we developed the systems that can handle it all.
29
+
The costs, rate limits, provider outages taking down your entire feature… it
30
+
used to haunt us, but we developed the systems that can handle it all.
26
31
27
-
We power chat automation for over a million creators and brands, so "the LLM is down" isn't an acceptable answer, ever. We built infrastructure that expects LLMs to misbehave:
28
-
- Multi-provider routing that fails over from Azure to OpenAI mid-retry.
29
-
- Weighted traffic distribution that we can rebalance without deploying code.
30
-
- Cooldowns that pull failing backends out of rotation automatically.
31
-
- Observability that actually helps — Prometheus metrics, Grafana dashboards, OpenTelemetry traces with business context so we know which customer's request just broke.
32
+
We power chat automation for over a million creators and brands, so "the LLM is
33
+
down" isn't an acceptable answer, ever. We built infrastructure that expects
34
+
LLMs to misbehave:
32
35
33
-
We also monitor our asyncio event loop like hawks, because nothing ruins your day faster than discovering a blocking call is starving your entire service.
36
+
- Multi-provider routing that fails over from Azure to OpenAI mid-retry.
37
+
- Weighted traffic distribution that we can rebalance without deploying code.
38
+
- Cooldowns that pull failing backends out of rotation automatically.
39
+
- Observability that actually helps — Prometheus metrics, Grafana dashboards,
40
+
OpenTelemetry traces with business context so we know which customer's request
41
+
just broke.
34
42
35
-
**And we'd love to discuss this and all things Python with like-minded people!** Come by our booth to talk to the engineers who built this infrastructure. We can tell you how we handle rate-limit cascades. How we track token costs per AI agent. What it's like to debug distributed traces when you're trying to figure out why one specific request took 6 seconds instead of 2. And anything else you'd like to know about working in such a high-load environment.
43
+
We also monitor our asyncio event loop like hawks, because nothing ruins your
44
+
day faster than discovering a blocking call is starving your entire service.
45
+
46
+
**And we'd love to discuss this and all things Python with like-minded people!**
47
+
Come by our booth to talk to the engineers who built this infrastructure. We can
48
+
tell you how we handle rate-limit cascades. How we track token costs per AI
49
+
agent. What it's like to debug distributed traces when you're trying to figure
50
+
out why one specific request took 6 seconds instead of 2. And anything else
51
+
you'd like to know about working in such a high-load environment.
36
52
37
53
Come see us on stage:
38
-
- Daria Korsakova, Python Engineer at Manychat will talk about Practical observability for Python APIs, workers & jobs
39
-
- Sergi Porta, Python Team Lead at Manychat will talk about LLM Traffic Spikes: Routing, Rate Limits, and Failover in Python
40
54
41
-
Read the full technical breakdown of our infrastructure: https://medium.com/manychat-engineering/how-to-survive-llm-traffic-spikes-in-python-73955ee9426f
55
+
- Daria Korsakova, Python Engineer at Manychat will talk about Practical
56
+
observability for Python APIs, workers & jobs
57
+
- Sergi Porta, Python Team Lead at Manychat will talk about LLM Traffic Spikes:
0 commit comments