API Evangelist API Evangelist
Learnings
Guidance
Toolbox
Alignment
API Evangelist LLC

API Evangelist Conversation with Adrian Machado Staff Software Engineer at Zuplo

with Adrian Machado , Staff Software Engineer at Zuplo
January 30th, 2025

Adrian Machado from Zuplo came by for a talk on API rate limiting, but not just the technical bits. Zuplo is doing a good job at bridging the business and technical aspects of rate limiting at the API gateway layer, and we had a compelling discussion around how planning your rate limits will shape your overall API business plan. Adrian provided some compelling thoughts around the need for standardization of API rate limits and how we approach the API access plans and policies we are implementing in regions around the globe. After the gateway layer Adrian shared more about the Zuplo ethos of API documentation should be free with their Zudoku offering, something I can definitely get behind. Thanks for the great talk Adrian, come by antyime.

Conversation

Who are you?

Hi, I’m Adrian Machado. I’m a staff engineer at Zuflo. Uh, we work in, you know, the API management space. And primarily my role is, I guess I could describe it as like growth, anything it takes to grow the company. Uh, these days that involves a lot of like content creation and, um, sort of like writing about influential topics that either developers or API leaders care about and helping them stay informed and up to date.

What is API rate limiting?

Yeah, so I recently wrote a sort of comprehensive guide to API rate limiting. And as a part of that, I talked about like, you know, the standards around communicating your API rate limits. So, you know, the obvious question is like, okay, you know, why should I [00:02:00] be communicating my API rate limits? And that kind of comes from the view where like, rate limits are purely like a security feature of your API. You know, I guess that kind of boils down to sort of the thinking that APIs are just like an infrastructure piece and gateways are just like an infrastructure piece, right, rather than being part of your product strategy. So, you know, I take the view that rate limits are actually kind of a feature of your platform. The fact that you reinforce them because they avoid resource starvation, they avoid, you know, noisy neighbors affecting other customers. It just improves like the overall quality and response times of your APIs when implemented. Correctly, and they can even play like a further role in terms of like how you actually go about like go to market for your API in terms of how you monetize it. Can you offer higher or different rate limits for higher to your customers? The enterprises are going to have different needs than your typical self serve. User right. You probably want to be more restrictive on like your free tier with rate limits to avoid, you know, people trying to mine Bitcoin or something [00:03:00] like API. There’s like so many different abuses that can happen. So depending on your level of trust, you want to have different rate limits. Depending on the type of API you want to have, you want to have something like Okay. Complex rate limiting, where you rate limit, let’s say you have a file storage API or like Dropbox or something like that, you don’t want to just rate limit on requests, you probably want to be rate limiting on the number of kilobytes of the files that are being uploaded and downloaded to avoid, you know, um, people using it for streaming movies or, or something like that, you know what I mean? Um, so it’s not just about avoiding misuse, it’s about like guiding developers on how to use your APIs and, uh, opening up higher rate limits or different rate limits depending on the use case.

How is rate limiting about the business of API?

Yeah, exactly. And when it comes to, you know, thinking of API as a product, that term is thrown, you know, around a lot, but what I think of it as being is, how do we merge like technical requirements that engineers care about with the dollars and cents that your product managers and business leaders are going to care about? And I think rate limiting is like an intersection there because one, it could be, yeah, a monetization lever when you control them together, but You know, if you’re doing something like complex rate limiting, or maybe dynamic rate limiting, depending on like the customer, uh, their engineering [00:05:00] applications on there. So, you know, the PM may have 1 idea like, oh, this is really good. We should try this monetization model, but then you need to have, like, the tooling and infrastructure to support that. Right? So, um, finding that right balance is. You know, API as a product, uh, I’d say what is your like go to market strategy and what’s feasible. And, um, I think part of that is like how you communicate your rate limits as well.

Does API rate limiting need to be standardized?

Yeah, and I think of, you know, there’s already some efforts around standardization within the rate limiting space. So I’d sort of like describe them as being kind of like, um, different tiers. There’s like a bronze tier, silver, [00:06:00] gold, and platinum. Starting with like the bronze tier, I’d say, is the fact that you have rate limits is good. And the fact that you send back like a 429 response, I think that’s okay. You know what I mean? It’s like, that’s like the bare, bare, bare, Minimum communication that you can have and maybe within your like, uh, API docs, whether using like, uh, you know, like millifies it or grew or whatever you like mentioned it in like one line where you have like, Oh, these are like a rate limits for like our API. That’s really restrictive. I’d say like for like a business, the fact that you have it as like a static documentation format, because that means that if you want to update it, you probably have to create like a breaking change, uh, for your API. It also makes it hard for you to, like, have different rate limits for different parts of, like, your API. Maybe you offer different APIs that need rate limits on different things, whether it’s like Concurrent connections or AI tokens, right? So just having it within like one section of your docs is restrictive and not communicating that dynamically, I think is restrictive. So we move into kind of the next level of communication that [00:07:00] kind of involves sending back like a more detailed error message, typically using the standard of. Problem details. I forget, like the RFC, uh, for problem details now, but essentially it’s just a case on body that you send back that tells you, okay, what type of error is this? What’s the error code? Um, and here are some details about it. And like a trace, uh, that you can go and reference within your support tickets, right? So if your customers running into this, um, They can, you know, manually read the information of like, Hey, uh, you hit a fortune nine because you sent, I don’t know, a thousand requests when your limit is 900 or something like that. Um, when we want to move beyond that, I’d say that we’re like engineers need to be manually inspecting like arbitrary Jason bodies to understand how to use your API, especially when, you know. It’s not going to be people using APIs a lot of times going into the future. It might be like AI agents that are consuming it, right? Or maybe you want to have tooling that actually adapts to rate limits. So that’s when we move into, I’d say, like the kind of gold tier approach. By gold tier, I don’t mean this is like [00:08:00] actually that great, it’s just better than most people. Where you include like a retry after header, and that header typically indicates how long you have to wait until you Have to try that request again, right? So typically this is a mountain like second. So like 3600. Okay, you have to wait like an hour before you can start sending requests again. So if I’m like a, you know, someone who’s writing like an API wrapper or like an SDK, I would definitely use that property. My code would pick up on that and be like, okay, I’m going to chill out on sending client side requests until then. And that’s going to introduce some front end latency. But maybe you can have, let’s say, like a hook and react that hooks into that and displays like a loading screen in between. You know what I mean? I can report the status back to you programmatically, and the fact that this is dynamic now and works can be used programmatically. I think that’s very powerful for delivering a good user experience for your end users.

What are some concerns around API monetization?

Yeah, I think that making your API as understandable by AI as possible is important. And I think there are some sort of standards emerging around this, around like better communication around your API. Some people are of the opinion that we should radically simplify like our documentation. Uh, there’s like the standard, the sort of proposed standard of LLMs. txt. The sort of successor to robots. txt, where it’s literally just a plain text file, stripped of like any HTML or whatever, that just [00:11:00] It’s just pure text on, let’s say, how your API operates. I think that this is sort of like a very transitional standard, and I don’t think it’s going to stick around for long, in my opinion, because I think agents are going to get smarter, they’re going to be able to render JavaScript, and more importantly, they’re going to need like dynamic Information rather than just static information. It’s the same problem is just documenting your rate limits within, um, you know, just within your docks is okay. That’s a one and done thing, but agents can be interactive. They can keep, you know, um, they want to know different information bespoke and. If you have different rate limits documented in like weird ways within like a arbitrary text file, it’s going to be hard for them to even understand, right? So, um, Darrell Miller, for example, he’s created like this new standard around rate limiting headers that can be dynamically sent to you via headers. They’ll tell you what quotas are being used by like your current API call, the time window of like the reset, how many, um, You know, items within that quota you have left, whether that’s like AI [00:12:00] tokens or kilobytes in file upload or requests. Um, and you know, when that’s going to be reset. And I think that agents will actually be able to use that information for making these calls to be a smarter consumer. Uh, whether that’s in terms of picking what plan that you’re going to be using. Okay, I know what the rate limits are between the enterprise tier and the business tier. And then when actually, you know, consuming that API. Um, as part of, like, their agent workflow, they can actually, like, dynamically adapt to that in a better way. Maybe they can write an API wrapper or, like, choose to make calls selectively based on those headers that are coming back. Um, in fact, maybe AI agents will be better consumers of API than current tooling that’s out there right now. And most people are just using, like, Postman to, like, arbitrary fire off requests. So, uh, I think that dynamic, dynamically communicating. Um, your A. P. I. And documentation on our ongoing basis will put you further ahead in terms of like, um, supporting agents.

What is important to you right now?

So I think that’s what’s cool is the fact that I can take a look at these standards and actually implement them within our platform at Zuplo to actually move API development. Forward. [00:14:00] I find that there’s a little bit of divide where we have had these like RFCs coming out like every year about whether that’s like deprecation and sunset headers or rate limiting headers now, but adoption is like primarily limited to enterprises that are like API first. I’d say, so like Telnix is one company that’s adopted one version of these like headers, but then RFC changed. And now I don’t know if they’re going to go and change their APIs like accordingly. Right. Uh, and it’s hard, it’s hard to adopt these. It’s hard to like prioritize that from like a business standard. Uh, like, Oh, how do I justify this? So, you know, what I try to do at Zoopla whenever we implement something is try to adapt to these standards, uh, in such a way that like it’s, it’s seamless when you add like. You know, let’s say your rate limiting policy to Zoopla. We automatically create like a retry after header for you that automatically gets sent back. You don’t even have to think about it because we have like the control of like the implementation that you can configure, obviously, but you don’t have to think about adopting best standards. We’re opinionated as a platform, uh, to force the [00:15:00] industry to move forward, uh, adopting standards. And, you know, the more we adopt standards, as you’ve seen with open API, the more tooling. Uh, becomes justified and flourishes right now. We have so many other tools that are created off of open API because, you know, as an industry kind of came together and pushed those standards forward.

Can you apply API gateway policies to a single domain?

Yeah, and you know, that’s kind of what I’d like to call this sort of third wave of API management, which I’ll probably talk about in a talk in the future. I think new tooling is coming out now that focuses on developer experience, radically simplifying things that are like hard before decoupling and unbundling of different features. You know, currently these startups, so I’d say like, you know, speakeasy and stainless on the SDK side, fern and mintlify on like the dock side. They’re all kind of picking out different parts that have traditionally been like difficult to do, making it harder, uh, or sorry, making it easier, uh, to solve like these issues around like API tooling. Um, and as they mature, I think like they’ll move up market, they’ll be able to serve better enterprises and [00:17:00] the end result will be a new generation of API tooling, um, whether they end up You know, consolidating in the future, or maybe the state like unbundled, uh, that allows you that enables you to like have high quality API’s with like a fraction of the effort that it used to take to do this.

Is experimentation at the API gateway important?

Yeah, I agree. And you know, that all comes back to like standards and tooling. Um. Traditionally, you know, I, I think the advantage of like a lot of these startups, including ourselves at Soup Low is our [00:18:00] nimbleness in, or is like adopting standards and, uh, tooling, whether it’s like open API or problem details or open telemetry. Um, you know, there, there’s kind of like dinosaurs in, in, uh, the API uh, world, especially the ones owned by like the big boxes that are slow to adopt these. Customers, I think, can understand like the benefit of like these open standards, like open telemetry, for example, and they’ll choose like the startup instead, because like the developer experience that much better adherence to new standards of tooling is that much better. That is our competitive advantage, uh, compared to like, you know, the AWS API gateways of the world, which are like almost the default option for most people.

Why should API documentation be free?

Yeah. I think that, um, if we [00:19:00] want to have a better ecosystem, some components are going to have to have good free offerings. And that’s like our thinking with Sudoku. Um, We think that the current standard of people using, let’s say, um, uh, what’s like the smart mirror? Open open source version. I forget why. Yes, why did you? Why? I see that very often used by like these hobbyist developers and she’s been baked into platforms like fast API, for example, or frameworks, I should say, um. But I don’t think anyone loves it. It’s like, it’s just the fact that it’s the free thing that’s out there and it’s works. Like I’m not saying it doesn’t work. It’s out there, but it’s not, um, how would I put it? It very much strikes me as when you see that it’s like, okay, this person did not, um, think very deeply about what the flow is going to be. And, um, you know, if I think about you, if you’re like selling an API as a business, whether you’re a hobbyist or you’re like a large company, um, That’s going to be [00:20:00] sort of like the first experience that developers get with you if they see like they have the free tier swagger, um, you know, um, open API dogs there. They’re going to be like, hey, I think they just code gen a dog and put it together. They didn’t think too much about it. Whereas when they see, like, I don’t know, opening eyes, API documentation, like, okay, this is a professional. A P I. This is the one I’m going to pay for. This is the one I trust. So you can think about it being like the home page for your business, your A P I docs. And that’s why we want to raise the bar by introducing Sudoku as like a high quality free version. So we elevate, you know, the hobbyists out there are the startups out there to have better experiences.

Adrian Machado
Adrian Machado
Staff Software Engineer at Zuplo

An avid developer, and technology enthusiast. I excel in Web Frontend and Backend development. I've also been dabbling in Machine/Deep Learning lately too! In my spare time, I like learning about politics, world history, business, and fitness/nutrition.