API Evangelist API Evangelist
Learnings
Guidance
Toolbox
Alignment
API Evangelist LLC

API Evangelist Conversation With Juan Cruz Viotti of Sourcemeta About Schema Registries

with Juan Cruz Viotti , Founder at Sourcemeta
February 18th, 2025

Juan Cruz Viotti of Sourcemeta came by for a conversation about the schema registry solution he had recently built. Juan has gone deep into the JSON Schema rabbit hole, deeper than anyone else I've ever met. His validation tooling and schema registry are providing a new foundation for enterprises to get their schema house in order. This is something Juan and I both agree is the single most foundational thing you can invest in when it comes to API governance. I learn so much everytime I talk to Juan and explore his work, and we will be working together more around the evolution of his schema registry, and API Evangelist will be leaning more into using the JSON Schema educational resources he's contributed to at https://www.learnjsonschema.com/2020-12/.

Conversation

Who are you?

Hello, who are you? Hello, my name is Juan. Nice to be here.

What do you do?

I do a lot of work on JSON Schema. So I’m a TSC member of JSON Schema, and nowadays I also have my own company called SourceMeta that is completely dedicated to the JSON Schema space and making it easy for companies to use JSON Schema in production.

Why is JSON Schema important?

Um, I think coming back to the history, I think I got Pretty much accidentally, [00:01:00] uh, into the rabbit hole. I was working on IOT stuff and I was looking for a way to do more space efficient binary serialization, completely different from, from the API world. Uh, but I started seeing the potential of JSON Schema and once I saw that, and many of the different areas where JSON Schema is applicable, including APIs, it got me even more exciting, uh, even more excited than I think. Schemas have proven to be the foundational block for so many things, including API governance. Uh, so yeah, I think it’s just a huge opportunity and a lot of fun.

Why did you build a schema registry?

Yeah, sure. So what I’ve seen a lot is that, I mean, as you’re saying open API, it’s mostly about JSON schema. It can be thought as a thin wrapper around JSON schema to be able to describe APIs. And again, the problem is that a lot of people don’t know JSON schema and they might not really know how to operate JSON schema at large. Like, for example, if you are writing a very simple open API specification, just yourself, For a basic API, then life is kind of easy, right? You don’t really need to think it through a lot, but as your company grows, you’re having lots of APIs, you’re trying to govern them all. Maybe you’re like, you have like a [00:03:00] hundred open API specs. That’s where the challenge comes in. Even reusability something as simple as that, you see so many people just struggling with it. And that’s where the schema registry idea came to be. It’s like, can we make something that is really simple? To just at least start encouraging all of these things and building on top of that base to then support more advanced use cases.

Why does the version of JSON Schema you use matter?

Yeah. And that’s a very important [00:04:00] one and it’s pretty interesting. So I’ve been doing a little work, uh, through my company. On ontologies created with JSON schema, both for data science and APIs. And they face this problem, which I think is very common. You, which version of JSON schema am I releasing my set of schemas for? And that’s often not a very easy thing to answer. I mean, you would think from one side, Oh, just go to the latest, right? That’s like kind of the default. But then. In many cases where if you have like a big enough audience, not everybody’s on latest because of x, y reasons, and then you’re thinking, okay, maybe I should do the last, the, the, the last two, the last three, right? What I think, I mean, the ideal answer to all of these is like, do you even need to think about it at all? Right. So what I’m trying to create is also tooling that would let you upgrade. And downgrade when possible automatically the schemas across versions so you can just release on the version that you want and on the fly you can serve users on different versions out of the box without no thinking like for example and this is coming on the registry [00:05:00] so let’s say that you’re deploying your schemas on the registry And you are writing them in draft 4, for whatever reason, but then you have a customer that’s trying to use them on draft 20 2012. You can say, okay, give me this schema, but give it to me as 20 2012. And the schema will, the registry will automatically, on the fly, upgrade it to the version that you need. So then the entire version in ProblemKana. Disappears. It’s we support all of it.

How does tooling influence how we use JSON Schema?

Yeah. And just to provide a bit more context on that, we’ve been looking at the entire schema, schema downgrades in particular on the last Google Summer of Code, um, session on JSON Schema. One of the projects was exactly about these and we did, so we did a lot of work on mainly upgrading. But we did a huge amount of preliminary work on what downgrading would look like. And to our surprise, and to be honest, to my surprise as well, I was the, the, the mentor of that project. We find out that there was an extremely surprising amount of stuff that we could downgrade and more than you would think. Uh, to the point that now I’m thinking I’m changing my mind a bit and I’m like, well, actually downgrading is feasible to, to a enormous amount of cases.

What is the relationship between OpenAPI and JSON Schema?

Yes. I mean open API as well as many other specifications like Async API, or Bramo as well, the W three C web of things and so many other [00:08:00] specifications you’ve probably never heard of. Um, they are building on top of Jason Schema. Jason Schema is the foundational. general purpose language to allow you to describe and validate JSON data. Very, very general, right? It doesn’t assume anything about the use case and all of these higher level specifications like OpenAPI are taking on that and they are in a way repurposing JSON schema for that specific use case. So a good way to think of it is like OpenAPI is a thin wrapper around JSON schema to let you JSON schema to describe APIs. But at the core of it is 80%, 90 percent of it is actually just JSON schema. That’s what you’re using.

What does Sourcemeta do?

Well, I mean, the mission of the company is help people actually use and unleash the power of JSON Scheme, as you’re saying JSON Scheme is so foundational, but I feel, and this is like a, like a personal thing that I’m like scratching my own itch, um, people are not able to unleash this power. And I’ve seen this power myself when I was working on IOT Space efficient satellite data transfer. There was a point [00:10:00] where it clicked like this is really powerful stuff and I want everybody else to be able to get that same power I got and I and I felt that the ecosystem out there, the tooling, the resources is not there by far. Um, so the goal of the company is actually to get. There. So we have many different products, both free and also some enterprise ones that we’re building, but we’re trying to get our, or, you know, we’re trying to cover this entire problem space. So ranging from education, we maintain, and we own the learnjsonschema. com website that a lot of people are relying on. We’re shooting a premium video course on JSON Schema for people to learn a lot of these. We’re building a standard library of JSON Schema concepts so that you can more easily assemble stuff. We have a high performance validator for some very high throughput finance use cases. Um, we have a JSON Schema CLI for managing ontologies of the schemas. Um, and then we are developing some enterprise offerings like the schema registry to be able to host your, your, your own, um, source of truth at your organizations. We’re trying, kind of trying to [00:11:00] cover this entire space. I think when we do that. People are, I mean, people are starting to see that through some of the good tooling that we have. It gives you power and it gives you power to do so much more and so much more easily with the schemas. And that’s the, that’s the core of it. I’m super thankful you’re, you’re here. Cause when I, you know, I mean, we connected at, at Postman or around a lot of these, and then I spent a year, uh, rolling out governance.

Where shoud enterprises start with getting their schema house in order?

Yeah, you cannot start at the moment. And that’s that’s what I’ve seen all of the challenges that you’re describing. That’s what I’ve [00:13:00] seen as well. And I think there is split into two groups. First, as you said, is literacy. Most people don’t even again know that JSON schema is a thing, and even the ones that know, they cannot learn it properly because actually there is not a place to actually learn it properly, right? The people that, for example, in the TSC like myself, we learn it the hard way. You know, like banging our heads against the wall and reading the specifications a million times and trying to come up with their own implementations and facing their own problems and asking a trillion questions. We went to the hard path. You cannot expect everybody to go to go there. There is no resource at the moment. So that’s one side, right? Learnjsonschema. com, something like video course, my O’Reilly book. But then the other side is that even if you learn, let’s assume that you have the resources to learn JSON Schema. And be a pro JSON schema developer. Even when you get there, you’re going to find that the tooling is actually going to set you back because the ecosystem is completely fragmented. You have non compliance tooling that leads people to write invalid schemas like we’ve seen with Spectral, for example, pretty recently. [00:14:00] And that’s just another massive mountain you have to climb. And I’m also trying to fix that, right? Like either to the end user or to the company. Building a, it offering on top of JSON schema, open API, ideally the best JSON schema foundation that you can get, you’re going to find it. That’s first method.

Why is performance important for JSON Schema validation?

Yes. Performance has been a big one. Um, again, lots of people use Jason schema or want to use Jason schema for of Validation on production services, [00:15:00] for example, API gateway. So the overall API is like, for example, fastify everybody using fastifies using Jason schema to validate a request. That’s the default that they offer. So what you see is that a lot of companies would complain that a schema validation on their API is taking a huge amount of compute and either. It means that they are paying for that compute or they just completely disabled that. It’s like, well, this is too expensive, mainly on high throughput, for example, financial cases. So that led into a fun research line of mind. It’s a project called place. We’re actually going to be publishing a paper about it soon, but it’s all building the open. So you can go take a look. It’s an enterprise ready. C JSON Schema Validator, extremely high performance. The paper is claiming Blaze to be the most high performance validator out there by at least 10x than the other ones. It can provide validation time even down to the nanosecond range, which is completely not seen before in the, in the JSON Schema space. And proving that JSON Schema, yes, can be complex, but [00:16:00] if you take the effort and writing good tooling. What you guys squeeze out of it, it’s, uh, enormous.

Where can people find you?

definitely take a look at sourcemeta.com. That’s the main website where I’m posting all the offerings. And you can even see some hints of what I’m also working on on the website, but definitely connect with me on LinkedIn. I’m posting lots of updates on LinkedIn. And in general, again, we’re offering a lot of products. And we’re also offering consulting help, which is, you know, we’re getting involved with a lot of companies. Getting into into the weeds and seeing the challenges firsthand. And that’s the inspiration for a lot of our tooling. So, uh, yeah, that’s another way to get engaged. Yeah. Yeah.

Juan Cruz Viotti
Juan Cruz Viotti
Founder at Sourcemeta

Juan is a seasoned Computer Scientist with a strong background in startups, open-source, and academia. As the co-author of Unifying Business, Data, and Code: Designing Data Products with JSON Schema (O’Reilly), he has contributed to shaping best practices in data design. His research at the University of Oxford earned him the 2022 CAR Hoare Prize for groundbreaking work on JSON BinPack, a space-efficient serialization technology that outperformed 13 other methods. Additionally, Juan manages Learn JSON Schema, a widely used reference site, and previously spearheaded the development of Starship, a C++ framework for cross-platform apps at Postman. Juan is available for consulting and always open to discussing innovative projects.