Skip to main content

Prevent information drift by generating from code

·2 mins

JSON Schema is the most widely adopted representation of JSON APIs. As stated on the website:

While JSON is probably the most popular format for exchanging data, JSON Schema is the vocabulary that enables JSON data consistency, validity, and interoperability at scale.

Because it’s so popular there are many tools that integrate with it, such as Postman for building and using APIs and many libraries for generating documentation, integrating with different programming languages, and editor support.

However, writing JSON Schema documents can be a pain. The JSON spec is complex, and maintaining a separate JSON schema document away from the code means the information drifts over time and become incomplete or outdated.

That’s why it’s common for the JSON Schema to be generated from the code - the same code that defines the APIs. For example, you can define your user in Go code:

type User struct {
	// Unique sequential identifier.
	ID int `json:"id" jsonschema:"required"`
	// Name of the user
	Name string `json:"name"`
}

And using a library create the JSON Schema document from that, including the documentation:

{
  "$schema": "http://json-schema.org/draft/2020-12/schema",
  "$ref": "#/$defs/User",
  "$defs": {
    "User": {
      "required": ["id"],
      "properties": {
        "id": {
          "type": "integer",
          "description": "Unique sequential identifier."
        },
        "name": {
          "type": "string",
          "description": "Name of the user"
        }
      },
      "additionalProperties": false,
      "type": "object",
      "description": "User is used as a base to provide tests for comments."
    }
  }
}

It’s not just Go, there are libraries for all the popular programming languages, including Pydantic for Python.

This has the benefit of maintaining the specification, and the documentation, within the code that defines it. There is only one place to keep it up to date, and changing the code and the documentation can be done in a single pull request, preventing information drift.

Data contracts can also be defined in code, for all the same reasons. The ecosystem isn’t there to make that easy yet, but it will come.

Daily data contracts tips

Get tips like this in your inbox, every day!

Give me a minute or two a day and I’ll show you how to transform your organisations data with data contracts.

    (Don’t worry—I hate spam, too, and I’ll NEVER share your email address with anyone!)

    Andrew Jones
    Author
    Andrew Jones
    I help data leaders transform their organisation to one where data becomes information - trusted, governed, and federated across the business - and guaranteed with data contracts.