zackoverflow

Write your own Zod

Write your own Zod from scratch


Table of Contents

  1. Introduction
  2. Primitive types
  3. Complex types: ZodArray<T>
  4. Complex types: ZodObject<T>
  5. Building schemas
  6. Validating schemas
  7. Next steps
  8. Footnotes

Introductionλ

Zod is a Typescript library where defining a schema gives you both runtime validation and type safety.

As an exercise in type-level gymnastics, lets make our own Zod from scratch.

Zod works by creating objects that carry around both runtime and compile-time information.

In Zod, when you construct a schema, you create an object that contains information to parse/validate the schema at runtime.

At compile-time, this object carries around a special Zod Typescript type. This can be used with Zod’s infer<...> type to synthesize it back into a Typescript type:

import { z } from "zod";

const User = z.object({
  username: z.string(),
});

// At runtime (when this code is run), the `User` schema
// stores some information to validate objects.
User.parse({ username: "Ludwig" });

// At compile-time, the `User` schema has a special Zod type which
// can be synthesized back into a regular Typescript type with `infer<...>`
type User = z.infer<typeof User>;

I think it’s easier to begin with the compile-time / Typescript side. So we’ll start by creating the special Zod Typescript types and implementing the infer<...> generic type to convert Zod types into Typescript types.

Then, we’ll handle the runtime part. We’ll recreate Zod’s API for creating schemas that can be parsed / validated at runtime, which will also be hooked up to our Typescript types so we get type-safety too!

Let’s start with the compile-time side.

(The complete code for this post can be found here)

Primitive typesλ

Let’s first start by defining some special Zod types. We’ll start with implementing this for 3 basic Typescript types: unknown, string, number.

We need a way to express that a Zod type can be one of unknown, string, or number. In other words, we can say that our Zod types take up one of several forms. This sounds like a good use for Typescript’s union types.

A first attempt might look like this:

type ZodType = unknown | string | number

Unfortunately, every type in Typescript extends unknown1, so this union actually simplifies to just unknown. You can check yourself by pasting this into a Typescript file and hovering over ZodType.

To get around this, we instead use Typescript’s discriminated unions feature:

type ZodType = { type: 'unknown' } | { type: 'string' } | { type: 'number' }

We choose the discriminant field as type and set its value to the stringified name of each type.

To increase readability, break each variant of the union into its own type:

type ZodType = ZodUnknown | ZodString | ZodNumber
type ZodUnknown = { type: 'unknown' }
type ZodString = { type: 'string' }
type ZodNumber = { type: 'number' }

With this, we can now implement a basic Infer type. Remember that the Infer type should take a ZodType and synthesize it into a regular Typescript type:

type Infer<T extends ZodType> = T extends ZodUnknown
  ? unknown
  : T extends ZodString
  ? string
  : T extends ZodNumber
  ? number
  : "invalid type";

This syntax is verbose if you’re not used to it.

This weird chaining of ternary operators (condition ? then : else) is the type-level Typescript way of expressing if { ... } else if { ... } else { ... } statements.

Besides syntax, this code is straightforward. We check if T is ZodUnknown, if so, return the unknown type. Otherwise, check if T is ZodString and return string, and so on.

Try it out:

// unknown
type result1 = Infer<ZodUnknown>
// string
type result2 = Infer<ZodString>
// number
type result3 = Infer<ZodNumber>

Complex types: ZodArray\<T\>λ

Now, onto types that reference 1 or more ZodType’s, we’ll call them “complex” types.

Begin by expressing arrays using the discriminated union scheme set up earlier.

type ZodType = ZodUnknown | ZodString | ZodNumber | ZodArray
type ZodAray = /* what do we put here...? */

An array in Typescript (Array<T> or T[]) takes a generic argument so that it can be an array of any type: string, number, etc. Let’s do the same:

type ZodType = ZodUnknown | ZodString | ZodNumber | ZodArray<ZodType>
type ZodArray<T extends ZodType> = { type: 'array', element: T }

Note that T in ZodArray<T> needs to extends ZodType, not just any type, so it works with our Infer function.

However, if you try this, Typescript gives a red squiggly line under ZodType with the error message: Type alias 'ZodType' circularly references itself..

One solution is to instead use interface to define ZodArray<...>:

// error fixed now!
type ZodType = ZodUnknown | ZodString | ZodNumber | ZodArray<ZodType>

interface ZodArray<T extends ZodType> {
  type: "array";
  element: T;
}

This works because in Typescript, interface member type resolution is deferred. This basically means that the members of interfaces types are evaluated lazily, only when Typescript needs to know about it2.

From now on we’ll use interface to define types to keep the style consistent, and it will also come in handy when we work on the runtime part:

interface ZodUnknown { type: "unknown" };
interface ZodString { type: "string" };
interface ZodNumber { type: "number" };

Now we update the Infer function:

type Infer<T extends ZodType> = T extends ZodUnknown
  ? unknown
  : T extends ZodString
  ? string
  : T extends ZodNumber
  ? number
  : T extends ZodArray<infer E> // <-- over here
  ? Array<Infer<E>>
  : "invalid type";

Using the T extends ZodArray<infer E> ? ... : ... allows us to essentially create a type variable, E, which holds the type of the generic argument to ZodArray<...> (the Zod type of the element type of the array). It allows us to extract the arguments of a generic type. In a sense, this is similar to object destructuring in JS/TS.

By doing that, E holds the element type of the ZodArray. Remember that it is a ZodType, so we need to Infer<...> it and wrap it in an Array<...> to turn it back into a Typescript type: Array<Infer<E>> Let’s test this out:

// string[]
type shouldBeArrayOfString = Infer<ZodArray<ZodString>>;

It even works for nested ZodArray’s!

// string[][]
type nested = Infer<ZodArray<ZodArray<ZodString>>>;

Complex types: ZodObject\<T\>λ

Again, we start by filling in our discriminated union scheme:

type ZodType =
  | ZodUnknown
  | ZodString
  | ZodNumber
  | ZodArray<ZodType>
  | ZodObject</* what goes here? */>;

interface ZodObject</* and here? */> {
  type: "object";
  /* and here? */
}

In Zod, you define objects like this:

const User = z.object({
  username: z.string(),
  id: z.number()
});

If we look at the argument to z.object(...), it seems to be an object with fields string and values of ZodType. We can use the Typescript type: Record<string, ZodType>, to represent that.

So our ZodObject should look like this:

interface ZodObject<T extends Record<string, ZodType>> {
  type: "object";
  fields: T;
}

How do we go about inferring a Typescript type from our Zod type?

We need to reconstruct the object, preserving the precise string literal types of its keys, but replacing its ZodType values as Typescript types.

In Typescript, you can do this using mapped types:

type InferZodObject<T extends ZodObject<Record<string, ZodType>>> = {
  [Key in keyof T["fields"]]: Infer<T["fields"][Key]>;
};

The [Key in keyof T["fields"]]: part tells Typescript to fill the object with every key from T['fields'], these are the string literal keys we want to preserve.

The Infer<T["fields"][Key]> part tells Typescript to get the value associated with each key in T['fields'], and then Infer<...> that into a Typescript type.

Let’s fit this into the Infer<...> function:

type Infer<T extends ZodType> = T extends ZodUnknown
  ? unknown
  : T extends ZodString
  ? string
  : T extends ZodNumber
  ? number
  : T extends ZodArray<infer E>
  ? Array<Infer<E>>
  : T extends ZodObject<Record<string, ZodType>> // <-- over here
  ? InferZodObject<T>
  : "invalid type";

Let’s test this out:

type obj = {
  type: "object";
  fields: {
    hi: ZodString;
    hello: ZodNumber;
  };
};

// type t2 = {
//    hi: string;
//    hello: number;
// }
type t2 = Infer<obj>;

These are all the Zod types we’ll support for now. Next, we’ll move on to the runtime half of Zod: creating the API to declare schemas.

Before we move on, here’s the Typescript types you should have:

type ZodType =
  | ZodUnknown
  | ZodString
  | ZodNumber
  | ZodArray<ZodType>
  | ZodObject<Record<string, ZodType>>;

interface ZodUnknown { type: "unknown" };
interface ZodString { type: "string" };
interface ZodNumber { type: "number" };

interface ZodArray<T extends ZodType> {
  type: "array";
  element: T;
}

interface ZodObject<T extends Record<string, ZodType>> {
  type: "object";
  fields: T;
}

type Infer<T extends ZodType> = T extends ZodUnknown
  ? unknown
  : T extends ZodString
  ? string
  : T extends ZodNumber
  ? number
  : T extends ZodArray<infer E>
  ? Array<Infer<E>>
  : T extends ZodObject<Record<string, ZodType>> // <-- over here
  ? InferZodObject<T>
  : "invalid type";

type InferZodObject<T extends ZodObject<Record<string, ZodType>>> = {
  [Key in keyof T["fields"]]: Infer<T["fields"][Key]>;
};

Building schemasλ

Now we move onto the runtime aspect of Zod, building and validating schemas.

To build a schema, we need to expose an API that lets you construct schema objects that have a specific ZodType.

To validate a schema, we need to give those constructed schema objects a .parse(val) method that checks if a value matches the schema.

Let’s tackle the first. Zod’s API is an exposed z object that contains many functions for constructing schemas, let’s emulate the same:

const string = (): ZodString => ({ type: "string" });
const number = (): ZodNumber => ({ type: "number" });
const unknown = (): ZodUnknown => ({ type: "unknown" });
const array = <T extends ZodType>(element: T): ZodArray<T> => ({
  type: "array",
  element,
});
const object = <T extends Record<string, ZodType>>(
  fields: T
): ZodObject<T> => ({
  type: "object",
  fields,
});

export const z = {
  string,
  number,
  unknown,
  array,
  object
};

And that’s it! Now you can create schemas:

const Monster = z.object({
  health: z.number(),
  name: z.string(),
});

type Monster = Infer<typeof Monster>;

Validating schemasλ

A schema isn’t very useful without the ability to validate against it. Let’s add a .parse(val) function to our schema objects.

To do this, we’re going to modify each ZodType we defined earlier to include this parse function:

interface ZodUnknown {
  type: "unknown";
  parse(val: unknown): unknown;
}

interface ZodString {
  type: "string";
  parse(val: unknown): string;
}

interface ZodNumber {
  type: "number";
  parse(val: unknown): number;
}

interface ZodArray<T extends ZodType> {
  type: "array";
  element: T;
  parse(val: unknown): Array<Infer<T>>;
}

interface ZodObject<T extends Record<string, ZodType>> {
  type: "object";
  fields: T;
  parse(val: unknown): InferZodObject<ZodObject<T>>;
}

Now let’s add the implementation for the primitive types first:

const string = (): ZodString => ({
  type: "string",
  parse(val): string {
    if (typeof val !== "string") throw new Error("Not a string");
    return val;
  },
});

const number = (): ZodNumber => ({
  type: "number",
  parse(val): number {
    if (typeof val !== "number") throw new Error("Not a number");
    return val;
  },
});

const unknown = (): ZodUnknown => ({
  type: "unknown",
  parse(val): unknown {
    return val;
  },
});

Not too bad, just some simple checks.

Now the complex types:

const array = <T extends ZodType>(element: T): ZodArray<T> => ({
  type: "array",
  element,
  parse(val): Array<Infer<T>> {
    if (!Array.isArray(val)) throw new Error("Not an array");

    // Check that each element in `val` can be parsed by `this.element`
    val.forEach((v) => this.element.parse(v));

    return val;
  },
});

const object = <T extends Record<string, ZodType>>(
  fields: T
): ZodObject<T> => ({
  type: "object",
  fields,
  parse(val): InferZodObject<ZodObject<T>> {
    if (typeof val !== "object" || val == null)
      throw new Error("Not an object");

    // Have to type cast here
    const recordVal = val as Record<string, unknown>;

    // Check that each key in `this.fields` is present in the `val`, and its
    // value parses by the corresponding entry in `val`
    Object.entries(this.fields).forEach(([k, v]) => v.parse(recordVal[k]));

    // Have to do some type casting here too
    return val as InferZodObject<ZodObject<T>>;
  },
});

Not too bad either, but we’re required to do some unelegant typecasting3.

But with that, our Zod implementation is complete! Try it out for yourself:

const ProductSchema = z.object({
  id: z.number(),
  name: z.string(),
  price: z.number(),
});

const product = {
  id: 123,
  name: "Product Name",
  price: 24.99,
};

const parsed = ProductSchema.parse(product)

Next stepsλ

Congratulations, you’ve now rolled your own mini Zod implementation!

If you play around and try to add more ZodTypes, you may encounter an error that goes like this: “Type instantiation is excessively deep and possibly infinite”.

This is because our implementation does a lot of unnecessary type-level work4, we’ll fix this in the next blog post so the code resembles more of what the actual Zod code is like.

Also, next time, we’ll add some more features like:

Stay tuned, for part 2! I have an RSS feed here if you use that sort of thing.

Once again, the complete code for this post can be found here

Footnotesλ

1

unknown is what’s known as a top type.

2

In contrast with an interface, type aliases are evaluated eagerly. This can lead to infinite recursion, so Typescript catches it and instead signals an error. More info here.

3

These unholy and sacrilegious type casts, and the use of the any keyword, might be sounding type-safety alarms in your head. Unfortunately, you’ll have to get used to an any here and there for these kinds of type magic Typescript libraries.

Interestingly, the Zod uses as any 69 times in the main file where all of its types are implemented, and 325 times accross the whole codebase.

4

I chose to show this way of implementing Zod because I felt like it was a more intuitive way of approaching the problem. The way to solve this “excessively deep and possibly infinite” Typescript error is a little bit more confusing, so it would have been unintuitive to show it to you guys first.