The Code Etymologist is a multi-part series that covers typical software knowledge that deserves to be documented and shared within development teams.
- Preface
- Part 1: Project setup
- Part 2: Coding guidelines
- Part 3: Development workflows
- Part 4: Product requirements
- Part 5: UI Components library
- Part 6: Data structures
- Part 7: Technical decisions
- Closing thoughts
There's a neverending debate between compile time and runtime. Between types and values. Between static analysis and dynamic analysis. Between type checkers and unit tests. Which side is more valuable? I've been pivoting myself a lot between the two sides quite a lot. They are both valuable, but in different ways.
Runtime values and unit tests provide information about what things actually are. They don't tell how things should or could be. Therefore, when we encounter some untyped code, we might have some questions about specific runtime values.
Let's consider an example of a Rest API call to fetch a list of messages for a chat component. The response contains an array of objects with various fields, but some of them, in particular, might raise some questions:
// GET: /messages
[
{
"id": 8356,
"content": "good morning, everyone!",
// ... other fields
"status": 2, // 1️⃣
"messageType": "text", // 2️⃣
"annotation": null // 3️⃣
}
]
For instance, you might ask yourself:
- what does
2
represent, and what other values could thestatus
field have? - are there any other types of messages besides
"text"
? - when could the
annotation
field be different thannull
, and what data type is it?
Types and static analisys on the other hand describe the theory, what things should or could be. They describe the contract and rules instead of the execution.
But in JS/TS land, types are completely stripped away at runtime, providing no actual guarantees. To add salt to injury, there are various levels of type-safety in TS, depending on the strictness settings in tsconfig.json
. So, any contract we might have at compile time can be completely broken during execution.
Also, TypeScript per se doesn't implicitly help us a lot. Even if we use types, the response might be typed as:
type MessageListResponse = Array<Message>;
type Message = {
id: number;
content: string;
// ... other fields
status: number; // 1️⃣
messageType: string; // 2️⃣
annotation: Annotation | null; // 3️⃣
};
The above type definition adds some additional information, but not too much:
- we still don't know the possible values for
status
and what they represent; - we still don't know the possible values for
messageType
; - we know that
annotation
is nullable, but we still don't know the conditions of having theAnnotation
structure.
The reality is that TypeScript includes a gradual typing system. There are many shades of type definitions. Some are wider, permissive, and encapsulate little information. Others are narrower, stricter, and provide lots of insights.
Now, if we encounter the above situation in real life, we don't have much of a choice but to analyze the code and figure out the possible values of those fields. Alternatively, we could ask our teammates who have been around longer and might recall the details.
So, let's improve the type definitions to convey more information about the data structure.
Enums
Whenever we have to deal with a limited set of values, it's very helpful to define them separately:
const MessageStatus = {
Unsent: 0,
Sent: 1,
Deleted: 2,
} as const;
type MessageStatus = typeof MessageStatus[keyof typeof MessageStatus];
type Message = {
// ... other fields
status: MessageStatus, // 1️⃣
messageType: string,
annotation: Annotation | null,
}
The above declaration includes additional valuable information, regardless if we're familiar with the code or not. MessageStatus
clearly documents what are the possible values for the status
field and what they represent.
Literal Unions
An alternative to Enums is to define all possible values as a Union of literal types. This applies especially for string
values, because they are self-explanatory and don't require an additional key to explain the value. For instance, we might deal with 2 types of messages:
type MessageType = "text" | "comment";
type Message = {
// ... other fields
status: MessageStatus,
messageType: MessageType, // 2️⃣
annotation: Annotation | null,
}
The above improvement answers our 2nd question above regarding the possible values for messageType
.
number
and string
types, while Literal Unions work better with strings
.Discriminated Unions
Whenever we are dealing with different variations of a single object, there's a good chance that their structures are slightly different, even if they have a lot of common fields. In such scenarios, a discriminated union is usually hiding, waiting to be discovered.
Another hint for discriminated unions, even when there's no obvious discriminant, is when we have data structures with a bunch of optional or nullable fields.
We can refactor the Message
type as a union of two different types, using the messageType
as a discriminant:
type Message = TextMessage | CommentMessage;
type TextMessage = {
// ... other fields
messageType: "text";
status: MessageStatus;
annotation: null;
};
type CommentMessage = {
// ... other fields
messageType: "comment";
status: MessageStatus;
annotation: Annotation;
};
Using the above discriminated union, we're answering the 3rd question above. The annotation
is not really nullable. Instead, there are two types of Messages
:
- there's a
TextMessage
with no annotation; - and there's an
CommentMessage
that includes someAnnotation
data.
Discriminated unions provide valuable information about the entities and the relationships between them. Additionally, the TypeScript language service deeply understands them, providing great control flow analysis and type narrowing.
Schemas
As we've seen so far, types help a lot during development, at compile time. However, they don't provide 100% guarantees at runtime.
Schema validators fill this gap, providing runtime data parsing and validation. Below, we have the same data structure we worked with so far, but defined as a schema using Zod:
import { z } from "zod";
enum MessageStatus { Unsent = 0, Sent = 1, Deleted = 2 };
const TextMessage = z.object({
id: z.number(),
content: z.string(),
status: z.nativeEnum(MessageStatus),
messageType: z.literal("text"),
annotation: z.null(),
});
const CommentMessage = z.object({
id: z.number(),
content: z.string(),
status: z.nativeEnum(MessageStatus),
messageType: z.literal("comment"),
annotation: Annotation,
});
const Message = z.discriminatedUnion("messageType", [TextMessage, CommentMessage]);
Additionally, modern libraries are powerful enough to infer the TypeScript static type from the schema definition. This enables us to have a single source of truth for a data structure definition.
type Message = z.infer<typeof Message>;
Explaining Type
Let's look at a different example of improving a type definition. In the example below, we have a Record type for storing the last read messages in order to display which messages are unread for each chat room. In practice, this is a key-value pair between a chatRoomId
and a messageId
.
type LastReadMessage = Record<number, number>;
However, the above code doesn't describe what those number
types represent. A self-explanatory approach, without using code comments, is to extract the primitive number
into 2 differently named type aliases to describe their nature:
type ChatRoomId = number;
type MessageId = number;
type LastReadMessage = Record<ChatRoomId, MessageId>;
This type definition might seem useless, but it clearly explains the structure of the Record. This refactoring is similar to Martin Fowler's Introducing Explaining Variable, but at the primitive type level.
Continue reading Part 7: Technical decisions