Skip to content

oven-sh/style-guide

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Bun Zig style guide

Note: This style guide is a work-in-progress. Additional sections will be added, and existing sections may be modified.

Naming

Variables, parameters, and fields should be snake_case

The names of variables, parameters, and struct/enum/union fields should be snake_case.

✅️ Correct:

const max_compression_level: usize = 100;
const File = struct {
    cache_index: usize,
    decoded_contents: []const u8,
    pub fn compress(self: *File, compression_level: u32) void { ... }
}

❌️ Incorrect:

const MAX_COMPRESSION_LEVEL: usize = 100; // SCREAMING_SNAKE_CASE not allowed
const File = struct {
    cacheIndex: usize, // fields should not be camelCase
    decodedContents: []const u8,
    pub fn compress(self: *File, compressionLevel: u32) void { ... }
    // parameters should not be camelCase ^
};

Justification: Compliance with Zig’s style guide.

Exception: Variables, parameters, and fields of type type should be TitleCase instead.

Exception: It may be necessary to use non-compliant names when writing raw bindings to foreign code, like a C library that exposes variables with such names. Any abstractions built on top of the raw bindings should use compliant names, however.

Functions should be camelCase

The names of functions (including methods) should be camelCase.

✅️ Correct:

pub fn appendSlice(comptime T: type, list: *List(T), slice: []const T) !void {
    for (slice) |*item| {
        try list.pushCloned(item);
    }
}

❌️ Incorrect:

pub fn append_slice(comptime T: type, list: *List(T), slice: []const T) !void {
    // ^ functions should not be snake_case
    for (slice) |*item| {
        try list.push_cloned(item); // methods should not be snake_case
    }
}

Justification: Compliance with Zig’s style guide.

Exception: Functions that return a type should be TitleCase instead.

Non-namespace types should be TitleCase

Non-namespace structs, unions, enums, and opaques should be TitleCase.

If a struct has no members and is not instantiated, it is a namespace and should be snake_case.

✅️ Correct:

const network = struct {
    pub const Connection = struct { ... };
    pub const Protocol = enum { tcp, udp };
};

❌️ Incorrect:

const Network = struct { // namespace should be snake_case
    pub const connection = struct { ... }; // struct should be TitleCase
    pub const PROTOCOL = enum { tcp, udp }; // enum should be TitleCase
};

Justification: Compliance with Zig’s style guide.

Type aliases are always TitleCase

All variables of type type should be TitleCase, even if they’re aliases of primitive types.

✅️ Correct:

const Length = usize;
const ByteString = []const u8;
const CpuTemp = f32;

❌️ Incorrect:

const length = usize;
const bytestring = []const u8;
const cpu_temp = f32;

Justification: Compliance with Zig’s style guide.

File namespaces should be snake_case.zig

Files that do not have any top-level fields and are not instantiated are namespaces, and should have names that consist of a snake_case identifier, plus a .zig extension.

Example: Given a file consisting of the following:

pub fn starts_with(str: []const u8, prefix: []const u8) bool { ... }
pub fn ends_with(str: []const u8, suffix: []const u8) bool { ... }
  • ✅️ Correct: File is named string.zig.
  • ❌️ Incorrect: File is named String.zig.

Example: Given a file consisting of the following:

pub const FileCache = struct { ... };
  • ✅️ Correct: File is named file_cache.zig.
  • ❌️ Incorrect: File is named FileCache.zig. The file contains a struct, but is still a namespace. (Consider turning the file into a file struct if you see code like this.)

Example: Given a file consisting of the following:

pub fn ArrayList(comptime T: type) type {
    return struct { ... };
}
  • ✅️ Correct: File is named array_list.zig.
  • ❌️ Incorrect: File is named ArrayList.zig. Despite containing a TitleCase type function, the file itself is still a namespace.

Example: Given a file consisting of the following:

pub fn parseJson(text: []const u8) JsonObject { ... }
  • ✅️ Correct: File is named parse_json.zig.
  • ❌️ Incorrect: File is named parseJson.zig. Filenames should never be camelCase. This file is a namespace and should be snake_case.

Justification: Compliance with Zig’s style guide.

File structs should be TitleCase.zig

Files that have top-level fields or are meant to be instantiated are structs, and should have names that consist of a TitleCase identifier, plus a .zig extension.

Example: Given a file consisting of the following:

const Self = @This();

head: ?*Node,

pub const Node = struct { ... };

pub fn init() Self { ... }
pub fn push(self: *Self, node: *Node) void { ... };
  • ✅️ Correct: File is named LinkedList.zig.
  • ❌️ Incorrect: File is named linked_list.zig.

Justification: Compliance with Zig’s style guide.

Acronyms and initialisms are treated as normal words

In camelCase and TitleCase identifiers, acronyms and initialisms are treated like normal words, and are not rendered in all caps.

  • ✅️ Correct: HttpServer, TcpConnection, HtmlCssBundle
  • ❌️ Incorrect: HTTPServer, TCPConnection, HTMLCSSBundle

Justification: Compliance with Zig’s style guide.

Exception: WebKit typically renders acronyms and initialisms in type names as all caps (e.g., JSONObject). When working with Zig types that directly correspond to WebKit types, it is acceptable to use names that match the WebKit type.

Method receiver should always be named self

The first parameter of methods should always be named self.

✅️ Correct:

pub fn deinit(self: *Self) usize { ... }
pub fn length(self: *Self) usize { ... }
pub fn isEmpty(self: *const Self) bool { ... }

❌️ Incorrect:

pub fn deinit(this: *Self) void { ... }
pub fn length(list: *Self) usize { ... }
pub fn isEmpty(this_list: *const Self) usize { ... }

Justification: The Zig standard library uses self much more often than other alternatives like this.

Exception: If a given container already consistently uses this, it is acceptable to continue to use this in new methods added to that container. However, consider updating the whole type to use self.

Don’t repeat @This()

Don’t use @This() multiple times within the same type. Instead, bind @This() to a variable or use the name of the type.

✅️ Correct:

const Self = @This();

key: []const u8,
value: *Node,

const Node = struct {
    next: ?*ValueNode,
    value: u32,

    pub fn init(value: u32) Node { ... }
    pub fn reset(self: *Node) bool { ... }
};

pub fn init() Self { ... }
pub fn deinit(self: *Self) { ... }

❌️ Incorrect:

key: []const u8,
value: *Node,

const Node = struct {
    next: ?*ValueNode,
    value: u32,

    pub fn init(value: u32) @This() { ... } // @This() used once
    pub fn reset(self: *@This()) bool { ... } // @This() repeated
};

pub fn init() @This() { ... } // @This() used once
pub fn deinit(self: *@This()) { ... } // @This() repeated

Justification: @This() is visually noisy, especially when combined with type modifiers (as in []*@This()) and it looks like it might have side effects even though it doesn’t. In the presence of nested types, it’s also less clear than using the name of the type.

Always bind @This() to Self or the name of the type

When binding @This() to a variable, always choose the name Self, or the name of the enclosing type, as opposed to alternatives like This.

If the name of the enclosing type is long (e.g., PackageManagerOptions), prefer using Self to reduce verbosity, unless readability would be harmed.

Example: Given a file named ConcurrentQueue.zig:

✅️ Correct (preferred):

const Self = @This();

✅️ Correct:

const ConcurrentQueue = @This();

❌️ Incorrect:

const Queue = @This(); // file is named ConcurrentQueue.zig, not Queue.zig

❌️ Incorrect:

const This = @This();

❌️ Incorrect:

const ThisQueue = @This();

Justification: The Zig standard library uses Self much more often than alternatives like This. The name of the the enclosing type is also commonly used and acceptable, as it clearly and unambiguously refers to the type, but using a different name (like Queue in ConcurrentQueue.zig) should be avoided, as it could lead to confusion about which type is being referenced, and it makes searching through the code more difficult.

Method receiver type should not be qualified

The type of the first parameter of a method should not be a qualified type; that is, it should not have any dots, but rather be a bare identifier (optionally with * and const modifiers).

✅️ Correct:

pub fn set(self: *AutoBitSet, index: usize) void { ... }

✅️ Correct:

pub fn set(self: *Self, index: usize) void { ... }

❌️ Incorrect:

pub fn set(self: *bun.collections.AutoBitSet, index: usize) void { ... }

Justification: Using a qualified type makes it unclear that the function is actually a method, especially if self is also not named correctly, and adds unnecessary visual noise.

Private fields should start with #

Bun’s fork of Zig contains support for private fields starting with #, much like JavaScript. Visibility works like private functions: private fields are accessible only within the same file.

✅️ Correct:

#head: *T,
#tail: *T,
#len: usize,

❌️ Incorrect:

head: *T, // unprefixed identifiers are reserved for public fields
_tail: *T, // don't use underscores to mark fields as private
private_len: usize,

Justification: Private fields make it easier to ensure that a type’s invariants aren’t broken, and are a useful tool in achieving encapsulation. A decision was already made to implement private fields using a # prefix, instead of alternatives like underscores, documentation, or the word private. Given that this support now exists, it should be used consistently to prevent fracturing the codebase.

Internal fields should start with _

Fields that are referenced by multiple files cannot be private, but not all such fields are truly public. Fields that are used by multiple files within a container, but should not be used by files outside that container, are called internal, and should be prefixed with a single underscore. Note that the compiler will not enforce these access restrictions.

It is recommended to explicitly document which container may access the field.

Internal fields may be an indication of an abstraction failure, especially when the code allowed to access them consists not of a single container and its descendants, but of multiple disjoint containers. In general, try to avoid internal fields.

✅️ Correct:

/// Don't access this field outside of the current container (`TcpServer`).
_socket: *Socket,

// Connection.zig is allowed to use `_socket`, since it's nested within the
// current container.
const Connection = @import("./TcpServer/Connection.zig");

❌️ Incorrect:

socket: *Socket, // looks like a public field: was that intended?

// Connection.zig uses `socket`, but it's unclear whether other parts of the
// codebase can too.
const Connection = @import("./TcpServer/Connection.zig");

Justification: Internal fields should be readily apparent, since there are restrictions on their use that must be enforced manually. An underscore prefix clearly communicates that a field is internal, and is in line with existing conventions across many languages.

Named blocks with no nested named blocks should be called blk:

Named blocks that do not contain any named blocks should be called blk:.

✅️ Correct:

const avg1 = blk: {
    const pair = self.nextPair();
    break :blk (pair.first + pair.second) / 2;
};
const avg2 = avg2: { // contains a nested named block
    const pair = self.tryNextPair() orelse blk: {
        var reader = self.open();
        defer reader.close();
        break :blk reader.readPair();
    };
    break :avg2 (pair.first + pair.second) / 2;
};

❌️ Incorrect:

const avg1 = avg1: { // no nested named blocks; should be called `blk:`
    const pair = self.nextPair();
    break :avg1 (pair.first + pair.second) / 2;
};
const avg2 = blk: { // should *not* be called `blk:` due to nested named block
    const pair = self.tryNextPair() orelse read_pair: { // should be `blk:`
        var reader = self.open();
        defer reader.close();
        break :read_pair reader.readPair();
    };
    break :blk (pair.first + pair.second) / 2;
};

Justification: Most named blocks are quite small. Giving a small block a long name adds unnecessary noise when the entire definition of the block is visible on the screen. Additionally, using a different name may make it seem like nested blocks are in use, even when they’re not (break :named_block may appear to break out of multiple levels of blocks).

blk: is the most common block name used in the Zig standard library, used much more often than alternatives like brk:.

Exception: If a block is large enough that its entire contents might not fit on one screen (more than about 50 lines), it may be given a different name to aid readability, even if it doesn’t contain any nested named blocks. However, try to avoid using blocks that are this large; split the code into a function instead.

Do not prefix variables with _ to avoid name conflicts

Do not prefix variables or parameters with an underscore in order to avoid name conflicts. Instead, give the variable a descriptive name (ideal), or add an underscore as a suffix.

✅️ Correct (preferred):

const id: usize = getId();
const id_string: []u8 = idToString(id);

✅️ Correct:

const id: usize = getId();
const id_: []u8 = idToString(id);

❌️ Incorrect:

const id: usize = getId();
const _id: []u8 = idToString(id); // prefix mistaken for private/unused var

Justification: Identifiers prefixed with _ may be mistaken for private, internal, or unused variables. The Zig standard library uses underscore suffixes more than prefixes, although it most commonly gives variables descriptive names.

Use descriptive names

Use names that clearly communicate the meaning and purpose of the variable, function, parameter, or field. Do not use names that require the reader to check how the item is used in order to infer its meaning.

Note that “descriptive” is not the same as “long”: short names can be and often are descriptive: i is perfectly descriptive as a loop index, and ptr may be adequately descriptive in a memory allocation library. Changing these names to loop_index and memory_pointer would not be an improvement.

✅️ Correct:

pub fn parse(self: *Self, parser: anytype) ParseError!ast.Node { ... }

❌️ Incorrect:

pub fn do(self: *Self, p: anytype) DoMethodReturnType { ... }
// * `do` is not descriptive
// * `p` is not descriptive, especially because its type is `anytype`
// * `DoMethodReturnType` is long, but not descriptive: it says nothing about
//   what the return type actually *is*

Justification: Descriptive names are a core component of readable, maintainable code. Being unsure of what a variable means or what a function does slows down development and causes bugs.

Avoid excessively long names

Avoid very long identifiers. Identifiers should be as long as is required to communicate all necessary information, and no longer.

✅️ Correct:

  • parseJson()
  • last_insertion: ?usize
  • printDiagnostic()

❌️ Incorrect:

  • parseJavaScriptObjectNotationFileThenReturnAbstractSyntaxTree()
  • index_of_most_recent_hash_map_insertion_otherwise_null: ?usize
  • printUserFacingMessageWithColorUnlessStderrRedirected()

Justification: Very long identifiers make code noisy, hard to read, and hard to write. They typically have low information density and high levels of redundancy. If a name really must be that long to convey all the required information, it indicates a failure of abstraction.

Exception: Extern functions may need long names in order to avoid conflicts. Still, try to make the names as short as possible without losing descriptiveness or risking conflicts.

Avoid redundancy in names

Do not include information in names that is already present in the name of a parent container.

✅️ Correct:

const Server = struct {
    const Connection = struct {
        pub fn init() Connection {
            var self: Connection = .{ ... };
            // `self` has clear meaning within a struct
            registerConnection(&self);
            return self;
        };
    };
};

❌️ Incorrect:

const Server = struct {
    const ServerConnection = struct { // `Server.ServerConnection` is redundant
        // function shouldn't repeat the name of the type it's in
        pub fn initializeServerConnection() ServerConnection {
            var server_connection: ServerConnection = .{ ... };
            registerConnection(server_connection);
            return server_connection;
        }
    };
};

The same applies to filenames:

✅️ Correct:

src/
├── Parser.zig
└── Parser
    ├── State.zig
    └── Options.zig

❌️ Incorrect:

src/
├── Parser.zig
└── Parser
    ├── ParserState.zig
    └── ParserOptions.zig

Justification: Repeating information already communicated in the name of a parent container adds verbosity without providing any benefit. See also: the Zig style guide.

Semantically different things must have semantically different names

If two items are semantically different—that is, they mean different things or represent different concepts—they must have names that clearly communicate that difference. Their names should not differ solely on the basis of trivialities like underscores or abbreviations.

✅️ Correct:

const FileContent = struct {
    bytes: []u8,
    encoding: Encoding,
};
const ReadResult = ReadError!FileContent;

❌️ Incorrect:

const ReadResult = struct {
    bytes: []u8,
    encoding: Encoding,
};
const ReaderResult = ReadError!ReadResult;
// `ReadResult` and `ReaderResult` sound like the same thing

✅️ Correct:

var total_bytes: usize = 0;
for (items) |*item| {
    const num_slots = item.allocated_slots - item.empty_slots;
    const num_bytes = num_slots * item.bytes_per_slot;
    total_bytes += num_bytes;
}

❌️ Incorrect:

var count: usize = 0;
for (items) |*item| {
    const cnt = item.allocated_slots - item.empty_slots;
    const count_ = cnt * item.bytes_per_slot;
    count += count_;
}
// `count`, `cnt`, and `count_` all sound like the same thing, but
// they represent 3 different concepts

Justification: If the names of two semantically different things communicate the same set of information, the names are necessarily not descriptive.

Formatting

zig fmt and sort-imports.ts are run automatically on all code. This section concerns formatting guidelines not enforced by those programs.

Limit lines to 100 characters

Lines should be limited to a maximum length of 100 characters. If a comma-separated list (e.g., parameters or struct fields) exceeds 100 characters, add a trailing comma to force zig fmt to format each item of the list on a separate line.

Justification: Long lines make it difficult to view and edit files side-by-side. The choice of 100 characters in particular comes from the Zig style guide.

Exception: If a long line would be very difficult to reformat without harming readability, due to, for example, the presence of long string literals that don’t contain line breaks, or very long identifiers (which should typically be avoided), it may remain over the limit. Still, try to minimize the extent to which it exceeds the limit, and consider alternative ways of structuring the code.

Use blank lines sparingly

Use a single blank line to separate container and function definitions. In other cases, use a blank line to separate related chunks of code. Chunks should typically be more than one line; do not turn 10 lines of code into 19 by adding blank lines.

Do not add blank lines between every field of a struct, unless the fields have long doc comments where the addition of whitespace may aid readability.

Do not add blank lines between every declaration in block of single-line declarations.

Do not add blank lines where indentation or syntactic features already make a clear distinction between blocks of code (e.g., after a closing brace or parenthesis on its own line).

✅️ Correct:

//! NormVec3D.zig: a normalized 3D vector.

const Self = @This(); // ok: blank line after file comment

x: f64, // ok: blank line separates `Self` decl from fields
y: f64,

pub fn init(x: f64, y: f64, z: f64) Self {
    const vec: Vec3D = .{
        .x = x,
        .y = y,
        .z = z,
    };
    return vec.toNormalized();
}

pub fn getZ(self: *const Self) Self {
    return 1 - self.x - self.y;
}

pub fn addUnnormalized(self: *Self, vec: *const Vec3D) void {
    const x = vec.x;
    const y = vec.y;
    const z = vec.z;
    const len = std.math.sqrt(x * x + y * y + z * z);

    self.x += vec.x; // blank line separates length calculation & mutation
    self.y += vec.y;
    self.x /= 1 + len; // could optionally insert a blank line above this one
    self.y /= 1 + len;
}

❌️ Incorrect:

//! NormVec3D.zig: a normalized 3D vector.

const Self = @This();

x: f64,

y: f64, // unnecessary blank line between fields

pub fn init(x: f64, y: f64, z: f64) Self {
    const vec: Vec3D = .{
        .x = x,
        .y = y,
        .z = z,
    };

    return vec.toNormalized(); // unnecessary blank line after closing brace
}
pub fn getZ(self: *const Self) Self { // missing blank line between methods
    return 1 - self.x - self.y;
}

pub fn addUnnormalized(self: *Self, vec: *const Vec3D) void {
    const x = vec.x; // excessive blank lines do not aid readability

    const y = vec.y;

    const z = vec.z;

    const len = std.math.sqrt(x * x + y * y + z * z);

    self.x += vec.x;

    self.y += vec.y;

    self.x /= 1 + len;

    self.y /= 1 + len;
}

Justification: When used sparingly, blank lines help the reader distinguish related chunks of code. When used more often, blank lines limit the amount of code that can be viewed at once while providing no benefit to readability. See also: Google C++ style guide.

Avoid Struct{} syntax; use struct literals

Do not use Struct{} syntax; always use struct literals, as in var x: Struct = .{}.

✅️ Correct:

const sorted = sortList(.{ .head = node1 });
const other_list: List = .{ .head = node2 };

❌️ Incorrect:

const sorted = sortList(List{ .head = node1 });
const other_list = List{ .head = node2 };

Justification: Struct literals are considered more idiomatic in Zig and are more common. Zig may even remove Struct{} syntax in a future version.

Note that not all uses of Struct{} can be simply replaced with .{} without making other changes. For example, when passed to a function that takes an anytype parameter, changing to .{} will alter the behavior of the program. In these cases, declare a separate variable:

std.debug.print("{f}\n", .{List{ .head = node }}); // ❌️ Incorrect

const list: List = .{ .head = node }; // ✅️ Correct
std.debug.print("{f}\n", .{list});

Exception: Struct{} syntax is currently the only way to declare and initialize an array with an inferred length, as in const x = [_]u8{ 1, 2 }. As such, it is acceptable in this case. However, if the array is only ever used as a slice, Struct{} syntax can and should be avoided:

const x = &[_]u8{ 1, 2 }; // ❌️ Incorrect
const x: []const u8 = &.{ 1, 2 }; // ✅️ Correct

Prefer decl literals to Struct.init() syntax

Prefer calling init functions with decl literals (.init()) instead of qualified Struct.init() syntax.

In particular, never write var x: Struct = Struct.init(), as this is also redundant.

This guideline also applies to other functions that perform initialization, like initEmpty or fromUnmanaged.

✅️ Correct:

const sorted = sortList(.init(node1));
const other_list: List = .init(node2);
const tree: BalancedTree = .fromSorted(&sorted);

❌️ Incorrect:

const sorted = sortList(List.init(node1));
const other_list = List.init(node2);
const tree: BalancedTree = BalancedTree.fromSorted(&sorted); // especially bad!

Justification: Decl literals are considered more idiomatic in Zig and are more common, and they match struct literals, which are the preferred way of initializing structs.

Exception: In cases where a decl literal cannot be used, like a function that takes an anytype parameter, it is acceptable to use Struct.init() syntax.

Prefer type coercion without @as casts

When type coercion is needed, prefer to achieve it with an approach other than an explicit @as cast.

✅️ Correct:

const bytes: [*]u8 = @ptrCast(data);
const point: Point = .{ .x = 0, .y = 1 };
const len: u32 = @intCast(list.len());
const str: []const u8 = @ptrCast(chunk);
std.debug.print("{s}\n", .{str});

❌️ Incorrect:

const bytes = @as([*]u8, @ptrCast(data));
const point: = @as(Point, .{ .x = 0, .y = 1 });
const len = @as(u32, @intCast(list.len()));
std.debug.print("{s}\n", .{@as([]const u8, @ptrCast(chunk))});

Justification: @as casts are visually noisy, harming readability.

Avoid useless blocks

Avoid putting code in a block when a block is not necessary. Generally, a block is necessary if either of the following is true:

  • The code in the block breaks out of the block.
  • The block prevents name conflicts with the code that exists outside of and after the block. However, consider whether the variables could simply be renamed instead.

If a block’s sole purpose is to provide visual separation for a chunk of code, use blank lines instead.

✅️ Correct:

var x = point.x;
var y = point.y;

const len = math.sqrt(x * x + y * y);
x /= len;
y /= len;

return .init(x, y);

❌️ Incorrect:

var x = point.x;
var y = point.y;
{
    const len = math.sqrt(x * x + y * y);
    x /= len;
    y /= len;
}
return .init(x, y);

Justification: Blocks add nesting to code, which should generally be reduced when possible. Useless blocks do not justify the additional nesting.

Keep related items together

Keep related declarations and functions together. The more closely related two items are, the closer they should be in the code.

✅️ Correct:

pub fn init() Self { ... } // init and deinit are related
pub fn deinit(self: *Self) Self { ... }
pub fn len(self: *const Self) usize { ... } // len and isEmpty are related
pub fn isEmpty(self: *const Self) bool { ... }
pub fn push(self: *Self, item: T) void { ... } // push and pop are related
pub fn pop(self: *Self) T { ... }

❌️ Incorrect:

pub fn push(self: *Self, item: T) void { ... }
pub fn len(self: *const Self) usize { ... }
pub fn deinit(self: *Self) Self { ... }
pub fn pop(self: *Self) T { ... }
pub fn init() Self { ... }
pub fn isEmpty(self: *const Self) bool { ... }

Justification: Keeping related items nearby makes it easier to understand the code’s structure and behavior, and reduces the need for searching through the code.

Init functions should be at the top of the container

Functions that initialize a container (most notably init, but also functions like initWithOptions or fromUtf8) should be placed at the top of the container, before any other functions (but not before fields).

✅️ Correct:

pub fn init() Self { ... }
pub fn fromSlice(slice: []T) Self { ... }
pub fn len(self: *const Self) usize { ... }
pub fn isEmpty(self: *const Self) bool { ... }

❌️ Incorrect:

pub fn len(self: *const Self) usize { ... }
pub fn isEmpty(self: *const Self) bool { ... }
pub fn fromSlice(slice: []T) Self { ... }
pub fn init() Self { ... }

Justification: Placing initialization functions near the top of a type’s definition is a common convention in many languages, including Zig.

deinit should immediately follow init functions

If a type has a deinit method, it should be defined immediately after all initialization functions (like init or fromSlice).

✅️ Correct:

pub fn init() Self { ... }
pub fn fromSlice(slice: []T) Self { ... }
pub fn deinit(self: *Self) void { ... }
pub fn len(self: *const Self) usize { ... }
pub fn isEmpty(self: *const Self) bool { ... }

❌️ Incorrect:

pub fn init() Self { ... }
pub fn fromSlice(slice: []T) Self { ... }
pub fn len(self: *const Self) usize { ... }
pub fn isEmpty(self: *const Self) bool { ... }
pub fn deinit(self: *Self) void { ... }

Justification: In Zig, it’s very important to know whether a type has a deinit method, since it will not be automatically called. Placing it near the top of the container, and next to the related initialization functions, reduces the chance that its existence will be missed. Placing it after init rather than before is preferred due to existing Zig convention that generally expects init first.

Do not label blocks or loops unnecessarily

Do not label blocks or loops unless the label is actually needed. In particular:

  • Don’t write labeled blocks that consist of a single break statement; simply use the value provided to the break statement directly.
  • Don’t label loops if every use of a loop label could be replaced by a non-labeled break or continue statement. Loops that are not nested should not typically need labels.

✅️ Correct:

const files = try readDirectory(dir_path);
const match = for (files) |*file| {
    if (matchGlob(file.name, pattern)) {
        break file;
    }
} else null;
return if (match) |file|
    concatPath(dir_path, file.name)
else
    null;

❌️ Incorrect:

const files = blk: { // block is unnecessary
    break :blk try readDirectory(dir_path);
};
const match = for (files) |*file| loop: { // loop label is unnecessary
    if (matchGlob(file.name, pattern)) {
        break :loop file; // could be an unlabeled `break`
    }
} else null;
return if (match) |file| blk: { // block is unnecessary
    break :blk concatPath(dir_path, file.name);
} else null;

Justification: Unnecessary labels add verbosity without any benefit, and also cause the reader to search in vain for uses of the label.

Do not use orelse unreachable

orelse unreachable is equivalent to .?; simply use that instead.

✅️ Correct:

const value = hash_map.get(key).?;

❌️ Incorrect:

const value = hash_map.get(key) orelse unreachable;

Justification: orelse unreachable adds verbosity without any benefit, and is less idiomatic.

Avoid unnecessary uses of var

Avoid uses of var that could be made const. The Zig compiler catches some of these cases, but not all; for example, if some_var.someConstMethod() is performed, Zig will allow some_var to be a var, even if someConstMethod takes self by const pointer.

✅️ Correct:

const mutex = try allocator.create(std.Thread.Mutex);
mutex.* = .{};
mutex.lock();
defer mutex.unlock();

❌️ Incorrect:

var mutex = try allocator.create(std.Thread.Mutex);
// `mutex` is a pointer: it has type `*std.Thread.Mutex`. Although we need
// mutable access to the `Mutex` for `lock` and `unlock`, the pointer itself
// doesn't need to be mutable, and should be made `const`.
mutex.* = .{};
mutex.lock();
defer mutex.unlock();

Justification: const variables help catch bugs by preventing unintended mutation. Declaring variables that don’t need to be mutated as var adds surface area for bugs, and may be confusing to readers.

Prefer positive conditions in if-else statements

In an if-else statement, prefer writing the condition as a positive expression instead of a negative one, switching the if and else branches if necessary. In particular:

  • Prefer == to !=.
  • Avoid using a boolean-not expression (!x) as the whole condition.

✅️ Correct:

if (text.len == 0) {
    file.createEmpty();
} else {
    file.create(text);
}

if (task == null) {
    thread_pool.stop();
} else {
    thread_pool.poll();
}

if (is_valid and is_connected) {
    startTransaction();
} else {
    logError();
}

❌️ Incorrect:

if (text.len != 0) {
    file.create(text);
} else {
    file.createEmpty();
}

if (task != null) {
    thread_pool.poll();
} else {
    thread_pool.stop();
}

if (!(is_valid and is_connected)) {
    logError();
} else {
    startTransaction();
}

Exception: Negative conditions may be necessary to reduce nesting. In general, reducing nesting should take precedence.

File organization & imports

sort-imports.ts is run automatically on all code. This section concerns guidelines not enforced by that program.

A file’s children should live in a sibling directory of the same name

When a file grows large, it often makes sense to extract some of the declarations in that file into separate files. For example, a large struct definition in a namespace could be turned into a separate file struct and imported by the namespace.

If a file’s child declarations are split into separate files, those files should exist in a directory of the same name as the parent file, without the .zig extension. The parent file and that directory should be siblings: they should share the same parent directory.

The parent file should then import the child files.

✅️ Correct:

src/
├── Thread.zig
├── Thread/
│   └── Mutex.zig
├── math.zig
└── math/
    └── complex.zig

❌️ Incorrect:

src/
├── Thread/
│   ├── Thread.zig  # parent shouldn't be nested in its own subdirectory
│   └── Mutex.zig
├── math.zig
└── complex.zig  # only imported by math.zig, so it's a child and should
                 # live in a subdirectory called `math`

❌️ Incorrect:

src/
├── Thread.zig
├── thread/  # should be TitleCase (`Thread`) to match Thread.zig
│   └── Mutex.zig
├── math.zig
└── mathlib/  # should be called `math` (or math.zig should be mathlib.zig)
    └── complex.zig

Justification: This convention is used by the Zig standard library; see, for example, Thread.zig and the sibling Thread directory. It also matches how module systems work in other languages, like Rust. Sticking to a single convention makes it easier to navigate and understand the codebase.

Access items from bun instead of importing them directly

Where possible, access items using a chain of declaration accesses, starting with the bun module, instead of importing files directly. The majority of files should only need to be imported once, by their parent.

✅️ Correct:

//! src/allocators/example.zig
const MimallocArena = bun.allocators.MimallocArena;
const ArrayList = bun.collections.ArrayList;
const Owned = bun.ptr.Owned;

❌️ Incorrect:

//! src/allocators/example.zig
const MimallocArena = @import("./MimallocArena.zig");
const ArrayList = @import("../collections.zig").ArrayList;
const Owned = @import("../ptr/owned.zig").Owned;

Even though MimallocArena.zig is in the same directory as the example, still prefer to import it from bun if possible. Also note that the "../ptr/owned.zig" import should be avoided for other reasons as well.

Justification: Accessing items from bun makes it clearer which item is being imported, since imports of the same item will look the same in all cases, instead of being relative to the current directory. Additionally, file imports require special care to ensure that private implementation details aren’t being used outside their intended scope (hence not importing from subdirectories), while declaration accesses starting with bun can rely on the compiler to enforce visibility.

The Zig standard library tends to prefer this style as well: for example, Mutex.zig accesses Thread via std.Thread rather than ../Thread.zig.

Don’t import from subdirectories, except immediate children

Don’t import files from subdirectories of the current directory, except for the current file’s immediate children, which should live in a sibling directory of the same name as the file.

Instead of importing files in subdirectories, the parents of those files should re-export them (using pub const imports), enabling them to be accessed through the parent without being imported directly.

✅️ Correct:

//! current file is src/xml.zig
const ast = @import("./xml/ast.zig"); // ok: direct child
const Entity = ast.Entity;
const Mutex = bun.threading.Mutex;

❌️ Incorrect:

//! current file is src/xml.zig
const ast = @import("./xml/ast.zig");
const Entity = @import("./xml/ast/Entity.zig"); // bad: not a direct child
const Mutex = @import("./threading/Mutex.zig"); // bad: subdirectory

Justification: Items in subdirectories may be private implementation details not intended to be used publicly, and as such, using them may introduce bugs or unintended behavior. If the items are intended to be public, they should be re-exported in the parent container using a pub const import.

Imports containing .. must not descend into a subdirectory

Imports that contain .. path components must not descend into any subdirectories; that is, after a .. component, the next component must either be a file, or another .. component. (Imports should also not descend into any directories before a .. component, as this is redundant: ./a/../b.zig is the same as ./b.zig.)

However, even in the case of permitted .. imports, prefer avoiding such imports if the items can be accessed from bun instead. An import containing .. should only be performed if the item is not accessible from bun, and should not be made accessible (due to being intentionally private).

✅️ Correct:

//! current file is src/xml/ast/Entity.zig
const ast = @import("../ast.zig"); // ok if xml.ast is private
const ArrayList = bun.collections.ArrayList;
const Owned = bun.ptr.Owned;

❌️ Incorrect:

//! current file is src/xml/ast/Entity.zig
const ast = @import("../ast.zig");
const Owned = @import("../ptr/owned.zig").Owned; // bad: `ptr` subdirectory
const ArrayList = @import("../collections.zig").ArrayList;
// ^ technically allowed, but avoid, since bun.collections is public

Justification: Imports that first ascend to a parent directory, then descend into a subdirectory are simply another kind of subdirectory import and are bad for the same reason: such items may be private implementation details not intended to be used publicly. If intended to be public, those items should be re-exported by their parents using pub const imports.

Imports should not change the name of the imported item

Items should be imported under their original name; that is, the name of the const variable to which the import is bound should be identical to the item’s original name as determined by the original declaration or filename.

This applies both to “real” imports using @import, as well as chains of declaration accesses like const Owned = bun.ptr.Owned.

✅️ Correct:

const MimallocArena = bun.allocators.MimallocArena;
const ThreadPool = bun.threading.ThreadPool;
const JSValue = bun.jsc.JSValue;

❌️ Incorrect:

const ThreadLocalArena = bun.allocators.MimallocArena;
const WorkPool = bun.threading.ThreadPool;
const Value = bun.jsc.JSValue;

Justification: Keeping names uniform makes it easier for the reader to determine which pieces of code are using the same shared types and namespaces (among other items), and makes it easier to find all uses of a certain type.

Exception: If the existing name lacks semantic information needed in the context in which the item is imported, the item may be imported under a different name that conveys that information. (In this case, the import essentially functions as a type alias.) For example, bun.ptr.TaggedPointer may be imported as ParentPointer in a context where that additional semantic information is valuable.

Exception: Importing under a different name is permissible if it is needed to avoid name conflicts. However, prefer accessing the item using a qualified path like parent.ConflictingName instead of performing a name-changing import like const OtherName = parent.ConflictingName.

Do not split files into tightly coupled pieces

Do not split a file into multiple smaller files if those smaller files would be tightly coupled. If a file contains two relatively independent types, it may make sense to split those into separate files, especially if those types don’t need to access each other’s private methods and fields, since Zig’s file-based visibility mechanics will then prevent such accesses. Conversely, it rarely makes sense to split individual methods of a type into separate files, since those methods will no longer be able to access the type’s private methods and fields.

Either keep the file intact, or restructure the code so it is less tightly coupled.

✅️ Correct:

#state: State,

pub fn init(text: []const u8) Self { ... }
pub fn deinit(self: *Self) void { ... }
pub fn parse(self: *Self) nodes.Document { ... }

fn parseDoctype(self: *Self) nodes.Doctype { ... }
fn parseEntity(self: *Self) nodes.Entity { ... }
fn parseTag(self: *Self) nodes.Tag { ... }

// ok to define in separate file, especially if large
const State = @import("./Parser/State.zig");

❌️ Incorrect:

#state: State,

pub fn init(options: Options) Self { ... }
pub fn deinit(self: *Self) void { ... }
pub fn parse(self: *Self) nodes.Document { ... }

// BAD: methods should not live in separate files
const parseDoctype = @import("./Parser/parse_doctype.zig").parseDoctype;
const parseEntity = @import("./Parser/parse_entity.zig").parseEntity;
const parseTag = @import("./Parser/parse_tag.zig").parseTag;

const State = @import("./Parser/State.zig");

Justification: Splitting files into tightly coupled pieces makes code harder to read, as it fractures what would otherwise be a cohesive unit of code, forcing the reader to jump back and forth between many different highly interconnected files. This also does not work well with Zig’s file-based visibility system (a method in a separate file cannot call other private methods), leading to many methods and fields unnecessarily being made public.

Reducing nesting

Avoid writing deeply nested code (also called the arrow anti-pattern, or arrow code).

  • Deeply nested code increases mental burden. When reading code nested in five levels of if conditions, the reader must keep track of all the conditions and their order. That context is easily forgotten, leading to, for instance, confusion about which condition an else block applies to, and necessitating constant rereading.

  • Deeply nested code makes it harder to stay within the line length limit.

  • Deeply nested code is often an indication of other issues, like code duplication or a lack of abstraction (especially if the same set of nested conditionals is repeated).

Avoid if-else statements containing a break in control flow

Avoid writing an if-else statement1 where one of the branches contains a break in control flow, which in this context would be any expression of type noreturn, such as return, break, unreachable.

Instead, perform one of the following steps:

  1. If the if branch contains the noreturn statement, move the code in the else branch after the if statement, and get rid of the else branch.

  2. If the else branch contains the noreturn statement, negate the if condition, thereby swapping the if and else branches, and perform step 1.

  3. If both branches contain a noreturn statement, perform either step 1 or step 2.

If optionals or error unions are involved, it may be necessary to use orelse or catch expressions.

✅️ Correct:

pub fn findUser(db: *Database, name: []const u8) FindUserError!?*User {
    if (!db.isConnected()) {
        return error.NotConnected;
    }
    if (!validateName(name)) {
        return error.InvalidUsername;
    }
    const users = db.getTable(.users) orelse return error.InvalidUsername;
    if (users.isCached(name)) {
        return users.cache;
    }
    var query = users.startQuery();
    defer query.close();
    return try query.find(name);
}

❌️ Incorrect:

pub fn findUser(db: *Database, name: []const u8) FindUserError!?*User {
    if (db.isConnected()) {
        if (validateName(name)) {
            if (db.getTable(.users)) |users| {
                if (users.isCached(name)) {
                    return users.cache;
                } else {
                    var query = users.startQuery();
                    defer query.close();
                    return try query.find(name);
                }
            } else {
                return error.MissingUsersTable;
            }
        } else {
            return error.InvalidUsername;
        }
    } else {
        return error.NotConnected;
    }
}

Justification: Returning early eliminates one level of nesting for one of the if branches, which is desirable for the reasons described above.

Exception: If both the if and else branches are small and contain little to no further nesting, the if statement may be left as-is.

Avoid if statements that both contain and are followed by a break

Avoid writing an if statement1 that both contains a break in control flow (in this context, a noreturn expression like return or break) and is also followed by (outside the if) another break in control flow, like a noreturn expression or the end of a function or loop body.

Instead, negate the condition and perform the second break in control flow up front. Then, what used to be the nested body of the if statement can follow the new, negated if statement without any increased indentation.

If optionals or error unions are involved, it may be necessary to use orelse or catch expressions.

✅️ Correct:

pub fn average(comptime T: type, slice: []const T) ?T {
    if (slice.len == 0) {
        return null; // good: early return
    }
    if (slice.len == 1) {
        return slice[0];
    }
    var total: T = 0;
    for (slice) |item| {
        total += item;
    }
    return total / slice.len;
}

❌️ Incorrect:

pub fn average(comptime T: type, slice: []const T) ?T {
    if (slice.len > 0) { // bad: most of the function is (doubly) nested
        if (slice.len > 1) {
            var total: T = 0;
            for (slice) |item| {
                total += item;
            }
            return total / slice.len;
        }
        return slice[0];
    }
    return null;
}

Exception: If the body of the if statement is small and contains little to no further nesting, it may be left as-is.

Justification: Inverting the conditional and returning early eliminates one level of nesting, which is desirable for the reasons described above.

Avoid nested if statements where an else if chain would suffice

Avoid nested if statements1 that could be replaced with a chain of else if branches.

A good candidate for this transformation is one where:

  • The first statement inside each if branch is another if statement, except for the innermost if statement.
  • Many of the if statements have else branches. This is not a hard requirement, but if there are no else branches, a better option may be to use an and expression instead.

✅️ Correct:

pub fn addUser(self: *Self, name: []const u8) Error!void {
    if (!self.#options.is_async) {
        try self.addUserSync(name);
    } else if (self.#current_task != null) {
        self.pushTask(.{ .add_user = name });
    } else if (self.#options.threading_enabled) {
        self.spawnThread(.{ .add_user = name });
    } else {
        self.spawnTask(.{ .add_user = name });
    }
    try self.sendMessage(.add_user);
}

❌️ Incorrect:

pub fn addUser(self: *Self, name: []const u8) Error!void {
    if (self.#options.is_async) {
        if (self.#current_task == null) {
            if (self.#options.threading_enabled) {
                self.spawnThread(.{ .add_user = name });
            } else {
                self.spawnTask(.{ .add_user = name });
            }
        } else {
            self.pushTask(.{ .add_user = name });
        }
    } else {
        try self.addUserSync(name);
    }
    try self.sendMessage(.add_user);
}

Justification: Replacing nested if statements with an else if chain can reduce multiple levels of nesting, which is desirable for the reasons described above.

Avoid nested if statements where an and expression would suffice

Avoid nested if statements1 that could be replaced with a single if statement whose condition is an and expression.

A good candidate for this transformation is one where:

  • The first statement inside each if branch is another if statement, except for the innermost if statement.
  • None of the if statements have else branches, or all of the if statements have identical else branches.

✅️ Correct:

fn updateIfNeeded(self: *Self) void {
    if (self.new_index != self.old_index and
        self.db.isConnected() and
        self.db.mode() != .read_only)
    {
        self.log("updating index");
        self.db.updateIndex(self.new_index);
    }
}

❌️ Incorrect:

fn updateIfNeeded(self: *Self) void {
    if (self.new_index != self.old_index) {
        if (self.db.isConnected()) {
            if (self.db.mode() != .read_only) {
                self.log("updating index");
                self.db.updateIndex(self.new_index);
            }
        }
    }
}

Justification: Replacing nested if statements with a single boolean expression can reduce multiple levels of nesting, which is desirable for the reasons described above.

else blocks should not contain a single if statement

Do not write an else block that contains only an if or if-else statement; simply turn the outer else block into an else if instead.

✅️ Correct:

if (options.blocking) {
    task.run();
} else if (pool.availableCapacity() > 0) {
    pool.schedule(task);
} else {
    queue.push(task);
}

❌️ Incorrect:

if (options.blocking) {
    task.run();
} else {
    if (pool.availableCapacity() > 0) {
        pool.schedule(task);
    } else {
        queue.push(task);
    }
}

Justification: An if statement inside an else block is exactly equivalent to an else if branch, but adds nesting, which is undesirable. Changing to an else if reduces nesting while keeping the current layout and order of the code.

Avoid large nested type definitions

Avoid large nested definitions of types. Consider moving the definition outside the parent container (redeclaring it inside the parent if necessary), or putting it in a separate file.

✅️ Correct:

pub const Task = struct {
    callback: *const fn (*anyopaque) void,
    state: *anyopaque,

    fn spawn(self: *Task, pool: *WorkPool) void { ... }
    fn runSync(self: *Task) void { ... }
    fn cancel(self: *const Task, pool: *WorkPool) void { ... }
};

pub const WorkPool = struct {
    const Self = @This();

    #tasks: ArrayList(Task),

    pub fn init() Self { ... }
    pub fn deinit(self: *Self) void { ... }
    pub fn schedule(self: *Self, task: Task) void { ... }
    pub fn run(self: *Self) void { ... }
};

If WorkPool.Task needs to exist, we can add pub const Task = Task; to WorkPool. To avoid ambiguity, we would also have to update existing uses of Task with Self.Task.

❌️ Incorrect:

pub const WorkPool = struct {
    const Self = @This();

    #tasks: ArrayList(Task),

    pub const Task = struct {
        callback: *const fn (*anyopaque) void,
        state: *anyopaque,

        fn spawn(self: *Task, pool: *WorkPool) void { ... }
        fn runSync(self: *Task) void { ... }
        fn cancel(self: *const Task, pool: *WorkPool) void { ... }
    };

    pub fn init() Self { ... }
    pub fn deinit(self: *Self) void { ... }
    pub fn schedule(self: *Self, task: Task) void { ... }
    pub fn run(self: *Self) void { ... }
};

Justification: Moving large type definitions to an outer scope or separate file reduces the level of nesting for a large amount of code, which is desirable for the reasons described above.

Avoid inline type definitions

Avoid directly using a container definition (e.g., struct {...} or enum {...}) as the type of a parameter, field, or variable. Instead, give the container a name using a const variable declaration, and use that name in place of the definition everywhere else.

✅️ Correct:

const FileInfo = struct {
    content: []const u8,
    mime_type: ?[]const u8 = null,
    permissions: PermissionSet,

    fn mimeType(self: *const FileInfo) []const u8 {
        return if (self.mime_type) |mime|
            mime
        else
            guessMimeType(self.content);
    }
};

pub fn serveFile(self: *Self, info: FileInfo, options: Options) void {
    var task: Task(FileInfo) = .initFile(info);
    defer task.deinit();
    return self.serveTask(&task, options);
}

❌️ Incorrect:

pub fn serveFile(
    self: *Self,
    info: struct {
        content: []const u8,
        mime_type: []const u8,
        permissions: PermissionSet,

        fn mimeType(self: *const FileInfo) []const u8 {
            return if (self.mime_type) |mime|
                mime
            else
                guessMimeType(self.content);
        }
    },
    options: Options,
) void {
    var task: Task(@TypeOf(info)) = .initFile(info);
    defer task.deinit();
    return self.serveTask(&task, options);
}

Justification: Inline type definitions increase nesting, result in worse error messages, and encourage code duplication or metaprogramming hacks if the type must be referenced in multiple places.

Avoid deeply nested boolean expressions

Avoid complex nested boolean expressions. Instead, try the following strategies:

  1. Extract parts of the expression into separate variables or functions.

  2. Extract the expression as a whole into a separate function, where different parts of the expression can be turned into separate if statements that return early as soon as the whole expression is known to succeed or fail.

  3. Replace the expression with a labeled block containing if statements, or an else if chain or switch statement if possible. This is essentially the same as approach 2, but avoids defining a new function. Prefer approach 2 if the replacement code would be large or substantially nested.

✅️ Correct:

pub fn findList(
    lists: []List(T),
    query: *const List(T),
    options: SearchOptions,
) ?*List(T) {
    for (lists) |*list| {
        if (containsSublist(list, query)) return list;
    }
    return null;
}

fn listMatches(
    haystack: *const List(T),
    needle: *const List(T),
    options: SearchOptions,
) bool {
    if (needle.len > haystack.len) return false;
    if (exact and needle.len != haystack.len) return false;
    if (findSublistPtr(haystack, needle)) return true;

    var cmp: StructuralComparator = .init();
    defer cmp.deinit();
    return findSublistWithCmp(haystack, needle, cmp.asCmp());
}

❌️ Incorrect:

pub fn findList(
    lists: []List(T),
    query: *const List(T),
    options: SearchOptions,
) ?*List(T) {
    for (lists) |*list| {
        if (list.len <= query.len and
            !(options.exact and query.len != list.len) and
            (findSublistPtr(list, query) or blk: {
                var cmp: StructuralComparator = .init();
                defer cmp.deinit();
                break :blk findSublistWithCmp(list, query, cmp.asCmp());
            }))
        {
            return list;
        }
    }
    return null;
}

Justification: Deeply nested boolean expressions are undesirable for the typical reasons, and tend to be especially unwieldy to work with and difficult to format clearly, especially if statements must be added to some of the subexpressions.

Avoid large blocks in functions

Avoid writing functions that contain large brace-enclosed blocks, as this causes all the code in the blocks to be indented. Instead, extract the blocks into separate functions.

Justification: Large blocks in functions increase nesting and tend to encourage a lack of abstraction and encapsulation. Moving the code to a separate function reduces nesting and encourages clearer separation between different parts of the program.

✅️ Correct:

pub fn runMain(self: *Self) MainError!void {
    const tasks = self.takeTasks();
    var thread_pool = self.makeThreadPool();
    var results = self.spawnTasks(tasks, &thread_pool) catch |err| {
        self.logError(err, .spawn_tasks);
        return error.SpawnTasksFailed;
    };
    defer results.deinit();
    for (results) |*result| {
        if (result.kind == background) {
            self.logResult(result);
        } else {
            self.printResult(result);
        }
    }
}

fn spawnTasks(self: *Self, tasks: []Task, pool: *ThreadPool) !?ResultSet {
    var handles: ArrayListDefault(Handle) = try .initCapacity(tasks.len);
    defer handles.deinit();
    for (tasks) |*task| {
        const handle = try pool.spawn(task);
        handles.appendAssumeCapacity(handle);
    }

    try pool.run();
    pool.pause();
    var results = pool.takeResults();
    defer results.deinit();

    if (comptime Self.TaskReturnType == void) {
        return .empty;
    }
    var result_set: ResultSet = .init();
    for (results) |*result| {
        try result_set.add(result);
    }
    return result_set;
}

❌️ Incorrect:

pub fn runMain(self: *Self) MainError!void {
    const tasks = self.takeTasks();
    var thread_pool = self.makeThreadPool();

    const results = spawn_tasks: { // large block should be a function
        var handles: ArrayListDefault(Handle) = try .initCapacity(tasks.len);
        defer handles.deinit();
        for (tasks) |*task| {
            const handle = thread_pool.spawn(task) catch |err| {
                break :spawn_tasks err;
            };
            handles.appendAssumeCapacity(handle);
        }

        thread_pool.run() catch |err| {
            break :spawn_tasks err;
        };
        thread_pool.pause();
        var results = thread_pool.takeResults();
        defer results.deinit();

        if (comptime Self.TaskReturnType == void) {
            break :spawn_tasks ResultSet.empty;
        }
        var result_set: ResultSet = .init();
        for (results) |*result| {
            try result_set.add(result);
        }
        break :spawn_tasks result_set;
    } catch |err| {
        self.logError(err, .spawn_tasks);
        return error.SpawnTasksFailed;
    };

    defer results.deinit();
    for (results) |*result| {
        if (result.kind == background) {
            self.logResult(result);
        } else {
            self.printResult(result);
        }
    }
}

Reducing potential IB

Many operations in Zig can potentially invoke illegal behavior (IB), like .? on an optional. This section discusses strategies to reduce the number of operations that could invoke IB.

Avoid unwrapping an optional after checking for null

Avoid checking if an optional is null and unwrapping it with .? in the case where it isn’t null; use an if statement with a capture or an orelse expression instead.

✅️ Correct:

pub fn updateFirst(self: *Self, id: u32) void {
    const users = self.getUsers() orelse return;
    if (users.getFirst()) |first| {
        first.id = id;
    }
}

❌️ Incorrect:

pub fn updateFirst(self: *Self, id: u32) void {
    const users = self.getUsers();
    if (users == null) {
        return;
    }
    const first = users.?.getFirst(); // bad: unwrap
    if (first != null) {
        first.?.id = id; // bad: unwrap
    }
}

Justification: As code size grows, it may no longer be obvious to the reader that the uses of .? are guaranteed to succeed. This hurts readability and makes it easy for a non-guaranteed use of .? to slip in unnoticed.

Avoid tagged union field accesses after checking active field

Avoid checking whether a particular field of a tagged union is the active one, followed by a direct access of that field; use a capturing switch expression instead.

✅️ Correct:

const num_hardlinks = switch (entry) {
    .file => |*file| file.hardlink_count,
    .dir => 0, // directories don't support hardlinks
};

❌️ Incorrect:

const num_hardlinks = switch (entry) {
    .file => entry.file.hardlink_count, // BAD: union field access
    .dir => 0, // directories don't support hardlinks
};

✅️ Correct:

fn refreshCache(node: *Node) !void {
    const file = switch (node.entry) {
        .file => |*f| f,
        else => return,
    };
    if (file.size() == 0) {
        node.cache.clear();
    } else {
        const text = try file.read();
        node.cache.replace(text);
    }
}

❌️ Incorrect:

fn refreshCache(node: *Node) !void {
    if (node.entry != .file) {
        return;
    }
    if (entry.file.size() == 0) { // bad: field access
        node.cache.clear();
    } else {
        const text = try entry.file.read(); // bad: field access
        node.cache.replace(text);
    }
}

Justification: As code size grows, it may no longer be obvious to the reader that the field access is guaranteed to succeed. This hurts readability and makes it easy for a non-guaranteed field access to slip in unnoticed.

Avoid @intCast after a manual range check

Avoid checking whether an integer is within the range of a smaller integer, followed by a call to @intCast; use std.math.cast instead.

✅️ Correct:

fn toCompactSlice(comptime T: type, slice: []T) ?CompactSlice(T) {
    const len = std.math.cast(u16, slice.len) orelse return null;
    return .init(slice.ptr, len);
}

❌️ Incorrect:

fn toCompactSlice(comptime T: type, slice: []T) ?CompactSlice(T) {
    if (slice.len > std.math.maxInt(u16)) return null;
    return .init(slice.ptr, @intCast(len));
}

Justification: Performing these steps separately introduces opportunities for bugs, if the range check is incorrect or the wrong integer type is used with @intCast. std.math.cast is guaranteed to perform the right check.

Minimize use of untagged unions

Minimize the use of untagged unions; prefer tagged unions when possible. When an untagged union must be used, for example, to enable a more space-efficient representation, define functions that convert the untagged union to and from a tagged counterpart, and perform all processing on the tagged version.

✅️ Correct:

Normally, programs should simply used tagged unions. This example illustrates a case where an untagged union must be used for performance.

pub const Entry = union(enum) {
    file: File,
    dir: Dir,
};

pub const Bundle = struct {
    #entry: UntaggedEntry,
    metadata: packed struct(u64) {
        #entry_tag: EntryTag,
        id: u63,
    };

    // See multi_array_list.zig for how to perform these conversions.
    pub fn getEntry(self: *const Bundle) Entry { ... }
    pub fn setEntry(self: *Bundle, entry: Entry) void { ... }
};

pub fn processBundle(ctx: *Context, bundle: *Bundle) void {
    var entry = bundle.getEntry();
    // process `entry`...
    // good: no need to worry about tag & data getting out of sync
    bundle.setEntry(entry);
}

const EntryTag = std.meta.Tag(Entry);
const UntaggedEntry = blk: {
    var info = @typeInfo(Entry);
    info.@"union".tag_type = null;
    break :blk @Type(info);
};

❌️ Incorrect:

pub const Entry = union {
    file: File,
    dir: Dir,
};

// bad: no tagged version of `Entry` available; all processing must manage tag
// separately, which could easily lead to IB

pub const Bundle = struct {
    entry: Entry,
    metadata: packed struct(u64) {
        is_file: bool,
        id: u63,
    },
};

pub fn processBundle(ctx: *Context, bundle: *Bundle) void {
    // process `bundle.entry`...
    // BAD: need to make sure `bundle.metadata.is_file` is updated; will cause
    // IB if it gets out of sync
}

Justification: Untagged unions require information about the active field to be stored somewhere else. Storing this information separately makes it easy for it to get out of sync; a tagged union keeps all the information in one place.

Avoid storing a pointer and length separately

Avoid separately storing a many-item pointer and a length that describes how many items exist at that address; use a slice instead. If the pointer and length must be stored separately, for example, to enable a more space-efficient representation, define functions that convert the pointer and length to and from a slice, and perform all processing on the slice.

✅️ Correct:

Normally, programs should simply use slices. This example illustrates a case where the pointer and length must be stored separately for performance.

pub const Message = struct {
    data: []const u8,
    owner: Owner,

    pub fn pack(self: *const Message) PackedMessage {
        return .{
            .#ptr = self.slice.ptr,
            .#len = @intCast(self.slice.len),
            .#owner = self.owner,
        };
    }
};

pub const PackedMessage = struct {
    #ptr: [*]const u8,
    #len: u32,
    #owner: Owner,

    pub fn unpack(self: PackedMessage) Message {
        return .{
            .data = self.#ptr[0..self.#len],
            .owner = self.#owner,
        };
    }
};

pub fn processNextMessage(ctx: *Context) void {
    const packed_msg: PackedMessage = ctx.recv_queue.pop() orelse return;
    const msg: Message = packed_msg.unpack();
    // process `message`...
    // good: out-of-bounds accesses will panic in debug mode
    // good: `msg.data` slice is easy to work with
}

pub fn sendMessage(ctx: *Context, msg: Message) void {
    // good: function accepts a normal `Message`
    ctx.send_queue.push(msg.pack());
}

❌️ Incorrect:

pub const Message = struct {
    data: [*]const u8,
    len: u32,
    owner: Owner,
};

// bad: no version of `Message` that contains a normal slice

pub fn processNextMessage(ctx: *Context) void {
    const msg: Message = ctx.recv_queue.pop() orelse return;
    // process `msg`...
    // bad: out-of-bounds accesses like msg.ptr[10] are *unchecked* IB
    // bad: separate pointer and length are harder to work with
}

pub fn sendMessage(ctx: *Context, msg: Message) void {
    // bad: callers have to convert slices to a separate pointer and length
    // before calling this function
    ctx.send_queue.push(msg);
}

Justification: Storing the pointer and length separately makes it easy for them to get out of sync. Additionally, out-of-bounds accesses will become safety-unchecked behavior, instead of safety-checked as with slices.

Avoid repeating the same potential-IB operation

If a potential-IB-causing operation like .? must be used, don’t repeat the same operation on the same data. For example, perform .? once and store the unwrapped result in a variable instead of repeatedly unwrapping.

✅️ Correct:

if (self.side != .server) return;
const name = self.name.?; // server always has a name
log("handling server: {s}", .{name});
const tmp = self.getTempDir(name);
self.reserved_names.add(name);

❌️ Incorrect:

if (self.side != .server) return;
// server always has a name
log("handling server: {s}", .{self.name.?});
const tmp = self.getTempDir(self.name.?); // bad: repeated .? on same value
self.reserved_names.add(self.name.?);

Justification: As code size grows, it may no longer be obvious to the reader that every subsequent use of .? is guaranteed to succeed. This hurts readability and makes it easy for a non-guaranteed use of .? to slip in unnoticed.

Avoid splitting a switch into disjoint pieces

Avoid writing a switch expression that handles a subset of cases, breaking control flow if those cases match, followed by a second switch expression that handles the remaining cases and uses unreachable for the cases that were already handled. Instead, write a single exhaustive switch expression.

If there is a substantial amount of code between the two switch expressions, it may be extracted into a separate function to avoid duplication.

✅️ Correct:

pub const User = union(enum) {
    pending: PendingUser,
    cached: CachedUser,
    remote: RemoteUser,
    root: void,
};

fn getNameFromId(self: *Self, id: u32) ![]u8 {
    var connection = self.openDb();
    defer connection.close();
    const user = connection.queryUser(id) orelse return error.UserNotFound;
    return user.getName();
}

pub fn getName(self: *Self, user: *User) QueryError![]u8 {
    return switch (user) {
        .pending => |*pending| pending.getName(),
        .cached => |*cached| cached.getName(),
        .remote => |*remote| try self.getNameFromId(remote.id),
        .root => try self.getNameFromId(0),
    };
}

❌️ Incorrect:

pub const User = union(enum) {
    pending: PendingUser,
    cached: CachedUser,
    remote: RemoteUser,
    root: void,
};

pub fn getName(self: *Self, user: *User) QueryError![]u8 {
    switch (user) {
        .pending => |*pending| return pending.getName(),
        .cached => |*cached| return cached.getName(),
        else => {},
    }

    var connection = self.openDb();
    defer connection.close();
    const id = switch (user) {
        .remote => |*remote| remote.id,
        .root => 0,
        else => unreachable, // bad!
    };

    const user = connection.queryUser(id) orelse return error.UserNotFound;
    return user.getName();
}

Justification: A single exhaustive switch avoids the need for unreachable. Splitting the switch into disjoint parts makes it easy to forget one of the cases, which could result in an unreachable being executed.

Try to avoid @fieldParentPtr

Try to avoid using @fieldParentPtr. Instead, pass a pointer to the parent itself, potentially in a wrapper type.

✅️ Correct:

const NamedBlock = struct {
    label: Label,
    body: []Statement,

    pub fn rename(self: *NamedBlock, id: Identifier) void {
        for (self.body) |*stmt| {
            stmt.replaceId(self.label.id, id);
        }
        self.label.id = id;
    }
};

❌️ Incorrect:

const NamedBlock = struct {
    label: Label,
    body: []Statement,
};

const Label = struct {
    id: Identifier,

    pub fn renameBlock(self: *Label, id: Identifier) void {
        // BAD: unchecked IB if `self` is not in a NamedBlock!
        const block: *NamedBlock = @fieldParentPtr(self, "label");
        for (block.body) |*stmt| {
            stmt.replaceId(self.id, id);
        }
        self.id = id;
    }
};

Justification: @fieldParentPtr places requirements on the pointer that are not guaranteed by the type system. If accidentally passed a pointer that does not point to the specified parent field, it will immediately invoke safety-unchecked illegal behavior.

@fieldParentPtr also does not reduce the amount of information that must be stored or passed. It requires that 1) the child is in the parent’s immediately addressable memory (e.g., not in a slice), and 2) that the exact location of the child within the parent is known, which means that if the parent contains an array of children, you can’t get a parent pointer without knowing the child’s index. Because of this, most uses of @fieldParentPtr can be replaced by simply passing the parent pointer itself.

Exception: Some implementations of dynamic polymorphism require @fieldParentPtr, like the std.Io.Writer interface. These uses are acceptable, but be careful to ensure that the pointer really does point to the specified parent field, as the compiler cannot check this.

Exception: In rare scenarios, @fieldParentPtr may be necessary to enable certain space-efficient representations. In this case, try to minimize its scope: in particular, keep the types to which it is applied private.

Avoid self-references

Avoid writing a type that contains a pointer to a part of itself, like a struct where one field is a pointer to another field. Instead, consider the following alternatives:

  • Use numeric indices instead of pointing to an array element.
  • Use an enum instead of a pointer that could point to one of several fields of the same type.
  • Create a pointer-containing data structure on demand instead of storing it in the type.
  • If the type already heap-allocates some of its data, move additional data into the heap allocation to avoid self-references.
  • Perform a larger restructuring of the type. Keeping clear ownership patterns in mind will likely lead to a reduced need for self-references.

✅️ Correct:

pub fn CircularBuffer(comptime T: type, comptime: size: usize) type {
    return struct {
        #storage: [size]T,
        #start: usize,
        #end: usize,

        pub fn init() Self {
            return .{
                .#storage = undefined,
                .#start = 0,
                .#end = 0,
            },
        }

        pub fn push(self: *Self, elem: T) void {
            var end = self.#end;
            if (end == std.math.maxInt(usize)) return;

            self.#storage[end] = elem;
            end += 1;
            if (end == size) {
                end = 0;
            }
            self.#end = if (end == self.#start)
                std.math.maxInt(usize) // buffer full
            else
                end;
        }
    };
}

❌️ Incorrect:

pub fn CircularBuffer(comptime T: type, comptime: size: usize) type {
    return struct { // bad: cannot move this type
        #storage: [size]T,
        #start: [*]T,
        #end: ?[*]T,

        // have to initialize in place or pointers will be invalidated
        pub fn init(uninit_self: *Self) void {
            uninit_self.#start = &uninit_self.#storage[0];
            uninit_self.#end = &uninit_self.#storage[0];
        }

        pub fn push(self: *Self, elem: T) void {
            var end = self.#end orelse return;
            end.* = elem;
            end += 1;
            if (end > &self.#storage[size - 1]) {
                end = 0;
            }
            self.#end = if (end == self.#start)
                null // buffer full
            else
                end;
        }
    };
}

Justification: If a type that contains self-references is moved, the new value will contain pointers into the old location. If the old value is invalidated (e.g., its memory is freed or overwritten), the pointers in the new value will become dangling, and dereferencing them could cause immediate unchecked illegal behavior.

Errors

Avoid inferred error sets in public functions

Avoid writing public functions that return an error union with an inferred error set; explicitly define the error set instead.

✅️ Correct:

pub fn push(list: *List, elem: T) Allocator.Error!void { ... }

pub const SaveError = File.OpenError || error{
    ReadOnly,
    DeletionPending,
    EncodingFailed,
};

pub fn saveToDisk(user: *const User, path: []const u8) SaveError!void { ... }

fn encodeUser(user: *const User, file: *File) !void { ... }
// ^ ok: private function may have inferred error set

❌️ Incorrect:

pub fn push(list: *List, elem: T) !void { ... }
// ^ bad: inferred error set in public function

pub fn saveToDisk(user: *const User, path: []const u8) !void { ... }
// ^ bad: inferred error set in public function

fn encodeUser(user: *const User, file: *File) !void { ... }

Justification: A function’s return type is part of its public API, and thus should be clearly documented so callers are aware of it. For functions that could return errors, the best way to document the potential errors is to use an explicit error set. Inferred error sets make it easy for a function’s error set to change unexpectedly (e.g., if a new try expression is added), which can cause undesired behavior in callers. Additionally, functions with inferred error sets are necessarily generic, which can cause issues with recursion and function pointers, and can result in unexpected target-specific behavior.

Exception: If it is necessary to have different error sets depending on the values of generic parameters or the current target, an inferred error set may be acceptable. However, consider whether simplifying and unifying the return type, or defining a named error set whose definition depends on the generic parameters or target, would be more appropriate.

Give names to large or repeatedly used error sets

Give names to large or repeatedly used error sets instead of directly using an error{...} expression in a function’s return type.

✅️ Correct:

pub const SaveError = File.OpenError || error{
    ReadOnly,
    DeletionPending,
    EncodingFailed,
};

pub fn saveToDisk(user: *const User, path: []const u8) SaveError!void { ... }

pub fn saveAllToDisk(
    users: []const User,
    dir: []const u8,
) SaveError!void { ... }

❌️ Incorrect:

pub fn saveToDisk(
    user: *const User,
    path: []const u8,
) (File.OpenError || error{
    ReadOnly,
    DeletionPending,
    EncodingFailed,
})!void { ... }

pub fn saveAllToDisk(
    users: []const User,
    dir: []const u8,
) (File.OpenError || error{
    ReadOnly,
    DeletionPending,
    EncodingFailed,
})!void { ... }

Justification: Large or frequently used anonymous error sets add verbosity, which can hurt readability. Additionally, giving a name to an error set may convey useful semantic information about the context in which the errors originated.

Avoid anyerror

Avoid using anyerror. Use explicit error sets instead.

✅️ Correct:

pub const RunError = error{
    CommandNotFound,
    PermissionDenied,
    UnsupportedPlatform,
};

pub fn run(self: *Command) RunError!i32 { ... }

❌️ Incorrect:

pub fn run(self: *Command) anyerror!i32 { ... }

Justification: anyerror prevents the caller from knowing which errors could actually occur, which makes it very hard to ensure that all errors are handled correctly.

catch expressions must capture the error

All uses of catch must capture the error.

✅️ Correct:

buffer.reserve(file.size) catch |err| switch (err) {
    error.OutOfMemory => return self.processStreaming(file),
};

❌️ Incorrect:

buffer.reserve(file.size) catch return self.processStreaming(file);
// bad: error not captured ^

Justification: It is rarely correct to handle all errors in the same way.2 Blanket catch expressions that don’t capture the error encourage such behavior. Even if a blanket catch does not presently cause issues, it could easily become incorrect if a function’s error set changes.

catch expressions should not ignore the error

catch expressions should inspect the captured error in some way instead of applying blanket behavior to all potential errors. It is highly recommended to accomplish this by performing an exhaustive switch on the error, as this ensures all cases are appropriately handled.

✅️ Correct:

command.run() catch |err| switch (err) { // good: exhaustive switch
    error.CommandNotFound, error.PermissionDenied => {
        return try self.runBackupCommand()
    },
    error.UnsupportedPlatform => {
        self.logPlatformError();
        return null;
    },
};

❌️ Incorrect:

command.run() catch |err| {
    _ = err;
    return try self.runBackupCommand();
};

Justification: Proper error handling typically requires inspecting the error in some way in order to guide behavior; treating all errors the same way is usually not correct. Performing an exhaustive switch on the error is the best way to ensure all potential errors are handled correctly. Depending on the situation, other approaches such as logging the error code or storing the error somewhere to be retrieved later may be acceptable, and still satisfy the requirement of not ignoring the error.

If a function can only produce one specific error, inspecting the error may seem unnecessary. However, in this case, an exhaustive switch with one case should be used, as this prevents the function’s error set from unexpectedly changing without forcing the caller to update its error handling.

Do not use handleOom in a function that could return error.OutOfMemory

If error.OutOfMemory is in a function’s error set, do not call bun.handleOom in that function. Instead, propagate the error using try.

✅️ Correct:

fn clone(node: *const Node, allocator: std.mem.Allocator) !Node {
    var children: ArrayList(*Node) = try .initCapacity(
        allocator,
        node.children.len,
    );
    for (node.children) |*child| {
        children.appendAssumeCapacity(try clone(child), allocator);
    }
    return .{
        .name = try allocator.dupe(node.name),
        .children = children,
    };
}

❌️ Incorrect:

fn clone(node: *const Node, allocator: std.mem.Allocator) !Node {
    var children: ArrayList(*Node) = try .initCapacity( // could return OOM
        allocator,
        node.children.len,
    );
    for (node.children) |*child| {
        children.appendAssumeCapacity(try clone(child), allocator);
    }
    return .{
        .name = bun.handleOom(allocator.dupe(node.name)),
        // bad: handleOom also used ^
        .children = children,
    };
}

Justification: Mixing different mechanisms of handling the same error in the same function is confusing and results in an experience that is suboptimal to both the approach of always using handleOom and the approach of always propagating error.OutOfMemory: callers still have to handle out-of-memory errors, yet the function could also panic.

Avoid excessive uses of handleOom

Avoid frequent uses of bun.handleOom within the same function. Instead, define a helper function that returns error.OutOfMemory instead of calling handleOom (i.e., replace all the calls to handleOom with try). Then, make the original function simply call the helper function, wrapped in a single call to handleOom.

✅️ Correct:

pub fn flatten(root: *const Node, outList: anytype) void {
    bun.handleOom(tryFlatten(root, outList)); // good: one call to handleOom
}

fn tryFlatten(root: *const Node, outList: anytype) !void {
    var stack: ArrayListDefault(*Node) = .init();
    defer stack.deinit();
    try stack.append(root);
    while (true) {
        const node = stack.pop() orelse break;
        try outList.append(node);
        if (node.left) |left| {
            try stack.append(left);
        }
        if (node.right) |right| {
            try stack.append(right);
        }
    }
}

❌️ Incorrect:

pub fn flatten(root: *const Node, outList: anytype) void {
    var stack: ArrayListDefault(*Node) = .init();
    defer stack.deinit();
    bun.handleOom(stack.append(root)); // handleOom #1
    while (true) {
        const node = stack.pop() orelse break;
        bun.handleOom(outList.append(node)); // handleOom #2
        if (node.left) |left| {
            bun.handleOom(stack.append(left)); // handleOom #3
        }
        if (node.right) |right| {
            bun.handleOom(stack.append(right)); // handleOom #4
        }
    }
}

Justification: Excessive calls to handleOom harm readability. The addition of a non-panicking version of the function may prove generally useful as well.

Functions

See also: Errors

Do not use bool parameters

Do not write a function that takes a parameter of type bool. Instead, define an enum with two fields, choosing names that convey the meaning of the boolean.

✅️ Correct:

const Encoding = enum { utf8, utf16 };

fn upload(server: *Server, bytes: []const u8, encoding: Encoding) void { ... }

❌️ Incorrect:

fn upload(server: *Server, bytes: []const u8, is_utf16: bool) void { ... }

✅️ Correct:

const Level = enum { debug, normal };

fn log(message: []const u8, level: Level) void { ... }

❌️ Incorrect:

fn log(message: []const u8, is_debug: bool) void { ... }

Justification: Calls to functions that have boolean parameters are often hard to understand, as it is typically not clear what true and false mean, requiring the reader to consult the function’s definition. An appropriately named enum causes much more readable code.

Exception: If a bool simply represents a plain boolean or bit with no specific meaning (for example, in a BitSet type), it may remain a boolean.

Do not use inline fn unless semantic inlining is required

Do not mark a function as inline unless semantic inlining (which is distinct from inlining as an optimization) is required. Most functions do not require semantic inlining.

✅️ Correct:

fn reverse(comptime T: type, slice: []T) void {
    for (0..slice.len / 2) |i| {
        const tmp = slice[i];
        slice[i] = slice[slice.len - 1 - i];
        slice[slice.len - 1 - i] = tmp;
    }
}

❌️ Incorrect:

inline fn reverse(comptime T: type, slice: []T) void {
    for (0..slice.len / 2) |i| {
        const tmp = slice[i];
        slice[i] = slice[slice.len - 1 - i];
        slice[slice.len - 1 - i] = tmp;
    }
}

Justification: inline functions can have surprising behavior when semantic inlining is not expected or required, and can drastically increase compile times and code size if used indiscriminately. The Zig language reference recommends against using inline fn when semantic inlining is not required.

Exception: If real profiling measurements indicate that a particular use of inline fn enables a necessary improvement in performance, it may be used in that instance.

Functions should not take a large number of parameters

Avoid writing functions that take an excessive number of parameters. When a function takes many parameters, it is likely that many of the parameters are closely related. Therefore, group the most closely related parameters into one or more structs, and have the function accept those structs instead.

As a general guideline, functions should not have more than 5 parameters.

✅️ Correct:

pub const User = struct {
    name: []const u8,
    address: []const u8,
    email: ?[]const u8,
    phone: ?PhoneNumber,
};

pub fn addUser(
    server: *Server,
    user: User,
    timeout: Timeout,
) AddError!void { ... }

❌️ Incorrect:

pub fn addUser(
    server: *Server,
    name: []const u8,
    phone: ?PhoneNumber,
    timeout: Timeout,
    address: []const u8,
    email: []const u8,
) AddError!void { ... }

Justification: Calls to functions that take a large number of parameters are often hard to understand, as the meaning of the parameters is typically unclear and easily forgotten, requiring the reader to repeatedly consult the function’s definition.

Avoid writing huge functions

Avoid writing functions that take up a very large number of lines of code. Instead, split the function into multiple smaller functions.

As a general guideline, if a function has more than 80 lines, you should consider splitting it up. Functions that have more than 150 lines are in serious need of a refactor, as there are very few functions that legitimately need to be that long.

Justification: Huge functions are difficult to understand and maintain, often requiring the reader to repeatedly jump between different parts of the function that cannot fit on one screen. Additionally, most of the code typically has access to the entire set of local variables even though each piece only needs a small subset, which can easily lead to bugs. Huge functions are also commonly associated with code duplication and nesting.

Avoid unused parameters

Avoid writing functions that take unused parameters. If a function is modified in such a way that a parameter becomes unused, the parameter should be removed and the call sites should be updated.

Justification: It is not obvious which parameters are unused when a function is called, leading to confusion, and perhaps even inefficiency, at call sites.

✅️ Correct:

fn deleteUser(db: *Database, name: []const u8) !void { ... }

❌️ Incorrect:

fn deleteUser(db: *Database, name: []const u8, priority: u32) !void {
    _ = priority;
    ...
}

❌️ Incorrect:

fn deleteUser(db: *Database, name: []const u8, _: u32) !void { ... }

Exception: If a function must have a specific signature (e.g., to maintain ABI compatibility in an extern function, or if it will be used with an API that expects a function pointer of a particular type), it may have unused parameters.

Avoid out-parameters

Avoid writing functions that take out-parameters. An out-parameter is a pointer parameter where:

  • The pointer initially points to potentially uninitialized memory.
  • The function writes an initialized value to the pointer.

Out-parameters serve the same function as return values; therefore, simply use a return value instead. If multiple return values are needed, use a struct (preferred) or tuple.

✅️ Correct:

const File = struct {
    bytes: []u8,
    metadata: Metadata,
};

fn readFile(path: []const u8, metadata: *Metadata) !File {
    const fd = try open(path);
    defer close(fd);
    return .{
        .metadata = getMetadata(fd),
        .bytes = try readAll(fd),
    };
}

fn loadUser(name: []const u8) !User {
    const path = try userJsonPath(name);
    defer path.deinit();
    const file = try readFile(path.bytes(), &metadata);
    // ...
}

❌️ Incorrect:

fn readFile(path: []const u8, metadata: *Metadata) ![]u8 {
    const fd = try open(path);
    defer close(fd);
    metadata.* = getMetadata(fd);
    return try readAll(fd);
}

fn loadUser(name: []const u8) !User {
    const path = try userJsonPath(name);
    defer path.deinit();
    var metadata = undefined; // bad: need to use undefined variable
    const json = try readFile(path.bytes(), &metadata);
    // ...
}

Justification: Initialization of out-parameters is not enforced by the compiler, which could easily lead to bugs if initialization is skipped in certain code paths.

Exception: If real profiling measurements show that using an out-parameter enables a necessary improvement in performance, it may be used in that instance.

Exception: If a type contains self-references, it may be necessary to use an out-parameter. However, self-references should typically be avoided.

Explicitly mark comptime conditions

If a function contains any compile-time expressions that are used to control branching (e.g., if (some_comptime_expr)), the expression should be explicitly prefixed with comptime.

✅️ Correct:

if (comptime platform == .unix) {
    unixOnlyFunction();
} else {
    slowCrossPlatformFunction();
}

❌️ Incorrect:

if (platform == .unix) {
    unixOnlyFunction();
} else {
    slowCrossPlatformFunction();
}

Justification: Adding comptime ensures that the expression can be evaluated at compile-time, and clearly communicates to the reader that conditional compilation is being used. Without an explicit comptime, if the compiler ever becomes unable to evaluate the expression at compile-time, a potentially very confusing error message could result.

Exception: If a condition happens to be only incidentally comptime (that is, conditional compilation is not required), it does not need to be prefixed. Specifically, “incidentally comptime” means that the condition could be replaced with a runtime boolean variable without causing compilation errors.

Exception: @inComptime() cannot be prefixed with comptime, as that would cause it to incorrectly unconditionally return true. The compiler will error in this case.

Declaration visibility

Avoid pub imports

Avoid importing items as pub unless the item really needs to be re-exported.

The most common case for re-exports is when a parent container re-exports some of its children in order to make those children part of the parent’s public API. For example, Thread.zig re-exports Thread/Mutex.zig in the standard library by using importing it as pub. Other kinds of re-exports not involving a parent and its children are possible, but should generally be avoided (for example, array_list.zig should probably not re-export Mutex.zig).

✅️ Correct:

const ArrayListDefault = bun.collections.ArrayListDefault;
const Owned = bun.ptr.Owned;
const Mutex = bun.threading.Mutex;

❌️ Incorrect:

pub const ArrayListDefault = bun.collections.ArrayListDefault;
pub const Owned = bun.ptr.Owned;
pub const Mutex = bun.threading.Mutex;
pub const LinearFifo = bun.LinearFifo; // unused!

Justification: Unused pub imports are not detected by tools like sort-imports.ts. This can easily lead to a situation where files have large numbers of unused imports, which causes confusion by making it difficult to determine which functionality the file actually depends on. Unnecessary re-exports also encourage the practice of importing the same item from many different files, which makes it hard to find the parts of the codebase that use a given item, and contributes to poor abstraction and encapsulation.

Avoid making functions unnecessarily pub

Avoid declaring functions as pub unless they are part of the public API of the container. That is, it should be allowed and expected for any code that has access to the container to call those functions.

If a pub function is not part of the container’s public API, use the following strategies:

If the function is used only in the same file:

Simply remove pub.

If the function must be accessed by child files:

First, consider whether the files from which the function is called are actually tightly coupled pieces that should be merged into one file. In that case, pub can simply be removed.

Otherwise, consider moving the function to a separate file namespace in which it is declared pub. The namespace can be imported by each file that needs to call the function, but is not re-exported by its parent, preventing external use.

For methods, another approach is to make a private type with pub methods, and a public type that wraps the private one and only re-exports the truly public methods. Internal code can use the private type, but external code only has access to the public one.

As a last resort, the function can remain pub but be documented as internal. In this case, it should be prefixed with _.

✅️ Correct:

#elements: []?Element,

pub fn init(...) Self { ... }

pub fn deinit(self: *Self) void {
    self.deinitRange();
    bun.default_allocator.free(self.#elements);
    self.* = undefined;
}

pub fn removeRange(self: *Self, start: usize, end: usize) void {
    self.deinitRange(start, end);
    var trailing = self.#elements[end..];
    if (trailing.len > end - start) {
        trailing = trailing[..end - start];
    }
    @memcpy(self.#elements[start..][..trailing.len], trailing);
    self.clearRange(start, start + trailing.len);
}

fn deinitRange(self: *Self, start: usize, end: usize) void {
    for (self.#elements[start..end]) |*elem| {
        if (elem.*) |*some_elem| {
            some_elem.deinit();
        }
    }
}

fn clearRange(self: *Self, start: usize, end: usize) void {
    @memset(self.#elements[start..end], null);
}

❌️ Incorrect:

#elements: []?Element,

pub fn init(...) Self { ... }

pub fn deinit(self: *Self) void {
    self.deinitRange();
    bun.default_allocator.free(self.#elements);
    self.* = undefined;
}

pub fn removeRange(self: *Self, start: usize, end: usize) void {
    self.deinitRange(start, end);
    var trailing = self.#elements[end..];
    if (trailing.len > end - start) {
        trailing = trailing[..end - start];
    }
    @memcpy(self.#elements[start..][..trailing.len], trailing);
    self.clearRange(start, start + trailing.len);
}

// BAD: Method should not be public. It could easily lead to a UAF or
// double-free, e.g., if a user calls `deinit` after this one.
pub fn deinitRange(self: *Self, start: usize, end: usize) void {
    for (self.#elements[start..end]) |*elem| {
        if (elem.*) |*some_elem| {
            some_elem.deinit();
        }
    }
}

// BAD: Method should not be public. It can easily lead to memory leaks
// since it doesn't deinit the elements.
pub fn clearRange(self: *Self, start: usize, end: usize) void {
    @memset(self.#elements[start..end], null);
}

Justification: Unnecessary pub functions make it harder to understand how a container is intended to be used, and harder to check that the container is being used correctly. This is especially problematic if some of those functions could put a type into an invalid state.

Methods in private types should be pub if part of the type’s public API

Even if a type is private, its methods (and functions) should be marked pub if they are part of the type’s public API.

✅️ Correct:

/// Only used in this file.
const FileContents = struct {
    #bytes: []const u8,
    #is_owned: bool,

    pub fn init(...) FileContents { ... }

    pub fn deinit(self: *FileContents) void {
        defer self.* = undefined;
        if (self.#is_owned) bun.default_allocator.free(self.#bytes);
    }

    pub fn takeBytes(self: *FileContents) []u8 {
        if (!self.#is_owned) self.cloneBytes();
        defer self.* = .{ .#bytes = "", .#is_owned = false };
        return @constCast(self.#bytes);
    }

    fn cloneBytes(self: *FileContents) AllocError!void {
        self.#bytes = try bun.default_allocator.dupe(self.#bytes);
        self.#is_owned = true;
    }
};

❌️ Incorrect:

/// Only used in this file.
const FileContents = struct {
    #bytes: []const u8,
    #is_owned: bool,

    fn init(...) FileContents { ... } // should be public

    fn deinit(self: *FileContents) void { // should be public
        defer self.* = undefined;
        if (self.#is_owned) bun.default_allocator.free(self.#bytes);
    }

    fn takeBytes(self: *FileContents) []u8 { // should be public
        if (!self.#is_owned) self.cloneBytes();
        defer self.* = .{ .#bytes = "", .#is_owned = false };
        return @constCast(self.#bytes);
    }

    fn cloneBytes(self: *FileContents) AllocError!void {
        self.#bytes = try bun.default_allocator.dupe(self.#bytes);
        self.#is_owned = true;
    }
};

❌️ Incorrect:

/// Only used in this file.
const FileContents = struct {
    #bytes: []const u8,
    #is_owned: bool,

    pub fn init(...) FileContents { ... }

    pub fn deinit(self: *FileContents) void {
        defer self.* = undefined;
        if (self.#is_owned) bun.default_allocator.free(self.#bytes);
    }

    pub fn takeBytes(self: *FileContents) []u8 {
        if (!self.#is_owned) self.cloneBytes();
        defer self.* = .{ .#bytes = "", .#is_owned = false };
        return @constCast(self.#bytes);
    }

    // This method should *not* be public. It's not part of the type's public
    // API, since it could easily cause a memory leak if used directly (if the
    // bytes are already owned).
    pub fn cloneBytes(self: *FileContents) AllocError!void {
        self.#bytes = try bun.default_allocator.dupe(self.#bytes);
        self.#is_owned = true;
    }
};

Justification: Although not required by the compiler, explicitly marking the public API as pub makes it easier to understand how the type is intended to be used and to ensure the type is used correctly. It also allows the type to easily be made public if necessary.

Invariants

Background

Types are a grouping of:

  • Data (fields).
  • Invariants, which are conditions that are always true about the data.
  • Functions that operate on the data (methods). Crucially, these functions are allowed to assume that the invariants are true.

It is highly recommended to document invariants explicitly.

Invariants always hold for all instances of a given type

A type’s invariants are always assumed to hold for all values3 (variables, fields, and parameters) of that type.

Example:

/// INVARIANT: start <= end
pub const Range = struct {
    #start: usize,
    #end: usize,

    // We can't allow direct mutable access to #start and #end, because then
    // a user of the type could invalidate its invariants.
    pub fn start(self: Range) usize { return self.#start; }
    pub fn end(self: Range) usize { return self.#end; }

    pub fn len(self: Range) usize {
        // guaranteed not to underflow due to invariants
        return self.#end - self.#start;
    }

    pub fn shiftLeft(self: *Range, n: usize) void {
        const amount = if (n <= self.#start) n else self.#start;
        self.#start -= amount;
        self.#end -= amount; // guaranteed not to underflow due to invariants

        // Important: we must be careful to make sure this function doesn't
        // invalidate the invariants. Subtracting the same value from both
        // #start and #end is guaranteed to maintain them.
    }
};

pub fn getSubstring(str: []const u8, range: Range) ?[]const u8 {
    return if (self.range.end() > str.len)
        null
    else // guaranteed not to invoke IB due to invariants (start <= end)
        str[self.range.start()..self.range.end()];
}

Justification: Invariants are useful precisely because they allow code to safely make assumptions about a type. If some instances of a type are exempt from upholding the invariants, code can no longer make those assumptions, rendering the invariants essentially useless.

Methods should avoid making assumptions not guaranteed by invariants

Methods should avoid making assumptions about a type that are not guaranteed by the invariants.

✅️ Correct:

const SliceDeque = struct {
    // INVARIANTS:
    // * #front <= #back
    // * #front <= #slice.len
    // * #back <= #slice.len

    #slice: []const T,
    #front: usize,
    #back: usize,

    pub fn init(slice: []const T) SliceDeque {
        return .{
            .#slice = slice,
            .#front = 0,
            .#back = slice.len,
        };
    }

    pub fn popFront(self: *SliceDeque) ?T {
        if (self.#front == self.#back) return null;
        defer self.#front += 1;
        return self.#slice[self.#front];
    }

    pub fn popBack(self: *SliceDeque) ?T {
        if (self.#front == self.#back) return null;
        self.#back -= 1;
        return self.#slice[self.#back];
    }
};

❌️ Incorrect:

const SliceDeque = struct {
    // Fields are public, so we can't guarantee invariants.
    slice: []const T,
    front: usize,
    back: usize,

    pub fn init(slice: []const T) SliceDeque {
        return .{
            .slice = slice,
            .front = 0,
            .back = slice.len,
        };
    }

    pub fn popFront(self: *SliceDeque) ?T {
        if (self.front == self.back) return null;
        defer self.front += 1;
        // Bad: If a user modified `back` to be less than `front`, this could
        // read past the end of the slice. We should either update this method
        // to check for that condition, or add `front <= back` to this type's
        // invariants (and make the fields private to enforce it).
        return self.slice[self.front];
    }

    pub fn popBack(self: *SliceDeque) ?T {
        if (self.front == self.back) return null;
        // Bad: If a user modified `back` to be less than `front`, this
        // subtraction could underflow. We should either update this method to
        // check for that condition, or add `front <= back` to this type's
        // invariants (and make the fields private to enforce it).
        self.back -= 1;
        return self.slice[self.back];
    }
};

Justification: It is often very difficult to ensure that assumptions not guaranteed by invariants are upheld. In many cases, it is not obvious that a method even makes such assumptions in the first place. Accordingly, such assumptions may be a frequent source of bugs.

Exception: There are many legitimate exceptions to this, where a method must assume properties that cannot be globally guaranteed as invariants; appendAssumeCapacity is an example. However, all such assumptions must be explicitly documented.

Methods must uphold all invariants

Methods must not invalidate any invariants. If a method invalidates an invariant, then the invariant isn’t really an invariant!

✅️ Correct:

pub const Range = struct {
    // Invariant: #start <= #end
    #start: usize,
    #end: usize,

    pub fn skip10(self: *Range) void {
        if (self.#end - self.#start < 10) {
            self.#start = self.#end;
        } else {
            self.#start += 10;
        }
    }
};

❌️ Incorrect:

pub const Range = struct {
    // Invariant: #start <= #end
    #start: usize,
    #end: usize,

    pub fn skip10(self: *Range) void {
        // BAD: this could make #end greater than #start!
        self.#start += 10;
    }
};

Justification: Invariants are, by definition, always true. A method that fails to uphold an invariant is fundamentally contradictory to the nature of invariants.

Exception: Invariants may be temporarily broken within a method, as long as they are restored before the method returns. However, in this case, be mindful of calling any other methods while the invariants are broken: if those methods expect the invariants to be upheld, this could lead to bugs.

Exception: It is permissible to have private methods that break invariants, but try to avoid this if possible. If such methods are needed, the fact that they break invariants should be explicitly documented.

Public fields cannot break invariants if set to an arbitrary value

If a field is public, it must be permissible to set that field to any arbitrary value4 without breaking any invariants. If invariants could be broken this way, the field should be private.

✅️ Correct:

const OpenFile = struct {
    // Invariant: #fd is a valid file descriptor.
    #fd: c_int,

    pub const init(...) Error!OpenFile { ... }
};

❌️ Incorrect:

const OpenFile = struct {
    // Invariant: `fd` is a valid file descriptor.
    fd: c_int, // BAD: anyone could set `fd` to an invalid file descriptor!

    pub const init(...) Error!OpenFile { ... }
};

Justification: It should not be the responsibility of users of a type to uphold its invariants. If invariants can be invalidated by modifying public fields, it becomes very difficult in a complex codebase to ensure that the invariants aren’t broken, leading to bugs.

Avoid long or complex lists of invariants

Try to avoid writing types that have a long or complex list of invariants. Remember that you get a type’s invariants “for free” simply by having an instance of that type, so consider grouping some of the fields into separate types instead. Also consider whether the use of enums or tagged unions would be appropriate.

✅️ Correct:

const SliceRange = {
    #slice: []T,
    #start: usize, // Invariant: <= #slice.len; <= #end
    #end: usize, // Invariant: <= #slice.len
};

// good: no extra invariants required on this type
const ActiveIterator = struct {
    #range1: SliceRange,
    #range2: ?SliceRange,
};

// good: no extra invariants required on this type
const Iterator = union(enum) {
    #active: ActiveIterator,
    #done: void,
};

❌️ Incorrect:

const Iterator = struct {
    #done: bool,

    #slice1: []T,
    // Invariant: #start1 <= #slice1.len
    //   UNLESS #done is true, then #start1 == 0
    // Invariant: #start1 <= #end1
    //   UNLESS #done is true
    #start1: usize,
    // Invariant: #end1 <= #slice1.len
    //   UNLESS #done is true, then #end1 == 0
    #end1: usize,

    #slice2: ?[]T,
    // Invariant: #start2 < #slice2.len
    //   UNLESS #slice2 == null or #done, then #start2 == 0
    // Invariant: #start2 <= #end2
    #start2: usize,
    // Invariant: #end2 < #slice2.len
    //   UNLESS #slice2 == null or #done, then #end2 == 0
    #end2: usize,
};

Justification: Long or complex lists of invariants can make it hard to ensure that the invariants are always maintained.

Working with builtin & standard library types

Normally, users of a type should not be responsible for maintaining its invariants; this is why it’s important to use private fields when appropriate.

However, because private fields are only a feature of Bun’s fork of Zig, builtin and standard library types do not use them. This makes it difficult to ascribe any invariants to them according to the rules outlined above. Additionally, some kinds of invariants, such as pointer validity, are beyond the scope of what the compiler can enforce.

So, as a matter of practicality, some invariants are ascribed to builtin and standard library types, with the caveat that certain kinds of operations on those types are prohibited:

Pointers

  • Pointers are assumed to point to valid memory. Therefore:
    • Do not create or use dangling pointers.
    • Do not set a pointer to a dangling address.
  • By default, pointers are assumed to point to a valid object. Therefore:
    • Do not use pointers to uninitialized memory unless it is documented as acceptable for the pointer to be uninitialized.

Slices

  • The ptr in a slice is assumed to point to at least len elements. Therefore:
    • Don’t modify len directly; make a subslice or new slice.
    • Don’t modify ptr directly; make a new slice.
  • The statements about pointers apply to slices, since slices contain a pointer.

ArrayList

  • items.ptr in an ArrayList is assumed to point to at least capacity elements. Therefore:
    • Don’t modify the fields of an ArrayList directly. If you want the length or capacity to change, use the appropriate methods. If you want items.ptr to change, create a new ArrayList.
  • The statements about slices apply to items.
  • These points apply to all variants of ArrayList, including Managed and Aligned.

For other types not listed here, consider whether a reasonable person would assume certain conditions about the type are always true. If so, avoid breaking those assumptions by refraining from operations that could do so (e.g., direct field accesses).

Exception: If it is absolutely necessary to have an instance of a builtin/standard type that does not conform to these invariants, it must be explicitly documented as such, and great care must be taken not to expose the instance to any code that is not explicitly documented as being able to handle such broken invariants.

Footnotes

  1. For simplicity, this rule uses “if statement” to refer to both if statements and if expressions, which are distinct entities in Zig’s grammar. 2 3 4

  2. Except for propagating all errors upward, in which case a catch expression would likely not be used.

  3. This applies to all live variables, parameters, and fields. Uninitialized or invalidated values are exempt.

  4. In this case, “arbitrary value” includes only valid instances of the type (e.g., not undefined, unless the field is explicitly documented as permitting undefined).

About

Bun Zig style guide (IN PROGRESS)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published