Cells, Builders and Slices

Cells, Builders and Slices are low-level primitives of TON Blockchain. The virtual machine of TON Blockchain, TVM (opens in a new tab), uses cells to represent all data structures in persistent storage, and most in memory.

Cells

Cell is a primitive and a data structure, which ordinarly consists of up to $1023$ continuously laid out bits and up to $4$ references (refs) to other cells. Circular references are forbidden and cannot be created by the means of TVM (opens in a new tab), which means cells can be viewed as quadtrees (opens in a new tab) or directed acyclic graphs (DAGs) (opens in a new tab) of themselves. Contract code itself is represented by a tree of cells.

Cells and cell primitives are bit-oriented, not byte-oriented: TVM (opens in a new tab) regards data kept in cells as sequences (strings or streams) of up to $1023$ bits, not bytes. If necessary, contracts are free to use, say, $21$ -bit integer fields serialized into TVM (opens in a new tab) cells, thus using fewer persistent storage bytes to represent the same data.

Kinds

While the TVM (opens in a new tab) type Cell refers to all cells, there are different cell kinds with various memory layouts. The one described earlier is commonly referred to as an ordinary (or simple) cell — that's the most simple and most commonly used flavor of cells, which can only contain data. The grand majority of descriptions, guides and references to cells and their usage assumes ordinary ones.

Other kinds of cells are collectively called exotic (or special) cells. They sometimes appear in actual representations of blocks and other data structures on TON Blockchain. Their memory layouts and purposes significantly differ from ordinary cells.

Kinds (or subtypes) of all cells are encoded by an integer between $-1$ and $255$ . Ordinary cells are encoded by $-1$ , while exotic ones can be encoded by any other integer in that range. The subtype of an exotic cell is stored in the first $8$ bits of its data, which means valid exotic cells always have at least $8$ data bits.

TVM (opens in a new tab) currently supports the following exotic cell subtypes:

Pruned branch cell (opens in a new tab), with subtype encoded as $1$ — they represent deleted subtrees of cells.
Library reference cell (opens in a new tab), with subtype encoded as $2$ — they are used for storing libraries, and usually, in masterchain contexts.
Merkle proof cell (opens in a new tab), with subtype encoded as $3$ — they are used for verifying that certain portions of other cell's tree data belong to the full tree.
Merkle update cell (opens in a new tab), with subtype encoded as $4$ — they always have two references and behave like a Merkle proof (opens in a new tab) for both of them.

💡

Useful links:
Pruned branch cells in TON Docs (opens in a new tab)
Library reference cells in TON Docs (opens in a new tab)
Merkle proof cells in TON Docs (opens in a new tab)
Merkle update cells in TON Docs (opens in a new tab)
Simple proof-verifying example in TON Docs (opens in a new tab)

Levels

Every cell, being a quadtree (opens in a new tab), has an attribute called level, which is represented by an integer between $0$ and $3$ . The level of an ordinary cell is always equal to the maximum of the levels of all its references. That is, level of an ordinary cell without references is equal to $0$ .

Exotic cells have different rules for determining their level, which are described on this page in TON Docs (opens in a new tab).

Serialization

Before a cell can be transferred over the network or stored on disk, it must be serialized. There are several common formats, such as standard Cell representation and BoC.

Standard representation

Standard Cell representation is a common serialization format for cells first described in the tvm.pdf (opens in a new tab). Its algorithm representing cells in octet (byte) sequences begins with serializing the first $2$ bytes called descriptors:

Refs descriptor is calculated according to this formula: $r + 8 * k + 32 * l$ , where $r$ is the number of references contained in the cell (between $0$ and $4$ ), $k$ is a flag for the cell kind ( $0$ for ordinary and $1$ for exotic), and $l$ is the level of the cell (between $0$ and $3$ ).
Bits descriptor is calculated according to this formula $\lfloor\frac{b}{8}\rfloor + \lceil\frac{b}{8}\rceil$ , where $b$ is the number of bits in the cell (between $0$ and $1023$ ).

Then, the data bits of the cell themselves are serialized as $\lceil\frac{b}{8}\rceil$ $8$ -bit octets (bytes). If $b$ is not a multiple of eight, a binary $1$ and up to six binary $0$ s are appended to the data bits.

Next, the $2$ bytes store the depth of the refs, i.e. the number of cells between the root of the cell tree (the current cell) and the deepest of the references, including it. For example, a cell containing only one reference and no further references would have a depth of $1$ , while the referenced cell would have a depth of $0$ .

Finally, for every reference cell the SHA-256 (opens in a new tab) hash of its standard representation is stored, occupying $32$ bytes per each such cell and recursively repeating the said algorithm. Notice, that cyclic cell references are not allowed, so this recursion always ends in a well-defined manner.

If we were to compute the hash of the standard representation of this cell, all the bytes from steps above would be concatenated together and then hashed using SHA-256 (opens in a new tab) hash. This is the algorithm behind HASHCU and HASHSU instructions (opens in a new tab) of TVM (opens in a new tab) and respective Cell.hash() and Slice.hash() functions of Tact.

Bag of Cells

Bag of Cells, or BoC for short, is a format for serializing and de-serializing cells into byte arrays as described in boc.tlb (opens in a new tab) TL-B schema (opens in a new tab).

Read more about BoC in TON Docs: Bag of Cells (opens in a new tab).

💡

Advanced information on Cell serialization: Canonical Cell Serialization (opens in a new tab).

Immutability

Cells are read-only and immutable, but there are two major sets of ordinary cell manipulation instructions in TVM (opens in a new tab):

Cell creation (or serialization) instructions, which are used to construct new cells from previously kept values and cells;
And cell parsing (or deserialization) instructions, which are used to extract or load data previously stored into cells via serialization instructions.

On top of that, there are instructions specific to exotic cells to create them and expect their values. However, ordinary cell parsing instructions can still be used on exotic ones, in which case they are automatically replaced by ordinary cells during such deserialization attempts.

All cell manipulation instructions require transforming values of Cell type to either Builder or Slice types before such cells can be modified or inspected.

Builders

Builder is a cell manipulation primitive for using cell creation instructions. They're immutable just like cells are, and allow constructing new cells from previously kept values and cells. Unlike cells, values of type Builder appear only on TVM (opens in a new tab) stack and cannot be stored in persistent storage. That means, for example, that persistent storage fields with type Builder would actually be stored as cells under the hood.

Builder type represents partially composed cells, for which fast operations for appending integers, other cells, references to other cells and many others are defined:

While you may use them for manual construction of the cells, it's strongly recommended to use Structs instead: Construction of cells with Structs.

Slices

Slice is a cell manipulation primitive for using cell parsing instructions. Unlike cells, they're mutable and allow extracting or loading data previously stored into cells via serialization instructions. Also unlike cells, values of type Slice appear only on TVM (opens in a new tab) stack and cannot be stored in persistent storage. That means, for example, that persistent storage fields with type Slice would actually be stored as cells under the hood.

Slice type represents either the remainder of a partially parsed cell, or a value (subcell) residing inside such a cell and extracted from it by a parsing instruction:

While you may use them for manual parsing of the cells, it's strongly recommended to use Structs instead: Parsing of cells with Structs.

Serialization types

Similar to serialization options of Int type, Cell, Builder and Slice also have various representations for encoding their values in the following cases:

as storage variables of contracts and traits,
and as fields of Structs and Messages.

contract SerializationExample {
    someCell: Cell as remaining;
    someSlice: Slice as bytes32;
 
    // Constructor function,
    // necessary for this example contract to compile
    init() {
        self.someCell = emptyCell();
        self.someSlice = beginCell().storeUint(42, 256).asSlice();
    }
}

`remaining`

The remaining serialization option can be applied to values of Cell, Builder and Slice types.

It affects the process of constructing and parsing cell values by causing them to be stored and loaded directly rather than as a reference. To draw parallels with cell manipulation instructions, specifying remaining is like using Builder.storeSlice() and Slice.loadSlice() instead of Builder.storeRef() and Slice.loadRef(), which are to be used by default.

In addition, the TL-B (opens in a new tab) representation produced by Tact changes too:

contract SerializationExample {
    // By default
    cRef: Cell;    // ^cell in TL-B
    bRef: Builder; // ^builder in TL-B
    sRef: Slice;   // ^slice in TL-B
 
    // With `remaining`
    cRem: Cell as remaining;    // remainder<cell> in TL-B
    bRem: Builder as remaining; // remainder<builder> in TL-B
    sRem: Slice as remaining;   // remainder<slice> in TL-B
 
    // Constructor function,
    // necessary for this example contract to compile
    init() {
        self.cRef = emptyCell();
        self.bRef = beginCell();
        self.sRef = emptySlice();
        self.cRem = emptyCell();
        self.bRem = beginCell();
        self.sRem = emptySlice();
    }
}

There, ^cell, ^builder and ^slice in TL-B (opens in a new tab) syntax mean the reference to Cell, Builder and Slice values respectively, while the remainder<…> of cell, builder or slice tells that the given value would be stored as a Slice directly and not as a reference.

Now, to give a real-world example, imagine that you need to notice and react to inbound jetton (opens in a new tab) transfers in your smart contract. The appropriate Message structure for doing so would look something like this:

message(0x7362d09c) JettonTransferNotification {
    queryId: Int as uint64;             // arbitrary request number to prevent replay attacks
    amount: Int as coins;               // amount of jettons transferred
    sender: Address;                    // address of the sender of the jettons
    forwardPayload: Slice as remaining; // optional custom payload
}

And the receiver in the contract would look like this:

receive(msg: JettonTransferNotification) {
    // ... you do you ...
}

Upon receiving a jetton (opens in a new tab) transfer notification message, its cell body is converted into a Slice and then parsed as a JettonTransferNotification Message. At the end of this process, the forwardPayload will have all the remaining data of the original message cell.

Here, it's not possible to violate the jetton (opens in a new tab) standard by placing the forwardPayload: Slice as remaining field in any other position in the JettonTransferNotification Message. That's because Tact prohibits usage of as remaining for any but the last field of the Structs and Messages to prevent misuse of the contract storage and reduce gas consumption.

💡

Note, that the cell serialized via as remaining cannot be optional. That is, specifying something like Cell? as remaining, Builder? as remaining or Slice? as remaining would cause a compilation error.

Also note, that specifying remaining for the Cell as the map value type is considered an error and it won't compile.

`bytes32`

💡

To be resolved by #94 (opens in a new tab).

`bytes64`

💡

To be resolved by #94 (opens in a new tab).

Operations

Construct and parse

In Tact, there are at least two ways to construct and parse cells:

Manually, which involves active use of Builder, Slice and relevant methods.
Using Structs, which is a recommended and much more convenient approach.

Manually

Construction via `Builder`	Parsing via `Slice`
`beginCell()`	`Cell.beginParse()`
`.storeUint(42, 7)`	`Slice.loadUint(7)`
`.storeInt(42, 7)`	`Slice.loadInt(7)`
`.storeBool(true)`	`Slice.loadBool(true)`
`.storeSlice(slice)`	`Slice.loadSlice(slice)`
`.storeCoins(42)`	`Slice.loadCoins(42)`
`.storeAddress(address)`	`Slice.loadAddress()`
`.storeRef(cell)`	`Slice.loadRef()`
`.endCell()`	`Slice.endParse()`

Using Structs (recommended)

Structs and Messages are almost like living TL-B schemas (opens in a new tab). Which means that they're, essentially, TL-B schemas (opens in a new tab) expressed in maintainable, verifiable and user-friendly Tact code.

It is strongly recommended to use them and their methods like Struct.toCell() and Struct.fromCell() instead of manually constructing and parsing cells, as this allows for much more declarative and self-explanatory contracts.

The examples of manual parsing above could be re-written using Structs, with descriptive names of fields if one so desires:

// First Struct
struct Showcase {
    id: Int as uint8;
    someImportantNumber: Int as int8;
    isThatCool: Bool;
    payload: Slice;
    nanoToncoins: Int as coins;
    wackyTacky: Address;
    jojoRef: Adventure; // another Struct
}
 
// Here it is
struct Adventure {
    bizarre: Bool = true;
    time: Bool = false;
}
 
fun example() {
    // Basics
    let s = Showcase.fromCell(
        Showcase{
            id: 7,
            someImportantNumber: 42,
            isThatCool: true,
            payload: emptySlice(),
            nanoToncoins: 1330 + 7,
            wackyTacky: myAddress(),
            jojoRef: Adventure{ bizarre: true, time: false },
        }.toCell());
    s.isThatCool; // true
}

Note, that Tact's auto-layout algorithm is greedy. For example, struct Adventure occupies very little space, and it won't be stored as a reference Cell, but will be provided directly as a Slice.

By using Structs and Messages over manual Cell composition and parsing, those details would be simplified away and won't cause any hassle when the optimized layout changes.

💡

Useful links:
Convert serialization
Struct.toCell() in Core library
Struct.fromCell() in Core library
Struct.fromSlice() in Core library
Message.toCell() in Core library
Message.fromCell() in Core library
Message.fromSlice() in Core library

Check if empty

Neither Cell nor Builder can be checked for emptiness directly — one needs to convert them to Slice first.

To check if there are any bits, use Slice.dataEmpty(). To check if there are any references, use Slice.refsEmpty(). And to check both at the same time, use Slice.empty().

To also throw an exit code 9 whenever the Slice isn't completely empty, use Slice.endParse().

// Preparations
let someCell = beginCell().storeUint(42, 7).endCell();
let someBuilder = beginCell().storeRef(someCell);
 
// Obtaining our Slices
let slice1 = someCell.asSlice();
let slice2 = someBuilder.asSlice();
 
// .dataEmpty()
slice1.dataEmpty(); // false
slice2.dataEmpty(); // true
 
// .refsEmpty()
slice1.refsEmpty(); // true
slice2.refsEmpty(); // false
 
// .empty()
slice1.empty(); // false
slice2.empty(); // false
 
// .endParse()
try {
    slice1.endParse();
    slice2.endParse();
} catch (e) {
    e; // 9
}

💡

Useful links:
Cell.asSlice() in Core library
Builder.asSlice() in Core library
Slice.dataEmpty() in Core library
Slice.refsEmpty() in Core library
Slice.empty() in Core library
Slice.endParse() in Core library

Check if equal

Values of type Builder cannot be compared directly using binary equality == or inequality != operators. However, values of type Cell and Slice can.

Direct comparisons:

let a = beginCell().storeUint(123, 8).endCell();
let aSlice = a.asSlice();
 
let b = beginCell().storeUint(123, 8).endCell();
let bSlice = b.asSlice();
 
let areCellsEqual = a == b; // true
let areCellsNotEqual = a != b; // false
 
let areSlicesEqual = aSlice == bSlice; // true
let areSlicesNotEqual = aSlice != bSlice; // false

Note, that direct comparison via == or != operators implicitly uses SHA-256 (opens in a new tab) hashes of standard Cell representation under the hood.

Explicit comparisons using .hash() are also available:

let a = beginCell().storeUint(123, 8).endCell();
let aSlice = a.asSlice();
 
let b = beginCell().storeUint(123, 8).endCell();
let bSlice = b.asSlice();
 
let areCellsEqual = a.hash() == b.hash(); // true
let areCellsNotEqual = a.hash() != b.hash(); // false
 
let areSlicesEqual = aSlice.hash() == bSlice.hash(); // true
let areSlicesNotEqual = aSlice.hash() != bSlice.hash(); // false

💡

Useful links:
Cell.hash() in Core library
Builder.hash() in Core library
== and !=

Integers Maps