Demystifying Codex Protocol: How the P2P Network Works

Demystifying Codex Protocol: How the P2P Network Works

Decentralised storage networks (DSNs) offer a more durable, censorship-resistant way to store large amounts of data without relying on centralised parties. Protocols like Codex facilitate a peer-to-peer data storage network that is designed to persist data on decentralised infrastructure without being managed by any one party.

Codex expands the traditional DSN design with its Decentralised Durability Engine (DDE) architecture, which builds on the initial concept of a DSN by introducing powerful reliability and durability features, as well as an open marketplace for storage and retrieval.

The Codex network is in an early testnet phase, with users currently able to join and operate Codex nodes on the live test network. As outlined in its roadmap for 2025 and beyond, Codex aims to roll out an incentivised testnet in the second half of the year before achieving mainnet readiness closer to the end of the year.

To help you understand how the protocol works, Jessie, Guru, and Ben from the Codex team have recorded a deep dive into the Codex peer-to-peer network, covering everything from bootstrapping to storage incentivisation and verification.

Watch the video below or read further for a breakdown of how the Codex peer-to-peer network works:

Joining the Codex Network

To begin using Codex, a user runs a Codex node on their local machine. When the node starts up, it connects to a known “bootstrap node”, a publicly accessible entry point that helps new participants discover other peers in the network. 

The user receives a list of peer nodes from the bootstrap node via Signed Peer Records (SPRs). This allows the new node to join the wider Codex peer-to-peer network and then discover and add connection information related to other peers.

The Codex node operates using two ports: one for peer discovery (UDP port 8090 by default) and another for listening, or waiting for inbound requests (TCP port 8070 by default).

For full participation in the network, these ports must be accessible, or port-forwarded and piped through your software/hardware firewall solutions, meaning they are open to receive public traffic.

Sharing Files in Altruistic Mode

Codex currently supports a non-incentivised, peer-to-peer sharing mode known as Altruistic Mode, which allows users to share files directly without remuneration or guarantees of persistence. 

Here is how it works:

Users who wish to share a file upload it to their local Codex node using a REST API, typically through a POST request to the /data endpoint.

The file is divided into fixed-size blocks of 64 kilobytes. Each block is then hashed using a cryptographic hash function, and those hashes are recursively combined into a Merkle tree.

The final hash at the top of the tree, called the Merkle root, uniquely represents the file’s contents.

This Merkle root, along with metadata such as block size, total file size, and encoding format, is included in a Basic Manifest. The manifest itself is stored as a data block, given a Content Identifier (CID), and saved in the Repo Store, a local storage area on the node. 

Any node in the network can request and download any content for which it has a CID, which is then interpreted by the Block Exchange Engine, which in turn provides the associated data via the Repo Store.

When another user wishes to retrieve the file, they simply need the CID of the Basic Manifest. Their Codex node fetches the manifest, interprets how to retrieve all of the associated blocks, and reconstructs the original file accordingly.

Persistence Mode and Erasure Coding

While altruistic sharing is useful for casual or temporary transfers and protocol testing, many use cases require robust and durable decentralised storage with uptime guarantees. This is where Persistence Mode comes into play. 

This incentivised storage model allows users to pay Codex storage providers to host their files for a specified duration. In return, those providers must offer verifiable proof that the data remains intact. This process is enabled through smart contracts, incentives, the dispersal of data across a decentralised network, proof challenges, and erasure coding.

Storage providers in Codex do not compete via manual bidding. Instead, each provider passively monitors the blockchain for storage requests. Each is pre-configured with rules and constraints, such as minimum price per byte per second, maximum contract length, and maximum collateral.

When a posted storage request matches a provider’s configuration, the provider automatically reserves a slot and stores the required data. The user does not choose providers directly; instead, providers opt in to store files based on predefined conditions. This makes the system scalable and truly decentralised.

The exchange process is similar to Altruistic Mode, with the file being locally uploaded, chunked, and stored. However, for persistence, Codex transforms the file into a more robust structure by applying erasure coding. 

This involves generating additional parity blocks from the original data blocks. These parity blocks allow the system to recover the data even if parts of it become unavailable.

The combined original and parity blocks are arranged into a new linear dataset. A new Merkle tree is then constructed, and its root becomes the basis of a Protected Manifest, which includes the erasure coding parameters and other metadata.

In accordance with its design as a Decentralised Durability Engine, Codex then strategically distributes the blocks across multiple storage nodes, ensuring that original and parity blocks for any given data segment are dispersed across a decentralised network and are not stored on the same node. This mitigates single points of failure and bolsters overall durability.

Verifiable Manifest and Generating Storage Proofs

Codex uses zero-knowledge proofs to confirm that storage providers are hosting the promised data. These allow providers to prove they possess the data without revealing it. This process requires creating a Verifiable Manifest.

At this stage, the blocks are broken down into even smaller pieces called cells, which are 2 kilobytes in size. These cells are hashed using Poseidon2, a hashing function optimised for use in zero-knowledge proofs. Each block becomes a small Merkle tree of cell hashes, with a block root. Blocks are then grouped into slots, each of which also has a Merkle root, called a slot root.

All slot roots are combined into a final Merkle tree, with a single root hash at the top called the Verification Root. This root captures the integrity of the entire erasure-coded dataset; if any of the file or parity data is changed, the root will also change, and verification will fail.

The Verifiable Manifest includes the verification root, all slot roots, and other relevant metadata. Given its own CID, it is stored as a block and referenced in a smart contract storage request submitted to the blockchain.

The CID relating to this Verifiable Manifest is used for storage requests on the network.

Developer Integration and API Usage

A Codex node provides a REST API which allows user applications to interact with data and storage contracts.

All API documentation for interacting with the Codex network is available at api.codex.storage.

Developers can currently integrate Codex into their applications by operating and interacting with one or multiple nodes in Altruistic Mode on the Non-Incentivised Testnet.

Codex DevRel Guru recently published an example of how Codex can be integrated into a decentralised file-sharing application alongside using Waku for decentralised communications. 

This project leverages two components of the Logos tech stack to deliver a file-sharing application that lets users upload files to Codex, share the CID over Waku, and download files directly from each other’s nodes.

Read more about building on Codex and Waku.

Join the Codex Testnet

Codex is in a Non-Incentivised Testnet phase, open to anyone who wants to run a node and experiment with building applications that leverage durable, decentralised storage.

The Codex Installer CLI application makes it simple to install Codex, run a node, and join the public testnet.

A step-by-step tutorial is available to guide you through the setup with the Codex Storage CLI and explain how to claim the Altruistic Mode and Active Participant roles in the Codex Discord.

Once your node is running, share your feedback to help us improve the client and follow live statistics on the metrics dashboard

Follow us on social media, join our Discord, and subscribe to our newsletter to get the latest updates from Codex.