Codex Featured

Decentralised Storage for Virtual Self-Sovereign Territories

Sterlin Lujan

Feb 5, 2024 • 8 min read

Codex is building durable, censorship-resistant storage to achieve greater decentralisation and internet neutrality. Codex is also building decentralised storage to encourage the development of virtual self-sovereign territories. Without decentralised storage, the internet remains more vulnerable to hacks, outages, censorship, and attacks. Virtual territories would also be challenging to deploy in this environment.

For the sake of simplicity, we will refer to 'virtual self-sovereign territory' as 'virtual territory.' It is also important to note that virtual territories are a controversial idea. They represent an extinction-level event for unprepared organisations. At the very least, incumbent virtual territories alter the world's political climate by shifting resources, wealth, human capital, and how data is leveraged.

This piece will detail Codex's data storage solution for building virtual territories. First, we will explore Codex's relationship to Logos, the organisation dedicated to developing the decentralised web and creating virtual territories. We will define 'virtual self-sovereign territory' and explain why it is vital for the future of the decentralised web. We will do a deep dive into the problems with data sovereignty and data colonialism. Lastly, we will explain the coming sea change in data management and how Codex's innovation is helping clear a path for virtual territories.

Let us start with how Codex works with Logos.

The Sovereign Stack and Net Neutrality

Logos is the parent organisation of Codex; Logos is the 'technology stack' that includes three primary projects: Codex, Nomos, and Waku. Codex is the decentralised and durable data storage layer. Waku is the peer-to-peer communications layer. Nomos is the consensus layer.

These projects provide infrastructure for net neutrality and the creation of virtual territories. The term 'net neutrality' is politically loaded. Here, we use it to refer to an internet that governments, corporations, or organisations do not independently control. Net neutrality also means a distributed and shared internet data ecosystem. Logos is the organisation that is giving life to this new internet landscape. This is how the org describes itself:

'Logos is a collection of learning communities that will govern and sustain the network in the spirit of the original cypherpunks. Together, they form the grassroots movement needed to build the social, economic, and governing institutions that will live on the technology stack.'

The concept of 'social, economic, and governing institutions' is vital to understanding the overarching goal of the Logos organisation. Let us demystify it.

Virtual Self-Sovereign Territories

'Social, economic, and governing institutions' can also be called 'virtual self-sovereign territories.'

A virtual self-sovereign territory is a network of individuals who share a particular set of ideals, values, and goals. These individuals sometimes seek digital and physical secession from legacy institutions and nation-states. Virtual territories may also have a governance mechanism to define the parameters of their member's behaviour.

A similar term, 'network state,' was provided by former Coinbase CTO Balaji Srinivasan. In his book, 'The Network State,' he defines it:

'A network state is a highly aligned online community with a capacity for collective action that crowdfunds territory around the world and eventually gains diplomatic recognition from pre-existing states.'

The primary difference between 'virtual self-sovereign territory' and a 'network state' is that virtual territory is an umbrella term encompassing ideas like a virtual autonomous zone, network state, or decentralised autonomous state.

Until recently, the idea of a virtual territory or network state was a pipedream because world governments could shut them down. In other words, they were susceptible to external threats and did not possess antifragile characteristics. To maintain an antifragile scaffolding, these organisations needed a mechanism to decentralise their infrastructure and gain fault tolerance, primarily through consensus, communication, and storage mechanisms.

Codex focuses specifically on the problem of data storage. We will examine that here at length.

Centralised Chokepoints

One of the most significant gaps in creating a virtual territory is the lack of decentralised, censorship-resistant data storage. Anytime a storage server exists, it can be shut down by server outages, hackers, or government seizures — compromising all the data. This problem represents a serious threat vector for establishing virtual territories because governments and black hat actors could view virtual territories as potential threats or lucrative targets.

Shutting down the data servers that power such a community would be trivial. Cracking their digital defences and absconding with data would also be trivial. A threat actor would not need full access to their server; they could merely comprise the data availability of a virtual territory.

Currently, most user data is stored in vast warehouses owned by Google, Amazon Web Services, Facebook, and other players in the web2 ecosystem. Imagine if a virtual territory's data was stored centrally within a Google data warehouse. That would render that virtual territory open to attack. To understand the risks of current data infrastructure, we must look at 'data sovereignty' and how governments attempt to control data within their territories.

Data Sovereignty

In the modern world, governments care deeply about data integrity within their jurisdictions and their alleged sovereignty over that data. According to an Imperva article, 'Data sovereignty refers to the idea that a country or jurisdiction has the authority and right to govern and control the data generated within its borders. This means that the government can regulate the collection, storage, processing, and distribution of data that originates within its territory.'

According to a PREDIK article, these are some of the ways governments benefit from big data:

Tax information
Security
Defence
Financial fraud reporting
General welfare data
Medical and educational records

Governments must also work closely with tech giants who gatekeep access to this kind of data. In this way, data ownership is mutually maintained by a nation and a tech company. However, this relationship suggests that data can be corrupted or misused by these entities, undermining the rights of the people who helped create the data in the first place. This 'privatisation' of data for potentially maligned use cases has been called data colonialism.

Data Colonialism

Data colonialism is a term coined by Nick Couldry and Ulises A. Mejias. It is the idea that world powers currently dominate data to the point where they control vital resources, manage capital accumulation, and can create new social orders on a whim. Defined, data colonialism is 'The process by which governments, non-governmental organisations and corporations claim ownership of and privatise the data produced by their users and citizens.'

An example of the consequences of data colonialism is the notorious Facebook and Cambridge Analytica scandal. Britney Kaiser whistleblew on how Facebook sold data to Cambridge Analytica to impact the 2016 presidential election. This is an example of data becoming privatised for questionable or disreputable use cases. It also speaks to the idea that data can be weaponised and used to manipulate people.

However, data colonialism does not only imply that superpowers can control and tyrannize the population. It also highlights the current fragile nature of data storage as a whole. Even if one assumes that big data companies are honest and trying to do their best with user data, the vast centralisation of it into repositories represents a significant threat on its own. For instance, hackers and nefarious actors could also attack those troves of data, causing the same problems for virtual territories. In this way, data colonialism is an internet weakness, even if we were to assume the governments and tech companies involved have benevolent intentions.

Residency Gaps, OPSEC, and Decentralisation

In the future, data colonialism can be curtailed by decentralisation efforts. One of the problems that leads to the colonialism of data is its 'residency.' Data residency speaks to where the data is housed because governments and corporations are vested in keeping data within arm's length. Not only do governments hope to control the data, but they also have an interest in protecting it.

This mindset has led to the proliferation of stringent data regulations. One example is the European GDPR, or the General Data Protection Regulation, which hopes to guard data from abuse and misuse by the private and public sectors. However, these laws appear less than impactful due to the proliferation of data honeypots, which makes adequately protecting information challenging and sometimes futile. Astra security service provided some alarming data trends and statistics:

'79% of critical infrastructure organizations didn't employ a zero-trust architecture. 45% of the data breaches were cloud-based. 30% of all large data breaches occur in hospitals. Data breaches exposed at least 42 million records between March 2021 and February 2022.'

This is unsettling, but the excellent news is decentralisation represents a solution to many of the risks around data aggregation.

For instance, data will no longer be stored on servers at vast data farms. Instead, it will be distributed among many nodes on anonymous or pseudonymous networks. A more distributed data ecosystem changes governance dynamics and how organisations approach it. Without an armamentarium of sensitive data at their disposal, large organisations can no longer use data for malign purposes. These data caches also do not become targets for threat actors because of lazy opsec (operational security) and insufficient risk management practices.

Many incumbent organisations in the Web3 ecosystem seek to provide data storage solutions that radically shift 'data residency' and solve the aforementioned problems. Some companies want to decentralise data, give it back to individual users, or act as middleware to facilitate censorship-resistant data storage. Some known organisations with various business models include IPFS (The Interplanetary File System), Filecoin, Arweave, Storj, and Maidsafe.

These companies are ushering in a more fair data market where users can choose where to store their data. This is a boon for users and citizens of virtual territories because it prevents new, emergent network states from manipulating sensitive information. Data, therefore, becomes inextricably linked to personal identity that people can provide or sell of their own accord. At the very least, data gets broken up across several nodes, creating a more balanced data ecosystem.

Let's examine how Codex accomplishes this task and facilitates the growth of virtual territories.

Codex: Decentralised Storage for Virtual Territories

Through technologies such as erasure coding, Codex will provide the tools to decentralise and protect user data. In this way, hacking or censoring data becomes nontrivial. It also mitigates issues around outages or server shutdowns.

Erasure coding is a process of sharding data among many nodes, distributing it and making large volumes of information easier to manage and access. More importantly, erasure-coded data guarantees that it can be reassembled even if some nodes lose specific chunks of data. In this way, the data necessarily gains greater fault tolerance within the network.

In today's world, data is mainly protected by a replication process, where it is copied and stored on servers in the warehouses we mentioned. Of course, this has also compounded the problem of 'data residency', leading to large honeypots of attack-prone servers sitting out in the open. Read Dr Leonardo Bautista's explanation here to learn more about erasure coding and other technical aspects that Codex leverages.

This type of decentralised data storage technique both empowers and disempowers virtual territories:

It disempowers virtual territories. One of the biggest challenges with data is that it is prone to corruption, manipulation, and censorship. In this way, distributed and erasure-coded data being largely inaccessible to virtual states is a feature rather than a bug. If a virtual territory had 'data sovereignty', it would also likely lead to the problems of data colonialism we see today.
It empowers virtual territories. It allows them to manage critical data without having to store it in a vast repository along with its replicas. In this way, threat actors wishing to view the virtual territory as a political opponent cannot attack or surveil sensitive data belonging to it.

A virtual territory may be empowered to custody a wide range of such sensitive data, which it would protect through any means available to it. This data includes:

Citizenship identity metadata
Virtual and physical land ownership certificates
Collaboration and communication documents
Images, videos, and other media files containing citizenship information (including Nonfungible Token image files).
Health, work, and other business contract documents

For any of this data, the virtual territory will offload and distribute it to protect itself and its members or subscriber citizens. Through this process, Codex helps ease the emergence of Web3 and virtual territories.

Conclusion: Innovation Wellspring

Codex, Logos, and many in the crypto ecosystem are creating a more neutral internet landscape. The purpose is undoubtedly to protect privacy, human dignity, and civil liberties. However, it also prepares the internet for a new type of institution: the virtual territory.

The virtual territory would be untenable in an environment where data is easily manipulated and abused. But now that the paradigm is shifting toward more decentralised organisations, a new world of governance and social organisation has emerged. And this wellspring of innovation is heralded by the change in how we manage, protect, and use data. It is only a matter of time before these new institutions become a mainstay of everyday life.