IPLD Alliance

Web3 Data Infra Club

Exploring the Boundaries of Web3 Data Infrastructure.

W3DI GitHub
Data WareHouse

A data warehouse is a type of data management system that is designed to enable and support business intelligence (BI) activities, especially analytics.

Read More
Data Lake

A data lake is a cost-effective storage repository that holds a vast amount of raw data in its native raw format until it is needed for analytics applications.

Read More
Data LakeHouse

A data lakehouse is an open data management architecture that combines the flexibility and cost-efficiency of data lakes with the features of data warehouses.

Read More
Data Mesh

A new paradigm considering domains as the first class concern, applying platform thinking to create self-serve data infrastructure, and treating data as a product.

Read More

Web3 Data Infra
From Middleware Persective

In the blockchain industry, traditional user scenarios will also emerge, but the middleware position will become increasingly prominent for a decentralised application architecture system.

  • This is because decentralised application architecture leads to fragmented product forms which result in limited attention spans, whether for developers or users.
  • The scattered product forms will be aggregated in some reasonable way to make it easier for developers and users to use.
  • There will be no need for dapps to be tied to each other, but also for middleware to be tied in between.
Technical Market Analysis Middleware
Web3 middleware – The bet on infrastructures

Data Mesh On Pando

Zhamak Dehghani introduces Data Mesh, the next generation data platform, that shifts to a paradigm drawing from modern distributed architecture considering domains as the first class concern, applying platform thinking to create self-serve data infrastructure, and treating data as a product.

  • Domain-oriented decentralized data ownership and architecture
  • Data as a product
  • Self-serve data infrastructure as a platform
  • Federated computational governance

The Interplanetary Filesystem (IPFS) is a protocol and network for storing and sharing data in a distributed filesystem. IPFS uses content-addressing to identify each unique data resource persisted in the global namespace connecting all participating devices (nodes). A content identifier (CID), or the means through which a data resource becomes addressable, is essentially a hash that performs two essential functions which provides the premise for building an open data mesh.

  • First, a CID verifies the content of the data resource which can be thought of as a digital fingerprint.
  • Second, a CID provides the means to find the data in the network (i.e. it is routable). In effect, a CID provides verifiability and addressability in a single name.
Move Beyond a Monolithic Data Lake to a Distributed Data Mesh
Data Mesh Principles and Logical Architecture

Data Infrastucture Evolution
No Silver Bullet!

These four data design patterns aren’t mutually exclusive — they may co-exist in an enterprise, for instance, with a cross-functional domain team that has its own data lake. However, there is traceable evolution from data warehouse to data lake to data mesh, driven by the need to overcome certain architectural limitations.

Read More

Web3 Data Mesh Reference Architecture

The data mesh is built using a self-service layer on top of the data infrastructure as a platform ( Pando ) where we find one or more data lakes or object stores; ingestion, transformation, and orchestration engines; and data warehouses and/or data querying services.

  • All
  • Data Layer
  • Computation Layer
  • Domain Layer

Filecoin Data Infra Overview

Filecoin is a token-based data infra protocol that supports a decentralised storage and delivery network. The Retrieval Market facilitates a decentralized and trustless CDN for content addressed data.

Mind Power of Web3 Data Infra

Web3 Data Infra Mind Powers is a collection of best practices that designers can consider when building Pando Project user experiences & interfaces.

Evolution of Blockchain Components to Off-Chain Models

We’ll be exploring how components such as record keeping (storage) and smart contracts (computation) can be moved off-chain to enable more robust computations and storage requirements without sacrificing security and scalability requirements..

Read More
Lakehouse:A New Generation of Open Platforms that Unify Data Warehousing and Advanced Analytics

Lakehouses can help address several major challenges with data warehouses, including data staleness, reliability, total cost of ownership, data lock-in, and limited use-case support.

Read More
Data Mesh Principles and Logical Architecture

Data mesh is a new paradigm for building the next-generation data platform, and founded in four principles: domain-oriented decentralized data ownership and architecture, data as a product, self-serve data infrastructure as a platform, and federated computational governance.

Read More
Emerging Architectures for Modern Data Infrastructure

To help data teams stay on top of the changes happening in the industry, we’re publishing in this post an updated set of data infrastructure architectures. They show the current best-in-class stack across both analytic and operational systems, as gathered from numerous operators we spoke with over the last year. Each architectural blueprint includes a summary of what’s changed since the prior version.

Read More
The Composable Web3 Data Network

Financial composability is not the only form of composability. There’s an even larger opportunity for composability: data composability. All ledgers—asset ledgers must achieve Composability — As more data, state, and functions are added to a decentralized ledger, they increase the breadth and depth of the substrate on top of which new applications can be built. Composability is the ultimate network effect.

Read More
SQL is the King

We believe that SQL has become the universal interface for data analysis.Like networking we have a complex stack, with infrastructure on the bottom and applications on top.What we need is an interface that allows pieces of this stack to communicate with one another. Ideally something already standardized in the industry. Something that would allow us to swap in/out various layers with minimal friction.That is the power of SQL. Like IP, SQL is a universal interface.

Read More

Upcoming Events

Retrieval Markets Summit, Lisbon.A day of presentations from Retrieval Markets builders