Data is the digital gold in this age where attention is online. The global average screen time in 2024 stands at 6 hours and 40 minutes per day, an increase from previous years. In the United States, the average is even higher at 7 hours and 3 minutes daily.
With this level of engagement, the volume of data generated is staggering—328.77 million terabytes are created every day in 2024. That’s approximately 0.4 zettabytes (ZB) per day when considering all newly generated, captured, copied, or consumed data.
Yet, despite the massive amounts of data being produced and consumed daily, users own very little of it:
In crypto, we’ve seen the rise of @_kaitoai, which indexes social data on Twitter and translates it into actionable sentiment data for projects, KOLs, and thought leaders. The words “yap” and “mindshare” were popularized by the Kaito team because of their growth hacking expertise (with their popular mindshare & yapper dashboards) and ability to attract organic interest on Crypto Twitter.
“Yap” aims to incentivize quality content creation on Twitter, but many questions remain unanswered:
Beyond social data, discussions around data ownership, privacy, and transparency are heating up. With AI rapidly advancing, new questions emerge: Who owns the data used to train AI models? Who benefits from AI-generated outputs?
These questions set the stage for the rise of Web3 data layers—a shift toward user-owned, decentralized data ecosystems.
In Web3, there’s a growing ecosystem of data layers, protocols, and infrastructure focused on enabling personal data sovereignty—the idea of giving individuals more control over their data, with options to monetize it.
@vana‘s core mission is to give users control over their data, particularly in the context of AI, where data is invaluable for training models.
Vana introduces DataDAOs, community-driven entities where users pool their data for collective benefit. Each DataDAO focuses on a specific dataset:
Vana tokenizes data into a tradable asset called “DLP.” Each DLP aggregates data for a specific domain, and users can stake tokens to these pools for rewards, with the top pools being rewarded based on community support and data quality.
What makes Vana stand out is its ease of contributing data. Users simply:
@oceanprotocol is a Decentralized Data Marketplace that allows data providers to share, sell, or license their data, while consumers access it for AI and research.
Ocean Protocol uses “datatokens” (ERC-20 tokens) to represent access rights to datasets, allowing data providers to monetize their data while maintaining control over access conditions.
Types of data traded on Ocean:
Compute-to-Data is another key feature of Ocean, allowing computations to be done on the data without moving it, ensuring privacy and security for sensitive datasets.
@getmasafi is focused on creating an open layer for AI training data, supplying real-time, high-quality, and low-cost data for AI agents and developers.
Masa has launched two subnets on the Bittensor network:
Masa partnered with @virtuals_io, empowering Virtuals agents with real-time data capabilities. It also launched $TAOCAT, showcasing its abilities (currently on Binance Alpha).
@OpenledgerHQ is building a blockchain specifically tailored for data, particularly for AI and ML applications, ensuring secure, decentralized, and verifiable data management.
Key Highlights:
The demand for high-quality data to fuel AI and autonomous agents is surging. Beyond initial training, AI agents require real-time data for continuous learning and adaptation.
Key challenges & opportunities:
As AI agents become more autonomous, their ability to access and process real-time, high-quality data will determine their effectiveness. This growing demand has led to the rise of AI agent-specific data marketplaces—where both humans and AI agents can tap into high-quality AI agent data
Other key players:
This is just the beginning. Part 2 will dive deeper into:
Who controls the data will shape the future, and the projects building within this sector will define how data is owned, shared, and monetized in the AI era. As demand for high-quality data continues to grow, the race to create a more transparent, user-owned data economy is only getting started.
Stay tuned for Part 2!
Personal Note: Thanks for reading! If you’re in Crypto AI and want to connect, feel free to shoot me a DM.
If you’d like to pitch a project, please use the form in my bio—it gets priority over DMs.
Full Disclaimer: This document is intended for informational & entertainment purposes only. The views expressed in this document are not, and should not be construed as, investment advice or recommendations. Recipients of this document should do their due diligence, taking into account their specific financial circumstances, investment objectives, and risk tolerance (which are not considered in this document) before investing. This document is not an offer, nor the solicitation of an offer, to buy or sell any of the assets mentioned herein
Data is the digital gold in this age where attention is online. The global average screen time in 2024 stands at 6 hours and 40 minutes per day, an increase from previous years. In the United States, the average is even higher at 7 hours and 3 minutes daily.
With this level of engagement, the volume of data generated is staggering—328.77 million terabytes are created every day in 2024. That’s approximately 0.4 zettabytes (ZB) per day when considering all newly generated, captured, copied, or consumed data.
Yet, despite the massive amounts of data being produced and consumed daily, users own very little of it:
In crypto, we’ve seen the rise of @_kaitoai, which indexes social data on Twitter and translates it into actionable sentiment data for projects, KOLs, and thought leaders. The words “yap” and “mindshare” were popularized by the Kaito team because of their growth hacking expertise (with their popular mindshare & yapper dashboards) and ability to attract organic interest on Crypto Twitter.
“Yap” aims to incentivize quality content creation on Twitter, but many questions remain unanswered:
Beyond social data, discussions around data ownership, privacy, and transparency are heating up. With AI rapidly advancing, new questions emerge: Who owns the data used to train AI models? Who benefits from AI-generated outputs?
These questions set the stage for the rise of Web3 data layers—a shift toward user-owned, decentralized data ecosystems.
In Web3, there’s a growing ecosystem of data layers, protocols, and infrastructure focused on enabling personal data sovereignty—the idea of giving individuals more control over their data, with options to monetize it.
@vana‘s core mission is to give users control over their data, particularly in the context of AI, where data is invaluable for training models.
Vana introduces DataDAOs, community-driven entities where users pool their data for collective benefit. Each DataDAO focuses on a specific dataset:
Vana tokenizes data into a tradable asset called “DLP.” Each DLP aggregates data for a specific domain, and users can stake tokens to these pools for rewards, with the top pools being rewarded based on community support and data quality.
What makes Vana stand out is its ease of contributing data. Users simply:
@oceanprotocol is a Decentralized Data Marketplace that allows data providers to share, sell, or license their data, while consumers access it for AI and research.
Ocean Protocol uses “datatokens” (ERC-20 tokens) to represent access rights to datasets, allowing data providers to monetize their data while maintaining control over access conditions.
Types of data traded on Ocean:
Compute-to-Data is another key feature of Ocean, allowing computations to be done on the data without moving it, ensuring privacy and security for sensitive datasets.
@getmasafi is focused on creating an open layer for AI training data, supplying real-time, high-quality, and low-cost data for AI agents and developers.
Masa has launched two subnets on the Bittensor network:
Masa partnered with @virtuals_io, empowering Virtuals agents with real-time data capabilities. It also launched $TAOCAT, showcasing its abilities (currently on Binance Alpha).
@OpenledgerHQ is building a blockchain specifically tailored for data, particularly for AI and ML applications, ensuring secure, decentralized, and verifiable data management.
Key Highlights:
The demand for high-quality data to fuel AI and autonomous agents is surging. Beyond initial training, AI agents require real-time data for continuous learning and adaptation.
Key challenges & opportunities:
As AI agents become more autonomous, their ability to access and process real-time, high-quality data will determine their effectiveness. This growing demand has led to the rise of AI agent-specific data marketplaces—where both humans and AI agents can tap into high-quality AI agent data
Other key players:
This is just the beginning. Part 2 will dive deeper into:
Who controls the data will shape the future, and the projects building within this sector will define how data is owned, shared, and monetized in the AI era. As demand for high-quality data continues to grow, the race to create a more transparent, user-owned data economy is only getting started.
Stay tuned for Part 2!
Personal Note: Thanks for reading! If you’re in Crypto AI and want to connect, feel free to shoot me a DM.
If you’d like to pitch a project, please use the form in my bio—it gets priority over DMs.
Full Disclaimer: This document is intended for informational & entertainment purposes only. The views expressed in this document are not, and should not be construed as, investment advice or recommendations. Recipients of this document should do their due diligence, taking into account their specific financial circumstances, investment objectives, and risk tolerance (which are not considered in this document) before investing. This document is not an offer, nor the solicitation of an offer, to buy or sell any of the assets mentioned herein