Generative AI is the most important innovation in recent memory and is becoming even more important as time progresses. Generative AI is basically a product of three elements:
Algorithms + Data + Compute = Intelligence
This means that Data and Compute will likely become two of the world’s most important assets, and access to them will be incredibly important.
Generative AI models are data-hungry. The Data that the most significant Generative AI models operate on is the Internet worth of data, which is an approximation for the sum of all human knowledge.
Crypto is all about giving access to new digital resources around the world and asset-izing things that weren’t assets before via tokens. Grass does this for Data.
Grass gives AI models and apps access to the entire Internet as a dataset, live, which is collected via a network of nodes around the world who are contributing their idle Internet bandwidth. They have strong initial traction with over 2.5 million users.[1]
The long-term potential market for Grass is massive and is relative to the size of the AI market and its future growth. In the past, gathering datasets of this scale was relegated to only the largest of tech giants. Grass brings new economics to data, driving down costs. This democratizes data access to not just serve elite large companies, but the longer-tail of the AI industry.
AI model training and fine-tuning requires enormous amounts of data. Historically, much of that data has been gathered via AI model creators scraping data from websites. This process of scraping has a number of challenges:
Grass aims to solve these problems by creating a federated network of web scrapers. Each individual participating in the Grass network contributes a portion of their unused Internet bandwidth to provide a small amount of scraping from their IP address. Grass then assembles data from each of these nodes to form a combined dataset that’s useful for AI training and fine-tuning. It’s an elegant and fitting use of distributed networks powered by cryptocurrency.
There are other business cases for unused Internet as well, such as:
Today Grass gathers data using existing hardware (laptops, desktops, etc.). In the future, Grass plans to offer a data gathering appliance, which is a custom hardware device solely dedicated to data gathering, creating efficiencies due to the appliance being optimized for that particular task.
There are several benefits to using a distributed network for data gathering:
One of the tricky things to navigate when scraping data is content creators. This includes sites such as the NY Times and Reddit, who have started to monetize their data by licensing it to third parties for training AI models. They are naturally protective of the data on their sites since that data represents highly lucrative revenue streams for them. Indeed, Reddit has forbidden their developer API to be used for machine learning to protect their business model of licensing their data to AI model creators (see terms of service here).
What does the future hold for content creators? Well, consider that for user-generated content (UGC), such as Reddit, there’s an argument that users own their own data (rather than the platform), since the content was created by users and should be owned by those users. This argument has yet to be fully explored from a legal point of view. It will be interesting to keep an eye on this going forward. However, if users do indeed own their contributed data, then Grass could represent a hypothetical pathway to help those users monetize their own contributed data. For example, Grass could reward the Reddit contributors themselves for volunteering to contribute their data that they’ve created on Reddit.
For paid content creators such as the NY Times, content is created by paid writers, and as such there is no argument for user-owned data. Thus, Grass could simply exclude those sites from being scraped. Alternatively, Grass may scale to the point where it becomes feasible for Grass itself to become a customer of those sites and pay licensing fees. The way this could hypothetically work is that Grass’ customers could pay for data, and then Grass could revenue share back to the content creators, thus enabling AI model creation on a flexible budget. Alternatively, Grass could achieve such a scale that it could negotiate a bulk licensing deal on behalf of all its customers.
Grass had an extremely impressive launch earlier this year:
As of writing, the Grass token had positive price action post-launch (+115%), which is unusual as most tokens drop in the days/weeks following listing. This is likely a reflection of their smart approach towards airdrop distribution, as well as belief in the future and potential of Grass. Overall this is a great start to the network and we believe it paves the way for many prosperous years to come.
Source: TradingView.
Start contributing your unused Internet bandwidth by connecting your Solana wallet and earn the Grass token.
Want to use Grass’ datasets for your business, research, or project? Contact the team at discover@grassfoundation.io.
[1] Source: https://www.getgrass.io/.
[2] Source: https://www.google.com/url?q=https://www.theblock.co/post/323805/grass-becomes-most-distributed-solana-airdrop-as-nearly-1-5-million-addresses-claim-tokens&sa=D&source=docs&ust=1732646335082707&usg=AOvVaw0oVvhJL661rmE1ABmJqOyP.
[3] Source: https://www.getgrass.io/.
Generative AI is the most important innovation in recent memory and is becoming even more important as time progresses. Generative AI is basically a product of three elements:
Algorithms + Data + Compute = Intelligence
This means that Data and Compute will likely become two of the world’s most important assets, and access to them will be incredibly important.
Generative AI models are data-hungry. The Data that the most significant Generative AI models operate on is the Internet worth of data, which is an approximation for the sum of all human knowledge.
Crypto is all about giving access to new digital resources around the world and asset-izing things that weren’t assets before via tokens. Grass does this for Data.
Grass gives AI models and apps access to the entire Internet as a dataset, live, which is collected via a network of nodes around the world who are contributing their idle Internet bandwidth. They have strong initial traction with over 2.5 million users.[1]
The long-term potential market for Grass is massive and is relative to the size of the AI market and its future growth. In the past, gathering datasets of this scale was relegated to only the largest of tech giants. Grass brings new economics to data, driving down costs. This democratizes data access to not just serve elite large companies, but the longer-tail of the AI industry.
AI model training and fine-tuning requires enormous amounts of data. Historically, much of that data has been gathered via AI model creators scraping data from websites. This process of scraping has a number of challenges:
Grass aims to solve these problems by creating a federated network of web scrapers. Each individual participating in the Grass network contributes a portion of their unused Internet bandwidth to provide a small amount of scraping from their IP address. Grass then assembles data from each of these nodes to form a combined dataset that’s useful for AI training and fine-tuning. It’s an elegant and fitting use of distributed networks powered by cryptocurrency.
There are other business cases for unused Internet as well, such as:
Today Grass gathers data using existing hardware (laptops, desktops, etc.). In the future, Grass plans to offer a data gathering appliance, which is a custom hardware device solely dedicated to data gathering, creating efficiencies due to the appliance being optimized for that particular task.
There are several benefits to using a distributed network for data gathering:
One of the tricky things to navigate when scraping data is content creators. This includes sites such as the NY Times and Reddit, who have started to monetize their data by licensing it to third parties for training AI models. They are naturally protective of the data on their sites since that data represents highly lucrative revenue streams for them. Indeed, Reddit has forbidden their developer API to be used for machine learning to protect their business model of licensing their data to AI model creators (see terms of service here).
What does the future hold for content creators? Well, consider that for user-generated content (UGC), such as Reddit, there’s an argument that users own their own data (rather than the platform), since the content was created by users and should be owned by those users. This argument has yet to be fully explored from a legal point of view. It will be interesting to keep an eye on this going forward. However, if users do indeed own their contributed data, then Grass could represent a hypothetical pathway to help those users monetize their own contributed data. For example, Grass could reward the Reddit contributors themselves for volunteering to contribute their data that they’ve created on Reddit.
For paid content creators such as the NY Times, content is created by paid writers, and as such there is no argument for user-owned data. Thus, Grass could simply exclude those sites from being scraped. Alternatively, Grass may scale to the point where it becomes feasible for Grass itself to become a customer of those sites and pay licensing fees. The way this could hypothetically work is that Grass’ customers could pay for data, and then Grass could revenue share back to the content creators, thus enabling AI model creation on a flexible budget. Alternatively, Grass could achieve such a scale that it could negotiate a bulk licensing deal on behalf of all its customers.
Grass had an extremely impressive launch earlier this year:
As of writing, the Grass token had positive price action post-launch (+115%), which is unusual as most tokens drop in the days/weeks following listing. This is likely a reflection of their smart approach towards airdrop distribution, as well as belief in the future and potential of Grass. Overall this is a great start to the network and we believe it paves the way for many prosperous years to come.
Source: TradingView.
Start contributing your unused Internet bandwidth by connecting your Solana wallet and earn the Grass token.
Want to use Grass’ datasets for your business, research, or project? Contact the team at discover@grassfoundation.io.
[1] Source: https://www.getgrass.io/.
[2] Source: https://www.google.com/url?q=https://www.theblock.co/post/323805/grass-becomes-most-distributed-solana-airdrop-as-nearly-1-5-million-addresses-claim-tokens&sa=D&source=docs&ust=1732646335082707&usg=AOvVaw0oVvhJL661rmE1ABmJqOyP.
[3] Source: https://www.getgrass.io/.