AI Mirage in the Crypto World

Beginner

4/8/2024, 3:53:14 PM

The article explores the application of Artificial Intelligence (AI) in the cryptocurrency field and the challenges it faces. It points out that while AI technology holds potential for innovation in cryptocurrencies, its practical application may be influenced by market competition and regulation. The article emphasizes that decentralization alone is not enough to provide a competitive advantage for AI products based on cryptocurrencies; they must also match centralized products in functionality. Additionally, the article suggests that the value of many AI tokens may be exaggerated, lacking sustainable demand-driving factors. Nevertheless, there are still widespread opportunities at the intersection of AI and cryptocurrencies, but the development and realization of these opportunities may take time.

The intersection between artificial intelligence (AI) and cryptocurrency is vast but often poorly understood. We believe that different subsections of this intersection have distinct opportunities and development timelines.
We generally believe that for AI products, decentralization alone is not enough to bring competitive advantage – it must also achieve functional parity with centralized products in certain other key areas.
Our contrarian view is that due to the widespread attention on the AI industry, the value potential of many AI tokens may be exaggerated, and many AI tokens may lack sustainable demand-side drivers in the short to medium term.

In recent years, continued breakthroughs in artificial intelligence, especially in the field of generative artificial intelligence, have attracted great attention to the artificial intelligence industry and provided opportunities for crypto projects located at the intersection of the two. We previously covered some of the possibilities for the sector in an early report in June 2023, noting that overall capital allocation in cryptocurrencies appeared to be underinvesting in artificial intelligence. The field of crypto AI has grown tremendously since then, and we feel it’s important to point out some of the practical challenges that may hinder its widespread adoption.

The rapid change in AI makes us cautious about bold claims that crypto-centric platforms are uniquely positioned to disrupt the industry; this leads us to believe that most AI tokens have a long-term and sustainable value appreciation path. The road is full of uncertainty, especially for projects with fixed token economic models. Instead, we believe that some emerging trends in AI may actually make cryptocurrency-based innovations more difficult to adopt, given broader market competition and regulation.

That said, we believe the point between AI and cryptocurrencies is broad and has diverse opportunities, with adoption likely to be faster in certain sub-segments, despite the lack of already-marketed tokens in many of these areas. Still, that doesn’t seem to be dampening investor interest. We find that the performance of AI-related crypto tokens is supported by AI market headlines and can have positive price action even on days when Bitcoin is trading lower. Therefore, we believe that many AI-related tokens can continue to trade as representations of AI progress.

Key Trends in Artificial Intelligence

One of the most important trends in the field of artificial intelligence (related to crypto-AI products) is the ongoing culture around open source models. More than 530,000 models are exposed on Hugging Face for researchers and users to manipulate and fine-tune. Hugging Face’s role in AI collaboration is no different than relying on GitHub for code hosting or Discord for community management (both widely used in the crypto space). Barring serious mismanagement, this situation is not likely to change in the near future.

Models available on Hugging Face range from large language models (LLMs) to generative image and video models, and include creations from major industry players like Open AI, Meta, and Google, as well as independent developers. Some open source language models even have performance advantages over state-of-the-art closed-source models in terms of throughput (while maintaining comparable output quality), ensuring a degree of competition between open source and commercial models (see Figure 1). Importantly, we believe this vibrant open source ecosystem combined with a highly competitive commercial sector has enabled an industry where bad models are driven out of competition.

The second trend is the increasing quality and cost-effectiveness of smaller models (highlighted in LLM research back in 2020 and in a recent paper from Microsoft), which also coincides with the open source culture to further enable high-end Performance, locally running AI models. Some fine-tuned open source models can even outperform leading closed source models on certain benchmarks. In such a world, some AI models could be run locally, maximizing decentralization. Of course, incumbent technology companies will continue to train and run larger models on the cloud, but the design space between the two will require trade-offs.

Additionally, given the increasing complexity of the task of benchmarking AI models (including data contamination and varying test scopes), generating model outputs may ultimately be best evaluated by end users in a free market. In effect, end users can use existing tools to compare model output side-by-side with benchmark companies that perform the same operations. A rough idea of the difficulty of generative AI benchmarks can be gained from the growing variety of open LLM benchmarks, including MMLU, HellaSwag, TriviaQA, BoolQ, etc., each testing different use cases such as common sense reasoning, academic topics, and various question formats.

The third trend we observe in the AI space is that existing platforms with strong user stickiness or solving specific business problems can benefit disproportionately from AI integration. For example, GitHub Copilot’s integration with code editors enhances an already powerful developer environment. Embedding AI interfaces into other tools, from email clients to spreadsheets to customer relationship management software, are also natural use cases for AI (for example, Klarna’s AI assistant does the work of 700 full-time customer service staff) .

But it’s worth noting that in many of these scenarios, AI models will not lead to new platforms, but only enhance existing ones. Other AI models that improve traditional business processes internally (e.g., Meta’s Lattice system, which helped restore Apple’s ad performance to old levels after it launched App Tracking Transparency) also often rely on proprietary data and closed systems. These types of AI models will likely remain closed source because they are vertically integrated into the core product and use proprietary data.

In the world of AI hardware and computing, we see two other related trends. The first is the shift in computing usage from training to inference. That is, when artificial intelligence models are first developed, vast amounts of computing resources are used to “train” the model by feeding it large data sets. Now we have moved on to deploying and querying the model.

NVIDIA’s earnings call in February 2024 showed that about 40% of their business was used for inference. Satya Nadella made similar remarks at Microsoft’s earnings call the month before January, pointing out that “most” of their Azure AI Use is for reasoning. As this trend continues, we believe entities seeking to monetize models will prioritize platforms that can reliably run models in a secure and production-ready manner.

The second major trend is the competitive landscape surrounding hardware architecture. Nvidia’s H200 processors will be available starting in the second quarter of 2024, with the next-generation B100 expected to further double performance. In addition, Google’s continued support for its own Tensor Processing Unit (TPU) and Groq’s newer Language Processing Unit (LPU) may also increase its market share as alternatives in this space in the coming years (see Figure 2). Such developments could alter the cost dynamics in the artificial intelligence industry and could benefit cloud service providers, enabling them to swiftly pivot, bulk purchase hardware, and configure any related physical networking requirements and developer tools.

Overall, the field of artificial intelligence is an emerging and rapidly developing field. Less than 1.5 years after ChatGPT was first released to the market in November 2022 (although its underlying GPT 3 model has been around since June 2020), the rapid growth in the space since then has been astounding. Despite some questionable behavior regarding the biases behind some generative AI models, we could see poorer performing models being phased out by the market in favor of better alternatives. The rapid growth of the industry and the potential for upcoming regulations mean that the industry’s problems are changing regularly as new solutions become available.

For such a rapidly innovating field, the oft-touted “decentralized solution [XXX]” as a foregone conclusion is premature. It also preemptively solves a centralization problem that may not necessarily exist. The reality is that the AI industry has achieved a large degree of decentralization across technology and business verticals through competition between many different companies and open source projects. Furthermore, due to the nature of their decision-making and consensus processes, decentralized protocols advance at a slower pace than centralized protocols on both a technical and social level. This could create obstacles in the quest to balance decentralization and competitive products at this stage of AI development. In other words, there are synergies between cryptocurrency and artificial intelligence that can be meaningfully realized over an extended period of time.

Scope the opportunity

Broadly speaking, we divide the intersection of artificial intelligence and cryptocurrency into two broad categories. The first category is use cases where AI products improve the crypto industry. This includes scenarios ranging from creating human-readable transactions and improving blockchain data analysis, to leveraging on-chain model output as part of a permissionless protocol. The second category is use cases where cryptocurrencies aim to disrupt traditional AI pipelines through decentralized computing, verification, identity, etc.

The use cases for the former category of business-related scenarios are clear, and we believe that while significant technical challenges remain, there are also long-term prospects in more complex on-chain inference model scenarios. Centralized AI models can improve cryptocurrencies like any other technology-focused industry, from improving developer tools and code auditing to translating human language into on-chain actions. But investment in this area usually flows into private companies through venture capital, so it is often ignored by public markets.

However, the implications and benefits of how crypto could disrupt existing AI pipelines are less certain to us. Difficulties in the latter category are not just technical challenges (which we believe are generally solvable in the long term), but also uphill battles with broader market and regulatory forces. Much of the recent attention on artificial intelligence and cryptocurrencies has been focused on this category, as these use cases are better suited to owning liquid tokens. This is the focus of our next section, as there are currently relatively few liquidity tokens relevant to centralized AI tools in cryptocurrencies.

The role of cryptocurrencies in artificial intelligence pipelines

At the risk of oversimplifying the issue, we consider the potential impact of cryptocurrencies on AI in four main stages of the AI pipeline:

Data collection, storage and processing
Model training and inference
Verification of model output
Track the output of the artificial intelligence model

Many new crypto-AI projects have emerged in these areas. However, many will face serious challenges in the short to medium term from demand-side generation and fierce competition from centralized companies and open-source solutions.

Proprietary Data

Data is the foundation of all AI models and can be the key differentiator in professional AI model performance. Historical blockchain data itself is a new rich source of data for models, and some projects like Grass also aim to leverage crypto incentives to curate new data sets from the open internet. In this regard, crypto has the opportunity to provide industry-specific data sets and incentivize the creation of new valuable data sets. (We think Reddit’s recent $60 million per year data licensing deal with Google bodes well for the future of dataset monetization.)

Many early models (such as GPT 3) used a mix of open datasets such as CommonCrawl, WebText2, Books, and Wikipedia, as well as similar datasets freely available on Hugging Face (currently hosting over 110,000 options). However, possibly to protect commercial interests, many recent closed-source models have not yet released their final training dataset composition. The trend toward proprietary data sets, especially in business models, will continue and increase the importance of data licensing.

Existing centralized data marketplaces are already helping bridge the gap between data providers and consumers, leaving the opportunity space for new decentralized data marketplace solutions sandwiched between open source data catalogs and enterprise competitors. Without the support of a legal structure, a purely decentralized data market would also need to build standardized data interfaces and pipelines, verify data integrity and configuration, and solve the cold start problem of its products - while balancing the token incentives between market participants.

In addition, decentralized storage solutions may eventually find a place in the artificial intelligence industry, though with many challenges in this regard. On the one hand, pipelines for distributing open source datasets already exist and are widely used. On the other hand, many owners of proprietary data sets have strict security and compliance requirements.

Currently, there are no regulatory pathways for hosting sensitive data on decentralized storage platforms like Filecoin and Arweave. Many enterprises are still transitioning from on-premises servers to centralized cloud storage providers. Moreover, the decentralized nature of these networks does not currently meet certain geographical location and physical data isolation requirements for storing sensitive data, at a technical level.

While price comparisons between decentralized storage solutions and established cloud providers suggest that decentralized storage units are cheaper per unit, this ignores a significant premise. First, the upfront costs associated with migrating systems between providers need to be considered on top of day-to-day operating expenses. Second, crypto-based decentralized storage platforms need to match better tooling and integration with mature cloud systems developed over the past two decades. Cloud solutions also have more predictable costs from a business operations perspective, offer contractual obligations and dedicated support teams, and have a large pool of existing developer talent.

It’s also worth noting that a cursory comparison with the three major cloud providers (Amazon Web Services, Google Cloud Platform, and Microsoft Azure) is incomplete. There are dozens of lower-cost cloud companies also vying for market share by offering cheaper basic server racks. We believe these are the real near-term major competitors for cost-conscious consumers.

In other words, recent innovations such as Filecoin’s data computation and Arweave’s AO computation environment may play a role in upcoming greenfield projects that utilize less sensitive datasets or for companies that are not yet cost-sensitive suppliers (potentially smaller in scale).

Therefore, while there is certainly room for new cryptographic products in the data space, recent technological disruptions will occur where they can generate unique value propositions. Areas where decentralized products compete head-on with traditional and open-source competitors will take more time to progress.

Training and Inferencing Models

The field of decentralized computing (DeComp) in the crypto industry also aims to serve as an alternative to centralized cloud computing, partly due to the existing GPU supply shortages. One solution proposed to address this scarcity issue is the reuse of idle computing resources within collective networks, thereby reducing costs for centralized cloud providers. Protocols like Akash and Render have implemented similar solutions. Preliminary indicators suggest that such projects are seeing increased usage from both users and suppliers. For example, Akash’s active leases (i.e., number of users) have tripled year-to-date (see Figure 3), mainly due to increased utilization of its storage and computing resources.

However, the fees paid to the network have actually declined since the peak in December 2023, as the supply of available GPUs outpaced the growth in demand for these resources. That said, as more providers join the network, the number of GPUs rented (which appears to be the largest revenue driver proportionally) has declined (see Figure 4). For networks where computational pricing can change based on changes in supply and demand, it’s unclear where sustained, usage-driven demand for native tokens will eventually emerge if supply-side growth exceeds demand-side growth. While the long-term impact of such changes is unclear, such token economic models may need to be reviewed in the future to optimize for market changes.

On a technical level, decentralized computing solutions also face the challenge of network bandwidth limitations. For large models that require multi-node training, the physical network infrastructure layer plays a crucial role. Data transfer speeds, synchronization overhead, and support for certain distributed training algorithms mean that specific network configurations and custom network communications (such as InfiniBand) are required to facilitate high-performance execution. When exceeding a certain cluster size, it is difficult to implement in a decentralized manner.

In summary, the long-term success of decentralized computing (and storage) faces fierce competition from centralized cloud providers. Any adoption will be a long-term process akin to the adoption timeline of the cloud. Given the increasing technological complexity of decentralized network development, coupled with the lack of similar scalable development and sales teams, it will be a challenging journey to fully realizing the vision of decentralized computing.

Validating and Trusting Models

As artificial intelligence models become increasingly important in daily life, concerns about their output quality and biases are growing. Some cryptocurrency projects aim to address this issue by leveraging an algorithmic approach to assess outputs across different categories, seeking a decentralized, market-based solution. However, the challenges surrounding model benchmarking, along with apparent trade-offs between cost, throughput, and quality, make face-to-face comparisons challenging. BitTensor is one of the largest cryptocurrencies focused on AI and aims to address this issue, although numerous prominent technical challenges may hinder its widespread adoption (see Appendix 1).

Additionally, trustless model inference (i.e., proving that model outputs are indeed generated by the claimed model) is another active research area in the intersection of cryptocurrency and AI. However, as the scale of open-source models shrinks, such solutions may face challenges in demand. In a world where models can be downloaded and run locally and content integrity can be verified through robust file hash/checksum methods, the role of trustless inference is less clear. Indeed, many large language models (LLMs) still cannot be trained and operated on lightweight devices like smartphones, but powerful desktop computers (such as those used for high-end gaming) can already run many high-performance models.

Data Provenance and Identity

As the output of generative AI becomes increasingly indistinguishable from human output, the importance of identifying and tracking what AI generates comes into focus. GPT 4 passes the Turing test 3 times faster than GPT 3.5, and it is almost inevitable that we will one day be unable to tell the difference between robots and humans. In such a world, determining the identity of online users and watermarking AI-generated content will be key capabilities.

Decentralized identifiers and identity verification mechanisms like Worldcoin aim to address previous challenges in identifying humans on-chain. Similarly, publishing data hashes to the blockchain can help establish the time stamp and source verification of content. However, as with the aforementioned partial solutions, we believe there must be a balance between the feasibility of crypto-based solutions and centralized alternatives.

Some countries, such as China, link online identities to government-controlled databases. While the degree of centralization in other parts of the world may not be as high, Know Your Customer (KYC) provider alliances can also offer identity verification solutions independent of blockchain technology (similar to trusted certificate authorities that underpin today’s internet security). Research is currently underway on artificial intelligence watermarking to embed hidden signals in text and image outputs so algorithms can detect whether content is AI-generated. Many leading AI companies, including Microsoft, Anthropic, and Amazon, have publicly committed to adding such watermarks to their generated content.

Furthermore, many existing content providers have been entrusted with rigorously recording metadata of content to meet compliance requirements. Therefore, users often entrust metadata associated with social media posts (but do not trust screenshots), even if they are centrally stored. It is worth noting that any crypto-based data sourcing and identity solution needs to integrate with user platforms to achieve wide effectiveness. Therefore, while crypto-based solutions for proving identity and data sourcing are technically feasible, we also believe their adoption is not predetermined and will ultimately depend on business, compliance, and regulatory requirements.

Trading the AI Narrative

Despite the above difficulties, many AI tokens have outperformed Bitcoin and Ethereum starting in the fourth quarter of 2023, as well as major AI stocks such as Nvidia and Microsoft. This is because AI tokens typically benefit from strong relative performance in the broader crypto market and related AI news headlines (see Appendix 2). Therefore, even if the price of Bitcoin falls, the prices of AI-focused tokens may fluctuate upward, which can lead to upward volatility during Bitcoin declines. Figure 5 visually shows the dispersion of AI tokens during Bitcoin trading declines.

Overall, there still lacks many short-term sustained demand-driving factors in the narrative of AI in the cryptocurrency space. The absence of clear adoption predictions and metrics has led to widespread meme-like speculation, which may not be sustainable in the long run. Ultimately, price and utility will converge—the unresolved question is how long this will take and whether utility will rise to meet price, or vice versa. That said, the ongoing construction of the cryptocurrency market and the thriving AI industry may sustain a robust narrative of cryptocurrency AI for some time.

Conclusion

The role of cryptocurrency in AI is not a mere abstraction—any decentralized platform competes with existing centralized alternatives and must be analyzed against broader business and regulatory requirements. Therefore, merely replacing centralized providers with “decentralized” ones is not sufficient to drive meaningful progress. Generative AI models have existed for several years and have retained a degree of decentralization due to market competition and open-source software.

A recurring theme in this report is the acknowledgement that while crypto-based solutions are often technically feasible, they still require significant work to achieve functionality on par with more centralized platforms, which are not likely to stand still in future developments. In fact, due to the consensus mechanism, centralized development often progresses faster than decentralized development, which may pose challenges to a field like AI that is evolving rapidly.

Given this, the intersection of AI and cryptocurrency is still in its early stages, and rapid changes may occur in the coming years with the broader development of the AI field. Decentralized AI future is not guaranteed, as envisioned by many in the crypto industry—indeed, the future of the AI industry itself remains largely uncertain. Therefore, we believe the prudent approach is to cautiously navigate such markets, delve deeper into crypto-based solutions, and truly understand how to provide better alternatives or grasp potential trading narratives.

Original Article Link

Statement:

This article originally titled “加密世界的AI海市蜃楼” is reproduced from [theblockbeats]. All copyrights belong to the original author [David Han]. If you have any objection to the reprint, please contact Gate Learn team, the team will handle it as soon as possible.
Disclaimer: The views and opinions expressed in this article represent only the author’s personal views and do not constitute any investment advice.
Translations of the article into other languages are done by the Gate Learn team. Unless mentioned, copying, distributing, or plagiarizing the translated articles is prohibited.