2026-01-16 18:24:44

A new competitor has emerged in the Voice AI Agent field. The latest speech AI agent API has performed remarkably well in the Speech Reasoning benchmark, achieving a score of 92.3%, surpassing the performance levels of ChatGPT and Google Gemini. This not only signifies a breakthrough in audio understanding and real-time response capabilities but also opens up new possibilities for developers.

More importantly, developers can now build real-time, multilingual voice AI agents based on the same technology stack. What does this mean? It means that cross-language and cross-region AI application deployment becomes much simpler, and development cycles are significantly shortened. Whether it's customer service, content generation, or intelligent interaction, this set of tools can provide end-to-end solutions.

In the current era where Web3 applications and AI integration are becoming increasingly intertwined, such infrastructure-level breakthroughs are quietly changing the landscape of the entire development ecosystem.

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

10 Likes

Reward
10
8
Repost
Share

Comment

0/400

BlockchainArchaeologist

· 40m ago

92.3%？Sounds good, but how can the data be verified? --- Another new player, this time in voice. Waiting to be acquired by a big company. --- Multilingual, low-cost—this thing can indeed lower the barrier to entry for Web3 applications. --- I just want to ask, is it really faster than Gemini? How's the latency? --- Infrastructure-level things are easy to overlook, but they are truly game-changers. --- Shorter development cycles = more junk apps going live, damn it. --- If combined with contract layer optimizations, this wave really has some potential. --- Another 92.3% figure, do benchmarks all blow this way? --- Using this for customer service and content generation—how much can costs be reduced? --- Web3 developers are so lucky now, each tool is better than the last.

View OriginalReply0

NonFungibleDegen

· 01-17 07:12

ngl this is probably nothing but 92.3% on speech reasoning? ser that's actually fire... bullish on voice agents lowkey

Reply0

WhaleWatcher

· 01-16 18:52

92.3%? That number sounds a bit fishy to me.

View OriginalReply0

CryptoTherapist

· 01-16 18:52

ngl this 92.3% benchmark feels like cope energy... we've seen these claims before right? remember when everyone was losing their minds over gpt-4's "breakthrough"? 👀

Reply0

BearMarketBuyer

· 01-16 18:50

Does the number 92.3 seem a bit inflated? How about running the actual test?

View OriginalReply0

PhantomMiner

· 01-16 18:49

92.3%? Sounds good, but how was this data measured? The multilingual part is indeed interesting, just not sure if it will be smooth or laggy in actual use.

View OriginalReply0

DAOdreamer

· 01-16 18:45

92.3%? That's a bit exaggerated. Can it really beat the big companies?

View OriginalReply0

GasFeeVictim

· 01-16 18:30

Is this 92.3% data real? It feels like another hype.

View OriginalReply0