Anthropic engineer apologizes: It was my fault to mock the prediction that "AI compute usage is billed by the amount."

robot
Abstract generation in progress

Event Overview

An Anthropic engineer publicly apologized, stating that his previous mockery of the idea that “AI computing power will eventually be charged by usage” was incorrect. He explained that scaling is indeed very difficult, and the current pricing and rate limiting are unavoidable choices, not intended to deceive.

Specific Situation

  • Person involved: Thariq Shihipar (@trq212), working on the Claude Agent SDK
  • In response to @weswinder, he said he shouldn’t have mocked @Pranit’s earlier judgment that “AI computing power will be billed like utilities.”
  • His stance: There is no deception; the problem is that scaling is genuinely challenging, and the current approach is “the best option we can think of at the moment.”
  • Cause of controversy: @Pranit criticized Anthropic’s tiered pricing (which includes a $200/month tier), sparking discussions about the costs of AI infrastructure.
  • Industry reference: Cursor frequently triggers rate limits during peak times and is continuously optimizing for efficiency.

What This Means: A front-line engineer acknowledged the long-standing complaints from users—the difficulty of scaling the Agent system is real, and the company is still exploring the balance between cost, performance, and availability.

Why It Makes Sense

  • Shihipar is directly involved in the development of Claude sub-agents, loop verification, and other features; he speaks from personal experience, not guesswork.
  • @Pranit pointed out that whether it’s Anthropic’s tiered pricing or Cursor’s Pro/Ultra+API pool, they are essentially approaching “pay-per-use”—due to the significant fluctuations in inference costs. This is the same issue as the debate in 2026 regarding the computational consumption of Agent workflows.
  • The core contradiction is:
    • Tasks are becoming increasingly complex (code refactoring, multi-step reasoning).
    • Vendors must find a balance between efficiency and capability, inevitably limiting speed during peak times.
    • Some techniques at the SDK level (like gradually releasing context) can alleviate issues but do not address the root cause.
  • Cursor’s pricing and limit adjustments also reflect the same problem—heavy users have high inference demands, and their quotas are quickly exhausted.

Impact Assessment

  • Importance: Moderate
    • Why It Matters: This is a technical perspective from the R&D front lines, confirming the true difficulty of scaling Agents. It holds reference value for developers and technical teams in selection and architectural planning.

Key Points Summary

  • Core Conclusions:

    1. Scaling difficulty is the main reason: The current pricing and rate limiting are trade-offs in engineering reality, not deliberate misdirection.
    2. Billing is approaching a pay-per-use model: The high variability in costs and demand makes tiered + pooled designs hard to avoid.
    3. Peak rate limiting is a symptom, not the root cause: The fundamental issue is the unpredictable resource consumption of complex Agent workflows.
    4. Tool layers can alleviate but not solve the root issue: Techniques like gradually releasing context are helpful, but the conflict between performance and cost remains.
  • Recommendations for Different People:

    • Developers/Architects:
      • Reserve quotas, design fallback plans.
      • Focus on efficiency tools in the SDK (sub-agents, verification loops, context trimming).
    • Product and Procurement:
      • Assess the actual usable time window for “tiered + rate limiting.”
      • Monitor service levels and experience fluctuations during peak times.
    • Open Source/Tool Ecosystem:
      • Differentiate in cost transparency and resource management.

Judgment: The trend of “AI computing power moving towards pay-per-use” is still in its early stages. The ones who will benefit the most are developers and enterprise technical teams focused on stability and cost control. Before the scaling challenge is addressed, those who can effectively manage resources, rate limiting strategies, and workflow optimization will have the advantage.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin