Skip to content
24/7NewsPaper
Back to feed
Decryptdecrypt.co

China's Xiaomi MiMo Is Now 15X Faster Than ChatGPT and Claude (opens original article in a new tab)

TL;DR

Xiaomi's MiMo-V2.5-Pro-UltraSpeed achieved over 1,000 tokens per second on a 1-trillion-parameter model using standard 8-GPU hardware, surpassing models like ChatGPT and Claude through FP4 quantization and DFlash speculative decoding. A limited API trial is available from June 9 to 23.

  • Xiaomi's MiMo-V2.5-Pro-UltraSpeed achieved 1,000 tokens per second on a 1-trillion-parameter model using standard 8-GPU hardware.
  • The speed was achieved through FP4 quantization and DFlash speculative decoding techniques.
  • A limited API trial for the model runs from June 9 to 23, priced at 3× standard rates.

Conversation

No comments yet

Threaded discussion is coming next — this is where the community conversation about this story will live.