The new Mercury 2 AI model uses diffusion reasoning to generate 1,000 tokens per second; it runs about 5x faster than Haiku, speed limits are ...
It's cheap to copy already built models from their outputs, but likely still expensive to train new models that push the boundaries. Reading time 4 minutes It is becoming increasingly clear that AI ...
Anthropic has unveiled Claude 3.7 Sonnet, a notable addition to its lineup of large language models (LLMs), building on the foundation of Claude 3.5 Sonnet. Marketed as the first hybrid reasoning ...
Unlock the full InfoQ experience by logging in! Stay updated with your favorite authors and topics, engage with content, and download exclusive resources. Cory Benfield discusses the evolution of ...
Nvidia researchers developed dynamic memory sparsification (DMS), a technique that compresses the KV cache in large language models by up to 8x while maintaining reasoning accuracy — and it can be ...
These speed gains are substantial. At 256K context lengths, Qwen 3.5 decodes 19 times faster than Qwen3-Max and 7.2 times ...
A marriage of formal methods and LLMs seeks to harness the strengths of both.
OpenAI finally unveiled its rumored “Strawberry” AI language model on Thursday, claiming significant improvements in what it calls “reasoning” and problem-solving capabilities over previous large ...
GeekWire chronicles the Pacific Northwest startup scene. Sign up for our weekly startup newsletter, and check out the GeekWire funding tracker and VC directory. by Anthony Diamond on Dec 26, 2024 at 8 ...