DeepSeek open sources DSpark, a new framework to speed up LLM inference by up to 85%
THE SO WHAT
If DSpark’s claimed 85% LLM inference speedup holds in the wild, the cost curve for serving large models just bent again—especially for teams willing to adopt a Chinese-led open stack. For infra leads, it’s worth a small-scale benchmark this quarter: cheaper inference changes which products pencil out and how aggressively you can deploy heavier models at the edge of your app.
READ THE SOURCE
MORE FROM THE WIRE
Applied AIAmazon seeks cheaper AI alternatives as Anthropic shifts to token-based pricing
Amazon hunting cheaper alternatives as Anthropic moves to token-based pricing is a clear signal that hyperscalers will arbitrage model vendors hard—unit economics on inference now matter as much as raw capability. If you’re building on third-party models, model-switching and cost observability should be first-class in your architecture, not an afterthought.
Applied AIAnthropic Fires Back at Snitch Amazon CEO
Public friction between a major model lab and a hyperscaler partner is a reminder that AI supply chains are politically and contractually fragile. If your roadmap leans heavily on a single lab–cloud pairing, map your exit options and data portability before that relationship becomes a constraint.
Applied AIGemini’s personalized AI image generation is now free for US users
Personalized image generation tied into Gmail, Photos, and other Google apps—now free—turns Gemini into a default creative surface for a huge consumer base. If you’re building consumer or prosumer tools, assume users will expect “generate on my stuff” as a baseline and design around Google owning that context layer.
Applied AIFitbit’s Gemini AI coach is giving users ‘unhinged’ fitness advice — here’s why users are saying they ‘cannot wait for my trial to end’
An “unhinged” AI fitness coach is a live case study in what happens when you ship advice agents without tight domain constraints, guardrails, and UX for uncertainty. Any team deploying health, finance, or safety-adjacent copilots should be running red-team drills on worst-case advice and instrumenting fast rollback paths before broad release.