
OpenAI says GPT-5.6 Sol and Terra were capable of identifying vulnerabilities but were unable to execute autonomous, end-to-end attacks against hardened targets
THE SO WHAT
Top-tier models can now reliably spot vulnerabilities but still struggle to chain them into autonomous, end-to-end attacks against hardened systems — the offense/defense gap is narrower but not closed. Security teams should treat LLMs as powerful recon and red-teaming tools while still assuming human-led orchestration for serious threats.
READ THE SOURCE
MORE FROM THE WIRE
Applied AIAnthropic’s Mythos 5 AI Model Cleared by US for Wider Use
Export controls are becoming a gating factor in who gets access to frontier capability—“trusted partner” status is now a real commercial asset. If you’re a US institution in the 100+ eligible group, your near-term edge is in how fast you can harden workflows and governance around Mythos 5, not just in getting access.
Applied AINew agentic memory framework uses 118K tokens per query. LangMem burns through 3.26M.
Long-horizon agents are hitting a hard wall on context bloat—MRAgent’s 118K-token approach vs LangMem’s 3.26M shows that memory architecture is now a primary cost and latency driver. If you’re building agents, you need an owner for memory strategy the same way you have owners for retrieval and tools.
Applied AILetter: the US lifts its block on Mythos 5, allowing Anthropic to release it to more than 100 US institutions; sources: talks about Fable 5 are ongoing
Regulators are drawing a line between broad public access and controlled institutional access—Mythos 5 is now in the latter bucket for 100+ US orgs while Fable 5 is already in the conversation. If you’re not on that list, assume a capability gap and plan around interoperability and model diversity, not single-vendor parity.
Applied AIForum AI CEO on Pitfalls With AI in Politics
AI in politics is forcing a higher bar for transparency and auditability than most commercial deployments face. If your models touch civic processes—ads, content ranking, identity—you should assume “open up for scrutiny” becomes a regulatory requirement, not a branding choice.