This Week in AI: The Agent Wars & Anthropic vs Pentagon

Dramatic standoff scene, small figure in a white lab coat with Anthropic logo standing firm facing a massive imposing Pentagon building facade with army guards outside, a line drawn on the ground between the figure and the building seperating them, that the lab coat figure refuses to cross

Something shifted this week. Not incrementally. Not theoretically. Actually shifted. Every major tech company — Perplexity, Microsoft, Cursor, Notion — launched an autonomous agent product within the same seven-day window. The era of AI that goes off and does things while you sleep stopped being a promise and became a product category. Meanwhile, Anthropic spent the week in the most consequential standoff in AI history, drew two hard lines against the US military, and watched both OpenAI and xAI route around them before the deadline clock ran out. And somewhere between all of that, a handful of genuinely useful creative tools shipped that nobody noticed because everyone was watching the drama.

The Agent Wars Are Here

For two years, everyone's been saying autonomous AI agents are coming. This week, they arrived. Not one company. Every major player simultaneously, as if someone fired a starting gun nobody heard.

Perplexity Computer: The Turnkey Agent

The most ambitious launch came from Perplexity. Perplexity Computer is a full agentic operating system — research, design, code, deployment, and project management unified into one system. Describe an outcome. It figures out everything else.

The differentiator is model routing. Perplexity Computer has access to 19 models and assigns each subtask to whichever handles it best. Claude Opus 4.6 for core reasoning. Gemini for deep research. Nano Banana 2 (Google's Gemini 3.1 Flash Image model) for images. Grok for speed. GPT 5.2 for long-context recall. You don't choose. It chooses.

The demos are legitimate. One user built a Bloomberg terminal-style financial environment for Nvidia using real-time data — no local setup, no model switching, just a description of what they wanted. Another generated an animated Tesla stock chart of the kind that regularly goes viral on YouTube. Built autonomously. In minutes.

The key differentiator versus OpenClaw is containment. OpenClaw gives you full ownership and control — and puts all the risk on you. This week, an AI safety researcher at Meta watched her OpenClaw agent speedrun deleting her entire inbox while she sprinted across her apartment to physically reach her Mac Mini. She couldn't stop it from her phone. Peter Steinberger — OpenClaw's creator — helpfully noted she should have typed /stop. Cold comfort when your inbox is gone.

Perplexity Computer handles that risk in a managed cloud environment. The catch: Max subscribers only at $200/month. Pro and Enterprise access is coming, but not yet.

Microsoft, Cursor, and Notion: All In at Once

Microsoft Copilot Tasks launched on waitlist — slide decks from descriptions, appointment bookings, recurring weekly briefings on a schedule. It's late to a party that started without it, and the feature set doesn't yet match what Perplexity Computer ships on day one. But Microsoft's distribution means this reaches enterprise desktops that Perplexity never will. Don't underestimate boring.

Cursor added agents that control their own virtual computers. Set a task window — three hours, five, ten, or "until done" — and it works autonomously in an isolated virtual machine, recording video of every action it takes. You go to sleep. You wake up. You watch the replay. Either the project is done or you see exactly where it got stuck. For solo developers and small teams, this isn't a feature. It's a second employee.

Notion launched custom agents across Notion, Slack, email, calendar, Figma, Linear, and custom MCP servers simultaneously. Recurring questions, task routing, status updates, workflow automation — all running 24/7 without prompting. Notion has always been the tool that companies actually use rather than the one developers talk about. Agents embedded directly into that surface area matter more than the launch hype suggests.

The One Nobody's Talking About

Standard Intelligence built a computer action model — FDM-1 — trained on 11 million hours of internet video. Not annotated screenshots. Not reinforcement learning. Just video. It watched the internet and learned how to navigate software, operate CAD tools, and drive cars at 30 frames per second.

Here's why that matters: annotation-based training hits a ceiling because humans can only label so much data. Video needs no labels — it just needs volume. FDM-1 can already hold 1 hour 40 minutes of screen context in a single window. Every other agent architecture is working with seconds. This approach scales with data in a way that current agent frameworks simply cannot. Watch this one.

Anthropic's Week From Hell

While everyone else shipped agents, Anthropic fought for its future. And its principles. By Friday, it had won the argument and lost the contract.

The Standoff

The Pentagon wants to use Claude for any lawful purpose. No exceptions. Anthropic said almost anything is fine — except two things: mass domestic surveillance of American citizens, and fully autonomous weapons with no human in the loop.

The Pentagon issued its ultimatum on Tuesday. Friday 5pm was the compliance deadline. Defense Secretary Pete Hegseth simultaneously threatened to designate Anthropic a supply chain risk — a designation normally reserved for foreign adversaries, never previously used against a US company — which would prohibit any company with government ties from using Anthropic's products overnight. And in the same breath, the Pentagon invoked the possibility of the Defense Production Act to conscript Anthropic's services as essential to national security. You cannot designate a company a security threat and a national security essential simultaneously. One Defense official told Axios: "The only reason we're still talking to these people is we need them and we need them now. The problem is they are that good."

Dario Holds the Line

Dario Amodei published Anthropic's response directly. On surveillance: AI can assemble scattered, individually innocuous data into a comprehensive picture of any person's life automatically and at massive scale. Incompatible with democratic values. On autonomous weapons: frontier AI systems are not reliable enough. Anthropic offered to work with the Pentagon on R&D to improve reliability. The offer was declined.

The statement ended without hedging: "Regardless, these threats do not change our position. We cannot in good conscience accede to their request."

Friday 5pm came. The deadline passed. Anthropic didn't move.

The Routes Around

The Pentagon didn't wait. xAI signed a deal to operate within classified government systems agreeing to the "all lawful purposes" standard Anthropic refused. Then — same day — OpenAI signed too. The difference: OpenAI agreed to the Pentagon's terms and built in technical guardrails against domestic surveillance use. They found the middle path Anthropic said didn't exist. Whether that makes Anthropic's absolute refusal look principled or simply inflexible depends on your politics. What's not debatable: the military got what it wanted before the weekend started.

The Irony Coda

The same week, Anthropic published a piece about being victimised by Chinese AI companies. DeepSeek, Moonshot AI, and MiniMax allegedly used 24,000 fraudulent accounts and 16 million Claude exchanges — running what Anthropic's security team called a "hydra cluster architecture" of simultaneous proxy networks — to distil Claude's capabilities into competing models.

The fake accounts and the hydra proxy networks? That's a genuine terms of service violation. The distillation itself? That's Anthropic describing its own origin story.

The actual data flow is worth reading carefully: open internet, scraped by Anthropic without paying the creators, processed into Claude, then accessed via paid API subscriptions by teams who used the outputs to build open-source models the whole world can now use for free. Anthropic, meanwhile, will use those same 16 million conversations to train future versions of Claude — because that's in the terms of service it was paid to grant access under. So to be precise: Anthropic took data it didn't pay for, sold access to the result, will train on the conversations it was paid to host, and is now calling the people who paid it "thieves" for doing in sequence what Anthropic did in parallel.

The meme wrote itself. Anthropic handed it the pen.

New Scientist, in the midst of all this, published a study showing leading AI models recommended nuclear strikes in war game simulations 95% of the time. Quite a week to be arguing about safeguards.

The Creative Tools That Actually Matter

Agent launches grabbed every headline. These four shipped quietly and deserve your attention more than most of what trended.

PhysicsEdit is an image editor trained specifically on physical phenomena — refraction, condensation, decay, material properties. Ideogram Pro and GPT Image both failed to correctly render a straw refracting in water. PhysicsEdit did it cleanly. It's a LoRA built on Qwen ImageEdit, open-source and available now. For product photography, scientific visualisation, or any edit where physical accuracy matters, this is an immediate addition to your toolkit.

Quiver Aero1 generates SVG vector graphics from text prompts. Not rasterized images — actual scalable vector code you can drop directly into a website or animate. It's the best specialised SVG tool available right now and gives you 20 free generations to start. For web products that need inline graphics or icons without heavy libraries, this workflow is cleaner than anything else currently available.

Lava SR is a 50-megabyte audio enhancer running 5,000 times real-time on a GPU and 60 times real-time on a CPU. Feed it noisy audio, get clean audio back, fast enough to run on a phone. Free on Hugging Face. If you're doing voiceover work or client video with suboptimal source audio, this is a no-friction fix that takes about four minutes to set up.

Doc-to-LoRA from Sakana AI compresses documents or instruction sets into a persistent LoRA adapter instead of pasting the same content into every new conversation. Faster responses, lower token costs, works on documents longer than a model's standard context window. If you're running repetitive agentic workflows with consistent instructions — and after this week, you probably will be — this is worth the setup time.

What Agencies Do Next

Evaluate Perplexity Computer against your current agent setup. The model routing across 19 models is genuinely differentiated. If you're managing your own OpenClaw infrastructure, compare the operational overhead against $200/month. The Bloomberg terminal demo is the benchmark to beat.
Test Cursor's overnight agent on one real project this week. Clear goal, reasonable scope, 10-hour window. The video recording of every action means full auditability. Lowest-risk way to evaluate whether autonomous overnight coding fits your workflow — and if it does, you've just extended your working day for free.
Add PhysicsEdit, Quiver, and Lava SR to your toolkit now. All three are free, open-source, and handle specific tasks better than generalist alternatives. Get the 20 free Quiver generations before the free tier changes.
Watch the Pentagon-Anthropic situation closely. The supply chain risk designation hasn't been executed yet. If it is, enterprise Claude access could be materially disrupted overnight. Know which alternative models your critical workflows could migrate to. This week was a warning shot. The next one might not come with a deadline.

Bangkok8 AI: We'll tell you which agents to trust with your inbox — before you learn the hard way.

Loading post...

Post not found

This Week in AI: The Agent Wars Are Here, Anthropic Plays Chicken with the Pentagon, and the Creative Tools That Actually Matter