Hypothesis

https://web.archive.org/web/20250630134724/https://www.theregister.com/2025/06/29/ai_agents_fail_a_lot/

'agent washing' Agentic AI underperforms, getting at most 30% tasks right (Gemini 2.5-Pro) but mostly under 10%.

Article contains examples of what I think we should agentic hallucination, where not finding a solution, it takes steps to alter reality to fit the solution (e.g. renaming a user so it was the right user to send a message to, as the right user could not be found). Meredith Witthaker is mentioned, but from her statement I saw a key element is missing: most of that access will be in clear text, as models can't do encryption. Meaning not just the model, but the fact of access existing is a major vulnerability.

agenticai agentwashing ai-agents

Tags

Annotators

URL