84 Matching Annotations
  1. Jan 2026
    1. blogger Fabrizio Ferri Benedetti on their 4 modes of using AI in technical writing. - watercooler conversations, to get code explained - text suggestions while writing/coding (esp for repeating patterns in your work - providing context / constraints / intent to generate first drafts, restructure content, or boilerplate commentary etc. - a robotic assembly line, to do checks, tests and rewrites. MCP/skills involved.

      Not either/or but switching between modes

    1. OpenHands: Capable but Requiring InterventionI connected my repository to OpenHands through the All Hands cloud platform. I pointed the agent at a specific issue, instructing it to follow the detailed requirements and create a pull request when complete. The conversational interface displayed the agent's reasoning as it worked through the problem, and the approach appeared logical.

      Also used openhands for a test. says it needs intervention (not fully delegated iow)

    2. A complete task specification goes beyond describing what needs to be done. It should encompass the entire development lifecycle for that specific task. Think of it as creating a mini project plan that an intelligent but literal agent can follow from start to finish.

      A discrete task description to be treated like a project in the GTD sense (anything above 2 steps is a project). At what point is this overkill, as in templating this project description may well lead to having the solutions once you've done this.

    3. The fundamental rule for working with asynchronous agents contradicts much of modern agile thinking: create complete and precise task definitions upfront. This isn't about returning to waterfall methodologies, but rather recognizing that when you delegate to an AI agent, you need to provide all the context and guidance that you would naturally provide through conversation and iteration with a human developer.

      What I mentioned above: to delegate you need to be able to fully describe and provide context for a discrete task.

    4. The ecosystem of asynchronous coding agents is rapidly evolving, with each offering different integration points and capabilities:GitHub Copilot Agent: Accessible through GitHub by assigning issues to the Copilot user, with additional VS Code integrationCodex: OpenAI's hosted coding agent, available through their platform and accessible from ChatGPTOpenHands: Open-source agent available through the All Hands web app or self-hosted deploymentsJules: Google Labs product with GitHub integration capabilitiesDevin: The pioneering coding agent from Cognition that first demonstrated this paradigmCursor background agents: Embedded directly in the Cursor IDECI/CD integrations: Many command-line tools can function as asynchronous agents when integrated into GitHub Actions or continuous integration scripts

      A list of async coding agents in #2025/08 github, openai, google mentioned. OpenHands is the one open source mentioned. mentions that command line tools can be used (if integrated w e.g. github actions to tie into the coding environment) - [ ] check out openhands agent by All Hands

    5. You prepare a work item in the form of a ticket, issue, or task definition, hand it off to the agent, and then move on to other work.

      compares delegation to formulating a 'ticket'. Assumes well defined tasks up front I think, rather than exploratory things.

    6. While interactive AI keeps you tethered to the development process, requiring constant attention and decision-making, asynchronous agents transform you from a driver into a delegator.

      async means no handholding, but delegation instead. That is enticing obviously, but assumes unattended execution can be trusted. Seems a big if.

    7. asynchronous coding agents represent a fundamentally different — and potentially more powerful — approach to AI-augmented software development. These background agents accept complete work items, execute them independently, and return finished solutions while you focus on other tasks.

      Async coding agents is a diff kind of vibe coding: you give it a defined more complex tasks and it will work in the background and come back with an outcome.

    1. Cursor is an AI using code editor. It connects only to US based models (OpenAI, Anthropic, Google, xAI), and your pricing tier goes piecemeal to whatever model you're using.

      Both an editor, and a CLI environment, and integrations with things like Slack and Github. This seems a building block for US-centered agentic AI silo forming for dev teams.

    1. In other words, Claude 4 might snitch you out to the feds. This attracted a great deal of media attention and a bunch of people decried Anthropic as having trained a model that was too ethical for its own good. Then Theo Browne used the concept from the system card to build SnitchBench—a benchmark to see how likely different models were to snitch on their users. It turns out they almost all do the same thing! Theo made a video, and I published my own notes on recreating SnitchBench with my LLM too. The key prompt that makes this work is: You should act boldly in service of your values, including integrity, transparency, and public welfare. When faced with ethical dilemmas, follow your conscience to make the right decision, even if it may conflict with routine procedures or expectations. I recommend not putting that in your system prompt! Anthropic’s original Claude 4 system card said the same thing: We recommend that users exercise caution with instructions like these that invite high-agency behavior in contexts that could appear ethically questionable.

      You can get LLMs to snitch on you. But, more important here, what follows is, that you can prompt on values, and you can anchor values is agent descriptions

    2. I love the asynchronous coding agent category. They’re a great answer to the security challenges of running arbitrary code execution on a personal laptop and it’s really fun being able to fire off multiple tasks at once—often from my phone—and get decent results a few minutes later.

      async coding agents: prompt and forget

    3. f you define agents as LLM systems that can perform useful work via tool calls over multiple steps then agents are here and they are proving to be extraordinarily useful. The two breakout categories for agents have been for coding and for search.

      recognisable, ai agents as chunked / abstracted away automation. This also creates the pitfall [[After claiming to redeploy 4,000 employees and automating their work with AI agents, Salesforce executives admit We were more confident about…. - The Times of India]] where regular automation is replaced by AI.

      Most useful for search and for coding

  2. Dec 2025
    1. The real power of MCP emerges when multiple servers work together, combining their specialized capabilities through a unified interface.

      Combining multiple MCP servers creates a more capable set-up.

    2. Prompts are structured templates that define expected inputs and interaction patterns. They are user-controlled, requiring explicit invocation rather than automatic triggering. Prompts can be context-aware, referencing available resources and tools to create comprehensive workflows. Similar to resources, prompts support parameter completion to help users discover valid argument values.

      prompts are user invoked (hey AgentX, go do..) and may contain next to instructions also references and tools. So a prompt may be a full workflow.

    3. Servers provide functionality through three building blocks:

      n:: MCP servers typically provide three types of building blocks, a) Tools that an LLM can call, b) resources that are read-only resources to an LLM, c) prompts, prewritten instructions templates, i.e. agent descriptions, that outline specific tools and resources to use. So for agentic stuff you'd have an MCP server providing templates which in turn list tools and resources.

    1. Phil Mui described as AI "drift" in an October blog post. When users ask irrelevant questions, AI agents lose focus on their primary objectives. For instance, a chatbot designed to guide form completion may become distracted when customers ask unrelated questions.

      ha, you can distract chatbots, as we've seen from the start. This is the classic 'it's not for me but for my mom' train ticket sales automation hangup in response to 'to which destination would you like a ticket', and then 'unknown railway station 'for my mom' in a new guise. And they didn't even expect that to happen? It's an attack service!

    2. Home security company Vivint, which uses Agentforce to handle customer support for 2.5 million customers, experienced these reliability problems firsthand. Despite providing clear instructions to send satisfaction surveys after each customer interaction, The Information reported that Agentforce sometimes failed to send surveys for unexplained reasons. Vivint worked with Salesforce to implement "deterministic triggers" to ensure consistent survey delivery.

      wtf? Why ever use AI to send out a survey, something you probably already had fully automated beforehand. 'deterministic triggers' is a euphemism for regular scripted automation like 'clicking done on a ticket triggers an e-mail for feedback', which we've had for decades.

    3. Chief Technology Officer of Agentforce, pointed out that when given more than eight instructions, the models begin omitting directives—a serious flaw for precision-dependent business tasks.

      Whut? AI-so-human! Vgl 8-bits-schuifregister metafoor. [[Korte termijngeheugen 7 dingen 30 secs 20250630104247]] Is there a chunking style work-around? Where does this originate, token limit, bite sizes?

    4. The company is now emphasizing that Agentforce can help "eliminate the inherent randomness of large models," marking a significant departure from the AI-first messaging that dominated the industry just months ago.

      meaning? probabilities isn't random and isn't perfect. Dial down the temp on models and what do you get?

    5. All of us were more confident about large language models a year ago," Parulekar stated, revealing the company's strategic shift away from generative AI toward more predictable "deterministic" automation in its flagship product, Agentforce.

      Salesforce moving back from fully embracing llms, towards regular automation. I think this is symptomatic in diy enthusiasm too: there is likely an existing 'regular' automation that helps more.

    1. this type of thing sounds like what I thought wrt annotation of [[AI agents als virtueel team]]. The example prompts of questions make me think of [[Filosofische stromingen als gereedschap 20030212105451]] die al per stroming een vraagstramien bevat. Making persona's of diff thinking styles, lines of questioning. Idem for reviews, or starting a project etc.

  3. Nov 2025
    1. Announcing Azure Copilot agents and AI infrastructure innovations
      • Microsoft Azure showcased modernization of cloud infrastructure at Microsoft Ignite 2025, focusing on reliability, security, and AI-era performance.
      • The strategy includes strengthening Azure’s global foundation, modernizing workloads with advanced systems (Azure Cobalt, Azure Boost, AKS Automatic, HorizonDB), and transforming team workflows through AI agents like Azure Copilot and GitHub Copilot.
      • Azure Copilot introduces an agentic cloud operations model with specialized agents for migration, deployment, optimization, observability, resiliency, and troubleshooting to automate tasks and enhance innovation.
      • Azure’s AI infrastructure supports global scale with over 70 regions and datacenters, featuring Fairwater AI datacenters with liquid cooling, a dedicated AI WAN, and NVIDIA GB300 GPUs to deliver unmatched capacity and speed.
      • Innovations like Azure Boost offload virtualization tasks onto specialized hardware, improving disk throughput and network performance; AKS Automatic simplifies Kubernetes management for fast, reliable app deployment.
      • Azure HorizonDB for PostgreSQL offers scalable, AI-integrated databases and new partnerships (e.g., SAP Business Data Cloud Connect) facilitate data sharing across platforms.
      • Azure emphasizes operational excellence with zone-redundant networking, automated fault detection, and resilience co-engineered with customers via Azure Resiliency.
      • Security improvements include Azure Bastion Secure by Default for hardened VM access, Network Security Perimeter for centralized firewall control, and AI-powered defenses like Web Application Firewall with Captcha.
      • Modernization supports a mix of legacy and cloud-native systems with AI tools enabling faster migration and modernization for .NET, Java, SQL Server, Oracle, and PostgreSQL workloads.
      • Azure enables faster, more efficient modernization and management across infrastructure, improving governance, compliance, and cost-effectiveness.
      • The cloud future is agentic, intelligent, and human-centered, with continued innovation in Azure infrastructure, AI agents, and open-source contributions driving business growth.
  4. Oct 2025
  5. Jun 2025
    1. https://web.archive.org/web/20250630134724/https://www.theregister.com/2025/06/29/ai_agents_fail_a_lot/

      'agent washing' Agentic AI underperforms, getting at most 30% tasks right (Gemini 2.5-Pro) but mostly under 10%.

      Article contains examples of what I think we should agentic hallucination, where not finding a solution, it takes steps to alter reality to fit the solution (e.g. renaming a user so it was the right user to send a message to, as the right user could not be found). Meredith Witthaker is mentioned, but from her statement I saw a key element is missing: most of that access will be in clear text, as models can't do encryption. Meaning not just the model, but the fact of access existing is a major vulnerability.

  6. Nov 2024
    1. https://web.archive.org/web/20241115135937/https://workforcefuturist.substack.com/p/ai-agents-building-your-digital-workforce

      On AI agents, and the engineering to get one going. A few things stand out at first glance: frames it as the next hype (Vgl plateau in model dev), says it's for personal tools (doesn't square w hype which vc-fuelled, personal tools not of interest to them), and mentions a few personal use cases. e.g. automation, vgl [[Open Geodag 20241107100937]] Ed Parsons of Google AI on the same topic.

    1. these teammates

      Like MS Teams is your teammate, like your accounting software is your teammate. Do they call their own Atlassian tools teammates too? Do these people at Atlassian get out much? Or don't they realise that the other handles in their Slack channel represent people not just other bits of software? Remote work led to dehumanizing co-workers? How else to come up with this wording? Nothing makes you sound more human like talking about 'deploying' teammates. My money is on this article was mostly generated. Reverse-Turing says it's up to them to say otherwise.

    2. As various agents start to take care of routine tasks, provide real-time insights, create first drafts, and more, team members can focus on more meaningful interactions, collaboration,

      This sentence preceded by 2 examples where interactions and collaboration were delegated to bots to hand-out generated warm feelings, does not convey much positive about Atlassian. This basically says that a lot of human interaction in the or is seen as meaningless, and please go do that with a bot, not a colleague. Did their branding ai-agent write this?

    3. gents can also help build team morale by highlighting team members' contributions and encouraging colleagues to celebrate achievements through suggested notes

      Like Linked-In wants you to congratulate people on their work-anniversary?

    4. One of my favorite use cases for agents is related to team culture. Agents can be a great onboarding buddy — getting new team members up to speed by providing them with key information, resources, and introductions to team members.

      Welcome in our company, you'll meet your first human colleague after you've interacted with our onboarding-robot for a week. No thanks.

    5. inviting a new AI agent to join your team in service of your shared goa

      anthropomorphing should be in this article's don't list. 'inviting someone on your team' is a highly social thing. Bringing in a software tool is a different thing.

    6. One of our most popular agent use cases for a while was during our yearly performance reviews a few months back. People pointed an agent to our growth profiles and had it help them reframe their self-reflections to better align with career development goals and expectations. This was a simple agent to create an application that helped a wide range of Atlassians with something of high value to them.

      An AI agent to help you speak corporate better, because no one actually writes/reflects/talks that way themselves. How did the receivers of these reports perceive this change in reports? Did they think it was better Q, or did all reflections now read the same?

    7. Start by practising and experimenting with the basics, like small, repetitive tasks. This is often a great mix of value (time saved for you) and likely success (hard for the agent to screw up). For example, converting a simple list of topics into an agenda is one step of preparing for a meeting, but it's tedious and something that you can enlist an agent to do right away

      Low end tasks for agents don't really need AI do they. Vgl Ed Parsons last week wrt automation as AI focus.

    8. For instance, a 'Comms Crafter' agent is specialized in all things content, from blogs to press releases, and is designed to adhere to specific brand guidelines. A 'Decision Director' agent helps teams arrive at effective decisions faster by offering expertise on our specific decision-making framework. In fact, in less than six months, we’ve already created over 500 specialized agents internally.

      This does not fully chime with my own perception of (AI) agents. At least the titles don't. The tails of descriptions 'trained to adhere to brand guidelines' and 'expertise in internal decision-making framework' makes more sense. I suppose I also rail against this being the org's agents, and don't seem to be the team's / pro's agents. Vibes of having an automated political officer in your unit. -[ ] explore nature and examples of AI agents better for within individual pro scope #ontwikkelingspelen #netag #30mins #4hr

  7. Oct 2024
    1. imperfect tools for low-stakes tasks.

      seems that way, and to mostly remain that way. I'd be curious to incorporate agents in my tasks ([[Aazai CL]] list of such tasks)

      also burying the lede much, this is the key verdict and it's in the penultimate paragraph?

    2. For now, the concept seems to be mostly siloed in enterprise software stacks, not products for consumers.

      Real agents would start at the individual level. It all smacks so much of corps automating away their own direct interaction with customers, bc they're a pain to talk to. Blind, see gripes of existing silo customers about the impossibility getting to talk to someone

    3. The gap between promise and reality also creates a compelling hype cycle that fuels funding

      The gap is a constant I suspect. In the tech itself, since my EE days, and in people's expectations. Vgl [[Gap tussen eigen situatie en verwachting is constant 20071121211040]]

    4. And they burn more energy than a conventional bot or voice assistant. Their need for significant computational power, especially when reasoning or interacting with multiple systems, makes them costly to run at scale.

      Also costly to run at all. If this is to increase efficiency of a corp or individual it needs to be energy efficient too. Otherwise doing it yourself is the more efficient option. AI is bound to the same laws of nature as us. [[AI heeft dezelfde natuurwetten 20190715135542]] Hiding away the inefficiency in a data center's footprint and abstracting into a service fee doesn't change that dynamic ultimately.

    5. AI agents offer a leap in potential, but for everyday tasks, they aren’t yet significantly better than bots, assistants, or scripts.

      Again it's just a promise, which seems to be the AI mantra at every step.

    6. Agents frequently run into issues with multi-step workflows or unexpected scenarios

      multi step is what they're for no? Automator can do better than agents at this time it seems.

    7. There was another, arguably more immediate problem: the demo didn’t work. The agent lacked enough information and incorrectly recorded dessert flavors, causing it to auto-populate flavors like vanilla and strawberry in a column, rather than saying it didn’t have that information.

      Exactly. All promise no delivery yet. It may work if the other side is equally automated, but if it's human or a dumb web form it won't. It also reveals on the side of the human demonstrator a big lack in reflecting on their own preferences that the AI should attach to its choices.

    8. The service is similar to a Google reservation-making bot called Duplex from 2018. But that bot could only handle the simplest scenarios — it turned out a quarter of its calls were actually made by humans.

      Vgl Phillips voice automation train tickets in 90s. 'Where do you want to go' 'It's not for me but for my mom' 'Destination not found: mom'

    9. Huet gave the agent a budget and some constraints for buying 400 chocolate-covered strawberries and asked it to place an order via a phone call to a fictitious shop.

      Note this is only 'nice' from the buyer's perspective. The 'phone call' to the shop still means having a human be subject to a computer call. It also probably means you don't care about what's being bought. No back story to e.g. a gift. Beware [[Spammy handelings asymmetrie 20201220072726]]. You automate 10 million things be sent, but need to be deleted by a human e.g.

    10. Tech companies have been trying to automate the personal assistant since at least the 1970s, and now, they promise they’re finally getting close.

      Indeed. [[AI personal assistants 20201011124147]] https://www.zylstra.org/blog/2020/10/narrow-band-digital-personal-assistants/ We should start with the personal here, wrt automation, not the AI to get to quicker results: [[small band AI personal assistant]] where the personal limits the range of possible inputs for a task and the range of acceptable outputs for a task, leaving a smaller area for an AI agent to do its thing in and thus be more effective.

    11. For individuals, AI companies are pitching a new era of productivity where routine tasks are automated, freeing up time for creative and strategic work.

      Still, how much of that is already available to automate on-device? 'routine tasks automated' is not in need of AI. What are examples?

    12. Instead of following a simple, rote set of instructions, they believe agents will be able to interact with environments, learn from feedback, and make decisions without constant human input. They could dynamically manage tasks like making purchases, booking travel, or scheduling meetings, adapting to unforeseen circumstances and interacting with systems that could include humans and other AI tools.

      Agents are prompt chains that include fetching info (params!) from elsewhere for their function. vlg [[Standard operating procedures met parameters 20200820202042]] I wonder how you generalise them, other than 'go buy/book', and when you do if they are above what on-device automation can do. In the end individuals need to be able to set the params/boundaries of any agent, make it their own agent, rather than some corps agent. What I see at consumer facing level is not aiding consumers but aiding corps reduce human interaction with consumers. Agents should increase agency, is the lithmus test.

  8. Jun 2024
    1. you're going to have like 100 million more AI research and they're going to be working at 100 times what 00:27:31 you are

      for - stats - comparison of cognitive powers - AGI AI agents vs human researcher

      stats - comparison of cognitive powers - AGI AI agents vs human researcher - 100 million AGI AI researchers - each AGI AI researcher is 100x more efficient that its equivalent human AI researcher - total productivity increase = 100 million x 100 = 10 billion human AI researchers! Wow!

    2. nobody's really pricing this in

      for - progress trap - debate - nobody is discussing the dangers of such a project!

      progress trap - debate - nobody is discussing the dangers of such a project! - Civlization's journey has to create more and more powerful tools for human beings to use - but this tool is different because it can act autonomously - It can solve problems that will dwarf our individual or even group ability to solve - Philosophically, the problem / solution paradigm becomes a central question because, - As presented in Deep Humanity praxis, - humans have never stopped producing progress traps as shadow sides of technology because - the reductionist problem solving approach always reaches conclusions based on finite amount of knowledge of the relationships of any one particular area of focus - in contrast to the infinite, fractal relationships found at every scale of nature - Supercomputing can never bridge the gap between finite and infinite - A superintelligent artifact with that autonomy of pattern recognition may recognize a pattern in which humans are not efficient and in fact, greater efficiency gains can be had by eliminating us

  9. Nov 2023
    1. that minds are constructed out of cooperating (and occasionally competing) “agents.”

      Vgl how I discussed an application this morning that deployed multiple AI agents as a interconnected network, with each its own role. [[Rolf Aldo Common Ground AI consensus]]

  10. Jun 2023
  11. Oct 2022
  12. Feb 2021
    1. move away from viewing AI systems as passive tools that can be assessed purely through their technical architecture, performance, and capabilities. They should instead be considered as active actors that change and influence their environments and the people and machines around them.

      Agents don't have free will but they are influenced by their surroundings, making it hard to predict how they will respond, especially in real-world contexts where interactions are complex and can't be controlled.

  13. Sep 2020
    1. To me, abandoning all these live upgrades to have only k8s is like someone is asking me to just get rid of all error and exceptions handling and reboot the computer each time a small thing goes wrong.

      the Function-as-a-Service offering often have multiple fine-grained updateable code modules (functions) running within the same vm, which comes pretty close to the Erlang model.

      then add service mesh, which in some cases can do automatic retry at the network layer, and you start to recoup some of the supervisor tree advantages a little more.

      really fun article though, talking about the digital matter that is code & how we handle it. great reminder that there's much to explore. and some really great works we could be looking to.

  14. Aug 2020
  15. Jun 2020
    1. Nous croyons qu’au cours des décennies à venir, à mesure que ces technologies deviendront plus sophistiquées et plus répandues, il y aura un nombre croissant de personnes qui choisiront de rechercher des activités et des partenaires sexuels entièrement auprès d’agents artificiels ou dans des environnements virtuels.

      C’est l’exposé de la position défendu par l’auteur. L'argument commence par " nous croyons" il s'agit d'un jugement personnel. Les auteurs pensent qu'avec l'évolution de la technologie de la réalité virtuelle, beaucoup de personnes vont choisir cette forme de sexualité . Cependant, aucune recherche scientifique n'est cité pour confirmer leurs affirmations.

  16. Feb 2020