Published
Reading time
3 min read
Executing a sales report analysis for Q1 2026, focusing on revenue trends and product performance via desktop tools.
Loading the Elevenlabs Text to Speech AudioNative Player...

Dear friends,

If you haven’t already, I encourage you to experiment with using AI agents not just to chat but to actually do work for you on your desktop. Desktop agents not only chat with you but also read and edit local files, read/send messages, and provide scheduled deliverables like a daily news summary. While there's nothing wrong with copy-pasting output from web-based chatbots to a desktop or dragging and dropping files into chatbots to give them context, desktop agents can gain context more efficiently as well as take actions directly. 

The main way such an agent is built involves creating a set of tools (function calls) for tasks such as file access, web search/web fetch, messaging app integration, and so on; providing these tools to a frontier LLM; and setting up permissions and guardrails. Then you prompt the LLM and let it pick when to use what tool to move forward on a task. The software that wraps around the LLM to implement a desired agentic system is called the agent harness, and it enables the LLM to drive the key loop that decides what to do next at each step.

So far, most practical Agentic AI workflows (except for coding agents) have not relied on the LLM to this extent to decide what to do next. Instead, they have relied more on developer-specified workflows to deliver higher reliability. But in the past few months, frontier LLMs have advanced sufficiently for this style of harness design to provide an important, if still not entirely reliable, alternative. 

CLI (command line interface) coding agents (like Claude Code, Codex CLI, Antigravity CLI, and OpenCode) have been the main type of agent that uses an LLM to drive the next action. But there’s also value to non-CLI agents with easy-to-use interfaces. More precisely, consumers  currently interact with AI systems through three key interfaces: (i) chat interfaces (like the web version of ChatGPT), (ii) coding CLI tools, and (iii) desktop agents that can carry out tasks.  

I do not use existing commercial desktop agents for highly confidential tasks, since I’m uncomfortable with some of their data-retention policies, which are often buried in obscure legalese and might change overnight with a new model (as we just saw with Anthropic’s Fable release). Also, if you make a small misstep, it may have unexpected legal consequences such as losing legal privilege to confidential documents.

In light of these concerns, my collaborators Rohit Prsad, Devika Verma, and I have been working on a free, open-source alternative: OpenCoworker. This is an open-source project we put together while extending aisuite to support agent harnesses. If you’re interested in learning more about agentic harnesses, you might enjoy checking out the code. 

Using OpenCoworker requires your own API key from OpenAI, Anthropic, Google, or another provider, or you can run a local model using Ollama so nothing ever leaves your machine. Some of the data integrations, such as email, are still difficult to set up (comparable in difficulty to what users of other open-source projects such as OpenClaw or Hermes Agent users might have experienced). It saves its memory on your computer, and you can choose a LLM provider with a zero data-retention policy, local inference, or other options depending on your privacy requirements. 

My teams have been experimenting with OpenCoworker for a wide range of tasks like messaging automation, creating documents, and workflow automation. This is a work in progress, and I hope the open-source community will ensure that there is a viable, open, desktop agent option that is comparable or superior to the closed ones. We are working to make OpenCoworker easier to use, and welcome contributions as well as feedback!  

Keep building!

Andrew 

Share

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox