Engineering

Beyond the chatbot: AI agents that actually do things

Dušan Dević

April 22, 2026

8 min read

For two years, the canonical AI product has been a text box that returns text. You type a question. The model types an answer. You read it, copy something out, and go execute the actual task somewhere else — in a different app, with a different login, often by retyping what the model just told you. The model never touched the world.

That era is ending fast. The shift, technically, is small — function calling, sometimes called tool use — but the user-facing implications are large. An agent that can call functions can, in principle, do anything those functions do. And the functions can be anything: place an order, run a query, schedule a meeting, deposit funds, dim the lights, page the on-call engineer.

A worked example

Consider the difference between these two interactions. In the old world, a customer types into your support chat: I need to reschedule my appointment from Friday to Monday. The bot replies: I am sorry, I cannot reschedule for you. Please call our office. The user gives up, calls the number, waits on hold for fourteen minutes, finally talks to someone, and reschedules.

In the new world, the same message hits the same chatbox. The agent calls lookupAppointment(user_id), sees the Friday slot, calls findAvailability(provider, "monday"), picks a comparable time, calls rescheduleAppointment(id, slot), calls sendConfirmation(user_id), and replies: Done — moved to Monday at 14:30. Confirmation sent. Total time: 8 seconds. No human in the loop on either end.

That is not science fiction. We have been shipping versions of this for a year — for clients in commerce, betting, healthcare front-desk, and B2B scheduling. The technology is, frankly, ready. Most products have not caught up.

What an agent can plausibly do today

The list grows every month, but the production-ready set is already big enough to be transformational:

Place orders and process payments inside a chat or voice interface.
Manage bets, slips, and account funding for licensed operators.
Schedule, reschedule, and cancel appointments across calendars and CRM systems.
File support tickets, escalate to humans, and triage based on policy.
Control smart-home and IoT devices through standard APIs.
Query operational data, draft reports, and email them to the right people.
Trigger internal workflows — provisioning, refunds, exception handling.
Hand off cleanly to humans when the agent is unsure, with full context.

Why this is harder than it looks

The bad version of this category is everywhere — a chatbot that hallucinates an order it did not place, or worse, places one it should not have. Building an agent that does not embarrass your brand requires more engineering than people give it credit for. A few things matter disproportionately.

Typed tools, not free-form actions

Every tool the agent can call has a strict, validated schema. The model proposes arguments; your code validates them. If the model wants to charge €4,000 to a card that has never charged more than €40, that fails a check before the call ever reaches the payment gateway.

Confirmations, with teeth

For high-stakes actions — money moving, irreversible changes — the user explicitly confirms in the same conversation. Not a generic are you sure, but a structured re-statement: I will place a bet of €20 on Galileo Star at 9/2 to win. Confirm? One mistyped instruction should not drain an account.

Audit and reversibility

Every tool call goes into a log. Every consequential action has an undo path, or a clearly documented "this cannot be undone." When the regulator, the customer, or your own ops team asks what happened, the answer is one query away.

Evaluation that runs every release

AI products without eval sets are rumours. We treat agent behavior the same way we treat any other production system: a regression suite of representative conversations runs on every change, and we ship when the numbers do not get worse.

The strategic move

For most product companies, the agent is not a replacement for the existing app. It is a new front door. The user still has buttons and forms when they want them. They also have a single conversation that knows their account, their history, and their intent — and can do anything the buttons could do, plus a few things the buttons could not.

Anywhere your user fills a form today, an agent can fill it in for them — and actually click submit.

That is the headline. The mechanics underneath are boring infrastructure: typed tools, a planning loop, a context store, an evaluation pipeline. The boring part is what separates a demo from a product. We have spent the last year shipping the boring part. If you have a workflow trapped behind a form, we would love to see it.

#AI
#agents
#product

// about the author

Dušan Dević

Founder · DeltaDigit. We design, build, and operate production software for ambitious teams across the EU and US.

Book a 15-min call Message on WhatsApp

—

Keep reading

Beyond the chatbot: AI agents that actually do things

A worked example

What an agent can plausibly do today

Why this is harder than it looks

Typed tools, not free-form actions

Confirmations, with teeth

Audit and reversibility

Evaluation that runs every release

The strategic move

Dušan Dević

More from the workshop

Chat with your documents is easy. Until you have a lot of documents.

AI won't replace software engineers — it will reshape the job