Codex used my own app on the first try
An experiment with computer use and an internal app with no docs, where the agent completed tasks on the first try, and what that means for integration and UX.
Notes from a small experiment with computer use, a Proton VPN, and an app I'd built but never trained an AI on.
Computer use — the ability for an AI agent to operate software the way a human would, by clicking and typing on the screen — isn't available in Czech Republic yet. Codex's version of it hasn't rolled out here.
So I did the obvious thing. Proton VPN, US exit node, signed in.
I wasn't trying to break a rule. I wanted to test something specific: can an agent operate an app it has never seen before, with no documentation, no walkthrough, and no API shortcut?
So I gave it one of my own.
The test
I picked an internal tool I'd built — the kind of app most agencies have lying around. Custom UI, our own conventions, no public docs, nothing the model could have been trained on.
The instructions were simple: create a few tasks, add some teammates, set them up properly.
That's it. No "click the button in the top right." No "use the menu under settings." Just the goal.
What happened
It worked. First try.
The agent found the navigation. Identified the right screens. Clicked through the forms. Filled in the fields. Saved. Moved to the next task.
It didn't ask questions. It didn't get stuck on hover states or misread a button. It moved through the app like a person who had been onboarded a week ago and now knew the lay of the land.
The whole sequence took less time than it would have taken me to record a Loom walkthrough explaining how to do it.
Why this matters
A year ago, "AI agents using your software" was a demo. Curated. Staged. Often hardcoded around a specific flow.
This wasn't curated. The app was mine. The goal was concrete. And nothing was set up for it.
Two things I keep thinking about.
The integration question just changed shape.
For years, the answer to "how do we connect tool A to tool B" has been "find or build an API, wire it up, maintain it forever." That's still the best option when it exists.
But now there's a second answer: point an agent at the UI and tell it what you want. For internal tools, legacy systems, vendors without proper APIs — this is suddenly viable. Not optimal. But viable. And that's a category that didn't exist before.
UX is about to have two audiences.
Every interface choice we make — button labels, menu structures, form layouts — has been judged by humans. From now on, agents are part of that audience too.
Clear labels help both. Hidden affordances hurt both. The difference is that an agent will tell you, dryly, that it couldn't find the "Add member" button because it was buried inside an unmarked icon menu.
Agent-friendly UX is just good UX, slightly stricter.
What this means for the work we do
A lot of automation work is plumbing — connecting systems that don't want to be connected. That work doesn't go away. But the surface area of what's automatable is expanding fast.
The new question for any client process isn't only "is there an API?" It's "can an agent operate this software well enough to do the job?" For more and more workflows, the answer is yes.
That changes the cost-benefit math on a lot of automation projects that were previously too brittle or too expensive to ship. Internal admin panels, vendor portals, legacy ERPs, that one app the team uses for one specific thing — all of them just became more reachable.
Closing
I'm not going to oversell a single afternoon's experiment. There are still failure modes — long sessions, complex state, anything that requires real judgment over many steps. Computer use is not a finished product.
But the floor has clearly moved. An agent navigated an app it had never seen, on the first try, in under a minute.
It was the first time in a while that something AI-related made me sit back and actually pay attention.