The Brutal UX and Product Challenges of AI Agents
You really thought building AI agents was mainly a technical challenge?
Many think the hard part of building AI agents is about technical feasibility—choosing the right models, etc. But technical challenges are just the beginning.
The real challenge starts when you try to productize those agents. When you go from building for yourself to building for external users or clients, you start facing the difficult product and UX-related tradeoffs, which recursively impact your technical design decisions.
And by UX, I don’t mean visual polish or onboarding flows. I mean the psychological friction that shows up when real people interact with autonomous systems they don’t fully understand or trust. To give you a taste:
Asymmetry of liability: one failure by your agent can cost an outsized amount of your time, money, or reputation. This is a huge problem when you charge a small flat fee per action (usage model). Any AI agent that does anything meaningful beyond outputting slop needs to think about this.
The whipping boy effect: when AI agents screw up because the user failed to specify sensible input requirements, guess who’s going to be blamed? It’s the AI agent. Users rarely blame themselves when it’s their fault for misusing software. Clearly, the AI agent is supposed to read their mind (/s).
Essentially, gaining and maintaining user’s trust is the number one product priority for AI agents. But the vast majority of AI influencers severely undersell (or unaware of?) how hard it is to gain users’ trust, especially for agentic apps that take real world actions. From my experience with AI agents at Alexa and AWS, this discourse is extremely misleading.
And here’s the kicker: these problems don’t completely go away with better models. Smarter LLMs can help with feasibility, but they don’t change the fact that users have weird opinions strongly held. Bigger context windows won’t tell you what a user forgot to mention. Chain-of-thought won’t fix their unrealistic expectations. These UX constraints are structural, not technical.
This is why so many promising AI agents break down in the wild. Not because the model wasn’t good enough—but because the product wasn’t designed with real human behavior in mind. Luckily, there are many techniques a lot of these problems are solvable. But you have to approach them like a product person, not a model builder.
So in this post, I’ll talk about five of the most brutal UX and product challenges that sabotage AI agents in production.
🚀 We are accepting applications for the AI agents workshop, which is a 5 week, cohort based program, where you will be building AI agents and automation under our guidance and peer support. It’s meant for business owners, PMs, innovators at companies, and consultants to build AI agents in 2025. No prior programming experience is necessary, although some knowledge of prompt engineering, passion for AI, and intuitive understanding of ChatGPT or other AI tools is required. The first cohort starts on April 23rd, and I am currently accepting applications. For more information, check out this page for more.
Tribal Knowledge is the Bane Of Your AI Agents’ Existence
The first real obstacle AI agents will face is tribal knowledge.
In essence, tribal knowledge is any undocumented information that resides inside users’ heads that’s relevant for doing their jobs. Most businesses run on tribal knowledge, and it’s a massive obstacle to productivity. Most companies simply suck at documenting things.
Unfortunately, if workers depend too much on tribal knowledge to get work done, this sets up AI agents for massive failure, because they need to be grounded on digitized knowledge. In reality, this tribal knowledge can take multiple forms, such as:
undocumented preferences: e.g. how your bosses want reports to be written, not just in format, but in terms of content. How you want your eggs. How you want only organic produce, even if it’s more expensive, etc.
domain-specific quirks: e.g. some random state regulation about processing insurance claims in NY state versus NJ state that only SMEs (subject matter experts) know about. None of this is written down anywhere internally.
Even seemingly simple and stupid things like filling out insurance application forms have quirks. In certain states, you may need to submit pictures to prove your car’s physical garage location to prevent insurance fraud. Then you may need a specialized tool to forensic submitted photo evidence to cross check against vehicle information.
But this “process” is basically undocumented, tribal knowledge in most cases. It can’t be RAGed.
These latent requirements aren’t edge cases—they're everywhere.
Suppose you ask a coding AI assistant to set up a simple Python app using a specific stack. You feel you're being clear and helpful, but suddenly realize it started installing packages with pip
instead of your preferred uv
. Your preference wasn't explicitly stated, yet its violation frustrates you immediately. Guess what, AI agent isn’t going to read your mind.
In short, most AI agents will offer poor UX - quite unfairly - because they rely on surfacing tacit knowledge that users themselves overlook. Chat interfaces especially are privy to this problem since they overpromise simplicity—yet they require a staggering amount of unstated information to reliably deliver results. Users are often set up for disappointment.