A Strategic Guide for Enterprise Leaders
AI AGENTS
This guide embodies our experience at Barnacle Labs building agents at scale for large enterprises. It's for anyone exploring AI agents who wants to cut through the noise and find real opportunities.
Our team has over a decade of experience delivering AI solutions, led by the former European CTO for IBM Watson. We believe in AI for breakthroughs, not buzzwords—so we created this document to distil our practical experiences of building agents in production, not theory.
We'll email you the AI Agents white paper (PDF) and the Agent Opportunity Workbook to run a 60–90 minute workshop with your team.
THE OPPORTUNITY
Agents move us from AI that answers questions to AI that researches, reasons, and acts.
This unlocks two types of work:
Work that resisted automation — too open-ended or too much judgement required for traditional software.
Work we never bothered with — every business has a backlog of “nice to haves”: reviewing every supplier contract for risk, writing personalised outreach for 50,000 prospects, monitoring 200 journals a day. Valuable work, but tedious, judgment-heavy, and high-volume—so we never did it, because tech couldn’t handle it and humans were too expensive.
Studies suggest the economic potential of automating this work is significant: MIT's Iceberg Index estimates it could represent 11.7% of the US labour market, or approximately $1.2 trillion of economic activity. McKinsey estimate the global opportunity at $2.9 trillion.
The hard part is imagination. It's difficult to spot work you've never thought about doing. You need to look at what you do today and ask how it might be done differently.
CUT THROUGH THE NOISE
Marketing promises "autonomous digital employees." Academic papers offer impenetrable taxonomies. Neither helps you get started.
Reality check: Agents work best on specific, well-defined tasks—ones you do often enough to be annoying or costly.
"Can I replace my finance department with an agent?" Not yet. Focus on automating tasks, not roles.
But the frontier moves fast. Google's co-scientist discovers drug candidates. Gamma generates presentations. 19pine handles refunds autonomously. These are well-funded, productised bets—more than simple agents. They show what's possible when conviction and investment are high.
WHAT ARE AGENTS
Everyone uses the word "agent." They often mean different things.
An agent is AI that acts. It doesn't just answer — it reasons, uses tools, pursues goals.
Give an agent "research this competitor" and it searches the web, queries APIs, tries alternative sources, compiles a summary. No human tells it how. It works it out.
The key distinction: Agents decide what to do next. They adapt to what they find. Traditional automation follows rigid scripts.
WORKFLOWS VS AGENTS
The industry conflates two different things. Some products marketed as "agents" are workflows with an LLM bolted on.
Workflows: Human designs the steps. Tool executes them, maybe using AI on some steps. Edge case? It breaks. Workflows can't reason.
Agents: AI designs the steps. Give it a goal and tools. It figures out the path—and adapts when things change. Agents handle problems that need judgement, or are too open-ended for pre-defined workflows.
Both are useful. But conflating them means buying workflow tools expecting agent capabilities.
WHAT MAKES AN AGENT
Every agent has the same core capability—a loop that continues until the task is done. This is The Agentic Loop:
1. Observe — Receive input: a task, context, or feedback from the last action.
2. Think — Reason about current state. What's done? What's next? Which tool?
3. Act — Call a tool: search the web, query a database, write a file, send an email.
4. Repeat — Result feeds back. Agent observes the outcome and decides what's next.
This loop is what makes agents different. An agent might spend minutes exploring a topic, making multiple AI calls, before reaching a conclusion. It can reason about problems in surprisingly sophisticated ways. No loop? Not a true agent.
Tools let agents interact with the world. Without tools, an agent can only think. With tools, it can act. Your existing APIs can become tools—but often need simplifying first.
IDENTIFYING OPPORTUNITIES
Ask your team:
- •"What do you wish you never had to do again?"
- •"What do you wish you could do, but always assumed you can't?"
The answers will reveal agent opportunities — triaging leads, initial contract reviews, monitoring publications, qualifying inquiries, compiling reports.
The best candidates:
- Skilled people on low-value work — Lawyers reviewing standard contracts. Analysts compiling weekly reports.
- Frequent enough to matter — Daily or weekly tasks, not annual reviews.
- Requires judgement — Some reasoning needed, not just if/then rules.
- Data is accessible — Information lives in systems you can connect to.
- Stakes are manageable — Agent drafts, human approves. Errors are fixable.
CASE STUDY EXAMPLES
The following examples have been collated from our work at Barnacle Labs and from public case studies:
Lead Generation:
Result: 15 hours becomes 15 minutes of review time.
Legal Research:
Result: A week of a junior's work becomes an hour of review.
Brief Writing:
Result: A day of work turns into a few minutes of review.
Customer Support at Scale:
Result: 100,000 emails eliminated annually.
Research Monitoring:
Result: Researchers stay current without spending hours scanning journals.
Drug Discovery (Google's Co-Scientist):
Result: Novel drug candidates for acute myeloid leukemia and liver fibrosis were identified and are now being tested in labs.
Contract and Document Analysis:
Result: Hours of manual review reduced to minutes.
Customer Due Diligence:
Result: A radical acceleration in the pace at which complex due diligence processes can be run.
Ontology Generation:
Result: Work that might never have been done because it was too expensive with humans, becomes possible.
Newsletter Generation:
Result: 6 hours of manual work transformed into 30 minutes of review.
Agent Opportunity Workbook
Turn this guide into a concrete shortlist of agent candidates in your organisation.
We've put together a simple workbook we use with enterprise teams:
- •How to prompt your team to generate ideas
- •A scoring sheet for potential use cases
- •A one-page summary template for execs
Drop your email below and we'll send the workbook and the white paper together.
KEY RISKS
Every risk below is manageable — but not automatically. The difference between an agent that works and one that embarrasses you is in the implementation details. These are the risks we've learned how to design around.
- Security — Agents with tool access can cause damage. Mitigation: Read-only where possible. Approval gates for writes. Log everything.
- Hallucination — Agents can invent facts and find 'evidence' to support them. Mitigation: Human review on outputs. Citation requirements. Verification steps.
- Cost — Complex tasks trigger 50+ API calls. Mitigation: Monitor spending. Set budgets. Optimise prompts.
- Bias — Models inherit training data biases. Mitigation: Test for bias in your domain. Human oversight on sensitive decisions.
GETTING STARTED
Forget roadmaps, maturity models, and six-month planning phases. Agents reward speed. The only way to learn what works is to build one.
- Pick a specific problem. Not "improve customer service" but "summarise support tickets and suggest responses".
- Choose your technology. Pick a model provider (OpenAI, Anthropic, Google) and a framework. Treat the framework as scaffolding—you'll likely change it. Skip Small Language Models for now. Prove what agents can do first; optimise later.
- Build tools. Tools are the foundation. Good design means better agent behaviour.
- Build evals. Your agent is only as good as the tests that prove it works.
SCALING CONSIDERATIONS
One agent works. How do you build ten more? Or fifty?
Scaling isn't just replication. It needs systematic approaches that a single prototype didn't require.
- Reusable tools — Don't let every team start from scratch. Build a common approach to tools.
- Standardised infrastructure — Multiple teams and agents handling daily tasks need shared infrastructure.
- Governance — Security policies, cost tracking, audit logging, approval workflows.
- Evaluation automation — Evals prove agents work. Standardise how you do them so teams work on data, not gut feel.
- Small language models — For high-volume or privacy-sensitive tasks, smaller fine-tuned models can cut costs and increase privacy—but require up-front investment.
BE BOLD
The real risk isn't failure—it's timid pilots that prove modest value and kill momentum.
A cautious prototype saving a few hours a month won't capture attention. It won't justify what comes next.
Agent technology improves rapidly. What's impossible today will be routine in twelve months. Build your first agent to prove what's possible now—but plan for a future where agents handle work you haven't yet imagined automating.
The organisations that win will move boldly while competitors are still planning.
Ready to explore agents?
Schedule a 30-minute discovery call with our team. We have over a decade of experience delivering AI solutions, led by the former European CTO for IBM Watson. We'll explore your situation and share what we've seen work.
Not ready for a call yet?
Stay in the loop as we publish more agent research, architectures, and case studies.