A Strategic Guide for Enterprise Leaders

AI AGENTS

This guide embodies our experience at Barnacle Labs building agents at scale for large enterprises. It's for anyone exploring AI agents who wants to cut through the noise and find real opportunities.

Our team has over a decade of experience delivering AI solutions, led by the former European CTO for IBM Watson. We believe in AI for breakthroughs, not buzzwords—so we created this document to distil our practical experiences of building agents in production, not theory.

Download White Paper

We'll email you the AI Agents white paper (PDF) and the Agent Opportunity Workbook to run a 60–90 minute workshop with your team.

THE OPPORTUNITY

Agents move us from AI that answers questions to AI that researches, reasons, and acts.

This unlocks two types of work:

Work that resisted automation — too open-ended or too much judgement required for traditional software.

Work we never bothered with — every business has a backlog of “nice to haves”: reviewing every supplier contract for risk, writing personalised outreach for 50,000 prospects, monitoring 200 journals a day. Valuable work, but tedious, judgment-heavy, and high-volume—so we never did it, because tech couldn’t handle it and humans were too expensive.

Studies suggest the economic potential of automating this work is significant: MIT's Iceberg Index estimates it could represent 11.7% of the US labour market, or approximately $1.2 trillion of economic activity. McKinsey estimate the global opportunity at $2.9 trillion.

The hard part is imagination. It's difficult to spot work you've never thought about doing. You need to look at what you do today and ask how it might be done differently.

CUT THROUGH THE NOISE

Marketing promises "autonomous digital employees." Academic papers offer impenetrable taxonomies. Neither helps you get started.

Reality check: Agents work best on specific, well-defined tasks—ones you do often enough to be annoying or costly.

"Can I replace my finance department with an agent?" Not yet. Focus on automating tasks, not roles.

But the frontier moves fast. Google's co-scientist discovers drug candidates. Gamma generates presentations. 19pine handles refunds autonomously. These are well-funded, productised bets—more than simple agents. They show what's possible when conviction and investment are high.

WHAT ARE AGENTS

Everyone uses the word "agent." They often mean different things.

An agent is AI that acts. It doesn't just answer — it reasons, uses tools, pursues goals.

Give an agent "research this competitor" and it searches the web, queries APIs, tries alternative sources, compiles a summary. No human tells it how. It works it out.

The key distinction: Agents decide what to do next. They adapt to what they find. Traditional automation follows rigid scripts.

WORKFLOWS VS AGENTS

The industry conflates two different things. Some products marketed as "agents" are workflows with an LLM bolted on.

Workflows: Human designs the steps. Tool executes them, maybe using AI on some steps. Edge case? It breaks. Workflows can't reason.

Agents: AI designs the steps. Give it a goal and tools. It figures out the path—and adapts when things change. Agents handle problems that need judgement, or are too open-ended for pre-defined workflows.

Both are useful. But conflating them means buying workflow tools expecting agent capabilities.

WHAT MAKES AN AGENT

Every agent has the same core capability—a loop that continues until the task is done. This is The Agentic Loop:

1. Observe — Receive input: a task, context, or feedback from the last action.

2. Think — Reason about current state. What's done? What's next? Which tool?

3. Act — Call a tool: search the web, query a database, write a file, send an email.

4. Repeat — Result feeds back. Agent observes the outcome and decides what's next.

This loop is what makes agents different. An agent might spend minutes exploring a topic, making multiple AI calls, before reaching a conclusion. It can reason about problems in surprisingly sophisticated ways. No loop? Not a true agent.

Tools let agents interact with the world. Without tools, an agent can only think. With tools, it can act. Your existing APIs can become tools—but often need simplifying first.

IDENTIFYING OPPORTUNITIES

Ask your team:

•"What do you wish you never had to do again?"
•"What do you wish you could do, but always assumed you can't?"

The answers will reveal agent opportunities — triaging leads, initial contract reviews, monitoring publications, qualifying inquiries, compiling reports.

The best candidates:

Skilled people on low-value work — Lawyers reviewing standard contracts. Analysts compiling weekly reports.
Frequent enough to matter — Daily or weekly tasks, not annual reviews.
Requires judgement — Some reasoning needed, not just if/then rules.
Data is accessible — Information lives in systems you can connect to.
Stakes are manageable — Agent drafts, human approves. Errors are fixable.

CASE STUDY EXAMPLES

The following examples have been collated from our work at Barnacle Labs and from public case studies:

Lead Generation:

Your sales team spends 15 hours weekly hunting for prospects, scraping contact details, and writing outreach emails. An agent automates the work: using web-search to find targets, scrape directories, extract emails, generate personalised outreach, send via Gmail, log results.
Result: 15 hours becomes 15 minutes of review time.

Legal Research:

A criminal lawyer preparing a defence needs to research case law, review statutes, and build initial arguments. An agent with access to legal databases completes the first pass in under an hour—mapping relevant cases, extracting precedents, outlining potential arguments.
Result: A week of a junior's work becomes an hour of review.

Brief Writing:

At Barnacle Labs we built agents for a communications agency that gets requests for new projects from an established client base. One agent mines reference material (brand guidelines, past proposals, etc) to create a short brief for the teams that need to work on the requests.
Result: A day of work turns into a few minutes of review.

Customer Support at Scale:

At Barnacle Labs we built an agent for a student accommodation provider operating 70+ halls of residence. Each inquiry triggers an agent that reasons about how to respond—searching FAQs, calling internal APIs for account data, pulling location information.
Result: 100,000 emails eliminated annually.

Research Monitoring:

Users need to stay current with research but can't read everything published. At Barnacle Labs we built an agent that continuously scans new publications, identifies papers relevant to each user's interests, and delivers personalised summaries.
Result: Researchers stay current without spending hours scanning journals.

Drug Discovery (Google's Co-Scientist):

Google built a multi-agent system with specialised agents collaborating to explore hundreds of scientific hypotheses.
Result: Novel drug candidates for acute myeloid leukemia and liver fibrosis were identified and are now being tested in labs.

Contract and Document Analysis:

At Barnacle Labs we built an agent for legal and procurement teams who review contracts for risks and non-standard terms. The agent analyses documents against defined frameworks, flags concerning clauses, extracts key terms, and presents findings for review.
Result: Hours of manual review reduced to minutes.

Customer Due Diligence:

Compliance teams verify identities, check sanctions lists, and assess reputational risks. An agent conducts structured checks across databases, searches for adverse media, and compiles findings for review.
Result: A radical acceleration in the pace at which complex due diligence processes can be run.

Ontology Generation:

Large organisations accumulate sprawling reference material. An agent analyses this material and generates an ontology: a structured classification scheme that defines categories, relationships, and hierarchies—becoming a foundation for other agents.
Result: Work that might never have been done because it was too expensive with humans, becomes possible.

Newsletter Generation:

At Barnacle Labs we built a multi-agent system that researches AI news, spots common themes, writes newsletter copy, and creates social media posts. Our team reviews, adjusts, and approves everything before it's published.
Result: 6 hours of manual work transformed into 30 minutes of review.

Agent Opportunity Workbook

Turn this guide into a concrete shortlist of agent candidates in your organisation.

We've put together a simple workbook we use with enterprise teams:

•How to prompt your team to generate ideas
•A scoring sheet for potential use cases
•A one-page summary template for execs

Drop your email below and we'll send the workbook and the white paper together.

KEY RISKS

Every risk below is manageable — but not automatically. The difference between an agent that works and one that embarrasses you is in the implementation details. These are the risks we've learned how to design around.

Security — Agents with tool access can cause damage. Mitigation: Read-only where possible. Approval gates for writes. Log everything.
Hallucination — Agents can invent facts and find 'evidence' to support them. Mitigation: Human review on outputs. Citation requirements. Verification steps.
Cost — Complex tasks trigger 50+ API calls. Mitigation: Monitor spending. Set budgets. Optimise prompts.
Bias — Models inherit training data biases. Mitigation: Test for bias in your domain. Human oversight on sensitive decisions.

GETTING STARTED

Forget roadmaps, maturity models, and six-month planning phases. Agents reward speed. The only way to learn what works is to build one.

Pick a specific problem. Not "improve customer service" but "summarise support tickets and suggest responses".
Choose your technology. Pick a model provider (OpenAI, Anthropic, Google) and a framework. Treat the framework as scaffolding—you'll likely change it. Skip Small Language Models for now. Prove what agents can do first; optimise later.
Build tools. Tools are the foundation. Good design means better agent behaviour.
Build evals. Your agent is only as good as the tests that prove it works.

SCALING CONSIDERATIONS

One agent works. How do you build ten more? Or fifty?

Scaling isn't just replication. It needs systematic approaches that a single prototype didn't require.

Reusable tools — Don't let every team start from scratch. Build a common approach to tools.
Standardised infrastructure — Multiple teams and agents handling daily tasks need shared infrastructure.
Governance — Security policies, cost tracking, audit logging, approval workflows.
Evaluation automation — Evals prove agents work. Standardise how you do them so teams work on data, not gut feel.
Small language models — For high-volume or privacy-sensitive tasks, smaller fine-tuned models can cut costs and increase privacy—but require up-front investment.

BE BOLD

The real risk isn't failure—it's timid pilots that prove modest value and kill momentum.

A cautious prototype saving a few hours a month won't capture attention. It won't justify what comes next.

Agent technology improves rapidly. What's impossible today will be routine in twelve months. Build your first agent to prove what's possible now—but plan for a future where agents handle work you haven't yet imagined automating.

The organisations that win will move boldly while competitors are still planning.

Ready to explore agents?

Schedule a 30-minute discovery call with our team. We have over a decade of experience delivering AI solutions, led by the former European CTO for IBM Watson. We'll explore your situation and share what we've seen work.

Not ready for a call yet?

Stay in the loop as we publish more agent research, architectures, and case studies.

KEY TAKEAWAYS

What agents actually are: Software that uses AI to reason and adapt. They can automate work that was always too messy for traditional software — too open-ended, too much judgement required.

The real opportunity: Efficiency is only half the story. Agents also make it viable to do work that isn't being done today — because it was never practical to have humans do it.

How to spot opportunities: Skilled people doing low-value work. Frequent tasks. Needs some judgement. Data you can access. Stakes you can manage.

How to succeed: Think weeks to prototype, not months to plan. The pattern that works: agents do the grunt work, humans review and approve.

Risks to manage: Agents can cause damage with the tools you give them. They can also build confident-sounding but wrong conclusions. Human review isn't optional.

Aim high: The real risk isn't failure — it's timid pilots that prove modest value and kill momentum. Be ambitious enough to show what's actually possible.