By clicking “Accept All Cookies”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Privacy Policy for more information.
Revenue Operations

From AI Pilots to Revenue Impact: What MIT Research Reveals About Closing the Operational AI Gap

swirled squiggle accent

Most revenue teams aren't failing at AI because they picked the wrong model. They're failing because they skipped the operational foundations that make AI work in the first place.

That's the central finding from a research study Celigo commissioned with MIT Technology Review — and it's the thread that ran through every minute of this RevOps Co-op webinar. Brian Knaupe, VP of GTM Operations at Celigo, was joined by Sandeep Gaddam, Director of AI Operations and Development, and Youssef Zouhairi, Senior Solutions Consultant — with Matthew Volm moderating. Together, they walked through what separates the organizations scaling AI from those perpetually stuck in pilot purgatory, and then demonstrated exactly how Celigo's own GTM team has put those lessons into production.

The Research Finding Every RevOps Team Needs to Hear

The MIT study surveyed more than 500 technology executives and drew a clear line between companies scaling AI and companies still running experiments. The finding that jumped out: 90% of organizations with AI workflows in production relied on integration platforms. Not a specific model. Not a particular vendor. Integration infrastructure.

The implications for RevOps operators are direct. The gap isn't an AI gap. It's an operational gap.

"The problems typically aren't the AI themselves, it's the operational gap. It's the data silos, undefined processes, and lack of governance." — Brian Knaupe, VP of GTM Operations at Celigo

If that framing sounds familiar, it should. Anyone who has sat through an executive meeting where three people cite three different dashboards and arrive at three different numbers has already lived this problem. AI didn't create the underlying issue; it just amplifies it. This aligns with a broader truth that RevOps teams have been wrestling with for years: why most revenue stacks aren't ready for AI — and what that's costing you.

The second research finding is equally important: successful AI implementations draw on more than one data source. Context, it turns out, is the decisive variable — not model sophistication.

Process Documentation Is the Prerequisite, Not an Afterthought

Before any agent, any automation, any AI feature can deliver reliable results, the underlying process has to be documented. This sounds obvious. In practice, it's the step most teams skip.

Brian shared a principle he picked up early at Celigo from his colleague Sandeep: "If you could do it manually and write it down or record yourself on a camera, we can automate it." That constraint is actually a gift. It forces a clarity that most processes never receive.

"Write it down and then build it. If you can document it, you can automate it — you can add AI to it." — Brian Knaupe

The research reinforced this: AI success correlates directly with process maturity. Organizations that had already defined, measured, and instrumented their processes were able to layer automation and then AI on top of existing foundations. Those that hadn't found themselves in a loop — deploying tools to solve problems they couldn't fully describe.

For RevOps operators, this is not new territory. The process of building process is already central to the function. What's new is that the stakes for skipping it are now much higher. Undefined process doesn't just produce inconsistent reporting — it produces confidently wrong AI outputs.

Context Is King: Why More AI Isn't Smarter AI

One of the more counterintuitive takeaways from the session is that stacking more AI tools doesn't compound the value — it often dilutes it. Most organizations are running AI out of individual SaaS platforms, each operating with a narrow slice of context. The sum of those tools is not a coherent intelligence layer.

Brian demonstrated this with a practical thought experiment: take an open opportunity from your CRM, pull all the call transcripts, drop them into your large language model (LLM) of choice, and ask whether you're going to win the deal. Then reload the conversation with your sales process documentation, deal qualification framework, and any other institutionalized process artifacts — and ask a series of smaller, specific questions before repeating the big one.

The answer changes. Often dramatically.

"More AI isn't necessarily smarter AI. The combination of context, small questions, provides a better answer in the end." — Brian Knaupe

This principle governs how Celigo designs its internal agentic workflows: start with the right data sources, define what context the AI actually needs, ask precise questions, and synthesize the responses — rather than asking one overloaded question and hoping for a reliable result. For more on how teams are building this kind of data-first approach, Episode 50: Thinking of AI? Think Data First covers the underlying strategy in depth.

The Customer 360 Agent: Turning Prep Time Into Pipeline Intelligence

The first live demonstration illustrated how Celigo's own sales team operationalized the context principle in a workflow they use daily. Before any customer call, a sales rep traditionally needs to comb through the CRM, Gong transcripts, ERP records, Zendesk support tickets, and Snowflake usage data — a process that could consume 30 minutes to an hour of preparation time.

Celigo built a Slack-native agent to collapse that work into seconds. A rep types a slash command (/account summary [account name]), the agent queries across all connected systems, synthesizes the data, and returns a branded PDF via Google Drive link — ready before the call starts.

What makes this more than a time-saving trick is what the AI extracts from unstructured data. Youssef walked through the output sections:

  • Business overview and stakeholder map — including contacts mentioned in transcripts who were never entered into the CRM
  • Financial summary — pulled from ERP records
  • Support ticket context — Zendesk ticket history with full comment threads distilled into relevant signals
  • Red flags and risks — extracted directly from call transcripts, capturing complaints or concerns that reps may not have logged
  • Strategic priorities — inferred from conversation patterns, surfacing what the customer actually wants to discuss
  • Growth opportunity predictions — based on the full data picture, not just current contract scope

"There is no risk now of forgetting the important notes in our CRMs — we can extract this information immediately and directly from the transcripts." — Youssef Zouhairi, Senior Solutions Consultant at Celigo

The CRM augmentation aspect is particularly relevant for RevOps operators thinking about data management and CRM data quality. Rather than relying on rep discipline to update records after every call, the agent pulls signal from transcripts and pushes it back into Salesforce automatically — reducing the dependency on human data entry without eliminating the human judgment that determines what matters.

Building Governance Into AI: The Guardrails Architecture

As Celigo expanded its use of AI agents — including customer-facing deployments — the team built an explicit governance layer into every workflow. Youssef demonstrated this through a customer-facing chatbot example: a commerce experience ("Acme Chocolatier") where an AI agent handles order tracking, returns, refunds, and product selection.

The guardrails architecture operates as a filter on every agent response before it reaches the end user. The components include:

  • PII detection and masking — identifies personally identifiable information such as credit card numbers or personal names and masks them before delivery
  • Content moderation — flags responses that fall into specified risk categories so a human can review before they're sent
  • Custom policy enforcement — accepts an organization's own compliance policies as input, and evaluates every agent response against them

"You can't be serious if you don't have some governance around it. So Celigo makes it really easy to do that." — Youssef Zouhairi

The hallucination question came up directly from the audience, and the answer the team gave was structural rather than theoretical: break large questions into smaller parts, constrain the context deliberately (send only the relevant data, not everything you have), build the guardrails layer, and keep a human in the loop at critical output points — particularly during early deployment. This mirrors the broader principle from the MIT research: the organizations winning at AI aren't trusting models blindly, they're building verification into the workflow architecture itself.

The Outbound Prospecting Engine: Agentic AI at Scale

Sandeep walked through the most technically ambitious example of the session: a multi-agent outbound prospecting system built entirely on Celigo, designed to help the GTM team identify and reach qualified prospects with hyper-personalized outreach — without manual research at any step.

The system chains four agents in sequence:

1. Lookalike Prospects Agent — Given a won customer, this agent searches the public web to identify companies with similar business models, revenue profiles, and market segments. In the demo, it returned 36 comparable companies for a single input account.

2. Enrichment Agent — Takes the lookalike list and enriches each company with firmographic data: location, employee count, estimated annual recurring revenue (ARR), and funding stage. The enriched data is then filtered through Celigo's internal ideal customer profile (ICP) criteria to produce a prioritized target list.

3. Get Contacts Agent — For each prioritized prospect, this agent queries ZoomInfo and LinkedIn Sales Navigator to surface the right contacts within the buying group, then validates email deliverability through tools like ZeroBounce. Output: the top five contacts per prospect, by role, with verified contact information.

4. Prospecting Email Agent — The most compositional of the four: it calls the contacts agent, a tech stack agent (which reads job postings and uses tools like BuiltWith to map the prospect's technology landscape), and a cohort use cases agent (which matches the prospect's profile to Celigo's existing customers in similar industries). The LLM then generates role-specific, personalized outreach emails — with messaging tailored to a VP of Operations versus an IT team member versus an integration architect.

"The goal of this agent is to invoke different agents, gather the information, and based on the role of the person we are trying to reach out to — understanding their pain point, using our own internal knowledge base — we generate hyper-personalized emails." — Sandeep Gaddam, Director of AI Operations and Development at Celigo

The completed emails are staged in Gong for rep review. The human-in-the-loop checkpoint is the final send decision — not the research, not the writing, not the qualification. The automated version of this workflow runs in the background continuously: when a deal is won, the lookalike research triggers automatically, contacts are enriched on a daily cycle, and reps start each morning with a reviewed-and-ready outbound queue. This kind of AI-assisted approach is part of a broader shift in how cold prospecting motions are being rebuilt across RevOps teams.

When Not to Build an Agent: The Deterministic vs. Agentic Decision

One of the most practically useful moments in the session was a direct challenge to the instinct to reach for autonomous agents as the default answer. Both Youssef and Matthew pushed back on this — clearly and without hedging.

Autonomous agents are more expensive to run (each step requires additional LLM calls), harder to control, and often unnecessary for processes that are already well-defined. The smarter default, Youssef argued, is to build deterministic automation with a "flavor of agentic" — using the AI where it genuinely adds value and keeping the rest of the workflow in a conventional, rule-based process.

"Not everything needs to be built as an autonomous agent. A lot of business processes out there can be built as deterministic with just a flavor of agentic." — Youssef Zouhairi

Matthew reinforced this from the operator side:

"Just because you can build an agent to do a thing doesn't mean that you should, because sometimes the good old-fashioned deterministic workflow is actually the best way to do things." — Matthew Volm

The accounts receivable example Youssef referenced earlier in the session is a clean illustration of this principle: 50% of AR inquiries were generic enough to automate. Rather than building a fully autonomous agent, the team built a workflow that drafts a response, moves it to a review queue, and waits for a human to click send. The AI handles the drafting; the person handles the judgment. One LLM call, not many — and the human stays accountable for what goes out. This distinction between where automation fits and where it doesn't is something every RevOps team navigating AI tool decisions needs to build into their evaluation framework.

Brian's deal classification example lands the same point from a different angle. The team started with a single complex prompt asking the AI to classify deals. It didn't work reliably. The breakthrough came from decomposing the problem: identify the three variables that drive deal classification, ask three specific questions, collect three answers, then synthesize. The approach that looked simpler — one big question — was actually harder for the model. The approach that looked more complex — three small questions — produced consistent, reliable results.

Key Takeaways for RevOps Teams

  • Process documentation is the prerequisite. If you can't write it down or record yourself doing it manually, you can't automate it and you can't add AI to it. Document first, build second.
  • Context beats model sophistication. More data sources don't automatically produce better AI outputs. Deliberate, structured context — selecting the right data and asking precise questions — is what drives reliable results.
  • Integrate before you automate. The MIT research finding is unambiguous: organizations scaling AI are running integration platforms that connect their data sources. Siloed tools produce siloed intelligence.
  • Build governance in, not on. Guardrails, PII detection, and human-in-the-loop checkpoints aren't optional additions — they're part of the architecture from day one, especially for customer-facing deployments.
  • Deterministic workflows with a flavor of agentic often outperform fully autonomous agents. Evaluate each process individually. Autonomous agents are more expensive per workflow run and harder to control. Use them only where genuine unpredictability requires them.
  • Human in the loop is not a failure mode. Keeping a person at the final send or approval step — especially during early deployment — is a feature of well-designed AI workflows, not a sign that automation isn't working.

The message from this session is clear: the operational foundations that RevOps teams have always been responsible for — clean data, defined processes, connected systems, clear governance — are now the exact foundations that determine AI success. The tools have changed. The discipline hasn't.

To learn more about how Celigo helps RevOps and GTM teams automate workflows and deploy AI across the revenue stack, visit celigo.com. You can also access the MIT Technology Review research report referenced throughout this session via the QR code shared during the webinar or by reaching out to the Celigo team directly.

Looking for more great content?

Check out our blog, join our community and subscribe to our YouTube Channel for more insights.

And be sure to check out the Celigo website for more resources on AI-powered workflow automation and integration platforms.

Related posts

Join the Co-op!

Or