Deploying agentic AI at enterprise scale with Amazon Bedrock AgentCore

  • Blog
  • 13 minute read
  • April 10, 2026

Joe Williams

Principal, Advisory, Cloud Engineering, Data & Analytics, PwC US

Santhosh Chittor Sundaram

Sr. Manager, Advisory, Cloud Engineering, Data & Analytics, PwC US

Yash Munsadwala

Manager, Advisory, Cloud Engineering, Data & Analytics, PwC US

99%

Reduction in time-to-insight when moving from engineering queues to self-service agent workflows

PwC client implementation Based on anonymized enterprise deployment using Amazon Bedrock AgentCore
Minutes

Time required for analysts to generate SQL queries previously handled by engineering teams

PwC analysis Observed during production deployment

The corporate landscape is littered with compelling AI demos that never get past production.

The reason? As business leaders look to unleash agentic AI, they tend to overweight autonomy—systems that plan, decide, and act end-to-end with minimal human involvement. These examples work well in controlled environments, but then they collide with the real world: enterprise settings where governance, traceability, and system boundaries are non-negotiable.

Automation encounters a different problem. It executes predefined steps reliably and at scale. But it was not designed to coordinate tasks and processes across systems, applications, databases, and context.

The result? A gap that creates bottlenecks and execution problems. While most enterprise workflows appear automated, they depend on human coordination behind the scenes. Someone should bridge tools, interpret outputs, and determine next steps. The problem isn’t execution; it's orchestration.

Agentic systems address this challenge by redefining the human role rather than eliminating it. People still oversee decision-making. What changes is how work takes place: agents interpret intent, route tasks across systems, and assemble outcomes that previously required multiple handoffs.

The goal is not to increase autonomy; it’s to eliminate unnecessary coordination while preserving strong governance and control. As an organization builds AI agents into workflows, it’s critical to recognize that this isn’t a theoretical issue, success hinges on work actually getting done.

What follows is an actual deployment. PwC built a multi-agent system on Amazon Bedrock AgentCore for a global enterprise. This case study looks at the challenges, the architecture, how decisions unfolded, and the outcomes that followed.  

Understanding the challenge

The company had invested heavily in cloud infrastructure and data platforms. But the financial analysts (FinOps) responsible for analyzing and optimizing enterprise cloud spend still could not advance from a question to an answer without help.

The instinct was to solve the problem with fully autonomous agents. However, the analysts worked inside tightly governed financial systems where every query touched sensitive cost data and every recommendation fed executive budget decisions. An autonomous system operating without oversight wasn’t viable. What the company needed wasn’t more autonomy—it was less dependency: a system that could handle coordination and reasoning across tools and workflows without displacing human judgment.

These cloud financial analysts understood cost structures, optimization levers, and business context well. But acting on this information required SQL they couldn’t write and datasets they couldn’t locate independently. This led to long waits and diminished productivity. Essentially, every question required a handoff, and every handoff introduced a delay.

The SQL bottleneck was particularly problematic. The underlying data was already structured, accessible, and available by query. But without the ability to write SQL, it was largely unusable for more advanced AI and automation. Data requests followed the same path: a FinOps analyst identified a question, pushed it to the data engineering team, and waited for that team to write the query, run it against the relevant warehouse tables, and return results. Sometimes this took place within a day, but it could take much longer, depending on the current workload.

This was not a data quality problem or an access problem. It was a dependency problem. Analysts with deep domain expertise were structurally reliant on a separate team for every quantitative question they needed to answer. And that bottleneck compounded across every workflow downstream: reporting cycles, optimization reviews, and executive briefings. Everyone waited on a query that had not yet been written.

The problem wasn’t engineering expertise; it was the time required to handle the engineering tasks. 

Why traditional automation isn’t enough

The company had already tried automating individual steps. Manual reports had evolved to scheduled reports. Repetitive tasks had advanced to scripted data pulls. But answering a basic question like: “What is the current status of this optimization initiative?” required knowing which system holds the data, how to interpret findings, and how to present the information in a practical and useful way. Rule-based automation cannot tackle this level of complexity.  

"The bottleneck was never an individual task. It was everything required to connect one task to the next: knowing where to look, interpreting what came back, and deciding what to do with it. That is not an automation. That is reasoning."

The multi-agent solution

PwC designed a multi-agent system on Amazon Bedrock AgentCore, with each agent handling a specific workflow that had previously required manual intervention or an engineering dependency. Rather than building a single general-purpose assistant, the system would break down the problem into four specialist agents that coordinate with a Supervisor Agent. Each addressed a distinct domain, required independent governance, and had to be deployed as its own runtime.

SQL Generator Agent

The SQL Generator Agent allows FinOps analysts to convert natural language questions into secure, reviewable SQL queries that can be used with BigQuery. It bridges the gap between business intent and data access, removing the need for engineering teams to manage routine data retrieval. An analyst submits a plain language request, and the agent generates a draft query grounded in approved dataset schemas. The agent also functions as a dataset discovery tool: when an analyst is unsure which dataset to use, it interprets the intent behind the question, searches the approved dataset catalog using metadata-driven matching, and recommends the most relevant source before generating the query. Strong guardrails enforce SELECT-only patterns, a 50-row result limit, and schema restrictions that prevent queries from unauthorized sources. The agent never executes SQL. It generates the query and hands it to the analyst for review. One click opens Superset SQL Lab with the SQL loaded and ready to run. What previously took 8+ hours through an engineering queue now takes place in minutes.

Optimization Tracking Agent

The Optimization Tracking Agent provides real-time visibility into the full lifecycle of cloud cost optimization initiatives. A project management platform tracks optimization data and mirrors it in BigQuery. There, an agent automatically checks on current status. The company examines results through defined lifecycle stages: identified, approved, in progress, implemented, validated, and benefits realized. FinOps analysts can ask for the status of a specific initiative, filter by team or workstream, or generate a full report across all active optimizations. The AI agent does not approve, implement, or modify any optimization. In fact, existing review and approval workflows remain untouched—and the organization can trace and audit all actions and status changes.

Chart Summarization Agent

The Chart Summarization Agent transforms raw chart data into executive-ready narrative summaries on demand. A FinOps analyst exports a chart image and its underlying raw data, uploads both to the agent, and receives a structured summary covering key trends, notable changes, primary cost drivers, and any issue that warrants further attention. The agent uses an LLM vision model to interpret the chart image along with underlying raw data to deliver a complete analysis. With this output, the FinOps analyst has a strong starting point for business reviews or leadership reports. The agent reduces the time required to build summaries from scratch. What’s more, the system only generates summaries when an analyst explicitly requests one. They are never published or posted automatically.

Information Technology Service Management (ITSM) Agent

The ITSM Agent is a unified conversational IT assistant that pulls information from Jira, Confluence, and ServiceNow within a single interface. Analysts can retrieve Jira ticket details and assignments, look up incident records and change management requests, search knowledge base articles and runbooks, and pull up documentation across multiple repositories, all through natural language. Instead of navigating multiple systems to answer a support question or locate a runbook, the agent consolidates these queries into one interaction. For operational teams, this eliminates tool-hopping.

Why not a single agent?

The company considered a single-agent approach, but it was clear that consolidating all capabilities into one agent greatly increased risk—and the blast radius of failures. It also would have limited the organization’s ability to scale and evolve individual workflows independently. The multi-agent design preserves isolation while enabling targeted improvements. It also supports additional capabilities without disrupting existing ones.

What makes this system so powerful is that it isn’t just another chatbot layered atop enterprise data, and it isn’t a one-size-fits all large language model handling every task. It also isn’t a collection of scripts automating predefined workflows. Instead, AI works as a coordinated system of specialized agents, each responsible for a narrow and well-defined capability. This level of structured orchestration delivers a critical advantage: decisions take place across components, not just from a generated response.  

How the agents work together to magnify gains

Four design principles guided every architectural decision in this implementation.

  • Supervisor-agent coordination: This framework decomposes complex workflows into purpose-built specialist agents, each owning a single domain. It coordinates them through a Supervisor Agent that handles routing, intent classification, and response assembly. This makes each agent independently testable, deployable, and governable. The supervisor agent serves as the control plane of the system. It performs intent classification, routes requests to the appropriate specialist agent and manages the state across multi-turn interactions. By maintaining context and coordinating responses, it supports seamless transitions between workflows—allowing users to move across domains without re-establishing context.
  • Governance by default: With guardrails in place, agents do not execute, approve, or modify anything autonomously. Every consequential action requires human review. SQL generation is grounded in approved, version-controlled schemas. Dataset discovery returns only curated sources.
  • Observability from day one: Every interaction is fully traceable. A unique trace ID links each step in the process—the routing decision, agent execution, tool calls, and model responses. This way, if there’s a slowdown or something goes wrong, there’s a complete chain of causality.
  • Independent deployments: Each agent runs in its own AgentCore Runtime, and tools deploy on an EKS pod. A failure in one component stays contained. It’s possible to add a new capability with a new container—without any need to restructure the platform.

This architecture involves deliberate trade-offs. With capabilities distributed across multiple agents, system complexity increases and observability requirements grow, particularly related to debugging and diagnosing issues. Thoughtful design is critical for managing runtime latency that can occur with misaligned network configurations. Additionally, strict governance and guardrails are essential to confirm secure operations. The payoff is a system that's highly scalable, able to isolate failures, and remains manageable over time—without sacrificing control.  

How it works: agent architecture and end-to-end flow

At runtime, orchestration takes place through a supervisor-led flow that routes intent, coordinates agents, and assembles responses. The architecture below shows how a single user request flows through the platform end-to-end.

Figure 1: Multi-agent architecture on Amazon Bedrock AgentCore

A user submits a natural language request through the client’s conversational user interface.

The request resolves via Amazon Route 53 DNS and hits an Elastic Load Balancer that distributes traffic across the Amazon EKS cluster. This layer handles SSL termination, health checks, and availability zone distribution. The result is a platform that remains responsive under concurrent analyst load.

The request reaches the application layer on Amazon EKS, which calls invokeAgentRuntime() via the Amazon Bedrock AgentCore API. This is the entry point into the agent runtime, passing the user query and session context to the Supervisor Agent. The conversation history is managed separately through AgentCore Memory on the Supervisor and specialist agents.

The Supervisor Agent uses a LangGraph state graph to classify user intent and initiate the correct agent node. It maintains conversation state across the graph, so multi-turn interactions and follow-up questions carry full context. This means a FinOps analyst can ask a cost question, get the SQL result, and then say: “Now show me the optimization status for that team,” without re-explaining anything.

The selected specialist agent (SQL Generator, Chart Summarization, Optimization, or ITSM) executes within its own isolated AgentCore Runtime instance. Each runtime manages its own prompt context, tool permissions, and guardrail boundaries. Isolating agents at the runtime level confirms that a misconfiguration in one agent doesn’t affect another.

The agent invokes tool functions running as dedicated EKS pods. For example, the SQL Generator Agent calls get_schemas_tool(), generate_sql_tool(), validate_sql_tool(), and create_superset_deeplink_tool(), among others. Each agent has its own tool set scoped to its domain. Tools are stateless and independently deployable, so updating a tool does not require redeploying the agent that uses it.

Tools connect to external systems via authenticated, read-only integrations: In this case, it is Google BigQuery, the client’s existing cloud data warehouse. It handles cost data, Jira for ticket tracking, Confluence for documentation, ServiceNow for IT support, and Superset for BI visualization. All connections involve one-way pulls. Agents read from these systems but never write back, preserving existing workflows and data integrity.

Agents query the Data and Intelligence Layer for retrieval and context. Amazon Aurora handles structured metadata lookups and operational data. Amazon S3 stores unstructured content including uploaded chart images and CSV exports. Amazon OpenSearch powers vector-based semantic retrieval against the dataset catalog, enabling agents to match natural language questions to approved datasets. Amazon Bedrock provides LLM inference and embedding generation across all agents.

Amazon CloudWatch and CloudTrail provide infrastructure-level monitoring, logging, and audit trails. CloudWatch captures runtime metrics and alerts. CloudTrail records every API call for compliance and forensic analysis.

AgentCore Memory and AgentCore Observability run as shared services across the entire system. Memory operates in two tiers: short-term memory maintains session state so agents achieve continuity within a conversation, while long-term memory retains user preferences, prior queries, and interaction history across sessions. Observability spans every layer. Combined with CloudWatch for infrastructure metrics and CloudTrail for API-level audit logging, the system delivers complete visibility from the moment a user submits a request to the final response.

Delivering business outcomes that matter

Here are some of the benefits the company realized:

  • Self-service analytics: A single analyst query that previously required 8+ hours in an engineering queue now occurs in minutes, a reduction of over 99% in time-to-insight. The dependency on engineering teams for routine data access has been entirely eliminated.

  • Faster reporting cycles: There’s no longer a need to produce chart summaries manually. Agents draft this information on demand—while boosting reporting quality and reducing cycle time in business reviews.

  • Optimization visibility: There’s no need to manually build an optimization report. At any moment, analysts can view the entire lifecycle of every initiative, from identification to benefits.  

  • Reduced tool-hopping: Because the ITSM Agent consolidates Jira, Confluence, and ServiceNow into a single conversational interface, there’s a reduced need for context-switching and minimal support queue dependency across operational teams.

Building and deploying agentic systems at enterprise scale also provided a distinct set of lessons

  • Start with narrowly scoped agents and grow with success. Clear objectives and boundaries simplify design, testing and governance.
  • Invest in observability early. Effective debugging hinges on distributed systems with end-to-end traceability.
  • Guardrails are non-negotiable for secure operations. An enterprise should constrain agent behavior through schemas, permissions, and validation.
  • Streaming doesn’t reduce execution time—but it does give users real-time visibility into progress. This, in turn, improves perceived responsiveness.

The blueprint for this project was straightforward: divide complex workflows and insert them into specialized agents with a defined scope and built-in guardrails. Then coordinate them through a supervisor, and run the entire system on infrastructure that handles memory, monitoring, and runtime execution out of the box.

PwC collaborated with the company to build a framework that allows AI agents to advise, retrieve, and generate data, but never execute autonomously. The narrow scope gave cautious users the confidence to adopt quickly. It also laid down governance and observability infrastructure—so that when the organization is ready to expand AI agent autonomy, the technical and trust foundation exists.

With deep expertise in enterprise AI, and the ability to plug in AWS technical capabilities like Amazon Bedrock AgentCore, PwC helps organizations evolve from fragmented, manual workflows to production-grade agentic systems that boast strong governance, observability, and scalability from the start.

Analysts receive answers in seconds instead of days. Leadership gains visibility without waiting for reports. And engineering teams focus on higher-value problems once they are freed from the drudgery of handling routine queries. No less important: this architecture is not static. As new workflows emerge, an organization can add agents without disrupting existing capabilities. This fuels a continuous expansion of enterprise intelligence—along with the ability to scale agentic systems alongside rapidly evolving business needs.  

Disclaimer: Client details, proprietary tooling, and identifying business context have been anonymized in accordance with confidentiality obligations. Architecture descriptions reflect general patterns based on a real production implementation, kept at a conceptual level consistent with AWS Well-Architected best practices.

Build enterprise-ready agentic AI

Go beyond demos with governed AI agents

Follow us