Create a cloud-native, enterprise-grade architecture diagram for a multi-tenant healthcare AI chatbot assistant deployed on AWS. The system supports multiple hospitals (tenants), with doctors and nurses as end users, and is accessed via web and mobile applications.

High-Level Requirements

The architecture must clearly show multi-tenancy, AI agent orchestration, LLMOps observability, and secure cloud infrastructure

Use AWS official service icons

Group components logically using boundaries, layers, and annotations

Show request flow from users → AI agents → retrieval → LLM → response

User & Application Layer

Web Application (Doctors & Nurses)

Mobile Application (Doctors & Nurses)

Requests routed through an API Gateway / Ingress

Kubernetes & Compute Layer

AWS EKS Cluster

Namespace per environment

Tenant-isolated AI Service Pods

One AI service pod per hospital (tenant)

Each pod is identified by Org ID / Tenant ID

Amazon ECR

Stores Docker images for AI services

CI/CD pushes images to ECR

EKS pulls images from ECR

AI Application & Agent Layer (Inside Each Tenant Pod)

Inside each tenant-specific AI service pod, include the following multi-agent system:

Orchestrator Agent

Entry point for all AI requests

Coordinates agent execution flow

Supervisor Agent

Controls execution order

Applies guardrails and safety checks

Query Classifier Agent

Classifies user intent (clinical, policy, summarization, etc.)

Query Optimizer Agent

Refines and optimizes prompts

Prepares queries for retrieval and LLM

Retriever Agent

Retrieves embeddings from vector database

LangGraph Client

Used for AI agent orchestration and state transitions

Tooling & Retrieval Layer

MCP Server (Model Context Protocol)

Deployed sidecar-style alongside each AI service pod

Exposes retrieval tools

Acts as a tool provider to LangGraph agents

Amazon OpenSearch (Vector Database)

Stores embeddings per tenant

Queried via MCP Retriever Tool

Tenant-specific indices

Data & Storage Layer

Amazon RDS (PostgreSQL)

One database per tenant (hospital)

Application acts as the single source of truth

Database selection based on Org ID

Amazon S3

Stores uploaded documents for knowledge ingestion

Source for vector embedding pipelines

LLMOps, Observability & Monitoring

Arize Phoenix (LLMOps / Agent Observability)

Tracks:

Agent execution traces

Prompt versions

Token usage

Latency and errors

Connected to LangGraph execution flow

Logging & Metrics

Centralized logs

AI request tracing

Per-tenant observability

Security & Governance Layer

AWS IAM Roles (IRSA)

Pod-level permissions

Access control for S3, OpenSearch, RDS

AWS Secrets Manager

Stores API keys, credentials, secrets

Injected securely into pods

HIPAA & PHI Compliance Controls

Encryption at rest and in transit

Tenant isolation boundaries clearly marked

Data & Request Flow (Show with arrows)

Doctor/Nurse submits query via Web or Mobile App

Request enters EKS via Ingress/API Gateway

Routed to tenant-specific AI service pod

Orchestrator Agent invokes downstream agents

Retriever Agent calls MCP tool

MCP Server queries OpenSearch vector DB

Context returned to LangGraph

LLM generates grounded response

Supervisor Agent validates response

Response returned to user

Traces and metrics sent to Arize Phoenix

Diagram Style Instructions

Use layered architecture

Use clear tenant boundaries

Label all arrows with action verbs (e.g., “Retrieve Embeddings”, “Invoke Tool”, “Generate Response”)

Keep diagram clean, professional, and interview-ready

Optimize for clarity over decoration