Create a cloud-native, enterprise-grade architecture diagram for a multi-tenant healthcare AI chatbot assistant deployed on AWS. The system supports multiple hospitals (tenants), with doctors and nurses as end users, and is accessed via web and mobile applications. High-Level Requirements The architecture must clearly show multi-tenancy, AI agent orchestration, LLMOps observability, and secure cloud infrastructure Use AWS official service icons Group components logically using boundaries, layers, and annotations Show request flow from users → AI agents → retrieval → LLM → response User & Application Layer Web Application (Doctors & Nurses) Mobile Application (Doctors & Nurses) Requests routed through an API Gateway / Ingress Kubernetes & Compute Layer AWS EKS Cluster Namespace per environment Tenant-isolated AI Service Pods One AI service pod per hospital (tenant) Each pod is identified by Org ID / Tenant ID Amazon ECR Stores Docker images for AI services CI/CD pushes images to ECR EKS pulls images from ECR AI Application & Agent Layer (Inside Each Tenant Pod) Inside each tenant-specific AI service pod, include the following multi-agent system: Orchestrator Agent Entry point for all AI requests Coordinates agent execution flow Supervisor Agent Controls execution order Applies guardrails and safety checks Query Classifier Agent Classifies user intent (clinical, policy, summarization, etc.) Query Optimizer Agent Refines and optimizes prompts Prepares queries for retrieval and LLM Retriever Agent Retrieves embeddings from vector database LangGraph Client Used for AI agent orchestration and state transitions Tooling & Retrieval Layer MCP Server (Model Context Protocol) Deployed sidecar-style alongside each AI service pod Exposes retrieval tools Acts as a tool provider to LangGraph agents Amazon OpenSearch (Vector Database) Stores embeddings per tenant Queried via MCP Retriever Tool Tenant-specific indices Data & Storage Layer Amazon RDS (PostgreSQL) One database per tenant (hospital) Application acts as the single source of truth Database selection based on Org ID Amazon S3 Stores uploaded documents for knowledge ingestion Source for vector embedding pipelines LLMOps, Observability & Monitoring Arize Phoenix (LLMOps / Agent Observability) Tracks: Agent execution traces Prompt versions Token usage Latency and errors Connected to LangGraph execution flow Logging & Metrics Centralized logs AI request tracing Per-tenant observability Security & Governance Layer AWS IAM Roles (IRSA) Pod-level permissions Access control for S3, OpenSearch, RDS AWS Secrets Manager Stores API keys, credentials, secrets Injected securely into pods HIPAA & PHI Compliance Controls Encryption at rest and in transit Tenant isolation boundaries clearly marked Data & Request Flow (Show with arrows) Doctor/Nurse submits query via Web or Mobile App Request enters EKS via Ingress/API Gateway Routed to tenant-specific AI service pod Orchestrator Agent invokes downstream agents Retriever Agent calls MCP tool MCP Server queries OpenSearch vector DB Context returned to LangGraph LLM generates grounded response Supervisor Agent validates response Response returned to user Traces and metrics sent to Arize Phoenix Diagram Style Instructions Use layered architecture Use clear tenant boundaries Label all arrows with action verbs (e.g., “Retrieve Embeddings”, “Invoke Tool”, “Generate Response”) Keep diagram clean, professional, and interview-ready Optimize for clarity over decoration