The Data Platform Built for Agents: Inside Auraa’s Architecture
Databricks Synced Tables didn't hold up for 35+ governance tables. Here's how a single Governance Writer gave Auraa fast reads without a second master.
There’s a pattern that plays out at nearly every large organization building a modern data platform. The first year is spent standing up infrastructure. The second year is spent rewriting pipelines that weren’t built to scale. By the third year, the teams that were supposed to be using the platform are still waiting for the data engineering team to finish the backlog.
It’s not a talent problem. It’s an architectural problem.
The platforms we’ve been building were designed for human operators: engineers who write SQL, configure schedulers, and click through dashboards. AI assistants get added later as productivity boosters - copilots that suggest syntax, generate documentation, recommend optimizations. However, they can’t act. The platform doesn’t give them anything they can reliably invoke. A chatbot that generates a SQL query still requires a human to run it, check the output, and decide what happens next.
Covasant built Auraa to answer a different question: What would a data platform look like if AI agents were the primary consumer from day one?
The Core Insight: Users Prompt, Agents Plan, Tools Execute
Auraa’s architecture follows a clean hierarchy. Users express intent in natural language. Agents - specifically AIVARA, Auraa’s orchestration agent - interpret that intent, discover the right capabilities, and compose a workflow. Tools execute the work deterministically.
This hierarchy has a critical consequence: all business logic lives in tools, not in agents. Tools have typed inputs, typed outputs, and auditable side effects. Agents focus on planning and sequencing. If an agent makes a bad plan, the result is a failed tool call - not a corrupted dataset. The blast radius of any bad decision is minimal and inspectable.
Every capability in Auraa is a tool. Ingestion, data quality, transformation, scoring, infrastructure provisioning, governance enforcement - each is a discoverable function with a machine-readable contract, registered in NEXARA, Auraa’s runtime capability directory. When a new tool is registered, every agent that queries the registry can discover and use it. No code changes. No redeployment.
Twelve Components, One Coherent Platform
Auraa decomposes the data lifecycle into twelve named components. Each owns a specific slice of responsibility. Each exposes its capabilities as tools.
-
INFRAIA provisions Unity Catalog tenants automatically. When a new client is onboarded, INFRAIA creates their catalog, schemas, volumes, and grants - without a ticket or a runbook.
-
Foundra orchestrates platform initialization across seventy discrete phases and manages asynchronous tenant onboarding via DeltaBus. When a new tenant is requested, Foundra publishes a command and returns a task ID immediately. The onboarding completes in the background; progress is queryable in real time.
-
AION discovers source schemas and generates machine-readable ingestion manifests. It never moves data - it designs how data should be moved, creating a clean separation between design and execution.
-
DataDuct executes AION’s designs. Its engine architecture handles JDBC databases, SAP systems, and flat files. It publishes events at every lifecycle boundary so downstream components can react without polling.
-
AICURA profiles tables, proposes quality rules with statistical heuristics and LLM assistance, and enforces them across ingestion engines consistently. The same rule - “this column must not be null” - is expressed in DataDuct specs, Lakeflow pipeline expectations, and dbt tests automatically.
-
AITERA builds the Unified Data Model. Raw source data becomes canonical Silver entities - invoices, vendors, payments, GL entries - that downstream applications consume without needing to understand source system schemas.
-
LEXARA (in development) will centralize business rules and scoring. Fraud patterns, risk indicators, compliance thresholds - registered once, evaluated against UDM entities, reused across every application.
-
AIREKA handles information retrieval: vector search, documentation discovery, and explainability narratives for scored entities.
-
AIGENE manages identity, access, and governance. Every tool invocation is authorized before execution. Grant policies are data-driven. Governance is not a gate humans remember to check - it is a structural property of how tools execute.
-
NEXARA is the capability directory that makes everything discoverable. Agents query it at runtime. New tools become available to every agent the moment they’re registered.
-
AIVARA is the agent-facing interface. It interprets prompts, discovers tools via NEXARA, validates plans via AIGENE, and orchestrates execution.
-
AURA_CORE is the foundational library that all components depend on. It defines thirty-five abstract interfaces - the contract surface that makes every component independently testable and replaceable.
DeltaBus: Why There’s No Kafka
Components in Auraa don’t call each other directly. They communicate through DeltaBus - an event bus built entirely on Delta Lake tables with Change Data Feed.
This is a structural choice, not a cost optimization (though the cost difference is substantial: ~$50/month vs. $2,000–15,000/month for managed message queue services at comparable volumes). Direct inter-component calls create tight coupling. Event-driven communication means each component can operate, fail, and recover independently. DeltaBus events are SQL-queryable, permanently retained, and governed by Unity Catalog - the same model that governs every other data asset in the workspace.
Governance as a Structural Property
AIGENE doesn’t participate in workflows as an audit step after the fact. It participates at design time (can this agent use this tool on this tenant’s data?) and at runtime (is this principal authorized for this operation?).
Grant policies are stored in versioned Delta tables seeded from declarative configuration. There are no hardcoded fallbacks. If a required registry entry is missing, the system raises an explicit error rather than defaulting to an insecure state. Governance is not something the platform does - it is something the platform is.
What This Means in Practice
kona AI for Databricks, Covasant’s fraud detection and risk analytics solution, is the first production application built on Auraa. Its workflow follows the natural data flow: INFRAIA provisions the tenant, AION discovers SAP FI/AP/GL tables, DataDuct ingests to Bronze, AICURA enforces quality rules, AITERA builds UDM entities, scoring logic runs against those entities, and Gold data products flow to analysts and APIs.
The components are reusable. The next application that needs to ingest from SAP, enforce data quality, and build a semantic model inherits all of this. It provides the domain-specific configuration - which sources, which rules, which UDM entities - and the platform handles the rest.
That’s the compounding effect of a platform-centric architecture. The investment in solving a problem once reduces the cost of every problem after it.
Building AI-native data platforms on Databricks?
See how Covasant designs governed, production-grade lakehouse architectures that balance performance, scalability, and operational simplicity without compromising on single-source-of-truth principles.
