AI Engineering

The Data Platform Built for Agents: Inside Auraa’s Architecture

Databricks Synced Tables didn't hold up for 35+ governance tables. Here's how a single Governance Writer gave Auraa fast reads without a second master.

Alan Dennis

Jun 8, 2026

7:36

There’s a pattern that plays out at nearly every large organization building a modern data platform. The first year is spent standing up infrastructure. The second year is spent rewriting pipelines that weren’t built to scale. By the third year, the teams that were supposed to be using the platform are still waiting for the data engineering team to finish the backlog.

It’s not a talent problem. It’s an architectural problem.

The platforms we’ve been building were designed for human operators: engineers who write SQL, configure schedulers, and click through dashboards. AI assistants get added later as productivity boosters - copilots that suggest syntax, generate documentation, recommend optimizations. However, they can’t act. The platform doesn’t give them anything they can reliably invoke. A chatbot that generates a SQL query still requires a human to run it, check the output, and decide what happens next.

Covasant built Auraa to answer a different question: What would a data platform look like if AI agents were the primary consumer from day one?

The Core Insight: Users Prompt, Agents Plan, Tools Execute

Auraa’s architecture follows a clean hierarchy. Users express intent in natural language. Agents - specifically AIVARA, Auraa’s orchestration agent - interpret that intent, discover the right capabilities, and compose a workflow. Tools execute the work deterministically.

This hierarchy has a critical consequence: all business logic lives in tools, not in agents. Tools have typed inputs, typed outputs, and auditable side effects. Agents focus on planning and sequencing. If an agent makes a bad plan, the result is a failed tool call - not a corrupted dataset. The blast radius of any bad decision is minimal and inspectable.

Every capability in Auraa is a tool. Ingestion, data quality, transformation, scoring, infrastructure provisioning, governance enforcement - each is a discoverable function with a machine-readable contract, registered in NEXARA, Auraa’s runtime capability directory. When a new tool is registered, every agent that queries the registry can discover and use it. No code changes. No redeployment.

Twelve Components, One Coherent Platform

Auraa decomposes the data lifecycle into twelve named components. Each owns a specific slice of responsibility. Each exposes its capabilities as tools.

INFRAIA provisions Unity Catalog tenants automatically. When a new client is onboarded, INFRAIA creates their catalog, schemas, volumes, and grants - without a ticket or a runbook.
Foundra orchestrates platform initialization across seventy discrete phases and manages asynchronous tenant onboarding via DeltaBus. When a new tenant is requested, Foundra publishes a command and returns a task ID immediately. The onboarding completes in the background; progress is queryable in real time.
AION discovers source schemas and generates machine-readable ingestion manifests. It never moves data - it designs how data should be moved, creating a clean separation between design and execution.
DataDuct executes AION’s designs. Its engine architecture handles JDBC databases, SAP systems, and flat files. It publishes events at every lifecycle boundary so downstream components can react without polling.
AICURA profiles tables, proposes quality rules with statistical heuristics and LLM assistance, and enforces them across ingestion engines consistently. The same rule - “this column must not be null” - is expressed in DataDuct specs, Lakeflow pipeline expectations, and dbt tests automatically.
AITERA builds the Unified Data Model. Raw source data becomes canonical Silver entities - invoices, vendors, payments, GL entries - that downstream applications consume without needing to understand source system schemas.
LEXARA (in development) will centralize business rules and scoring. Fraud patterns, risk indicators, compliance thresholds - registered once, evaluated against UDM entities, reused across every application.
AIREKA handles information retrieval: vector search, documentation discovery, and explainability narratives for scored entities.
AIGENE manages identity, access, and governance. Every tool invocation is authorized before execution. Grant policies are data-driven. Governance is not a gate humans remember to check - it is a structural property of how tools execute.
NEXARA is the capability directory that makes everything discoverable. Agents query it at runtime. New tools become available to every agent the moment they’re registered.
AIVARA is the agent-facing interface. It interprets prompts, discovers tools via NEXARA, validates plans via AIGENE, and orchestrates execution.
AURA_CORE is the foundational library that all components depend on. It defines thirty-five abstract interfaces - the contract surface that makes every component independently testable and replaceable.

DeltaBus: Why There’s No Kafka

Components in Auraa don’t call each other directly. They communicate through DeltaBus - an event bus built entirely on Delta Lake tables with Change Data Feed.

This is a structural choice, not a cost optimization (though the cost difference is substantial: ~$50/month vs. $2,000–15,000/month for managed message queue services at comparable volumes). Direct inter-component calls create tight coupling. Event-driven communication means each component can operate, fail, and recover independently. DeltaBus events are SQL-queryable, permanently retained, and governed by Unity Catalog - the same model that governs every other data asset in the workspace.

Governance as a Structural Property

AIGENE doesn’t participate in workflows as an audit step after the fact. It participates at design time (can this agent use this tool on this tenant’s data?) and at runtime (is this principal authorized for this operation?).

Grant policies are stored in versioned Delta tables seeded from declarative configuration. There are no hardcoded fallbacks. If a required registry entry is missing, the system raises an explicit error rather than defaulting to an insecure state. Governance is not something the platform does - it is something the platform is.

What This Means in Practice

kona AI for Databricks, Covasant’s fraud detection and risk analytics solution, is the first production application built on Auraa. Its workflow follows the natural data flow: INFRAIA provisions the tenant, AION discovers SAP FI/AP/GL tables, DataDuct ingests to Bronze, AICURA enforces quality rules, AITERA builds UDM entities, scoring logic runs against those entities, and Gold data products flow to analysts and APIs.

The components are reusable. The next application that needs to ingest from SAP, enforce data quality, and build a semantic model inherits all of this. It provides the domain-specific configuration - which sources, which rules, which UDM entities - and the platform handles the rest.

That’s the compounding effect of a platform-centric architecture. The investment in solving a problem once reduces the cost of every problem after it.

Read the full technical whitepaper.

Auraa: An Agentic Data Activation Platform for Databricks covers the complete component specifications, implementation patterns, and the full replatforming methodology behind the architecture described here.

Read the Whitepaper →

Frequently asked questions

How do we get our data ready for AI agents on Databricks without building everything by hand?

Auraa automates the work of mapping source systems into Databricks so the data is ready for AI agents, instead of having engineers hand-code each connector. In a manual approach, every source system takes a small team several weeks to map, test, and validate, which is why large data backlogs stretch into years. Auraa is built natively on Databricks and uses a registry of pre-built tools to discover schemas, ingest, enforce quality, and build canonical entities, so the first working version of a tenant lands in weeks rather than quarters.

We already use Databricks. What does Auraa add on top of it?

Auraa adds the agent-driven ingestion and data-readiness layer that Databricks itself does not provide. Databricks gives you the lakehouse, Unity Catalog, and Delta Lake; it does not decide how your specific source systems get mapped in or kept clean. Auraa sits on top and handles that: it provisions Unity Catalog tenants automatically, ingests from JDBC databases, SAP, and flat files, profiles tables, and builds a Unified Data Model of canonical entities like invoices, vendors, and payments that downstream applications and agents can consume.

Is Auraa only for Databricks, or does it work with Snowflake too?

Auraa is built natively on Databricks and is intended for teams already on Databricks. Its architecture depends on Databricks primitives, Delta Lake tables with Change Data Feed for its event bus, and Unity Catalog for governance, so it is not positioned as a Snowflake product. If you have already invested in Databricks, that native dependency is what lets changes in the platform get reflected quickly and keeps everything under one governance model.

Our data engineering team is months behind on pipeline work. How does an agent-driven approach change that?

An agent-driven data platform compresses the backlog by treating every capability as a discoverable tool that agents can compose, rather than work an engineer codes from scratch. In Auraa, a user expresses intent in plain language, an orchestration agent interprets it and assembles a workflow, and tools execute the work deterministically. Because all business logic lives in the tools, a bad agent decision produces a failed tool call rather than a corrupted dataset, so the team supervises and verifies instead of writing every pipeline by hand.

How does Auraa keep data governed while still moving fast?

Auraa enforces governance as a structural property of how tools run, not as a manual review step after the fact. Its governance component checks at design time whether an agent is allowed to use a given tool on a given tenant's data, and again at runtime whether the principal is authorized for the operation. Grant policies live in versioned Delta tables with no hardcoded fallbacks, so if a required entry is missing the system raises an explicit error instead of defaulting to an insecure state.

Why does Auraa use Delta Lake tables for messaging instead of Kafka?

Auraa's components communicate through an event bus built entirely on Delta Lake tables with Change Data Feed, rather than a managed message queue. This is an architectural choice, not just a cost decision, though the cost gap is large: roughly $50 a month versus $2,000 to $15,000 a month for managed queue services at comparable volume. Event-driven communication lets each component operate, fail, and recover independently, and because the events are Delta tables, they are SQL-queryable, retained, and governed by Unity Catalog like every other asset.

How long does it actually take to onboard a new source or a new tenant?

Auraa is designed so a new tenant or source comes online without a ticket or a manual runbook. Tenant provisioning, catalog, schemas, volumes, and grants are created automatically, and onboarding runs asynchronously in the background while returning a task ID you can track in real time. The same pre-built tooling is reused for the next application, so each new source that needs the same ingestion, quality, and modeling work inherits the platform's existing capabilities instead of starting over.

We have built data lakes before that nobody could reuse. What makes this reusable across projects?

Auraa is structured so the investment in solving one data problem lowers the cost of every problem after it. Each of its components owns one slice of the lifecycle (provisioning, discovery, ingestion, quality, modeling, retrieval, governance) and exposes its work as tools registered in a runtime directory. When a new application needs to ingest from the same systems, enforce the same quality rules, and build the same kind of semantic model, it supplies only the domain-specific configuration and inherits the rest, which is the compounding effect of a platform-centric design.

How is Auraa different from just adding an AI copilot to our existing data stack?

Auraa treats AI agents as the primary consumer of the platform, where a copilot is added to a platform built for human operators. A copilot can suggest a SQL query, but a human still has to run it, check the output, and decide what happens next, so it cannot reliably act. Auraa exposes every capability as a tool with typed inputs, typed outputs, and a machine-readable contract, so agents can discover and invoke them directly, and any new tool becomes available to every agent the moment it is registered.

AI Engineering AI Agents

The Data Platform Built for Agents: Inside Auraa’s Architecture

The Core Insight: Users Prompt, Agents Plan, Tools Execute

Twelve Components, One Coherent Platform

DeltaBus: Why There’s No Kafka

Governance as a Structural Property

What This Means in Practice

Read the full technical whitepaper.

Frequently asked questions

Similar posts

Genie Code Is the Hands. Auraa Is the Brain and the Rules

How We Made Databricks Apps APIs Fast Without Breaking Our Single Source of Truth

When AI Acts on Its Own, Isolated Governance Isn't Enough

The Data Platform Built for Agents: Inside Auraa’s Architecture

The Core Insight: Users Prompt, Agents Plan, Tools Execute

Twelve Components, One Coherent Platform

DeltaBus: Why There’s No Kafka

Governance as a Structural Property

What This Means in Practice

Read the full technical whitepaper.

Frequently asked questions

Similar posts

Genie Code Is the Hands. Auraa Is the Brain and the Rules

How We Made Databricks Apps APIs Fast Without Breaking Our Single Source of Truth

When AI Acts on Its Own, Isolated Governance Isn't Enough

Get notified on new marketing insights