Data dictionary or enterprise ontology: knowing your data vs. understanding it

The terms get used interchangeably. They should not.

In most data conversations, "data dictionary" and "ontology" get used as if they describe roughly the same thing. They do not. The difference between them is the difference between knowing what data you have and understanding what it means — and that distinction sits at the heart of why so many AI and analytics initiatives quietly underdeliver.

If you only ever get to the dictionary, you have documentation. If you get to the ontology, you have understanding. Most organisations never make the second jump.

The core insight: A data dictionary tells you what a field is. An ontology tells you what it means in relation to everything else. That is the difference between documentation and understanding — and the difference between AI that hallucinates and AI that works.

What a data dictionary actually is

A data dictionary is a structured catalogue of the data assets inside a system or an estate. For each field, it typically captures the name, the data type, a description, sometimes a quality score, and sometimes a sensitivity classification. Done well, it gives a team a shared reference point for what exists.

That is genuinely useful. It is also limited. A dictionary describes fields in isolation. It will tell you that customer_id is a string in the CRM, an integer in the order management system, and a guid in the finance ledger. It will not tell you that those three fields all refer to the same customer, or that the relationship between them is what makes a single view of revenue possible.

Dictionaries answer the question what do we have? They do not answer the question what does it mean across the business?

What an ontology adds

An ontology is a structured model of business entities and the relationships between them. It does not just list fields. It maps how the things those fields describe relate to one another, across systems.

An ontology says: this entity is a Customer. A Customer has Orders. An Order has Line Items. A Line Item references a Product. A Product is fulfilled from a Warehouse. Each of those entities is assembled from fields that live in different systems, and the ontology is what makes them addressable as a single concept.

The practical effect is significant. Once an ontology exists, you can ask questions that cross systems — which customers placed orders in Q3 that have not yet been invoiced? — without anyone having to write a join across four databases by hand. The ontology already knows how those entities connect.

Why this matters for AI

An LLM, on its own, has no idea what your business means by a "customer" or an "order." It can guess, but the guesses are exactly the kind of confident wrongness that gives enterprise AI a bad reputation. Hallucinations are not a model problem so much as a grounding problem.

An ontology fixes the grounding problem. When an AI agent has a structured model of the business entities and their relationships, it can answer cross-system questions with provenance — it can show which fields, in which systems, the answer was assembled from. That is the difference between a chatbot that sounds plausible and an agent that produces auditable answers leadership can act on.

This is also why the ontology has become the single most important missing layer in most enterprise AI projects. Teams skip past it because building one by hand takes months. So they ground their AI on a dictionary, or worse, on raw schema, and then wonder why the answers do not hold up.

Why organisations rarely build one

Ontologies are not new. The reason most enterprises do not have one is that, until recently, building one was a multi-quarter project requiring senior data architects, subject matter experts, and an enormous amount of cross-system documentation. The work is conceptually unglamorous, takes a long time, and almost never has a clear ROI on day one. So it gets deferred. And deferred. And deferred.

Then an AI initiative kicks off, the absence of the ontology becomes the bottleneck, and the choice is between waiting nine months to build one or shipping AI on a foundation that everyone privately knows is shaky.

Most organisations choose to ship. That is part of why so many enterprise AI projects struggle.

How Sidekick handles this

Sidekick builds both layers automatically. It scans every connected source, produces a data dictionary with business descriptions, quality scores, and sensitivity classifications for every field — and then goes further. It identifies the business entities that the fields describe, maps the relationships between them, and assembles a governed enterprise ontology across the entire estate.

That ontology is then queryable in plain English. A user can ask a cross-system business question through Microsoft Teams and get an answer that has been assembled from the right fields in the right systems, with full lineage for anyone who needs to audit how it was produced.

What used to be a nine-month build becomes a continuously updated artefact that lives alongside the data estate as it changes.

The takeaway

If your organisation is investing in AI or analytics and the conversation is still focused on building a data dictionary, the conversation is half-finished. The dictionary is necessary. It is not sufficient.

The ontology is what makes the data investable — what makes cross-system questions answerable, what makes AI grounded in something real, and what makes governance actually enforceable. The organisations that move fastest on AI are the ones that have it. The rest are still trying to build it by hand.

Ready to get started?

See your estate mapped as an ontology

Most organisations have never seen their data estate represented as a map of business entities and relationships. Talk to the Sidekick Lab team about a Proof of Value engagement and see what your ontology actually looks like.