DataHub
datahubproject.ioThe Context Platform for AI Agents
Data & Analyticsdata-catalogmetadata-managementdata-governancedata-lineageopen-sourceai-contextdata-observability

About
DataHub is an open-source data catalog and context management platform that helps teams discover, understand, and govern their data assets. It provides a unified context graph with lineage, data quality, and governance capabilities designed to serve both human data teams and AI agents. Available as a self-hosted open-source solution or a fully managed cloud offering, it is trusted by over 3,000 organizations.
Problem
Data teams and AI agents lack a unified, trusted source of context — including lineage, quality, governance, and semantic definitions — making it hard to discover and rely on data assets.
For
Data engineers, data teams, and AI agent developers at mid-to-large organizations
How it works
DataHub ingests metadata from structured data, unstructured documents, business apps, and semantic models into a unified context graph, then exposes it via MCP, APIs, and SDKs so both humans and AI agents can query and act on it.
Business model
freemium
Status
launched
Company
DataHub Project