Kreuzberg
kreuzberg.devDocument intelligence for AI engineering workflows.
AI Toolsdocument-extractionocrragapiembeddingsai-pipelinedeveloper-tools

About
Kreuzberg is a document intelligence API that extracts structured, machine-readable content from PDFs, images, Office files, and 90+ other formats for use in AI pipelines. It offers OCR, layout detection, table extraction, semantic chunking, embeddings, and LLM-powered extraction in a single API call. Available as a managed cloud service or self-hosted, it is built on a high-performance Rust core for millisecond-speed processing.
Problem
Extracting structured, machine-readable content from diverse document formats is slow and operationally complex, creating bottlenecks in AI pipelines.
For
AI engineers and developers building document intelligence pipelines, RAG systems, or AI agents
How it works
Users send documents via API, SDK, CLI, or Docker; Kreuzberg processes them with OCR, layout detection, table extraction, and optional LLM-powered schema extraction, then returns a structured JSON response or delivers results via webhook.
Business model
freemium
Status
launched