Lamin
lamin.aiOpen data lakehouse for biology, context and memory at scale
Data & Analyticsbioinformaticsdata-lakehouseopen-sourcedata-lineagescientific-databiologypython

About
Lamin is an open-source data platform designed for biology research teams, providing a lineage-native lakehouse that supports biological file formats, registries, and ontologies. It enables tracked data management across infrastructure by letting users query, trace, and validate datasets and models with a single line of code. The platform integrates with relational metadata stores and supports major cloud providers while maintaining zero vendor lock-in.
Problem
Biological research teams lack a unified, traceable way to manage datasets, models, and metadata at scale across diverse infrastructure.
For
Biology research teams and bioinformatics engineers
How it works
Lamin wraps storage (S3, GCP, Azure) and databases (Postgres, SQLite) with a lineage-tracking layer and Python/R API that records data provenance, enforces schemas, and maps assets into a queryable lakehouse supporting bio-formats like AnnData and Zarr.
Business model
open-source
Status
launched