Overview
Data lineage is the map of where your data comes from, where it goes, and how it transforms along the way. ORCA’s lineage view shows your tables, files, warehouses, and dashboards as nodes in a graph, with edges representing the relationships between them. Lineage answers two questions that nothing else in the platform can:- “If I change this column, what breaks downstream?” — impact analysis traces every node that depends on a given source.
- “Where did this number come from?” — upstream traversal walks the chain back to the original raw data.
How nodes are created
ORCA’s lineage graph is populated from three sources, in order of preference:| Source | When it runs | What it creates |
|---|---|---|
| Auto-detection from foreign keys | When you connect a PostgreSQL data source | One node per table, one edge per FK constraint |
| Auto-detection from analyzed files | When a job completes | One node per file, edges to its parent data source |
| Manual entry | Anytime, in edit mode | Custom nodes (dashboards, BI tools, downstream consumers) |
Auto-detection from PostgreSQL
When you connect a PostgreSQL source, ORCA scansinformation_schema for foreign key constraints and creates lineage edges automatically. The detection is capped at 500 FK relationships per database to keep large schemas responsive.
You can re-trigger detection at any time from the source detail page in Sources.
Working with the graph
Open Lineage from the sidebar.Filter by source
Use the source dropdown in the toolbar to scope the graph to a single warehouse or data source. Useful when you have many connected systems.
Click a node
The detail panel shows node type, schema, database, owning data source, and the raw metadata payload.
Run impact analysis
From any node detail panel, click Impact analysis. ORCA traverses the graph downstream up to 10 levels deep and highlights every node that would be affected by a change to the selected one.
Keyboard shortcuts
| Key | Action |
|---|---|
f | Fit the graph to the viewport |
Esc | Close any open panel or modal |
Manual nodes and edges
Some parts of your data flow live outside ORCA — Tableau dashboards, downstream microservices, ML pipelines. You can model these by hand.Add a node
Click Add node. Pick a node type (table, file, view, dashboard, model), give it a name, and optionally link it to a data source.
detected_by: manual so you can distinguish them from auto-detected ones.
API reference
Lineage is fully accessible via the REST API. See API endpoints for the complete schema.POST, PATCH, DELETE) require admin role and are rate-limited to 20 requests per minute.
Limitations
- Foreign-key auto-detection currently supports PostgreSQL only. BigQuery and Snowflake nodes must be added manually or through file ingestion.
- Impact analysis depth is capped at 10 levels to prevent runaway queries on highly connected graphs.
- Column-level lineage mappings are stored on edges but only auto-populated for FK-detected edges. Manual edges support column mappings via the API.
What’s next?
- Connect a PostgreSQL source to auto-populate your graph
- Pair lineage with data contracts to understand which contracts protect upstream nodes
- Use knowledge graph for entity-level (rather than table-level) relationships