Skip to main content

Overview

A correction pipeline is a saved sequence of cleanup steps — null fills, format standardization, deduplication, custom transformations — that you can apply to a file in one shot and reuse across other files later. If auto-remediation is “fix this one file now” and the Fix Inbox is “approve AI-suggested fixes as they appear,” correction pipelines are the missing third piece: the cleanup recipes you’ve already decided work, ready to run on demand.

When to use a pipeline

Use a pipeline when:
  • You have a recurring file format (monthly transaction exports, weekly sales dumps) that always needs the same cleanup
  • You’ve manually fixed the same set of issues on three or more files and want to stop repeating yourself
  • You want to apply the same correction sequence to a different file and be able to roll it back
If the cleanup is one-off, use auto-remediation instead. Pipelines have higher setup cost in exchange for repeatability.

Pipeline anatomy

Every pipeline consists of:
FieldDescription
NameHuman-readable label (e.g. “Monthly orders cleanup”)
File IDThe file the pipeline was authored against (used for the column schema)
Statusactive (runnable) or archived
StepsOrdered list of correction actions
VersionAuto-incremented on every edit so you can roll back
Each step has:
  • A correction type (null fill, format fix, dedupe, regex replace, custom)
  • Configuration (which column, which strategy, parameters)
  • Status — proposed (suggested by AI, awaiting approval), active (approved, will run), applied (already run), rejected
  • Source — auto (engine-generated) or user (manually added)
  • Confidence score (only for AI-generated steps)

Creating a pipeline

There are two ways to create a pipeline: The fastest path. The Fix Inbox already groups AI-suggested fixes by file into pipeline-shaped objects. Approve the steps you want, optionally add manual ones, and save.
1

Open Fix Inbox

Navigate to Fix Inbox in the sidebar. Each file with proposed fixes appears as a draft pipeline.
2

Review and approve steps

Inspect each proposed step’s before/after preview. Approve the ones you trust, reject the ones you don’t.
3

Save as named pipeline

Click Save as pipeline and give it a name. The pipeline is now reusable across other files.

From scratch (Correction Wizard)

For files where you want to author the steps by hand:
1

Open the Correction Wizard

Navigate to Corrections in the sidebar and click New pipeline.
2

Pick a source file

Select the file whose schema the pipeline should match.
3

Add steps

Add corrections one at a time. The wizard suggests strategies based on detected issues, but you can override every choice.
4

Preview and save

Run the preview on the source file to verify the result, then save.

Running a pipeline

Once a pipeline is saved, you can apply it to any file with a compatible schema.
# Preview the pipeline against a file (no changes written)
POST /api/v1/corrections/{pipeline_id}/preview
{
  "file_id": "8fa1..."
}

# Apply the pipeline (creates a new remediated file)
POST /api/v1/corrections/{pipeline_id}/apply
{
  "file_id": "8fa1..."
}
In the UI: open the pipeline, click Apply to file, pick the target file, review the preview, and confirm. The original file is never modified — applying a pipeline always produces a new remediated copy. If you re-run the same pipeline on the same source, you get a new copy each time (the previous one is preserved).

Versioning and rollback

Every edit to a pipeline creates a new version. The pipeline detail page shows the version history with timestamps and a diff between versions. To roll back:
POST /api/v1/corrections/{pipeline_id}/revert
{
  "to_version": 3
}
Reverting creates a new version that matches the target version’s steps — it doesn’t delete intermediate history. The audit trail is preserved.

Bulk approve and reject

For pipelines with many proposed steps (common after the AI suggester runs), bulk actions speed things up:
# Approve every step above a confidence threshold
POST /api/v1/corrections/{pipeline_id}/approve-all-high-confidence
{
  "min_confidence": 0.9
}

# Bulk action on a selection
POST /api/v1/corrections/{pipeline_id}/bulk
{
  "step_ids": ["...", "...", "..."],
  "action": "approve"  // or "reject" or "apply"
}
Bulk actions respect the workspace trust threshold — anything below the threshold is skipped even when you ask for “all high-confidence.”

API reference

# List all pipelines for the org
GET /api/v1/corrections

# Get a specific pipeline with its steps
GET /api/v1/corrections/{pipeline_id}

# Edit a single step
POST /api/v1/corrections/{pipeline_id}/steps/{step_id}/edit
{
  "config": { ... }
}

# Approve / reject a single step
POST /api/v1/corrections/{pipeline_id}/steps/{step_id}/approve
POST /api/v1/corrections/{pipeline_id}/steps/{step_id}/reject

# Pending step count across all pipelines
GET /api/v1/corrections/pending-count

Tips

  • Name pipelines after the file pattern, not the issue. “Monthly orders cleanup” beats “Null fills” — when you have ten pipelines, you’ll search by what they apply to, not what they do.
  • Start by promoting Fix Inbox approvals. The fastest way to your first pipeline is to clean up one representative file in the Fix Inbox and save the result.
  • Use preview liberally. Pipelines are reusable, which means a bad step gets repeated N times. Always preview against a fresh file before scheduled apply.
  • Pair with contracts. Once a pipeline reliably produces clean output, define a contract that enforces the post-cleanup state — so any future file that needs the same pipeline trips a contract violation if you forget to run it.
  • Archive, don’t delete. Setting status = archived keeps the version history available for audit while removing the pipeline from active lists.

What’s next?