Overview
Your organisation profile is a small set of preferences and learned context that ORCA uses to tailor its AI behavior to your data. It’s how ORCA knows that your team calls revenue “GMV,” prefers conservative auto-corrections, and shouldn’t bother re-suggesting fixes that were rejected last week. The profile is shared across everyone in your organisation. There’s exactly one profile per org, created automatically the first time someone in your org runs a job.What the profile stores
| Field | Description |
|---|---|
| Data personality | High-level characterization of your typical datasets — domain, scale, common column types |
| Correction preferences | Per-issue-type preferences (e.g. “always median for null fills, never mode”) |
| Recurring issues | Issues ORCA has seen on your data multiple times across files |
| Learned rules | Quality rules promoted from accepted suggestions |
| Terminology | Your team’s vocabulary mapping (“GMV” → revenue, “MAU” → active users) |
| Trust level | conservative, balanced, or aggressive — controls auto-apply confidence thresholds |
| Correction stats | Counts of proposed / approved / rejected corrections, used to calibrate trust |
Trust levels
The trust level is the single most impactful setting on the profile because it controls how aggressively ORCA auto-applies fixes.| Level | Auto-apply threshold | Behavior |
|---|---|---|
| Conservative (default) | 0.95 | Only the most confident fixes are pre-checked. Most fixes require explicit approval. |
| Balanced | 0.90 | Pre-checks high-confidence fixes. Good middle ground for established workspaces. |
| Aggressive | 0.85 | Pre-checks more fixes. Use only after you’ve validated the suggester’s accuracy on your data. |
conservative and recommends staying there for the first month. As your correction stats accumulate (and as you accept more than you reject), you can safely move up.
Terminology mapping
Most teams have their own vocabulary for standard data concepts:| Standard term | Your team might call it |
|---|---|
| Revenue | GMV, gross sales, top-line, ARR |
| Customer | Account, user, member, client |
| Active user | MAU, DAU, engaged user |
| Churn | Attrition, cancellation, deactivation |
- Classify ambiguous columns correctly. “GMV” is no longer “unknown three-letter acronym” — it’s revenue.
- Use your language in explanations. AI Insights and chat responses use your terms, not the generic ones.
- Match contracts. Contract rules can target the standardized concept and still match your real column names.
Viewing and editing
Open Settings → Org profile (admin only). The page shows:- Current trust level with a slider to adjust
- Correction stats (how many proposed, approved, rejected to date)
- Data personality summary (auto-detected from your jobs)
- Terminology mapping table
- Learned rules list with timestamps
API
How learning works
ORCA updates the profile in two ways:- Implicit learning — every time you accept or reject a suggestion, the corresponding correction type’s confidence is adjusted. After enough accepts of a given pattern, ORCA promotes it to a learned rule and starts auto-suggesting it on similar columns.
- Explicit configuration — you edit the profile directly via the UI or API.
DELETE /api/v1/org-profile/learned-rules. This is occasionally useful when an upstream data change makes prior learnings stale.
Tips
- Start at conservative trust. Move up only after you’ve reviewed at least 50 suggestions and your accept rate is above 80%.
- Add terminology early. Even five entries makes a noticeable difference in classification accuracy on your specific data.
- Audit learned rules quarterly. What was useful six months ago may not match current data. Remove stale rules to keep suggestions sharp.
- Don’t share trust levels across very different teams. If you have two business units with very different data practices, consider separate orgs rather than averaging the trust level.
What’s next?
- Fix Inbox — see how the trust threshold drives the suggestion queue
- Chat — terminology mapping affects chat responses too
- Classification — how the data personality and terminology improve column classification