Overview
ORCA’s AI chat lets you ask questions about your data quality, classification results, and readiness scores in plain language. Instead of navigating dashboards and filtering tables, you can ask directly: “Which columns have the most null values?” or “Why did the readiness score drop?” The chat understands your organisation’s data context — it knows your files, columns, quality issues, and historical trends.What you can ask
Simple questions (1 token)
Factual lookups and status checks answered from your data:- “How many files have been scanned?”
- “What’s the quality score for customers.csv?”
- “List all columns with GDPR flags”
- “Show me the latest job status”
Complex questions (3 tokens)
Questions that require reasoning, comparison, or diagnosis:- “Why does the revenue column have so many outliers?”
- “Compare quality scores between last month and this month”
- “Which files should I prioritize for remediation?”
- “Explain the relationship between these two tables”
- “What’s causing the readiness score to drop?”
Action requests (5 tokens)
Requests that trigger operations:- “Generate a GDPR compliance report”
- “Fix the null values in the email column”
- “Create a data contract for the orders table”
Query routing
Every query is automatically classified to determine the right model and token cost:| Query type | Model | Token cost | Use case |
|---|---|---|---|
| Simple | Gemini Flash | 1 token | Lookups, status checks, listing data |
| Complex | Claude Sonnet | 3 tokens | Reasoning, diagnosis, strategy, comparison |
| Action | Claude Sonnet | 5 tokens | Fix, generate, apply operations |
| Follow-up | Gemini Flash | 1 token | Continuing a conversation thread |
| Off-scope | — | 0 tokens | Rejected (not about data quality) |
SQL-of-Thought reasoning
For complex queries, ORCA uses structured multi-step reasoning to build accurate answers:- Parse the question and identify what data is needed
- Assemble context from quality results, classifications, and historical scores
- Reason through the evidence step by step
- Synthesize a clear, actionable answer
Contextual questions
Throughout the web app, you’ll find “Ask About This” buttons on quality scores, dimension breakdowns, anomaly alerts, and column details. These buttons pre-fill the chat with relevant context, so the AI already knows what you’re looking at. For example, clicking “Ask About This” on a low completeness score sends a query like:“The completeness dimension scored 62%. What columns are driving this down and how can I improve it?”The AI receives the dimension context, the file’s quality results, and your organisation’s profile — so the answer is specific to your situation, not generic advice.
Token costs
Chat queries consume tokens from your monthly bucket:| Plan | Monthly tokens | Simple queries | Complex queries | Action queries |
|---|---|---|---|---|
| Free | 50 | 50 | 16 | 10 |
| Pro | 5,000 | 5,000 | 1,666 | 1,000 |
| Enterprise | 25,000 | 25,000 | 8,333 | 5,000 |
Tips for better answers
Be specific about columns and files
Be specific about columns and files
Instead of “How’s my data quality?”, ask “What are the top 3 quality issues in customers_march.csv?” — the AI can give you a concrete, actionable answer.
Reference column names
Reference column names
“Why does the
annual_revenue column have outliers?” gets a better response than “Why are there outliers?” because the AI can look up the specific column’s statistics and value distribution.Ask follow-ups
Ask follow-ups
The chat maintains conversation history (last 10 messages). After getting an answer, ask follow-up questions to drill deeper: “Which specific rows are affected?” or “What would happen if I applied winsorization?”
Use contextual buttons
Use contextual buttons
Keep it about your data
Keep it about your data
The chat is scoped to data quality topics. Questions outside this scope (general knowledge, coding help, unrelated topics) are politely declined to keep token costs at zero for off-scope queries.
Chat via the API
You can also use the chat programmatically:Create a conversation
Send a message
List conversations
Security
All chat inputs pass through a security gate that sanitizes queries before they reach the AI model. The system:- Scrubs any PII from user messages before logging
- Validates AI responses for safety
- Rejects prompt injection attempts
- Rate-limits queries per user
Next steps
Classification
Learn how ORCA classifies your columns
AI readiness
Understand the scoring methodology the chat references