Overview
The overall AI Readiness score tells you whether your data is generally fit for ML. Use-case readiness goes one level deeper: it tells you whether your data is fit for a specific ML use case, with a list of exactly what’s blocking you and what to fix. It answers questions like:- “Is my customer table ready for churn prediction?”
- “Can I use this transactions file for fraud detection?”
- “Do I have enough history for time-series forecasting?”
- A readiness percentage (0–100%)
- A list of blockers — hard requirements that fail
- A list of warnings — soft requirements that don’t fail outright but should be improved
- A natural-language summary of what to do next
Supported use cases
ORCA ships with eight predefined use-case profiles:| Use case | Predicts | What the profile looks for |
|---|---|---|
| Classification | Categorical outcomes (spam, churn, fraud) | Sufficient rows and columns, low average null rate |
| Regression | Continuous values (price, revenue, duration) | Numeric columns with low null rate and meaningful spread |
| Churn prediction | Which customers will leave | Identifier and datetime columns, plus enough historical depth |
| Time series forecasting | Future values from historical patterns | Datetime column, enough historical points, stable distribution |
| NLP / text classification | Document sentiment, topic, intent | Text columns with sufficient length and variety |
| Anomaly detection | Outlier detection | Stable baseline data |
| Clustering / segmentation | Natural groupings | Numeric features and low null rate |
| Recommendation system | Item recommendations | User-item interaction history |
How scoring works
For each use case, ORCA evaluates your dataset against the profile’s requirements:Hard requirements (blockers)
Things that must be true. If a use case needs a minimum number of rows and your file falls short, that’s a blocker. Each blocker reduces readiness substantially.
Soft requirements (warnings)
Things that should be true. If recommended categories include
datetime and your file has none, that’s a warning. Warnings reduce readiness but don’t block it.Required column types
Some use cases need specific semantic types. Churn prediction needs an identifier and a datetime column. Time series needs a datetime. If they’re missing, that’s a blocker.
Quality dimension thresholds
Each use case sets a target on the 7 readiness dimensions — for example, a churn-prediction profile cares more about completeness than a clustering profile does. Falling short on a dimension lowers readiness for that specific use case.
Reading a result
A use-case readiness result looks like this:Where to find it
In the web app:- AI Readiness page → click any file → scroll to the Use-case readiness matrix
Why this matters
Most data quality tools tell you that your data has problems. They don’t tell you which problems matter for what you’re actually trying to do. A 5% null rate onphone_number is a dealbreaker for an SMS marketing model and irrelevant for a price forecasting model. Use-case readiness encodes that asymmetry. When a stakeholder asks “can we build a churn model with this data?” you can answer with a number and a list of fixes — not a hand-wave.
Tips
- Check use-case readiness before the data science project starts. If churn prediction reports 30% readiness with 12 blockers, that’s a planning conversation, not a Sprint 14 surprise.
- Use blockers as a backlog. Each blocker is one ticket. Fix them in order, re-run, watch the percentage climb.
- Combine with contracts. Once you reach 100% on a use case, define a contract that enforces those exact requirements so future data drift doesn’t silently break your model.
What’s next?
- AI Readiness — the overall 7-dimension score this builds on
- Auto-remediation — fix the blockers automatically when possible
- Reports — generate a PDF of use-case readiness for stakeholders