Skip to main content

Overview

The AI readiness score is a weighted composite score (0-100) that measures how suitable a dataset is for AI/ML applications. It evaluates 7 quality dimensions, each contributing to the overall score based on its relative importance for AI/ML workloads.
Looking for the full methodology, standards alignment, and known limits? See the AI Readiness methodology page — it documents what we measure, what we deliberately don’t measure, and how we map to NIST AI RMF v1.0, ISO/IEC 25012, and ISO/IEC 5259.

The 7 dimensions

Each dimension targets a different category of AI/ML risk. Together they form a single composite score per file.
  • Completeness — proportion of non-null values across all columns.
  • Consistency — format uniformity and pattern adherence within columns.
  • Referential integrity — absence of orphaned cross-column references and severely malformed values.
  • Compliance — GDPR readiness and sensitive data handling.
  • Uniqueness — absence of duplicate rows and redundant records.
  • Schema quality — column naming conventions, type consistency, structural integrity.
  • Stability — data drift and distribution changes (when historical data is available).
One of the most heavily weighted dimensions. Measures the proportion of non-null values across all columns. Datasets with substantial missing values in critical columns score significantly lower than those with complete data.
Use the remediation engine to impute null values with median (numeric), mode (categorical), or forward-fill (datetime) strategies.
Evaluates whether values within a column follow consistent formats. For example: mixed date formats (2024-01-15 vs 01/15/2024), inconsistent casing, or varying phone number formats.
Apply format standardization fixes to normalize dates to ISO 8601, lowercase emails, and strip whitespace.
Detects orphaned cross-column references (values pointing at records that no longer exist) and severely malformed values that break the column’s expected format. Maps to ISO/IEC 25012’s credibility characteristic — not the standard’s accuracy characteristic, which would require comparison against a true reference value the platform does not have.
Resolve orphaned references and remediate critical/high-severity format violations before training. Dimension was renamed from “Accuracy” in methodology v1.0 (2026-04-27) so the name aligns with what is actually measured.
Checks for GDPR-sensitive columns (emails, phone numbers, national IDs, health data) and evaluates whether appropriate handling is in place. Considers the GDPR toggle setting and data retention policy.
Review GDPR-flagged columns and enable appropriate data handling policies in Settings.
Measures the proportion of unique rows. Exact duplicate rows reduce this score. Near-duplicates may also be flagged.
Apply deduplication to remove exact duplicate rows.
Evaluates column naming conventions (consistency, descriptiveness), type detection accuracy, and structural integrity (e.g., mixed types within a single column).
Rename ambiguous columns and ensure consistent data types per column.
When historical scores exist, stability measures how much the data distribution has changed between scans. Stable data that doesn’t drift unexpectedly scores higher. On the first scan, stability defaults to a neutral baseline.
Establish regular scan schedules to track drift over time.

How dimensions combine

The seven dimensions are combined into a single 0-100 score per file using a weighted average. Weights reflect AI/ML use-case priorities — for example, completeness and compliance carry more weight than uniqueness, because missing values and unhandled PII are more catastrophic for AI training than duplicate records. Weights are not fixed across all use cases. ORCA carries a small set of named profiles (classification, regression, NLP, time-series, anomaly detection) that re-balance the weights to reflect what matters most for that family of model — a time-series profile gives more weight to stability than a classification profile does, because temporal drift dominates time-series risk. The exact weight values, the thresholds each dimension applies internally, and the per-issue penalty schedules are implementation details we do not publish. For the full methodology, standards alignment, and limits, see the AI Readiness methodology page.

Grade scale

GradeScore rangeMeaning
A90-100Excellent — production-ready for AI/ML use cases
B75-89Good — eligible for an assessment report; some improvements recommended
C60-74Fair — significant issues need addressing before training
D40-59Poor — major remediation required
F0-39Failing — fundamental data quality problems

Assessment

Datasets scoring at Grade B or higher are eligible for an AI readiness assessment report. Assessment reports include:
  • Overall score and grade at time of issuance
  • Per-dimension breakdown
  • SHA-256 verification hash for authenticity
  • Public verification URL
Assessment reports can be shared with stakeholders, included in data governance documentation, or used to demonstrate data quality compliance.

Use-case readiness

Beyond the overall score, ORCA evaluates readiness for 8 specific ML use cases:
  1. Churn prediction — requires customer identifiers, temporal data, engagement metrics
  2. Fraud detection — requires transaction data, amounts, timestamps, categorical flags
  3. Recommendation engine — requires user-item interactions, ratings or implicit signals
  4. Customer segmentation — requires demographic and behavioral features
  5. Demand forecasting — requires time series data, quantities, seasonal indicators
  6. Sentiment analysis — requires text fields with sufficient length and variety
  7. Price optimization — requires pricing data, competitor info, demand signals
  8. Risk scoring — requires financial data, credit indicators, outcome labels
Each use case has specific data requirements. The Use Cases tab shows which requirements are met and which gaps need to be addressed.