Skip to main content

Verifying your data with data inspectors

When ingesting a tabular data file on data.world, it undergoes a series of inspections to validate both the structure and content. If issues are identified, the file is flagged, enabling users to take corrective measures.

Locating the warning flag

The warning flag appears on the Dataset Details page beneath the file name:

Screen_Shot_2018-12-30_at_4.25.37_PM.png

You can also find it in the About This File section as Inspections within the dataset or project workspace:

Screen_Shot_2018-12-30_at_5.39.17_PM.png

Types of indicators

  • WARNINGS - Indicated by a Yellow Triangle: These are the most common indicators, alerting you to potential issues that might impede querying or notifying you of detected sensitive data (e.g., social security numbers, phone numbers, email addresses).

  • SEVERE ERRORS - Indicated by a Red Circle: These rare indicators show there's been an ingestion error, resulting in data loss with possible causes such as: Corrupted original file, Data type mismatches, Value mismatches in specified linked data classes.

For more details, refer to Data inspections messages.

Reviewing warnings and errors

Clicking a warning or error indicator opens a window listing all issues captured by the data inspector. Review these issues to decide whether to address them, or click Dismiss if notifications are unnecessary.

Important

Dismissing warnings ensures they do not reappear even when the file is deleted and reimported or updated. To retrieve all warnings again, ingest the file under a different name.

Correcting issues

  • For files initially added directly: Download, correct, and re-upload the file.

  • For files synchronized from external services (for example, cloud storage): Update the file in the source system and click Sync Now to synchronize changes.

    Changes to the data dictionary might generate error warnings. For instance, converting a String column to Integer post-ingestion triggers a red flag due to datatype mismatches.