Enterprise docs

Verify your data

When you upload data files (csv, xlsx, json, etc.) to data.world, behind the scenes we're actually converting any readable data to a graph database, so that, regardless of format, it becomes instantly previewable in a tabular format as well as queryable within data.world. More on that later, but due to how we process these files, we can also infer data types, high-level stats about the data, and any potential issues in the data.

After data.world processes your data file, the data inspector will automatically attempt to identify and display any potential issues including blank cells, duplicate rows, and numeric values or string lengths far outside the standard deviation for their field.

The number of potential issues is flagged by the orange exclamation mark at the top of the data preview:

360011756233-mceclip3.png

Select the link to open the data inspector and then review the suggestions. If you feel the highlighted issues are not relevant select Dismiss to prevent them from showing again:

360011756253-mceclip4.png

Any changes to the data will still need to be made in your local copy and re-uploaded, where we'll again process the file and report on any potential issues identified. As long as you upload a file with the exact same name, the previous version will be overwritten (although all versions are maintained and accessible!).

This is also the time for you to verify your data to ensure that the columns in your tables were assigned the correct formats on upload. The format of each column is shown to the left of its name in the table. If you click on the icon you will see a pop-up with more information on the field:

360011707954-mceclip5.png

If you wish to change any column formats and also jump into describing your data, select Edit in the top right of the pop-up which will take you to the data dictionary.