About data profiling
What is data profiling?
Data profiling is a process of creating high-level summaries about data content and quality that can aid in decision making, data trust, and understanding.
Once you have profiled your data, you can perform data quality checks.
The summary statistics generated via profiling help you understand and trust your data by providing a quick look at the data.
By viewing the summary statistics, you can quickly see if your data has quality issues like incomplete data or data outside of a normal range (For example. data with too high of a maximum value or too low of a minimum value).
Profiling is especially important for larger datasets where visual spot-checks are difficult – profiling helps automate your data quality routine, leading to more trust and better decision-making.
How do I setup data profiling in data.world?
There are two options for setting up data profiling:
For both features, the system creates summary statistics about the data, like the mean, minimum, and maximum values, and null-counts.