Abstract
This paper focuses on data quality indicators conceived to measure the quality of numerical datasets. We have devised a set of three different indicators, namely Intrinsic Quality, Distance-based Quality Factor and Information Entropy. The results of quality measures based on these indicators can be used in further data processing, helping to support actual data quality improvements. We argue that the proposed indicators can adequately capture in a quantitative way the impact of different numerical data quality issues including (but not limited to) gaps, noise or outliers.
| Original language | English |
|---|---|
| Title of host publication | IoTBDS 2020 - Proceedings of the 5th International Conference on Internet of Things, Big Data and Security |
| Editors | Gary Wills, Peter Kacsuk, Victor Chang |
| Publisher | SciTePress |
| Pages | 341-348 |
| Number of pages | 8 |
| ISBN (Electronic) | 9789897584268 |
| DOIs | |
| Publication status | Published - 2020 |
| Event | 5th International Conference on Internet of Things, Big Data and Security, IoTBDS 2020 - Virtual, Online Duration: 7 May 2020 → 9 May 2020 |
Conference
| Conference | 5th International Conference on Internet of Things, Big Data and Security, IoTBDS 2020 |
|---|---|
| City | Virtual, Online |
| Period | 7/05/20 → 9/05/20 |
Funding
This research is funded by EPSRC Doctoral Training Partnership 2016-2017 University of Aberdeen with award number: EP/N509814/1
Keywords
- Data Quality
- Data Quality Indicators
- Intrinsic Data Quality
- Numerical Data Quality
- Pre-processing