Overview and Importance of Data Quality for Machine Learning Tasks
- Abhinav Jain
- Hima Patel
- et al.
- 2020
- KDD 2020
The Data Quality for AI (DQAI) framework offers services that enable model developers and data scientists to establish a systematic and formalized data preparation program, streamlining the initial step of the model development lifecycle.
By implementing the DQAI framework, organizations can reduce the labor and time invested in data preparation, ultimately decreasing model costs and development time. Improving data quality early in the development cycle leads to more accurate models, better decision-making, and increased efficiency.
This framework is designed for prepping data for supervised classification or regression tasks and includes software for quality checks, remediation, audit reports, and automation. Additionally, the capabilities can be utilized for custom data exploration and human-guided improvement of models, making it a versatile tool for any stage of model development.