The Science of Data Collection: Insights from Surveys can Improve Machine Learning Models

Abstract

Whether future AI models make the world safer or less safe for humans rests in part on our ability to efficiently collect accurate data from people about what they want the models to do. However, collecting high quality data is difficult, and most AI/ML researchers are not trained in data collection methods. The growing emphasis on data-centric AI highlights the potential of data to enhance model performance. It also reveals an opportunity to gain insights from survey methodology, the science of collecting high-quality survey data. This webinar summarizes lessons from the survey methodology literature and discuss how they can improve the quality of training and feedback data, which in turn improve model performance. It also suggests collaborative research ideas into how possible biases in data collection can be mitigated, making models more accurate and human-centric.

Date
May 30, 2024 12:00 PM
Event
United Nations ISWGHS-Global Network Webinar

Recording available at https://www.yammer.com/unstats