Veridical Data Science Toward Trustworthy AI

Friday, October 27, 2023 2:00 pm - 3:00 pm

Speaker: Bin Yu, UC Berkeley

Title: Veridical Data Science Toward Trustworthy AI

Abstract: Data Science is central to AI and has driven most of recent advances in biomedicine and beyond. Human judgment calls are ubiquitous at every step of a data science life cycle (DSLC): problem formulation, data cleaning, EDA, modeling, and reporting. Such judgment calls are often responsible for the "dangers" of AI by creating a universe of hidden uncertainties well beyond sample-to-sample uncertainty. To mitigate these dangers, veridical (truthful) data science is introduced based on three principles: Predictability, Computability and Stability (PCS). The PCS framework and documentation unify, streamline, and expand on the ideas and best practices of statistics and machine learning. In every step of a DSLC, PCS emphasizes reality check through predictability, considers computability up front, and takes into account expanded uncertainty sources including those from data curation/cleaning and algorithm choice to build more trust in data results. PCS will be showcased through collaborative research in finding genetic drivers of a heart disease, stress-testing a clinical decision rule, and identifying microbiome-related metabolite signatures for possible early cancer detection.

Where

Duques Hall School of Business 2201 G Street, NW Washington DC 20052

Room: 151

Admission

Open to everyone.