Data science education: We're missing the boat, again
Abstract
In the first wave of data science education programs, data engineering topics (systems, scalable algorithms, data management, integration) tended to be de-emphasized in favor of machine learning and statistical modeling. The anecdotal evidence suggests this was a mistake: data scientists report spending most of their time grappling with data far upstream of modeling activities. A second wave of data science education is emerging, one with increased emphasis on practical issues in ethics, legal compliance, scientific reproducibility, data quality, and algorithmic bias. The data engineering community has a second chance to influence these programs beyond just providing a set of tools. In this panel, we'll discuss the role of data engineering in data science education programs, and how best to capitalize on emerging opportunities in this space.