Past Events

Speakers:  Dr. Sana Syed, MD, MSCR, MSDS is an Associate Professor in the Departments of Pediatrics and Public Health Sciences in the School of Medicine and the School of Data Science.

Dr. Thomas Hartka, MD, MS is an assistant professor in the Department of Emergency Medicine and the Department of Mechanical and Aerospace Engineering.

Co-facilitators: Phil Bourne, Founding Dean, School of Data Science and Professor of Biomedical Engineering and Sallie Keller, Distinguished Professor in Biocomplexity and a Division Director in the Biocomplexity Institute

Title: Physician Data Scientists: Wearing Two Hats

Abstract: As if the achievement of a medical degree wasn’t enough, two UVA doctors embarked on another challenging pursuit: an MS in Data Science. These recent “graduates” from UVA’s School of Data Science will share their experiences as medical doctors and data scientists in this Data Science for the Public Good Distinguished Speaker Series.

Date: September 27, 2021
Time: 4:00- 5:00 p.m. ET

Watch Video



1 p.m. Plenary Session: Welcome and Keynote
2 p.m. Young Scholars Program: Overview and Highlights
2:15 p.m. Break
2:30-4:30 p.m. Poster Sessions

Please join us for our annual Data Science for the Public Good Symposium to be hosted virtually featuring keynote speaker Jeri Mulrow, Vice President and Director of Statistics and Evaluation Sciences at Westat, and this year's DSPG Young Scholars. *Note: This event is being recorded by audio, video, and photographic means. By attending, you grant the University of Virginia the right to use your voice/likeness in any depiction of this event. 


Keynote Speech Video       Poster Sessions       See Full Program

Speaker: Hadley Wickham
Hadley Wickham is a chief scientist at RStudio, and an adjunct professor of statistics at the University of AucklandStanford University, and Rice University. To learn more about Hadley, click here. 

Title: dplyr: one language, many implementations

Abstract: One of dplyr's lesser known features is that it works with data stored in a wide range of ways, translating dplyr verbs into a variety of other computational frameworks. In his talk, Hadley will talk about three important backends: dtplyr, dbplyr, and multidplyr. These allow dplyr to seamlessly scale up to handle every larger dataset: 

  • dtplyr uses the fantastic data.table package to quickly work with large in-memory datasets; 
  • dbplyr converts your R code to SQL so you can work with data of any size in a relational database; and 
  • multidplyr allows you to easily take advantage of every core on your computer. 

Hadley also will discuss recent community contributions that extend these backends to key tidyr verbs, and share why he thinks the idea of separation description from computation is such a powerful idea.

Date: March 31, 2021
Time: 4-5 p.m. ET

Watch Video

Speaker: Vicki Vasques, Owner and Chairman of the Board of Tribal Tech, LLC and Cowan & Associates

Title: The Role for Data in Successfully Navigating a Woman-Owned Small Business

Abstract: Data serves as an invaluable resource in running any business, helping to show how resilient and resourceful we are. Being that the majority of our business is with Federal, State and Tribal partners, we use it to develop new business and to continue doing business with the clients we already serve. We use data to document the work we do, especially needed when it comes to the work we do regarding health and wellness throughout Indian Country. As our business grows, there is a greater need to grow our data resources – one email can go a long way when it comes to taking action on customer data. The current research and data analysis has been helpful as we’ve been trying to navigate through these times of a global pandemic, racism, civil unrest and economic uncertainty.

Date: February 18, 2021
Time: 4:00- 5:00 p.m. ET

Watch Video

Shamina SinghSpeaker: Shamina Singh 
Shamina Singh is the Founder & President of the Center for Inclusive Growth, the philanthropic hub of Mastercard. She also serves as Executive Vice President of Corporate Sustainability. Over a 15-year career in the public sector, she also has held senior positions in the White House and the U.S. House of Representatives. To learn more about Shamina, click here.

Title: Building the Field of Data Science for Social Impact

Abstract: With nearly 2.5 quintillion bytes of data produced daily, how might we leverage the potential of data to address the socio-economic challenges of the COVID-19 pandemic, systemic racism and the deepening divide of information inequality? Despite great advances in data science, those who most may benefit from precise and timely data analytics - the social sector and civic organizations – are lagging behind. With increased attention and support, they can leverage data analytics to make their work go further and faster, ultimately helping more people survive, thrive and strive in a digital economy.  But transforming the role of data in addressing major social and economic issues is not a job for any one person or organization. Only through a crowding-in of time, talent and capital can the digital economy begin to work for everyone, everywhere.

Drawing on groundbreaking case studies and the use of data tools to unlock potential in economically distressed communities, Shamina Singh, together with leading data scientists from Mastercard, will share examples of how data is helping to advance an agenda of inclusive growth.

Date: October 6, 2020
Time: 4:00-5:00 p.m. ET

Watch Video

Come listen to a great keynote speaker and Oregon State DSPG young scholars discuss their summer data science projects to assist rural stakeholders in Oregon.

Charisse Madlock-Brown
Keynote speaker: Professor Charisse Madlock-Brown, a faculty member in Health Informatics and Information Management at the University of Tennessee Health Science Center, will speak on the topic: "Social Determinants of Health Related to COVID-19 in Urban and Rural Communities."

Date: August 21, 2020
Time: 1-4 p.m. PT

This event is free and open to the public.

Register here      Flyer Here

*Note: These events are being recorded by audio, video, and photographic means. By attending, you grant the University of Virginia the right to use your voice/likeness in any depiction of these events. 

DSPG AudiencePlease join us for our annual Data Science for the Public Good Symposium to be hosted virtually featuring keynote speaker Kenneth Prewitt, Carnegie Professor of Public Affairs and Special Advisor to the President, and this year's DSPG Young Scholars.

Thanks to a grant from the USDA, the DSPG Young Scholars program was able to expand beyond the Commonwealth of Virginia for the first time to create a three-state Coordinated Innovation Network among five partner universities: Oregon State University, Iowa State University, Virginia Tech, University of Virginia, and Virginia State University. 

As a result, this year’s DSPG Symposium brings over 60 undergraduate and graduate students together with postdoctoral fellows and faculty to present 30 research projects that address critical social issues relevant in the world today.

Date: August 7, 2020 
Time: 1-4:30 p.m. ET

Download Flyer       Poster Sessions      Keynote Speech Video

Speaker: Mark Hansen, David and Helen Gurley Brown Professor of Journalism and Innovation and the Director at the David and Helen Gurley Brown Institute of Media Innovation at Columbia University

Title: Fascinating Revelations 

Abstract: Journalists have long turned to data and computation as important components in their reporting. In 1904, Joseph Pulitzer himself advocated for the inclusion of data analysis in his College of Journalism. “You want statistics to tell you the truth” he explained, and then quickly pointed out that with statistics you can find “romance, human interest, humor and fascinating revelations.” Data and its analysis is a unique source for journalists, one with descriptive power on almost every beat. In ways that would have been hard for Pulitzer to fully anticipate, however, data and computation now form complex systems of power in our world. Journalists have a unique responsibility to assess if these systems are fair, holding this new form of power to account. For the last eight years, Hansen has been training journalists to report “on” as well as “with” data, code and algorithms. He will relate his experiences and what “computational journalism” might mean for the practice of data science.

Date: June 4, 2020
Time: 4:00- 5:00 p.m. ET

Watch Video

Speaker: danah boyd, a partner researcher at Microsoft Research, the founder and president of Data & Society, and a visiting professor at New York University

Title: Data: Its Vulnerabilities and Legitimacy

Abstract: Data-driven and algorithmic systems increasingly underpin many decision-making systems, shaping where law enforcement are stationed and what news you are shown on social media. The design of these systems is inscribed with organizational and cultural values. Often, these systems depend on the behavior of everyday people, who may not act as expected. Meanwhile, adversarial actors also seek to manipulate the data upon which these systems are built for personal, political, and economic reasons. In response to calls for algorithmic accountability, computer scientists may seek to "de-bias" data, build "fair" algorithms, or make their models interpretable. Yet, the attack surface goes far beyond the technical realm. In turn, this challenges the legitimacy of the data.

Weaving together her work on media manipulation, search engine "data voids," and efforts to protect the 2020 U.S. Census, danah will help the audience think about the social and technical challenges that our data-centric world introduces.

Date: January 21, 2020
Time: 4:00- 5:00 p.m. ET

Speaker: Katharine Abraham, Director of the Maryland Center for Economics and Policy, and a professor of Survey Methodology and Economics at the University of Maryland

Title: Informing Decisions while Protecting Privacy: The Future of the Federal Statistical System

Abstract: The Federal statistical system faces a multitude of challenges—limited budgets, increasing difficulty in obtaining survey responses, demands for more timely and detailed data, and growing concerns about individual privacy. Addressing these challenges will require a fundamental rethinking of how the statistical agencies do their work. Greater use of naturally-occurring “big data,” new models for data dissemination, and strengthened partnerships with academia and the business community all will be a part of an effective response. The Foundations of Evidence-Based Policymaking Act signed into law earlier this year lays the groundwork for some important first steps.

Date: September 17, 2019
Time: 4:00-6:30 p.m. ET

Arlington, VA

This year's Symposium featured keynote speaker Ron Jarmin, deputy director and chief operating officer of the U.S. Census Bureau. He spoke about the 2020 Census and it's continual evolvement to better suit the needs of our country. The Data Science for the Public Good Annual Symposium is a showcase for researchers across the country, including the Institute's DSPG Young Scholars.

This year's Symposium also featured keynote speaker Phil Bourne, director of the University of Virginia’s Data Science Institute and acting dean of the School of Data Science. This talk invited the audience to think about what we can learn about responsibility from genomics as we perform data science for the public good, and how openness, reproducibility, etc. can lead to responsible data science across all domains.

The 2019 symposium keynote speakers and the young scholar poster speed session are available on YouTube.

Keynote and Poster Session Videos       Research Projects

Speaker: Martin O'Malley, former governor of Maryland

Title: Smarter Government: The Data, the Map, and the Method

Date: June 14, 2019
Time: 4:00-6:30 p.m. ET

Download Slides       News Article