Overview

We use administrative data for federal grants to discover research topics and their trends in the area of artificial intelligence (AI). Our data source is Federal RePORTER, a database of federally funded research grants that includes project abstracts and other project data such as funding agencies and start years. We filter Federal RePORTER project abstracts for those that describe projects about AI. AI is a complex and hard to define theme, so this filtering problem is challenging. We utilized three different filtering methods: 1) an AI term matching method proposed by the Organization for Economic Co-operation and Development (OECD), 2) a method by Eads et al., which utilizes term matching and topic modeling, and 3) a Sentence BERT (bidirectional encoder representations from transformers) method that compares the similarity between the AI Wikipedia page and each grant abstract. Each filtering method produces an AI themed corpus on which we run a non-negative matrix factorization (NMF) topic model. Using linear regression and visualization, we analyze the topic model results to discover AI research trends in projects that were federally funded.

Teaser Video

Zoom Link

 

Project Website

 

Fellows

Crystal Zang

University of Pittsburgh Graduate School of Public Health 

 

 

 

Interns

Haleigh Tomlin 

Washington and Lee University 

 

 

 

Cierra Oliveira 

Clemson University

 

 

 

Mentors

Joel Thurston

Senior Scientist, Biocomplexity Institute, University of Virginia

Eric Oh

Research Assistant Professor, Biocomplexity Institute, University of Virginia

Stephanie Shipp

Research Professor, Biocomplexity Institute, University of Virginia

Kathryn Linehan

Research Scientist, Biocomplexity Institute, University of Virginia

Stakeholders

John Jankowski 

Director of R&D Statistics Program, National Center for Science and Engineering Statistics

Audrey Kindlon

Survey Statistician, National Center for Science and Engineering Statistics