Overview
We use administrative data for federal grants to discover research topics and their trends in the area of artificial intelligence (AI). Our data source is Federal RePORTER, a database of federally funded research grants that includes project abstracts and other project data such as funding agencies and start years. We filter Federal RePORTER project abstracts for those that describe projects about AI. AI is a complex and hard to define theme, so this filtering problem is challenging. We utilized three different filtering methods: 1) an AI term matching method proposed by the Organization for Economic Co-operation and Development (OECD), 2) a method by Eads et al., which utilizes term matching and topic modeling, and 3) a Sentence BERT (bidirectional encoder representations from transformers) method that compares the similarity between the AI Wikipedia page and each grant abstract. Each filtering method produces an AI themed corpus on which we run a non-negative matrix factorization (NMF) topic model. Using linear regression and visualization, we analyze the topic model results to discover AI research trends in projects that were federally funded.
Teaser Video
Zoom Link
Project Website
Fellows
Crystal Zang
University of Pittsburgh Graduate School of Public Health
Interns
Haleigh Tomlin
Washington and Lee University
Cierra Oliveira
Clemson University
Mentors
Joel Thurston
Senior Scientist, Biocomplexity Institute, University of Virginia
Eric Oh
Research Assistant Professor, Biocomplexity Institute, University of Virginia
Stephanie Shipp
Research Professor, Biocomplexity Institute, University of Virginia
Kathryn Linehan
Research Scientist, Biocomplexity Institute, University of Virginia
Stakeholders
John Jankowski
Director of R&D Statistics Program, National Center for Science and Engineering Statistics
Audrey Kindlon
Survey Statistician, National Center for Science and Engineering Statistics