Overview:

Our project aims to measure how much open source software is in use, how much is created, who is developing these tools, and how such tools are being shared across different sectors, institutions, and organizations. Building on past research, our team used data scraped from GitHub - the world’s largest remote hosting platform - to classify users into academic, government, and business sectors using natural language processing and by joining multiple publicly available data sources. In turn, we used social network analysis to analyze collaborations within and across these sectors to better understand how open source software tools are developed across the globe.

Teaser Video:

Research Project Webpage:

Click here for more details about the project including findings, data, and methods.

Fellows:

Daniel Bullock

Daniel Bullock

Indiana University Bloomington, Cognitive Neuroscience
 
 
 
 

Interns:

Morgan Klutzke

Morgan Klutzke

Indiana University, Psychology and Cognitive Science
 
 
 
 

Crystal Zang

Crystal Zang

Smith College, Mathematics, Statistical & Data Science
 
 
 
 

Mentors:

Gizem Korkmaz

Research Associate Professor, Biocomplexity Institute, University of Virginia

Brandon Kramer

Postdoctoral Associate, Biocomplexity Institute, University of Virginia

J Bayoán Santiago Calderón

Postdoctoral Associate, Biocomplexity Institute, University of Virginia

Stakeholder:

Carol Robbins, Senior Analyst, National Center for Science and Engineering Statistics, Science and Engineering Indicators