Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

README.md

Gender Breakdown of Honorary Degree Recipients

This folder contains the data and code behind the article: A breakdown of the 1,740 honorary degrees Penn has granted in the last three centuries

See All Recipients

To see full lists of honorary degree recipients and their genders, follow the links below. Harvard's data does not reflect all of their honorary degree recipients; their page only has complete information since 1990.

School File Website
Penn genders_penn_clean.csv Penn's Honorary Degrees
Harvard genders_harvard_clean.csv Harvard's Honorary Degrees
Brown genders_brown_clean.csv Brown's Honorary Degrees

How It Works

Data Analysis

The data was scraped from each University's relevant honorary degrees website using scraper.py. We chose to also gather data from Harvard and Brown because, of the other schools in the Ivy League, those two had the most data and also the sites most friendly to web scraping.

After scraping the data and cleaning up names, we wrote gender.py to discern the recipients' genders by using two different Python modules, Gender Guesser and Genderizer. If both modules agreed on the recipient's gender, we kept that gender. Otherwise, we confirmed the recipient's gender manually.

To find gender breakdowns at over different time periods, we wrote breakdown.py, which takes in the school's name and optionally a start year, or a start year and end year. For example, running python breakdown.py penn 1990 returns data on Penn from 1990 to present, whereas running python breakdown.py penn 1990 1999 returns data from 1990 through 1999.

Finally, over_time.py calculates the percentage of female and male recipients for all years in the input CSV. This script only works for Penn and Brown, since Harvard doesn't have complete data before 1990.

Visualization

The visualizations for this project were made with C3.js, and the code can be found in charts.js.