This folder contains the data and code behind the article: A breakdown of the 1,740 honorary degrees Penn has granted in the last three centuries
To see full lists of honorary degree recipients and their genders, follow the links below. Harvard's data does not reflect all of their honorary degree recipients; their page only has complete information since 1990.
| School | File | Website |
|---|---|---|
| Penn | genders_penn_clean.csv | Penn's Honorary Degrees |
| Harvard | genders_harvard_clean.csv | Harvard's Honorary Degrees |
| Brown | genders_brown_clean.csv | Brown's Honorary Degrees |
The data was scraped from each University's relevant honorary degrees website using scraper.py. We chose to also gather data from Harvard and Brown because, of the other schools in the Ivy League, those two had the most data and also the sites most friendly to web scraping.
After scraping the data and cleaning up names, we wrote gender.py to discern the recipients' genders by using two different Python modules, Gender Guesser and Genderizer. If both modules agreed on the recipient's gender, we kept that gender. Otherwise, we confirmed the recipient's gender manually.
To find gender breakdowns at over different time periods, we wrote breakdown.py, which takes in the school's name and optionally a start year, or a start year and end year.
For example, running python breakdown.py penn 1990 returns data on Penn from 1990 to present, whereas running python breakdown.py penn 1990 1999 returns data from 1990 through 1999.
Finally, over_time.py calculates the percentage of female and male recipients for all years in the input CSV. This script only works for Penn and Brown, since Harvard doesn't have complete data before 1990.
The visualizations for this project were made with C3.js, and the code can be found in charts.js.