Portfolio

Three projects in quantitative methods and data infrastructure for social science research.

D-SOCIAL-KLD

Dynamic corporate social responsibility scores, 1991–2018

KLD rates firms on dozens of binary ESG indicators each year. Rather than summing them into an index, this project estimates a latent CSR score for each firm-year using a dynamic two-parameter IRT model fit in Stan. Discrimination parameters let the data decide which indicators are most informative; a firm-specific random walk links scores across years.

Bayesian inference IRT Stan R

Six’s Generals

Digitizing 2,206 Napoleonic-era military biographies

Georges Six’s Dictionnaire biographique (1934) contains structured biographical entries for every general and admiral of Revolutionary and Imperial France. This project builds a pipeline to convert Six’s semicolon-delimited prose into a machine-readable panel dataset, then uses it to examine promotion dynamics, battlefield careers, and the imperial honor economy.

NLP / OCR Data pipeline Python Military history

Shadow of Intervention

Measuring anticipated third-party intervention in civil wars

Theory predicts that expected outside intervention shapes the decision to rebel, but no existing measure covers the full population of potential interveners. This project constructs one using a two-stage learned-proxy design: a machine-learning ensemble predicts intervention probabilities for every directed dyad-year; predictions are aggregated to a country-year shadow disciplined by a Nash fixed-point condition.

Machine learning Causal inference Civil conflict Measurement