Historical Chinese Microdata. 40 Years of Dataset Construction by the Lee-Campbell Research Group


  • Cameron Campbell
  • James Lee




China, Historical demography, Historical big data, Quantitative history, Inequality


The Lee-Campbell Group has spent forty years constructing and analysing individual-level datasets based largely on Chinese archival materials to produce a scholarship of discovery. Initially, we constructed datasets for the study of Chinese demographic behaviour, households, kin networks, and socioeconomic attainment. More recently, we have turned to the construction and analysis of datasets on civil and military officials and other educational and professional elites, especially their social origins and their careers. As of July 2020, the datasets include nominative information on the behaviour and life outcomes of approximately two million individuals. This article is a retrospective on the construction of these datasets and a summary of their findings. This is the first time we have presented all our projects together and discussed them and the results of our analysis as a single integrated whole. We begin by summarizing the contents, organization, and notable features of each dataset and provide an integrated history of our data construction, starting in 1979 up to the present. We then summarize the most important results from our research on demographic behaviour, family, and household organization, and more recently inequality and stratification. We conclude with a reflection on the importance of data discovery, flexibility, interaction and collaboration to the success of our efforts.


