Measuring Migration Status Based on the Place of Marriage Overestimates the Share of Male Migrants in Historical Populations. Evidence From Dutch Marriage Certificates

e-ISSN: 2352-6343 DOI article: https://doi.org/10.51964/hlcs9583 The article can be downloaded from here. © 2021, Rosenbaum-Feldbrügge, Puschmann This open-access work is licensed under a Creative Commons Attribution 4.0 International License, which permits use, reproduction & distribution in any medium for non-commercial purposes, provided the original author(s) and source are given credit. See http://creativecommons.org/licenses/.


Measuring Migration Status Based on the Place of Marriage Overestimates the Share of Male Migrants in Historical Populations
Thanks to the construction of large databases such as LINKS and GENLIAS based on Dutch civil certificates, our knowledge of individual demographic behavior in the past has improved significantly. However, the use of such research infrastructures also introduces some potential pitfalls, as these databases do not contain all information available from the original sources. For instance, variables that are available on the original source but lacking in LINKS are the places of residence of the bride and the groom at marriage. A common practice among researchers using LINKS and GENLIAS is therefore to identify migrants by comparing an individual's birth place with the place of marriage. The place of marriage, however, is not necessarily identical to the place of residence, because couples traditionally contracted their marriage in the bride's or bride's parents' municipality of residence. It is therefore particularly likely that grooms are erroneously considered as migrants even though they had never moved before marriage. In this paper we explore whether this poses a problem to studies using the place of marriage as an equivalent to the place of residence. This will be achieved with the help of the marriage certificates release from the Historical Sample of the Netherlands (HSN), which, unlike LINKS, contains both the place of marriage of the couple and the residence of the bride and groom, and allows us to compare the findings derived from both approaches. The analyses show that identifying migrants based on place of marriage causes indeed a significant overestimation of male migrants, but not of female migrants. We therefore suggest the use of a couple's place of first childbirth as a robustness check to avoid overestimating male migration in the past.
In the past as well as in contemporary societies, the demographic behavior of migrants differs substantially from non-migrants in various life domains, ranging from marriage and fertility to social mobility and mortality (Puschmann, 2015). As incorrect assumptions on the (size and behavior of the) migrant population might cause biases to a wide range of demographic analyses, it is of utmost importance to correctly identify domestic migrants, international migrants and non-migrants. This even applies to studies which do not focus on migration but introduce migration status as a control variable.
There are several ways to identify migrants and measure individual migration status using historical sources. The gold standard in historical-demographic migration research is a series of linked registers which provide a longitudinal data structure, thereby covering the entire life course of an individual and enabling researchers to identify the exact timing, frequency, origin, and destination of moves. Examples of historical demographic databases which are based on such longitudinal source materials are the Historical Sample of the Netherlands (HSN; Mandemakers, 2000), the Scanian Economic Demographic Database (SEDD; Dribe & Quaranta, 2020), the POPUM and POPLINK databases (Westberg, Engberg, & Edvinsson, 2016), the Antwerp COR*-database (Jenkinson, Anguita, Paiva, Matsuo, & Matthijs) and the Utah Population Database (UPDB). The data collection and construction of longitudinal historical databases, however, requires large costs and resources. In addition, detailed population registers are not available in most historical contexts and study periods.
Historical demographers therefore rely on cross-sectional data only or on a linkage of several cross-sections (mainly censuses) to identify migrants and non-migrants. Another approach is the linkage of individual civil certificates of birth, marriage and death. Particularly in the Netherlands, this approach has recently received much attention with the construction of the LINKing System for historical family reconstruction (LINKS), which aims at reconstructing all 19th-and 20th-century families in the Netherlands with the help of a digitized index of all civil certificates (Mourits, van Dijk, & Mandemakers, 2020). LINKS was originally established as a resource for genealogists but recent research on the Dutch province of Zeeland shows that it serves as a valuable data source for historical demographic research on fertility and mortality as well (e.g. Mourits, 2019; van Dijk, 2019).
Researchers using LINKS or similar databases typically identify migrants by comparing an individual's place of birth with the place of marriage, as the actual residence of the groom and bride is not available in the database (e.g. Störmer, Gellatly, Boele, & de Moor, 2017). This approach might be problematic as the place of marriage is not necessarily identical to the residences of the groom and bride. In the 19th-century Netherlands, for instance, couples traditionally contracted their marriage in the bride's or the bride's parents' place of residence (Ekamper, van Poppel, & Mandemakers, 2011). Therefore, we expect that comparing the birth place with the place of marriage is potentially misleading when it comes to male migration trajectories (Rosenbaum-Feldbrügge & Debiasi, 2019).
In the present methodological paper we investigate how many mismatches there are between the brides' and grooms' places of marriage and their actual places of residence to measure the size and direction of the bias. This will be achieved with the help of the marriage certificates collected within the HSN. In contrast to LINKS, the HSN marriage certificates release contains, next to birth places and the place of marriage, also the bride's and groom's actual places of residence.
The HSN certificates release 2010 contains birth, marriage and death certificates of HSN research persons 1 . For this specific exercise, we exclusively used the marriage certificates (N = 29,480), which contain information such as the municipality of marriage, the age and occupational title of the bride and groom, as well as their birth location and residence. From the raw HSN data we selected exclusively first marriages contracted between 1850 and 1900 in which information on all places of residence and places of birth was available (N = 11,764). Analyses run for the period 1850-1940 revealed very similar results. Selecting first marriages in LINKS, in contrast, is much more complicated as there is no information about individual civil status included in the database.

DATA AND METHODS
We study the migration status of grooms and brides based on two different approaches. The first approach compares birth place and marriage place (i.e. if wedding place ≠ birth place -> groom/bride = migrant), which is often used in datasets which lack information on the residence of the bride and groom. We expect that this approach is prone to biases, especially regarding the migration status of the groom. The second approach is a comparison between the location of birth and the actual residence of the bride and groom as indicated on the marriage certificate (i.e. if residence groom ≠ birth place groom -> groom = migrant; residence bride ≠ birth place bride -> bride = migrant ).
We assume that the second approach is inherently more reliable, as it contains the actual residence of the groom and bride. However, at the same time, it is clear that this approach based on place of residence is not ideal either. In practice, grooms and brides who married in their place of birth might in fact be return migrants, who moved to one or more other destinations at some point in their life course, but moved back to their municipality at birth upon marriage. Moreover, if the residences of the bride and groom at marriage were different, one of them probably moved (immediately) after the wedding in order to form a new household with the spouse. The second approach is thus likely to lead to an under-estimation of the total share of migrants in a population. Table 1 shows the results of both approaches. The upper crosstab displays the distribution of migrants and non-migrants based on place of marriage (vertically) and place of residence (horizontally) among brides, and the lower crosstab shows the same type of results for grooms. The results indicate that both approaches reveal very similar results with 92.8% of the brides and 85.5% of the grooms being matched identically (see the bold numbers in Table 1). Measuring the percentage of female migrants by place of marriage shows that 33.9% of the brides were considered to be migrants. Based on the place of residence, we find that 30.9% of the brides were migrants (see the numbers in italics in Table 1). For brides, using the place of marriage leads thus to a minor, almost negligible, overestimation of the migrant population by 3 percentage points. For grooms, however, we find that 42.8% were migrants based on the place of marriage, while only 34.0% were considered to be migrants based on the place of residence. This corresponds to a major difference of almost 9 percentage points. As expected, using the place of marriage as an indicator of migration status therefore leads to a considerable overestimation of the percentage of migrants among males.

RESULTS
The construction of large databases such as LINKS opens new opportunities for research. Simultaneously, it also creates additional challenges as LINKS -unlike, for instance, the HSN -lacks some information from the original source, such as the residence of the bride and groom and the presence or absence of signatures on the marriage certificate. In the absence of such information, certain analyses with regard to, for example, literacy are impossible to conduct, while in other cases researchers will be forced to use proxies which might cause bias.
This small exercise showed that using the marriage place as a proxy for residence led to a significant overestimation of the migrant population among males, but not among females. We think that this gender difference results from the fact that it was more common for couples to marry in the municipality where the bride or the bride's parents resided (Ekamper et al., 2011). Remarkably, based on the place of residence, migration propensities of males and females upon marriage were very similar (34.0% and 30.9%, respectively). Earlier research studying the share of migrants among the general population in the same research period based on the place of marriage therefore concluded incorrectly that males were much more mobile than females (Störmer et al., 2017).
To avoid such biases in the future, we generally advise researchers using databases originally designed for genealogists to become familiar with their specific shortcomings such as the dependency on vital events (see also Adams, Kasakoff, & Kok, 2002). In this particular case, we recommend that users of LINKS who aim to study migration utilize the birth place of the couple's first child as a robustness check. This way of dealing with the problem excludes childless couples, but LINKS certainly contains enough individual observations to enable such sensitivity analyses. In addition, using the first child's place of birth solves the problem of brides and grooms with different places of residence moving shortly after marriage. Finally, we would like to recommend that database administrators of LINKS add missing information from the original certificates in the future.