March 12, 2025 | News | Recommended Reading
Online Genealogies for Demographic Research - Potential Benefits and Pitfalls
In a recent study, Andrea Colasurdo of the Max Planck Institute for Demographic Research (MPIDR) and Riccardo Omenti of the University of Bologna examined the potential benefits and pitfalls of using online genealogies for demographic research. Using the FamiLinx database as an example, they investigated how the completeness and quality of demographic information in online genealogy data affects its usability.

© Irina – stock.adobe.com
With our analysis, we wanted to identify and propose new measures to assess the completeness and quality of demographic variables in the FamiLinx data at both the individual and family levels for the period 1600-1900,' explains Colasurdo. For the study, the researchers chose Sweden as a test country and analysed the extent to which the age and sex distributions and mortality rates of the digital population extracted from FamiLinx differed from those of the registered population. We asked: Are there clusters of completeness and quality within selected kinship networks? How are age and sex distributions and demographic estimates derived from online genealogy populations affected by the completeness and quality of reported demographic information?

The figure shows the percentage differences in age-sex proportions between the Swedish genealogical population extracted from FamiLinx and the registered Swedish population over four calendar years: 1751, 1800, 1850 and 1900. © MPIDR
Colasurdo and Omenti conclude that missing values and the accuracy of demographic information in FamiLinx are selective. When one demographic variable is available, researchers can effectively predict the availability of other demographic information. The completeness and quality of demographic variables within kinship networks is significantly higher for individuals with more complete and accurate demographic information. FamiLinx populations have lower mortality rates than the registered population, and their representativeness improves towards the end of the 19th century.
The study shows that online genealogies are a promising data source for demographic research, but that their usefulness in demography depends on the quality and completeness of the demographic information collected and its selectivity. We encourage researchers to use the FamiLinx data with caution. The data source offers many opportunities for demographic research, particularly in historical demography. However, the limitations of online genealogical data must be addressed by applying appropriate methods to correct for bias and by careful sample selection,' said Omenti.
Original Publication
Colasurdo, A.; Omenti, R.:
Demographic Research 51:41, 1299–1350. (2024)

Keywords
completeness, data quality, digital data, FamiLinx, genealogies, kinship network