News | October 9, 2019

Computational Methods and Data Sources for Migration Research in the Digital Era

© Delpixart/iStockphoto.com

The Max Planck Institute for Demographic Research (MPIDR) organizes a workshop on migration data and models, at the 11th Social Informatics Conference in Doha, Qatar, on November 18th, 2019.

Migration plays a central role in population processes and represents an increasingly important component of social, economic, health, and political change across the globe. However, despite its growing importance, migration data remain expensive, difficult to collect, and burdened by inconsistencies due to the definitions used by different organizations. On the other hand, the spread of internet and online social networks may provide unprecedented opportunities for studying global population dynamics, offering, for instance, new data sources for studying demographic processes such as migration. The goal of this workshop is hence to facilitate a conversation about improving migration data collection and developing new modeling approaches by bringing together social scientists interested in the estimation of internal and international migration flows with data scientists and statisticians who are familiar with strategies for inferring information on migration from new forms of digital data and with modeling approaches for integrating different data sources.

SocInfo 2019 Pre-conference Workshop: New computational methods and data sources for migration research in the digital era

Doha, Hilton Doha
November 18, 2019 — 8:30 am to 12:30 pm 

The goal of this workshop is to provide a forum for discussion of (i) recent research advances in internal and international migration, (ii) the relationships between internal and international migration, as well as (iii) opportunities that arise from combining traditional and digital data sources. The workshop includes three sessions, which are devoted to internal migration, international migration, and new data sources, respectively. Each presentation should last no more than 20 minutes, with a final Q&A moment at the end of each session.


8:30 — 8:45  Introductions

8:45 — 9:45  Panel 1 Data and Models for Internal Migration

Chair: Elin Charles-Edwards

Aude Bernard

Progress in Internal Migration Research Methods

Despite rising interest in migration, there is still lag in research and knowledge in part because of the absence of standardized data collections and standard migration indicators. This presentation will showcase two new directions in migration research that have sought to systematize collection of migration data and deliver measures that are intuitive meaningful and practically useful: (1) the IMAGE project that relies on the use of aggregate data in an origin-destination framework and (2) the cohort migration project that utilizes individual-level data to measure migration over the life-course and track progression from one migration to the next. This presentation will describe these two complementary approaches to migration research, including their conceptual foundation, data requirements, analytical methods and contribution to understanding.

Mark Ellis

Some Insights and Issues Arising from Using Linked Tax, Social Security, and Survey Records to Measure U.S. Internal Migration

The U.S. Census Bureau has developed procedures to link tax records from the Internal Revenue Service (IRS) to demographic data from the Social Security Administration (SSA) and Census survey records from decennial censuses in 2000 and 2010 and annual American Community Surveys from 2005-2015. These restricted-use linked records create a population-sized longitudinal database that annually tracks income and geographic location at fine spatial scales and also includes various standard individual and family demographic measures. I will discuss how these linked data mitigate coverage and selection issues and describe issues and procedures for dealing with missing annual records. I will also present a series of insights from using these data to assess the characteristics of the immobile population (those who never migrate, measured annually between 2000 and 2015), primary migrants (those who migrate once), and secondary migrants (those who move onward or return).  The structure of the IRS and SSA data creates possibilities for measuring intergenerational effects on migration.  I illustrate this with some analysis of those who are 16 in 2000 living with one or more parents, following them to 2015 when they are 31.I end with some observations on the tremendous potential of administrative data for U.S. internal migration research and how their linkage with survey records offers the possibility for improvement of missing or allocated responses to migration questions on census surveys. Differences between a person's tax home and where that person reports they usually live complicate this issue. It is not clear which is the most legitimate measure of location for migration measurement.


9:45 — 10:45  Panel 2 Data and Models for International Migration

Chair: Victoria Prieto

Nikola Sander

Existing and New Sources of Data on International Migration

In this talk, I discuss the strengths and limitations of official migration data and introduce a new source of longitudinal data on international migration. Data on international migration are commonly sourced from national population censuses and national household or labour force surveys. A significant limitation of these data is that emigrants are typically omitted. Hence, there is a dearth of data on emigrants from the perspective of the origin country. Existing strategies for surveying emigrant populations have mainly looked at emigrants from the perspective of the leading destination countries, thereby providing insights into the socio-economic characteristics of a selected sub-group of emigrants living abroad. However, this approach yields little insight into the migration process and its consequences over the life course. The German Emigration and Remigration Panel Study (GERPS) aims to address the dearth of data on emigrants. The GERPS project is the first of its kind to use a population register as a sampling frame for collecting data on emigrants in a life course perspective and covering all destination countries across the globe. The project illustrates how to generate new sources of data on international migration that are based on official statistics rather than social media data.

Joel E. Cohen

Measuring and Modeling Migration by Flows, Stocks, and Life Histories

Data on migration commonly describe flows or stocks. Migration life histories are an insufficiently studied third kind of data on migration. A migration life history is a continuous time series of an individual's place of current residence, together with time-dependent covariates such as education, health (including disabilities and functional status), socio-economic status, demographic indicators (including marital or union status and fertility history), and cultural indicators (including legal status of current residence and ethnic, sexual, religious and political self-identification). More attention is needed for gathering, modeling, and analyzing the migration life histories of large, representative samples of individuals. Sociology and epidemiology have developed rich methods and models for analyzing longitudinal data in life histories. Demographers can exploit and develop these methods and models. I will give a concrete example of how a simple longitudinal model applied to current estimates of flows imputed from differences in stocks can generate new predictions that can be tested using data on migration life histories.


10:45 — 11:05  Break

11:05 — 12:05  Panel 3 New data sources for migration research

Chair: Emanuele Del Fava

Francesco Rampazzo

Following a Trail of Breadcrumbs: a Study of Migration through Digital Traces

Measuring International migration is challenging. The lack of timely and comprehensive data about migrants and different measures and definitions used by countries is a barrier to understanding international migration. In my Ph.D. thesis, we complement traditional data sources with social media data considering as a specific case the United Kingdom. We use the Integrated Model of European Migration to combine the data from the Labour Force Survey and Facebook Advertising Platform to study the number of European migrants in the UK, aiming to produce estimates of European migrants closer to their true stock number. The model used in the analysis provides a framework which assesses the limitations of the datasets in terms of the definition of migrants used; the bias and the accuracy are also considered to create an appropriate prior distribution, which could adjust these data issues. The model is divided into a migration theory-based model and a measurement error model. The estimates produced in the model suggest that there are more European migrants than suggested by the official estimates. We discussed the advantages and limitations of this approach, and we suggest how we can complement even more data sources in this framework.

Emilio Zagheni

Combining Digital Traces and Traditional Sources for Migration Research

Digital trace data, including those from social media, Web applications and smartphones offer new opportunities to understand migration processes, when combined with traditional data sources. Also, emerging online survey tools available to researchers improve our ability to complement and expand traditional forms of data collection. This talk (i) presents some recent advances in migration studies that resulted from integrating digital data and tools with Census and survey data; (ii) examines current challenges in the field; and (iii) proposes approaches to make further progress in this research area.

12:05 — 12:30  Closing remarks

Organizing Committee

Emanuele Del Fava


Research Scientist, Max Planck Institute for Demographic Research, Rostock, Germany

Emanuele Del Fava (Ph.D. in Statistics, UHasselt 2012; MSc in Biostatistics, UHasselt 2008; MSc in Statistics Applied to Economics, UniPisa 2007) is Research Scientist in the Lab of Digital and Computational Demography. Previously he was postdoctoral fellow at the Carlo F. Dondena Centre for Research on Social Dynamics and Public Policy, Bocconi University, Milan, Italy. He is a quantitative researcher with a highly multidisciplinary research perspective (biostatistics, epidemiology, demography, health economics) and interest in computational methods and data. His expertise is mainly in statistical modeling of infectious diseases data and international migration data. His current research relates to the development of computational models for modeling migration flows in Europe.

Emilio Zagheni


Director, Max Planck Institute for Demographic Research, Rostock, Germany

Emilio Zagheni (Ph.D. in Demography, UC Berkeley 2010; MA in Statistics, UC Berkeley 2008) is Director of the Max Planck Institute for Demographic Research in Rostock, Germany and Affiliate Associate Professor of Sociology at the University of Washington, Seattle. Zagheni is a demographer who uses mathematical, statistical and computationally-intensive approaches to study the causes and consequences of population dynamics. Motivated by the ambition to improve people's lives through the scientific study of our societies, he is consolidating a portfolio that leverages interdisciplinary approaches to monitor demographic change, to explain population processes, and to predict future demographic outcomes. He is best known for his pioneering work on using Web and social media data for studying migration processes. In 2016, he received the Trailblazer Award from the European Association for Population Studies for his pivotal role in developing the field of Digital and Computational Demography. Emilio Zagheni has published in top journals in Demography (e.g. Demography, Population and Development Review, Population Research and Policy Review) and Statistics (e.g., Journal of the American Statistical Association, Biostatistics) as well as in ACM (Association of Computing Machinery) conference proceedings (e.g., WebSci, WWW, WSDM). He co-chairs the IUSSP (International Union for the Scientific Study of Population) Panel on Digital Demography.

Confirmed Speakers

© Aude Bernard, University of Queensland

Aude Bernard is a DECRA Fellow at the Queensland Centre for Population Research at the University of Queensland. Her research focuses on understanding migration processes and their consequences for individuals, regions, and nations, particularly from a demographic perspective. Most of her work is comparative and has a strong methodological focus to allow robust comparisons over time and between countries, including Australia. Her recent methodological contributions include the development of innovative cohort measures of migration and their applications to 16 OECD countries and China. As an adjunct research fellow at the Asian Demographic Research Institute at Shanghai University, Aude works closely with migration scholars in Asia to advance understanding of migration behavior in the region. She is currently co-editing a book on internal migration that brings together scholars from over 20 Asian countries. Aude was recently awarded the 2018 IPUMS International Award for her global assessment of the educational selectivity of migrants. She is the co-editor of the Journal of Population Research.

© Mark Ellis, University of Washington

Mark Ellis is a Professor of Geography at the University of Washington (UW). He received a John Simon Guggenheim fellowship in 2005 and was Director of UW’s Center for Studies in Demography and Ecology (CSDE) from 2008-9 and 2010-15.  He is currently Executive Director of UW's Northwest Federal Statistical Research Data Center, which provides access to restricted US survey and administrative demographic, health and economic data. Ellis’s research projects span population, economic and urban geography and have been funded by the Social Science Research Council, the Russell Sage Foundation, and the National Science Foundation. Most recently, his work has included investigations of a) the effect of labor markets and local immigration policies on immigrants’ internal migration; b) geographical mismatches between Science, Technology, Engineering and Mathematics (STEM) labor supply and demand and the internal migration of STEM trained workers; c) the dynamics of neighborhood change in US cities in response to immigration and racial diversity; and d) measuring trends in US internal migration using administrative data sources.

© Nikola Sander, Federal Institute for Population Research

Nikola Sander is a quantitative population geographer interested in the patterns and processes of human behavior in space and time. Her research focuses on quantifying, understanding, and visualizing migration and mobility. She grew up in Germany, received a Ph.D. in Australia, a post-doc in Vienna, held a position as an assistant professor at the Department of Demography at the University of Groningen, and is currently Research Director for Migration & Mobility at Germany's Federal Institute for Population Research.


© Joel E. Cohen, Ph. Mario Morgado

Joel E. Cohen is the Abby Rockefeller Mauzé Professor of Populations and head of the Laboratory of Populations at the Rockefeller University and Columbia University, New York. At Columbia University, he holds appointments in the Earth Institute and the Departments of International and Public Affairs; Earth and Environmental Sciences; and Statistics. He is a Visiting Scholar in the Department of Statistics of the University of Chicago. He is a member of the Scientific Advisory Board of the Vienna Institute of Demography (Institut für Demographie), Austria, 2017-2021, and the International Advisory Committee, School of Geographic Sciences, East China Normal University, Shanghai, 2018-2022. He was a visiting professor of the University of Tokyo, Japan, 2014, 2017, and 2019. Cohen studies demography, ecology, epidemiology, and the social organization of human and non-human populations and mathematical concepts useful in these fields. He earned doctorates in applied mathematics in 1970 and population sciences and tropical public health in 1973 from Harvard University. He has published 14 books and more than 435 papers and chapters. Cohen is a member of the American Academy of Arts and Sciences and the U.S. National Academy of Sciences.

© Francesco Rampazzo, University of Southampton

Francesco Rampazzo is a Ph.D. candidate at the University of Southampton, UK, and doctoral fellow at the Max Planck Institute for Demographic Research. He is a demographer whose focus is on the potential use of digital traces data for population studies. He is interested in methodologies to combine traditional data sources with digital traces, and in non-probabilistic sampling methodologies.