IDEM 187
Topics in Digital and Computational Demography
Course Coordinator: Risto Conte Keivabu
Instructors: Tom Theile, Emilio Zagheni, Carolina Coimbra Vieira, Ebru Sanlitürk, Risto Conte Keivabu, Jordan Klein, Boris Barron, Benjamin-Samuel Schlueter, Irena Chen
Start date: 3 November 2025
End date: 7 November 2025
Location: Hybrid: in-person for students in the PHDS network and/or already in Rostock; Online (via Zoom) for everyone else.
Course Description
Rapid increases in computational power and the explosion of Internet, social media and mobile phone use have radically changed our lives, how we interact, and our behavior, including demographic choices and constraints. The digitalization of our lives has also led to the so-called “data revolution” that is transforming the social sciences.
Data science tools allow social scientists to address core demographic questions in new ways. At the same time, demographic and social science methods enable researchers to make sense of new and complex data sources for which novel approaches and research designs may be needed.
The main goals for this course are:
- To introduce students to core demographic and social science methods that are essential to interpret digital trace data.
- To introduce students to core data science methods that are key to advance our understanding of population processes in the context of the increasing heterogeneity of data sources useful for demographic research.
- To introduce students to recent substantive advances in the field of Digital and Computational Demography, with emphasis on fostering critical thinking about modern demographic analysis and (big) data-driven discovery.
- To help students identify research questions in their own area of substantive interest that could be addressed with innovative data sources, and support them in the process of devising an appropriate research plan.
Organization
The course will be offered in a hybrid format: in-person for students in the PHDS network who are already in Rostock; online (via Zoom) for everyone else. Each day, there will be one lecture and one discussion session. The lecture will be pre-recorded and made available ahead of time.
Students are expected to watch the lecture carefully at their own pace and to complete the assignments before the discussion session, which will be held live every day from 14:00-17:00 CET (Central European Time). During the discussion session, homework assignments and/or hands-on computing exercises will be reviewed, assigned readings will be discussed and questions about the lecture will be addressed. Active participation of students is expected.
Each day, the lecture and discussion session will be presented by an experienced scholar in the field who will focus on a relevant research topic in which s/he is an expert.
Students should generally expect to spend about 8 hours per day on the course (lectures, discussion sessions, readings, assignments).
Schedule
Day 1 (Nov. 3rd)
Instructors: Emilio Zagheni & Tom Theile
Topics: Introduction to Digital and Computational Demography; Approaches for combining representative data and non-probabilistic samples; Identifying sources of bias in digital trace data and adjusting for them. In the practice session, we will scrape websites with R and then access web-APIs from OpenAI with R.
Day 2 (Nov. 4th)
Instructors: Carolina Coimbra Vieira & Ebru Sanlitürk
Topics: Digital trace data for migration research: Introduction to migration theories and ethics of digital data use; Fundaments of data collection and analysis of digital trace data; Advantages and critical challenges of using different types of digital trace data, such as Facebook, Instagram, Twitter, LinkedIn, Google Trends, Wikipedia, and Bibliometric data.
Day 3 (Nov. 5th)
Instructor: Risto Conte Keivabu
Topics: Introduction to geospatial and environmental data; Working with geospatial data in R; Advantages and pitfalls of available open data on the environment; Working with open geospatial and environmental data; Handling of environmental data for demographic research; Introducing a geospatial component to migration/mobility data.
Day 4 (Nov. 6th)
Instructors: Jordan Klein & Boris Barron
Topics: Simulations in the social sciences; Formulating assumptions into empirical tests; Loss functions for performance evaluation; Ordinary differential equations in modeling contexts; Empirical vs mechanistic models; Verifying and calibrating models.
Day 5 (Nov. 7th)
Instructors: Irena Chen & Benjamin-Samuel Schlüter
Topics: Bayesian approaches with applications to demography, Introduction to Bayes (comparison to frequentist statistics) including Bayes rule; implementation of MCMC algorithms (HMC, Gibbs), interpretation of results (credible intervals, posterior distributions) and model diagnostics; mortality models; methods for estimation issues in demography (missing data, small area estimation, multiple data sources).
Diversity of Student Backgrounds
Students in this course have different backgrounds. Some students may have strong computational and statistical skills, others may not. Some students may be very familiar with demographic methods, some others may only have basic knowledge of population processes. The instructors will emphasize substance and key statistical, mathematical, computational and demographic concepts to accommodate the range of backgrounds. There will also be different types of homework assignments. Some of them will involve computing and coding. Some others may involve critical reflections about the readings. In short, we will facilitate the participation of students who do not have an extensive background in statistics, or computational methods, but are eager to learn.
Course Prerequisites
Students should be familiar with programming with R/RStudio, Python (Anaconda), or an equivalent programming environment. Homework assignments that require programming can be completed using the programming environment of your choice. Solutions to the assignments will be discussed using R/RStudio or Python (Anaconda).
Instructions on how to download and install R can be found in “A (very) short introduction to R” by Torfs and Brauer (2014):
cran.r-project.org/doc/contrib/Torfs+Brauer-Short-R-Intro.pdf
A concise free Python starter course is available on Kaggle: www.kaggle.com/learn/python.
For installation at Max Planck, use Miniforge from: conda-forge.org/download or PyCharm: www.jetbrains.com/pycharm.
Anaconda: www.anaconda.com/download was previously a common choice to download it, but after Anaconda’s licensing changes for scientific institutions, direct use of Anaconda is not supported at Max Planck Institutes and access may be blocked on MPG networks. This restriction applies to Max Planck, colleagues at other educational institutions may still be able to use Anaconda if their institution qualifies under Anaconda’s current academic terms or holds an appropriate license, please check with your local IT/licensing office.
Examination
Students will receive a pass/fail grade based on a multiple-choice final quiz and active participation in class. Students who pass will receive a certificate of completion.
Tuition
There is no tuition fee for this course.
Recruitment of students external to the IMPRS-PHDS network
Applicants should either be enrolled in a PhD program or have received their PhD. Applications from advanced master’s students will also be considered.
How to apply
- Applications have to be submitted online via survey.demogr.mpg.de/index.php/235691?lang=en
- You will need to attach the following items integrated into a single pdf file:
- (1) Curriculum vitae, including a list of your scholarly publications.
- (2) A one-page statement of your research and how it relates to the course. Please include a short description of your knowledge of the programming language R, Python and/or similar.
- The application deadline is 28 September 2025.
- Applicants will be informed of their acceptance by 8 October 2025.
- Applications submitted after the deadline will be considered only if logistically feasible.