IDEM 107

Smoothing Demographic Data: Flexible Models in Population Studies

Carlo Giovanni Camarda

Start: 25 May 2020
End: 29 May 2020

Location: Online course. Link tba.

Instructor:

Carlo Giovanni Camarda

Course description

This course provides an applied introduction to modern and flexible statistical techniques for modeling demographic data. Traditional demographic methods tend to either apply a large number of parameters or impose strong parametric assumptions. In this course you will learn to master flexible models to extract the most from your data with the fewest assumptions.

Smoothing the relationship between two variables (e.g. life expectancy and GDP per capita) is the simplest example where no prior knowledge of their relationship is assumed. However, more complex examples are frequent in demography. Examples include the pattern of mortality at different ages and/or at different time points by sex and cause of death; the fertility pattern across ages, cohorts and parity; spatial patterns of demographic phenomena; and non-linear effects of age or income by specific health outcomes. Moreover, several population patterns are intrinsically continuous, it thus seems natural to model them by smooth functions which could be practically treated as continuous curves in, for instance, decomposition and rate-of-change calculations.

The course will start with an overview of generalized linear models (log-linear and logistic models). P-splines will be then presented as the most suitable and clear-cut smoothing approach for demographic data. This class of models can be easily generalized to more complex data structures (multi-dimensional and spatial data) and to achieve specific needs (forecasting and specialized smoothing).

While we will focus on the few theoretical concepts that underpin the more detailed literature, handouts for reproducing outcomes presented in class will be provided.

This will help to emphasize the use of modern software such as R for implementing the approaches presented on relevant demographic datasets. By the end of the course, smoothing won’t be seen as a mere black box, but as a modern statistical tool to explore and model population data at their best.

Organization

Each of the five course days will consist of two one-hour lectures:

First one-hour lecture from 9:30-10:30 CET (Central European Time)
Second one-hour lecture from 14:30-15:30 CET

Detailed schedule

Monday, May 25: Introduction to Generalized Linear Models

9:30-10:30 CET

Introduction
Reminder on linear models
Generalized Linear Models

14:30-15:30 CET

Poisson GLMs
Including exposures

Tuesday, May 26: Discrete Smoothing

9:30-10:30 CET

Discrete Smoothing
Generalized Linear Smoothing

14:30-15:30 CET

Optimal amount of smoothing
Histogram smoothing
Incorporating exposures

Wednesday, May 27: P-splines: an introduction in demography

9:30-10:30 CET

Non-linear relationships
B-splines

14:30-15:30 CET

P-splines for Gaussian data
P-splines for Poisson data
Including exposures

Thursday, May 28: Extending P-splines

9:30-10:30 CET

Extrapolating with P-splines
P-splines with more covariates

14:30-15:30 CET

Smoothing spatial data
P-GAM for smoothing demographic data

Friday, May 29: More about P-splines

9:30-10:30 CET

Tensor P-splines
Extrapolation in two dimensions

14:30-15:30 CET

Shape constraints
Calculus with smooth data
Closure

Course prerequisites

The course is targeted at non-statisticians and it will introduce all concepts from the basics. However, elementary knowledge of demographic analysis (i.e. construction of a life-table) and statistics (i.e. regressions) is required. Familiarity with basic concepts in matrix algebra (transposing and inverting a matrix) is helpful but not essential. Participants are expected to have a working knowledge of R because handouts will require its use. Participants are expected to re-read slides and work on the handouts with R and an associated editor (e.g. RStudio) prior to each class.

Examination

Students will be evaluated on the basis of class participation.

General readings

A reading list will be provided as well as slides from the lectures and handouts for reproducing all examples.

Career