IDEM 181

Data visualization – the art/skill cocktail

Start:  13 July 2020 
End:     17 July 2020

Instructor:

  • Ilya Kashnitsky, University of Southern Denmark

Location: Online course. Link tba.

Course description

Preparing academic papers, researchers too often consider producing high quality plots as a secondary and less important task. To some extent this is driven by the widespread software legacy issues and mostly outdated limitations imposed by traditional scientific publishers. Yet, the modern tools place data visualization in the focus of research workflows when it comes to conveying the results. Hence, the ability to turn a large dataset into an insightful visualization is an increasingly valuable skill in academia.

The course aims to empower the participants with the flexibility that the R+tidyverse framework gives to visualize data (the practical examples use mostly demographic data). The course covers some aspects of data visualization theory and best/worst practice examples, but it's also practice oriented including live coding sessions and short lecture/showcase parts.

Practical coding sessions start from the basic introduction to tidy data manipulation and ggplot2 basics. Next, practical examples cover the creation of certain most useful types of plots. Important data visualization choices and caveats are discussed along the way. Special attention is devoted to producing geographical maps, which are no longer the luxury of professional cartographers but have turned, with the help of R, into yet another data visualization type. Going beyond ggplot2, the course presents an introduction to interactive data visualization.

Organization

Each of the five course days will consist of one two-hour lecture 14:00-16:00 CET (Central European Time), followed by an one hour discussion time (16:00 – 17:00 CET), which is optional for the participants.

Monday, July 13: BASICS

  • Basic dataviz principles
  • Impressive dataviz showcasess
  • Tidy approach to data
  • {ggplot2} basics

Tuesday, July 14: TUNE-UP

  • More advanced {ggplot2}
  • Colors in dataviz
  • Themes and fonts
  • Population pyramids and animation

Wednesday, July 15: TOOLBOX

  • Useful types of dataviz
  • Dotplots – the most neglected and powerful type of dataviz
  • Heatmaps, equality-line, ggridges, treemap
  • Ternary plots and ternary colorcoding

Thursday, July 16: MAPS I

  • The basics of map projections
  • {sf} – the game changer in #rspatial, `geom_sf`
  • Useful spatial processing tricks

Friday, July 17: MAPS II

  • Mapping Europe with {eurostat}
  • Mapping the US with {tidycensus}
  • Mapping challenge

Course prerequisites

Participants should have basic experience in using R. For those starting from scratch, it's a good idea to take some of the online introductory courses (swirl R package https://swirlstats.com/ is one nice option). Participants need a laptop or desktop computer with the latest versions of R and RStudio installed. More information regarding the R packages to install will be sent before the course starts.

Evaluation

Small exercises during the labs and a final data visualization challenge.

General readings

I suggest two recent books that are both freely available online

by Claus Wilke https://serialmentor.com/dataviz

by Kieran Healy https://socviz.co

The Max Planck Institute for Demographic Research (MPIDR) in Rostock is one of the leading demographic research centers in the world. It's part of the Max Planck Society, the internationally renowned German research society.