IDEM 181
Data visualization – the art/skill cocktail
Start: 13 July 2020
End: 17 July 2020
Instructor:
- Ilya Kashnitsky, University of Southern Denmark
Location: Online course. Link tba.
Course description
Preparing academic papers, researchers too often consider producing high quality plots as a secondary and less important task. To some extent this is driven by the widespread software legacy issues and mostly outdated limitations imposed by traditional scientific publishers. Yet, the modern tools place data visualization in the focus of research workflows when it comes to conveying the results. Hence, the ability to turn a large dataset into an insightful visualization is an increasingly valuable skill in academia.
The course aims to empower the participants with the flexibility that the R+tidyverse framework gives to visualize data (the practical examples use mostly demographic data). The course covers some aspects of data visualization theory and best/worst practice examples, but it's also practice oriented including live coding sessions and short lecture/showcase parts.
Practical coding sessions start from the basic introduction to tidy data manipulation and ggplot2 basics. Next, practical examples cover the creation of certain most useful types of plots. Important data visualization choices and caveats are discussed along the way. Special attention is devoted to producing geographical maps, which are no longer the luxury of professional cartographers but have turned, with the help of R, into yet another data visualization type. Going beyond ggplot2, the course presents an introduction to interactive data visualization.
Organization
Each of the five course days will consist of one two-hour lecture 14:00-16:00 CET (Central European Time), followed by an one hour discussion time (16:00 – 17:00 CET), which is optional for the participants.
Monday, July 13: BASICS
- Basic dataviz principles
- Impressive dataviz showcasess
- Tidy approach to data
- {ggplot2} basics
Tuesday, July 14: TUNE-UP
- More advanced {ggplot2}
- Colors in dataviz
- Themes and fonts
- Population pyramids and animation
Wednesday, July 15: TOOLBOX
- Useful types of dataviz
- Dotplots – the most neglected and powerful type of dataviz
- Heatmaps, equality-line, ggridges, treemap
- Ternary plots and ternary colorcoding
Thursday, July 16: MAPS I
- The basics of map projections
- {sf} – the game changer in #rspatial, `geom_sf`
- Useful spatial processing tricks
Friday, July 17: MAPS II
- Mapping Europe with {eurostat}
- Mapping the US with {tidycensus}
- Mapping challenge
Course prerequisites
Participants should have basic experience in using R. For those starting from scratch, it's a good idea to take some of the online introductory courses (swirl R package https://swirlstats.com/ is one nice option). Participants need a laptop or desktop computer with the latest versions of R and RStudio installed. More information regarding the R packages to install will be sent before the course starts.
Evaluation
Small exercises during the labs and a final data visualization challenge.
General readings
I suggest two recent books that are both freely available online
by Claus Wilke https://serialmentor.com/dataviz
by Kieran Healy https://socviz.co