MPIDR Technical Report

compareFinRaw.r – an R program to measure the difference between datasets

Walke, R., Müller, A.
MPIDR Technical Report TR-2012-003, 7 pages.
Rostock, Max-Planck-Institut für demografische Forschung (Juli 2012)
"tr-2012-003-files.zip" contains the R script, a Sweave script, the related control file, and a folder holding all example output files ("Output_ready").

Abstract

In every data-related research it is essential to have knowledge about potential disparity of the data in use. There might be differences between modification stages of a single dataset or between distinct datasets. Either way the researcher has to be aware of these differences in order to draw proper conclusions that might be affected by different data properties. This report describes an adaptable solution to cope with that problem by using the statistical software R [R 2011]. The program compareFinRaw.r is a suitable automatic tool to measure differences of two datasets by computing distances for all relevant variable (column) pairs of the datasets on two levels. Two excerpts of the R-internal dataset Seatbelts [R 2011, Harvey1986] serve as an illustrative data example.
Schlagwörter: data analysis, data comparability, data evaluation, data processing, software
Das Max-Planck-Institut für demografische Forschung (MPIDR) in Rostock ist eines der international führenden Zentren für Bevölkerungswissenschaft. Es gehört zur Max-Planck-Gesellschaft, einer der weltweit renommiertesten Forschungsgemeinschaften.