November 11, 2020 | Press Release
Using Bibliometric Records to Analyze Internal Migration of Researchers
Using data from scientific publications, an MPIDR-team tracked the mobility of researchers in Mexico. © iStockphoto.com/Jose Girarte
A team at the MPIDR investigated the migration of researchers within Mexico using millions of bibliometric records. Their framework can now be used to analyze scientific mobility in other countries as well.
“We actually know quite a lot about international migration of researchers. But little has been done to understand the mobility of scientists within a country,” says Emilio Zagheni. Seeing this gap in research, the Director at the Max Planck Institute for Demographic Research (MPIDR) in Rostock, Germany, formed a team to analyze the migration of researchers in the under-studied case of Mexico. Their paper was recently published in EPJ Data Science.
“The size and the impact of research in Mexico is perhaps not as well-known as that of other countries, but the scientific community in Mexico is large and productive,” says Andrea Miranda-González, PhD student of Demography at UC Berkeley, who spent the summer 2019 at the MPIDR to take part in this project. According to the Scopus data from the past decade, there have been more than 200,000 published scientists in Mexico who have in total produced over 217,000 publications. These are comparable to the number of researchers in Switzerland and the number of publications coming from Singapore. These scientific publications from Mexico have received on average 9.4 citations per publication, which is on par with countries like China, Brazil, and Poland.
The literature on mobility of scientists in Mexico is thin. Therefore, the MPIDR team combined demographic and network science techniques to explore internal scholarly migration within Mexico. They found that over the past decade, migration patterns of researchers in Mexico appear to be heterogeneous in size and direction across regions. However, while many researchers remain in their regions, there seems to be a preference for the capital Mexico City and the surrounding states as migration destinations.
The challenge lies in cleaning and refining bibliometric data
To carry out the analysis, the MPIDR team used millions of bibliometric records from the Scopus database from the period between 1996 and 2018 to track the movements of over 252,000 researchers in Mexico. “Bibliometric data is mainly what you see on the first page of a scientific publication, such as the title, author names, and affiliation addresses,” says software developer Tom Theile. “To reconstruct scientific mobility, we track changes in affiliation addresses over time. It is challenging to connect the correct authors of these many publications.” Some individuals may have the same name, and there could be many publications authored under that name. This ultimately means there is a huge number of possibilities regarding who has authored which paper.
“Such large datasets are not fully disambiguated. Using our disambiguation algorithm, we get much closer to the ideally disambiguated author names which are essential for estimating scholarly migration,” continues Tom Theile.
In a second step, the MPIDR team developed new methods for re-purposing this data for studying internal migration. “We developed a framework to refine bibliometric data and obtain a suitable sub-national level of data aggregation using a neural network algorithm,” says research scientist Samin Aref. Using network analysis, they found that the internal migration network has become denser and more diverse over time, featuring a dynamic core-periphery structure. They also used classic migration measurements showing that the redistribution of researchers in Mexico has decreased in the past 20 years.
“The methods and data we combined open up new opportunities to understand the migration of researchers within country boundaries,” says Emilio Zagheni.
Miranda-González, A., Aref, S., Theile, T., Zagheni, E.: Scholarly migration within Mexico: analyzing internal migration among researchers using Scopus longitudinal bibliometric data. EPJ Data Science. (2020) DOI: https://doi.org/10.1140/epjds/s13688-020-00252-9