How to Map ANY Region of the World using R programming
Would you like to learn how to create beautiful maps of any region of the world using the R programming language?
In this article we will see how to:
- create a choropleth world map using World Bank data.
- zoom on any specific region of the world using Openstreetmap.
- improve the scale and color palette of your maps.
As an example we will visualize Europe. But you can reuse the R code shared in this article on any region of your choice.
You can also watch this tutorial on YouTube:
Download the R code of this tutorial by joining my newsletter on www.felixanalytix.com. You will receive an automatic email from me to access the R script.
How to Download World data using R
The first thing you want to do is to install the necessary R packages for this project.
We will load the {tidyverse} R package, which is a collection of R packages for data wrangling and data visualization, the {rnaturalearth} R package to download raw data from the Natural Earth Project, the library {sf} to work with simple features in R and the library {wbstats} for the World Bank API.
We will now download world data using the {rnaturalearth} package.
We will call the ne_countries() function with scale as “medium” (we don’t need detailed geographic data) and the returnclass will be “sf” (we want an sf object to work with the {sf} package). Note that “sf” just stands for “simple features”.
Using the filter() function from the dplyr R package, we will directly remove “Antarctica” from our world data object.
Now I want to also change the world map projection. It can be done easily using the st_transform() function also from the sf package. Here I decided to use the mollweide projection.
Get World Bank data
We will download now a random indicator from the word Bank using the World Bank API. I choosed the indicator named “employment (% of the labor force)”.
You can have a list of all the indicators available in the World Bank API using the wb_cachelist() function. We will filter again using the same filter() function from dplyr to keep only the indicator we are interested in.
To download the indicator dataset, you can just use the wb_data() function from the {wbstats} package. I choose here to only get data for year 2020. Let’s have a glimpse of the dataset with the glimpse() function from {dyplr}.
Visualize data distribution
As a very quick data exploratory analysis we will see the distribution of our indicator. Here I will just plot an histogram using {ggplot2}.
Our data that is strongly concentrated around a 5–6% of unemployment. This can be a bit annoying because we will hardly see the difference in the color palette for these specific percentages. So we will use a square root transformation on our percentage so our color palette will be better distributed (showed by the histogram plot).
Plot a World Map using R
The first thing we want to do is to join our World Bank data to our geographic “world” sf object. To do that we can simply use the left_join() fonction and join by “iso3c” code.
Using the {ggplot2} R package, we will create a world map with the function geom_sf(). We will also make the square root transformation to see more clearly the percentage differences. With the scale_fill_viridis_c() function, we can specificy we want “sqrt” using the “trans” argument. We will also adjuste the “breaks” argument accordingly.
Creating a Bounding Box
To zoom on a specific area you need to know its coordinates, i.e. its bounding box. The website OpenStreetMap has a nice tool to get the coordinates of a specific bounding box.
The different numbers at the top right of the screen, which are equivalent to latitude and longitude, can be transposed in the code. Don’t forget to transform again these coordinates with the mollweide projection with st_transform().
How to create a map of Europe using R
To zoom on Europe, we will add the coord_sf() function inside the R code we used before for the world map. The “xlim” and “ylim” arguments will call the “X” and “Y” of our “window_coord_sf” object created previously.
Changing the scale and color distribution
Let’s improve a bit our visualization. We will remove the African countries and Greenland. We will also reduce our data to European countries only. So now the extreme yellow and dark blue color will be the highest and lowest percentage within the European countries. We will also remove the square root transformation (the square root transformation can be misleading for some audience).
We now have a different data story: we see here a comparison of unemployment between European countries only, while in the previous plot we were using global data on unemployment. The picture is different: Spain and Baltic countries are popping up much more, while it was less the case when using a scale based on global world unemployment.
You can get all the code of this tutorial by joining my newsletter on felixanalytix.com. Once you subscribed you will receive an automatic email with the URL of my GitHub account, where you can download all the R scripts of my tutorials.
Don’t forget to clap and subscribe if you liked this tutorial.