view raw Rmd

EDIT: Post-obit a suggestion Adriano Fantini and code from Andy South, nosotros replaced rworlmap by rnaturalearth.

This tutorial is the first part in a series of 3:

  • Full general concepts illustrated with the world Map (this certificate)
  • Adding additional layers: an example with points and polygons
  • Positioning and layout for circuitous maps

In this function, we will embrace the fundamentals of mapping using ggplot2 associated to sf, and presents the nuts elements and parameters we can play with to prepare a map.

Maps are used in a variety of fields to express data in an appealing and interpretive way. Data tin be expressed into simplified patterns, and this data interpretation is generally lost if the information is only seen through a spread sheet. Maps can add vital context by incorporating many variables into an easy to read and applicable context. Maps are also very important in the information earth because they tin can quickly permit the public to gain amend insight so that they can stay informed. It's critical to have maps be effective, which means creating maps that tin be easily understood past a given audience. For instance, maps that need to exist understood by children would be very unlike from maps intended to exist shown to geographers.

Knowing what elements are required to enhance your information is primal into making effective maps. Bones elements of a map that should be considered are polygon, points, lines, and text. Polygons, on a map, are closed shapes such every bit country borders. Lines are considered to be linear shapes that are not filled with whatsoever attribute, such as highways, streams, or roads. Finally, points are used to specify specific positions, such as urban center or landmark locations. With that in heed, one need to think well-nigh what elements are required in the map to really brand an impact, and convey the data for the intended audience. Layout and formatting are the 2d critical aspect to enhance data visually. The arrangement of these map elements and how they will be fatigued tin be adjusted to make a maximum bear on.

A solution using R and its ecosystem of packages

Current solutions for creating maps usually involves GIS software, such as ArcGIS, QGIS, eSpatial, etc., which allow to visually gear up a map, in the same approach as i would set a poster or a document layout. On the other hand, R, a gratuitous and open up-source software development environment (IDE) that is used for calculating statistical data and graphic in a programmable linguistic communication, has developed advanced spatial capabilities over the years, and can be used to draw maps programmatically.

R is a powerful and flexible tool. R can be used from calculating information sets to creating graphs and maps with the same information set. R is also complimentary, which makes it easily accessible to anyone. Some other advantages of using R is that information technology has an interactive language, data structures, graphics availability, a developed community, and the reward of adding more functionalities through an entire ecosystem of packages. R is a scriptable linguistic communication that allows the user to write out a code in which it will execute the commands specified.

Using R to create maps brings these benefits to mapping. Elements of a map can exist added or removed with ease — R lawmaking can be tweaked to make major enhancements with a stroke of a key. Information technology is also easy to reproduce the same maps for different data sets. It is important to be able to script the elements of a map, so that it tin can be re-used and interpreted by any user. In essence, comparing typical GIS software and R for drawing maps is similar to comparison word processing software (due east.g. Microsoft Function or LibreOffice) and a programmatic typesetting organization such as LaTeX, in that typical GIS software implement a WYSIWIG approach ("What You See Is What Y'all Get"), while R implements a WYSIWYM arroyo ("What You See Is What You Hateful").

The parcel ggplot2 implements the grammar of graphics in R, as a way to create code that make sense to the user: The grammar of graphics is a term used to breaks upward graphs into semantic components, such every bit geometries and layers. Practically speaking, it allows (and forces!) the user to focus on graph elements at a higher level of abstraction, and how the data must be structured to achieve the expected effect. While ggplot2 is becoming the de facto standard for R graphs, it does not handle spatial data specifically. The electric current state-of-the-art of spatial objects in R relies on Spatial classes defined in the package sp, but the new packet sf has recently implemented the "simple feature" standard, and is steadily taking over sp. Recently, the package ggplot2 has allowed the utilize of simple features from the package sf as layers in a graph1. The combination of ggplot2 and sf therefore enables to programmatically create maps, using the grammar of graphics, just as informative or visually highly-seasoned as traditional GIS software.

Getting started

Many R packages are available from CRAN, the Comprehensive R Annal Network, which is the primary repository of R packages. The full list of packages necessary for this series of tutorials can be installed with:

                install.packages(c("cowplot", "googleway", "ggplot2", "ggrepel",  "ggspatial", "libwgeom", "sf", "rnaturalearth", "rnaturalearthdata")                              

We outset past loading the basic packages necessary for all maps, i.e. ggplot2 and sf. We also suggest to use the classic night-on-calorie-free theme for ggplot2 (theme_bw), which is appropriate for maps:

                library("ggplot2") theme_set(theme_bw()) library("sf")                              

The package rnaturalearth provides a map of countries of the unabridged world. Apply ne_countries to pull country information and cull the scale (rnaturalearthhires is necessary for scale = "large"). The part tin can return sp classes (default) or directly sf classes, as defined in the argument returnclass:

                library("rnaturalearth") library("rnaturalearthdata")  earth <- ne_countries(calibration = "medium", returnclass = "sf") class(world)  ## [ane] "sf"   ## [one] "data.frame"                              

General concepts illustrated with the globe map

Data and basic plot (ggplot and geom_sf)

First, allow u.s. start with creating a base of operations map of the earth using ggplot2. This base of operations map will then be extended with different map elements, every bit well as zoomed in to an expanse of interest. We can check that the globe map was properly retrieved and converted into an sf object, and plot information technology with ggplot2:

                ggplot(data = world) +     geom_sf()                              

This call nicely introduces the construction of a ggplot call: The first function ggplot(information = world) initiates the ggplot graph, and indicates that the primary data is stored in the earth object. The line ends up with a + sign, which indicates that the phone call is not complete nevertheless, and each subsequent line correspond to another layer or calibration. In this example, we use the geom_sf role, which simply adds a geometry stored in a sf object. By default, all geometry functions use the main data divers in ggplot(), but we volition come across later on how to provide additional information.

Note that layers are added one at a fourth dimension in a ggplot call, and so the order of each layer is very important. All data will have to exist in an sf format to be used past ggplot2; data in other formats (e.one thousand. classes from sp) will be manually converted to sf classes if necessary.

Title, subtitle, and centrality labels (ggtitle, xlab, ylab)

A title and a subtitle can be added to the map using the role ggtitle, passing whatsoever valid character string (e.chiliad. with quotation marks) as arguments. Axis names are absent by default on a map, but can exist changed to something more suitable (east.one thousand. "Longitude" and "Latitude"), depending on the map:

                ggplot(information = world) +     geom_sf() +     xlab("Longitude") + ylab("Latitude") +     ggtitle("World map", subtitle = paste0("(", length(unique(globe$NAME)), " countries)"))                              

Map colour (geom_sf)

In many ways, sf geometries are no different than regular geometries, and tin can be displayed with the aforementioned level of control on their attributes. Hither is an example with the polygons of the countries filled with a green colour (statement make full), using black for the outline of the countries (statement color):

                ggplot(data = earth) +      geom_sf(color = "black", fill = "lightgreen")                              

The bundle ggplot2 allows the utilise of more circuitous color schemes, such as a gradient on one variable of the data. Hither is another example that shows the population of each state. In this example, we utilize the "viridis" colorblind-friendly palette for the color gradient (with option = "plasma" for the plasma variant), using the square root of the population (which is stored in the variable POP_EST of the world object):

                ggplot(information = globe) +     geom_sf(aes(fill = pop_est)) +     scale_fill_viridis_c(option = "plasma", trans = "sqrt")                              

Project and extent (coord_sf)

The function coord_sf allows to deal with the coordinate system, which includes both projection and extent of the map. By default, the map will use the coordinate system of the first layer that defines one (i.east. scanned in the order provided), or if none, fall back on WGS84 (latitude/longitude, the reference arrangement used in GPS). Using the argument crs, it is possible to override this setting, and project on the fly to whatsoever projection. This can be achieved using any valid PROJ4 string (hither, the European-centric ETRS89 Lambert Azimuthal Equal-Area projection):

                ggplot(information = world) +     geom_sf() +     coord_sf(crs = "+proj=laea +lat_0=52 +lon_0=10 +x_0=4321000 +y_0=3210000 +ellps=GRS80 +units=thou +no_defs ")                              

Spatial Reference System Identifier (SRID) or an European Petroleum Survey Grouping (EPSG) code are available for the projection of interest, they can be used directly instead of the full PROJ4 string. The two following calls are equivalent for the ETRS89 Lambert Azimuthal Equal-Area project, which is EPSG lawmaking 3035:

                ggplot(data = earth) +     geom_sf() +     coord_sf(crs = "+init=epsg:3035")  ggplot(data = world) +     geom_sf() +     coord_sf(crs = st_crs(3035))                              

The extent of the map tin can besides be set in coord_sf, in practise assuasive to "zoom" in the area of involvement, provided by limits on the x-axis (xlim), and on the y-axis (ylim). Annotation that the limits are automatically expanded past a fraction to ensure that data and axes don't overlap; it can also exist turned off to exactly match the limits provided with expand = FALSE:

                ggplot(data = world) +     geom_sf() +     coord_sf(xlim = c(-102.15, -74.12), ylim = c(vii.65, 33.97), aggrandize = FALSE)                              

Calibration bar and North arrow (bundle ggspatial)

Several packages are available to create a scale bar on a map (eastward.grand. prettymapr, vcd, ggsn, or legendMap). We introduce here the package ggspatial, which provides easy-to-use functions…

scale_bar that allows to add together simultaneously the n symbol and a scale bar into the ggplot map. V arguments need to be gear up manually: lon, lat, distance_lon, distance_lat, and distance_legend. The location of the calibration bar has to be specified in longitude/latitude in the lon and lat arguments. The shaded distance inside the scale bar is controlled by the distance_lon statement. while its width is adamant by distance_lat. Additionally, it is possible to change the font size for the legend of the scale bar (statement legend_size, which defaults to iii). The North arrow behind the "N" n symbol can also be adapted for its length (arrow_length), its distance to the calibration (arrow_distance), or the size the N north symbol itself (arrow_north_size, which defaults to 6). Note that all distances (distance_lon, distance_lat, distance_legend, arrow_length, arrow_distance) are set up to "km" by default in distance_unit; they can too be set to nautical miles with "nm", or miles with "mi".

                library("ggspatial") ggplot(data = world) +     geom_sf() +     annotation_scale(location = "bl", width_hint = 0.5) +     annotation_north_arrow(location = "bl", which_north = "true",          pad_x = unit(0.75, "in"), pad_y = unit(0.5, "in"),         style = north_arrow_fancy_orienteering) +     coord_sf(xlim = c(-102.15, -74.12), ylim = c(vii.65, 33.97))  ## Calibration on map varies by more than than 10%, scale bar may be inaccurate                              

Note the alert of the inaccurate scale bar: since the map use unprojected data in longitude/latitude (WGS84) on an equidistant cylindrical projection (all meridians being parallel), length in (kilo)meters on the map directly depends mathematically on the degree of latitude. Plots of small regions or projected information volition ofttimes let for more accurate scale confined.

Land names and other names (geom_text and annotate)

The world information ready already contains land names and the coordinates of the centroid of each country (amid more information). We tin apply this information to plot country names, using globe as a regular data.frame in ggplot2. The function geom_text can exist used to add together a layer of text to a map using geographic coordinates. The part requires the data needed to enter the country names, which is the aforementioned data as the world map. Once more, nosotros accept a very flexible control to adjust the text at will on many aspects:

  • The size (argument size);
  • The alignment, which is centered by default on the coordinates provided. The text can exist adjusted horizontally or vertically using the arguments hjust and vjust, which can either exist a number between 0 (correct/bottom) and 1 (top/left) or a character ("left", "middle", "right", "lesser", "center", "top"). The text can also exist first horizontally or vertically with the statement nudge_x and nudge_y;
  • The font of the text, for example its color (argument color) or the type of font (fontface);
  • The overlap of labels, using the argument check_overlap, which removes overlapping text. Alternatively, when there is a lot of overlapping labels, the package ggrepel provides a geom_text_repel function that moves label around so that they practise not overlap.
  • For the text labels, we are defining the centroid of the counties with st_centroid, from the package sf. Then we combined the coordinates with the centroid, in the geometry of the spatial data frame. The package sf is necessary for the command st_centroid.

Additionally, the annotate role can exist used to add together a unmarried grapheme cord at a specific location, equally demonstrated here to add the Gulf of Mexico:

                library("sf") world_points<- st_centroid(globe) world_points <- cbind(world, st_coordinates(st_centroid(earth$geometry)))  ggplot(data = earth) + geom_sf() + geom_text(data= world_points,aes(ten=X, y=Y, label=name),     color = "darkblue", fontface = "bold", check_overlap = False) + comment(geom = "text", x = -90, y = 26, label = "Gulf of United mexican states",      fontface = "italic", color = "grey22", size = vi) + coord_sf(xlim = c(-102.fifteen, -74.12), ylim = c(7.65, 33.97), expand = Imitation)                              

Last map

Now to make the final touches, the theme of the map tin can be edited to make it more than appealing. Nosotros suggested the use of theme_bw for a standard theme, but there are many other themes that tin be selected from (see for instance ?ggtheme in ggplot2, or the package ggthemes which provide several useful themes). Moreover, specific theme elements tin can be tweaked to get to the final outcome:

  • Position of the legend: Although not used in this example, the argument fable.position allows to automatically place the legend at a specific location (e.g. "topright", "bottomleft", etc.);
  • Grid lines (graticules) on the map: by using panel.grid.major and panel.grid.small, grid lines can be adjusted. Here we gear up them to a gray color and dashed line type to clearly distinguish them from country borders lines;
  • Map background: the argument panel.background can be used to colour the background, which is the bounding main essentially, with a low-cal blue;
  • Many more elements of a theme can be adjusted, which would exist too long to cover here. We refer the reader to the documentation for the function theme.

    ggplot(data = world) + geom_sf(make full= "antiquewhite") + geom_text(data= world_points,aes(x=X, y=Y, label=name), color = "darkblue", fontface = "bold", check_overlap = FALSE) + annotate(geom = "text", x = -90, y = 26, characterization = "Gulf of Mexico", fontface = "italic", color = "grey22", size = vi) + annotation_scale(location = "bl", width_hint = 0.v) + annotation_north_arrow(location = "bl", which_north = "true", pad_x = unit(0.75, "in"), pad_y = unit(0.five, "in"), style = north_arrow_fancy_orienteering) + coord_sf(xlim = c(-102.15, -74.12), ylim = c(7.65, 33.97), aggrandize = FALSE) + xlab("Longitude") + ylab("Latitude") + ggtitle("Map of the Gulf of Mexico and the Caribbean area Sea") + theme(console.grid.major = element_line(colour = gray(.5), linetype = "dashed", size = 0.five), panel.groundwork = element_rect(fill = "aliceblue"))

Saving the map with ggsave

The final map now set, it is very easy to save it using ggsave. This part allows a graphic (typically the last plot displayed) to be saved in a multifariousness of formats, including the most common PNG (raster bitmap) and PDF (vector graphics), with command over the size and resolution of the outcome. For case here, we save a PDF version of the map, which keeps the all-time quality, and a PNG version of it for web purposes:

                ggsave("map.pdf") ggsave("map_web.png", width = 6, pinnacle = half-dozen, dpi = "screen")