I’ve had a lot of people ask me to make maps for them recently. There seems to be a magical aura around cartography where people say “I can’t make maps” or “I don’t do GIS”. It’s really not that hard. IF you don’t want to pay the big $$
for an ArcGIS licesne, you can make a quite servicable map with R.
There are a lot of different packages that you can use for map making, so one of the bigest issues folks have when they are first starting is knowing which package to start with.
‘sf’ which stands for “simple features” is one of the most user-friendly packages for basic map making (using vector graphics). It was built by the team that brought you the tidyvserse, so you know it’s probably good.
You can check out the website for more info: https://r-spatial.github.io/sf/
In a normal data set, you will have a table where each observation is a row, and each attribute associated with that observation is a column.
We’ll use stations from EMP’s discrete water quality survey as an example.
EMP = read.csv("EMP_Discrete_Water_Quality_Stations.csv")
EMP
## Station Location StationType
## 1 C10 San Joaquin River near Vernalis Shore-based
## 2 C10A San Joaquin River near Vernalis @ SJR Club Shore-based
## 3 C3 Sacramento River @ Green's Landing Shore-based
## 4 C3A Sacramento River @ Hood Shore-based
## 5 C7 San Joaquin River from Boat Dock @ Mossdale Bridge Shore-based
## 6 C9 West Canal @ Clifton Court Intake Shore-based
## 7 D10 Sacramento River @ Chipps Island Mid-channel
## 8 D11 Sherman Lake near Antioch Mid-channel
## 9 D12 San Joaquin River @ Antioch Ship Channel Mid-channel
## 10 D14A Big Break near Oakley Mid-channel
## 11 D15 San Joaquin River @ Jersey Point Mid-channel
## 12 D16 San Joaquin River @ Twitchell Island Mid-channel
## 13 D19 Frank's Tract near Russo's Landing Mid-channel
## 14 D2 Suisun Bay near Preston Point Mid-channel
## 15 D22 Sacramento River @ Emmaton Mid-channel
## 16 D24 Sacramento River below Rio Vista Bridge Shore-based
## 17 D26 San Joaquin River @ Potato Slough Mid-channel
## 18 D28A Old River @ Rancho Del Rio Mid-channel
## 19 D4 Sacramento River above Point Sacramento Mid-channel
## 20 D41 San Pablo Bay near Pinole Point Mid-channel
## 21 D41A San Pablo Bay near Mouth of Petaluma River Mid-channel
## 22 D42 San Pablo Bay near Mare Island Mid-channel
## 23 D6 Suisun Bay @ Bulls Head nr. Martinez Mid-channel
## 24 D7 Grizzly Bay @ Dolphin nr. Suisun Slough Mid-channel
## 25 D8 Suisun Bay off Middle Point nr. Nichols Mid-channel
## 26 MD10 Disappointment Slough @ Bishop Cut Shore-based
## 27 MD10A Disapointment Slough @ Bishop Cut Mid-channel
## 28 MD6 Sycamore Slough near Mouth Mid-channel
## 29 MD7 South Fork Mokelumne below Sycamore Slough Mid-channel
## 30 MD7A Little Potato Slough @ Terminous Mid-channel
## 31 NZ002 Carquinez Straite near Glencove Harbor Mid-channel
## 32 NZ004 Carquinez Straite near Ozol Pier Mid-channel
## 33 NZ032 Montezuma Slough, 2nd bend from mouth Mid-channel
## 34 NZ068 Sacramento River below Rio Vista Bridge Mid-channel
## 35 NZ325 San Pablo Bay near Rock Wall and Light 15 Mid-channel
## 36 NZS42 Suisun Slough @ Volanti Slough Mid-channel
## 37 P10 Middle River @ Victoria Canal Shore-based
## 38 P10A Middle River @ Union Point Shore-based
## 39 P12 Old River @ Tracy Road Bridge Shore-based
## 40 P12A Old River @ Oak Island Shore-based
## 41 P2 Mokelumne River @ Franklin Road Bridge Shore-based
## 42 P8 San Joaquin River @ Buckley Cove Mid-channel
## 43 S42 Suisun Slough 300' south of Volanti Slough Shore-based
## Latitude Longitude Status AverageDepth
## 1 37.67575 -121.2650 Inactive 3
## 2 37.67934 -121.2647 Active 4
## 3 38.34575 -121.5461 Inactive 2
## 4 38.36771 -121.5205 Active 1
## 5 37.78607 -121.3077 Inactive 1
## 6 37.83095 -121.5540 Active 45
## 7 38.04631 -121.9183 Active 23
## 8 38.04229 -121.7995 Inactive 12
## 9 38.02161 -121.8063 Active 6
## 10 38.01776 -121.7114 Inactive 23
## 11 38.05217 -121.6896 Inactive 4
## 12 38.09690 -121.6691 Active 1
## 13 38.04376 -121.6148 Active 35
## 14 38.06544 -122.0545 Inactive 6
## 15 38.08453 -121.7391 Active 4
## 16 38.15778 -121.6847 Inactive 2
## 17 38.07664 -121.5669 Active 1
## 18 37.97048 -121.5730 Active 34
## 19 38.06248 -121.8205 Active 5
## 20 38.03022 -122.3729 Active 74
## 21 38.08472 -122.3907 Active 23
## 22 38.05872 -122.2847 Inactive 43
## 23 38.04436 -122.1177 Active 4
## 24 38.11714 -122.0397 Active 5
## 25 38.05992 -121.9900 Active 34
## 26 38.04381 -121.4188 Inactive 3
## 27 38.04226 -121.4199 Active 4
## 28 38.14150 -121.4687 Inactive 2
## 29 38.12513 -121.4970 Inactive 1
## 30 38.11382 -121.4980 Inactive 1
## 31 38.06529 -122.2152 Active 45
## 32 38.03576 -122.1616 Active 23
## 33 38.16991 -122.0211 Active 12
## 34 38.14272 -121.6895 Active 6
## 35 38.05798 -122.2919 Active 23
## 36 38.18045 -122.0476 Active 4
## 37 37.89120 -121.4894 Inactive 1
## 38 37.89126 -121.4883 Inactive 35
## 39 37.80472 -121.4500 Inactive 6
## 40 37.80284 -121.4569 Inactive 4
## 41 38.25542 -121.4403 Inactive 2
## 42 37.97817 -121.3823 Active 1
## 43 38.18047 -122.0469 Inactive 34
You will notice that we have one line per station with various attributes, such as Station Type, Location, Latitude, Longitude, and depth.
(I actually just made up the depths for demonstration purposes, don’t use them for anything else)
The columns “latitude” and “longitude” tell you were the stations are, but R won’t recognize this as spatial data in this format. We need to identify which columns have the spatial component and what the coordinate reference system is. If you are not familiar with coordinate reference systems, check out this cheat sheet
The coordinate reference system is made up of several key components:
A geographic coordinate system locates latitude and longitude location using angles. Thus the spacing of each line of latitude moving north and south is not uniform. (example, WGS84)
A projected coordinate system locates latitude and longitude using static units. The line spacing is uniform so it is easier to transform and do analyses. However, they are usually only good for a small part of the earth (Example UTM)
We can specify the CRS via a list of the components, or by it’s numeric code .
EMP uses WGS84, which is number 4326.
library(sf)
EMPsf = st_as_sf(EMP, coords = c("Longitude", "Latitude"), crs = 4326)
EMPsf
## Simple feature collection with 43 features and 5 fields
## geometry type: POINT
## dimension: XY
## bbox: xmin: -122.3907 ymin: 37.67575 xmax: -121.2647 ymax: 38.36771
## geographic CRS: WGS 84
## First 10 features:
## Station Location StationType
## 1 C10 San Joaquin River near Vernalis Shore-based
## 2 C10A San Joaquin River near Vernalis @ SJR Club Shore-based
## 3 C3 Sacramento River @ Green's Landing Shore-based
## 4 C3A Sacramento River @ Hood Shore-based
## 5 C7 San Joaquin River from Boat Dock @ Mossdale Bridge Shore-based
## 6 C9 West Canal @ Clifton Court Intake Shore-based
## 7 D10 Sacramento River @ Chipps Island Mid-channel
## 8 D11 Sherman Lake near Antioch Mid-channel
## 9 D12 San Joaquin River @ Antioch Ship Channel Mid-channel
## 10 D14A Big Break near Oakley Mid-channel
## Status AverageDepth geometry
## 1 Inactive 3 POINT (-121.265 37.67575)
## 2 Active 4 POINT (-121.2647 37.67934)
## 3 Inactive 2 POINT (-121.5461 38.34575)
## 4 Active 1 POINT (-121.5205 38.36771)
## 5 Inactive 1 POINT (-121.3077 37.78607)
## 6 Active 45 POINT (-121.554 37.83095)
## 7 Active 23 POINT (-121.9183 38.04631)
## 8 Inactive 12 POINT (-121.7995 38.04229)
## 9 Active 6 POINT (-121.8063 38.02161)
## 10 Inactive 23 POINT (-121.7114 38.01776)
You will notice that instead of a “latitude” and “logitude” column, we now have a “geometry” column that contains both latitude and longitude.
Now let’s plot these points.
library(ggmap)
ggplot() + geom_sf(data = EMPsf)
OK, so that’s pretty easy! But it would be nice if we had an outline of the waterways so the points aren’t floating out in space. I’ve got this shapefile of the waterways of teh Delta, and it’s pretty easy to import and plot it in sf.
waterways = read_sf("hydro_delta_marsh.shp")
waterways
## Simple feature collection with 282 features and 9 fields
## geometry type: POLYGON
## dimension: XY
## bbox: xmin: -122.6408 ymin: 37.41522 xmax: -120.9357 ymax: 38.67781
## geographic CRS: NAD83
## # A tibble: 282 x 10
## AREA PERIMETER HYDRO_POLY HYDRO_PO_1 HYDRO_24K_ TYPE HNAME Shape_Leng
## <dbl> <dbl> <int> <int> <int> <chr> <chr> <dbl>
## 1 7.35e7 1033340 791 797 798 MR SACR~ 2.45
## 2 8.76e4 3319. 1965 1963 1964 S W 0.0357
## 3 7.92e6 87428. 1967 1965 1966 C SACT~ 0.829
## 4 1.04e5 2719. 1970 1969 1970 L GREE~ 0.0264
## 5 1.06e5 2798. 1977 1974 1975 L LAKE~ 0.0283
## 6 1.59e5 3392. 1982 1978 1979 S W 0.0314
## 7 4.26e4 1003. 1992 1989 1990 S W 0.00952
## 8 5.65e3 498. 2001 2008 2009 MR SOUT~ 0.00548
## 9 4.14e3 502. 2006 2012 2013 MR SOUT~ 0.00536
## 10 9.78e4 6630. 2008 2011 2012 MR SOUT~ 0.0746
## # ... with 272 more rows, and 2 more variables: Shape_Area <dbl>,
## # geometry <POLYGON [°]>
Notice that the ‘geometery’ column says ‘polygon’ instead of ‘point’ and it is actually a list with a bunch of points outlining the feature.
ggplot() +
geom_sf(data = waterways)+
geom_sf(data = EMPsf)
Excellent! Now let’s change some things. Fortunately, the ‘ggmap’ package can accept any of the ggplot themes, and all of the ggplot calls you might already know. You can also vary the shape, size, and color of the elements.
ggplot() +
geom_sf(data = waterways)+
geom_sf(data = EMPsf, aes(color = StationType, size = AverageDepth)) +
theme_bw()
We usually want a scale bar and north arrow on our maps. There are nice little built-in fucntions in the package “ggsn” that will do that for you.
library(ggsn)
ggplot() +
geom_sf(data = waterways)+
geom_sf(data = EMPsf, aes(color = StationType, size = AverageDepth)) +
theme_bw()+
#You can adjust the size, units, etc of your scale bar.
scalebar(data = EMPsf, dist = 20, dist_unit = "km",
transform = TRUE, st.dist = .1) +
#there are a number of different optinos for north arrow symbols. ?north
north(data = waterways, symbol = 2)
We’d also like a basemap so we can see the rest of the world.
This is also a good opportunity to review rasters versus vectors. Spatial data can either be represented as a grid of equal-sized boxes, with data associated with each box (raster), or a group of polygons, with data associated with each point/line/polygon. The delta waterways shapefile is a vector - each river is made up of a list of points and has lines connecting the points.
Basemaps are usually rasters, rather than vectors, and have very little information associated with them. Each pixel just has a color associated with it, so it’s just a picture of a map.
There are a number of free basemap sources online, including Google Maps, but Stamen maps are one of the easiest to work with.
#download your map from online
basemap = get_stamenmap(bbox = c(left = -122.6,
right = -121.0, bottom = 37.5, top = 38.6),
maptype = "terrain")
#use "ggmap" rather than "ggplot" to work with the basemap
ggmap(basemap)+ #plot the basemap
geom_sf(data = waterways, inherit.aes = F)+ #add teh waterways. Be sure to use "inherit.aes=F"
geom_sf(data = EMPsf, aes(color = StationType, size = AverageDepth),inherit.aes = FALSE) + #add EMP
#add scalebar and northa rrow
scalebar(data = EMPsf, dist = 20, dist_unit = "km",
transform = TRUE, st.dist = .1) +
scale_color_discrete(guide = FALSE) +
north(data = waterways, symbol = 2)
## Labels
We can label our points with geom_sf_text
and add annotations with “annotate”
ggmap(basemap, darken = c(.5, "white"))+ #plot the basemap and lighten it a little
geom_sf(data = waterways, inherit.aes = F, fill = "lightblue")+ #add the waterways
geom_sf(data = EMPsf, aes(color = StationType, size = AverageDepth),inherit.aes = FALSE) + #add EMP points
scalebar(data = EMPsf, dist = 20, dist_unit = "km",
transform = TRUE, st.dist = .1) + #add a scale bar
north(data = EMPsf, symbol = 3) + #Add north arrow
geom_sf_text(data = EMPsf, aes(label = Station), #label the points
nudge_x = 0.07, size = 3, #adjust size and position relative to points
inherit.aes = FALSE,
check_overlap = T) + #stop labels from being drawn over each other
annotate("text", label = "Yay Maps!", x = -122.2, y = 38.4, size = 8)
## Coordinate system already present. Adding new coordinate system, which will replace the existing one.
## Warning in st_point_on_surface.sfc(sf::st_zm(x)): st_point_on_surface may not
## give correct results for longitude/latitude data
And there you have it. This map obviously needs a fair amount of work to be beautiful, but you can get it going relatively easily. If you need a copy of the Delta shapefile, check out the “deltamapr” package: https://github.com/InteragencyEcologicalProgram/deltamapr