sf
packageSoil data has its specificities, but generally speaking we can get away with treating it like a spatial dataset for a significant part of the analysis workflow. This is why in this tutorial, we will present tools to read and process spatial data.
A lot (most?) of soil data is stored as using spatial formats – for better or worse. While the soil profile is a specific case (we can consider it as a set of horizons generated from a spatial point located at the surface of our planet, soil surveys, landsacpe attributes and enviornemental covariates are best represented as spatial datasets.
When reading such data, the main study cases are:
txt
or csv
In this tutorial, we will mostly cover the two first cases.
The “classic approach” is a collection of 4 packages:
sp
: provides a set of dedicated classes for the different vector and raster datatypes , and some analysis tools (overlays, etc.),rgdal
: input/output library built over GDAL/OGR to read and write spatial data,rgeos
: geometry library built over GEOS for all geometry operations (union, intersections, buffer, etc).Recently (this year), a new package has been introduced to manipulate vector data: sf
(for Simple Features). It’s a pretty interesting improvement on sp
/rgdal
/rgeos
, and we will focus on it in this tutorial.
In a few words, here’s what is interesting about sf
:
sf
is a data.frame
– which means that a lot of data analysis methods are readily available,sp
, rgdal
, and rgeos
under one unique package,rgdal
,sp
and rgdal
— the upcoming version will include spatial indexing!According to the standard:
“A simple feature is defined by the OpenGIS Abstract specification to have both spatial and non-spatial attributes. Spatial attributes are geometry valued, and simple features are based on 2D geometry with linear interpolation between vertices.”
Basically, spatial is not that special anymore: spatial attributes can be described as a geometry, which is just another attribute of the dataset.
Several types of geometry can be implemented using this standard: POINT
, LINESTRING
, POLYGON
, MULTIPOINT
, MULTILINESTRING
, MULTIPOLYGON
, GEOMETRYCOLLECTION
. There’s also more fancy stuff like CURVEPOLYGON
, etc.
Simple Features are represented using WKT/WKB (Well Known Text/Well Known Binary). It looks like:
## POINT(0 1)
## MULTIPOINT(1 6, 2 7, 3 8, 4 9, 5 10)
## MULTIPOLYGON(((0 0, 10 0, 10 10, 0 10, 0 0), (1 1, 1 2, 2 2, 2 1, 1 1), (5 5, 5 6, 6 6, 6 5, 5 5)), ((12 12, 22 12, 22 22, 12 22, 12 12), (13 13, 13 14, 14 14, 14 13, 13 13)), ((24 24, 34 24, 34 34, 24 34, 24 24)))
Simple Features are represented by sf
as a data.frame
. The spatial attributes of the Simple Features (or geometry
) are represenetd using WKT. Technically speaking, there are 3 types of classes implemented by the sf
package:
sf
is the data.frame
with spatial attributes,sfc
is the column storing the geometries of the different records in the data.frame
,sfg
is the geometry of each individual record.The functions provided by sf
are prefixed by st_
— similarly to PostGIS. That makes it easy to search for them on the command line too.
It is very often that point data (such as profiles) is exchanged using a text delimited format such as CSV.
CSV and TXT files can be read using the classic read.csv
, read.delim
, etc., in base
R. however, the readr
package — from the tidyverse
— does provide interesting alternatives that are much faster in read and write.
library(tidyverse)
df <- read_csv('./path/to/my/file.csv')
data.frame
to Simple FeaturesLet’s take the example dataset meuse
from the sp
package. meuse
is a data.frame
— similar to what could have been read from a CSV file using read_csv
:
data('meuse', package = "sp")
head(meuse)
## x y cadmium copper lead zinc elev dist om ffreq soil
## 1 181072 333611 11.7 85 299 1022 7.909 0.00135803 13.6 1 1
## 2 181025 333558 8.6 81 277 1141 6.983 0.01222430 14.0 1 1
## 3 181165 333537 6.5 68 199 640 7.800 0.10302900 13.0 1 1
## 4 181298 333484 2.6 81 116 257 7.655 0.19009400 8.0 1 2
## 5 181307 333330 2.8 48 117 269 7.480 0.27709000 8.7 1 2
## 6 181390 333260 3.0 61 137 281 7.791 0.36406700 7.8 1 2
## lime landuse dist.m
## 1 1 Ah 50
## 2 1 Ah 30
## 3 1 Ah 150
## 4 0 Ga 270
## 5 0 Ah 380
## 6 0 Ga 470
The easiest way to modifiy this data.frame
so that its coordinates (x
and y
) are used to generate a geometry
column is using st_as_sf
(“as simple feature”):
ms <- st_as_sf(
meuse,
coords = c('x', 'y'),
crs = "+init=epsg:28992"
)
ms
## Simple feature collection with 155 features and 12 fields
## geometry type: POINT
## dimension: XY
## bbox: xmin: 178605 ymin: 329714 xmax: 181390 ymax: 333611
## epsg (SRID): 28992
## proj4string: +proj=sterea +lat_0=52.15616055555555 +lon_0=5.38763888888889 +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +towgs84=565.4171,50.3319,465.5524,-0.398957388243134,0.343987817378283,-1.87740163998045,4.0725 +units=m +no_defs
## First 20 features:
## cadmium copper lead zinc elev dist om ffreq soil lime landuse
## 1 11.7 85 299 1022 7.909 0.00135803 13.6 1 1 1 Ah
## 2 8.6 81 277 1141 6.983 0.01222430 14.0 1 1 1 Ah
## 3 6.5 68 199 640 7.800 0.10302900 13.0 1 1 1 Ah
## 4 2.6 81 116 257 7.655 0.19009400 8.0 1 2 0 Ga
## 5 2.8 48 117 269 7.480 0.27709000 8.7 1 2 0 Ah
## 6 3.0 61 137 281 7.791 0.36406700 7.8 1 2 0 Ga
## 7 3.2 31 132 346 8.217 0.19009400 9.2 1 2 0 Ah
## 8 2.8 29 150 406 8.490 0.09215160 9.5 1 1 0 Ab
## 9 2.4 37 133 347 8.668 0.18461400 10.6 1 1 0 Ab
## 10 1.6 24 80 183 9.049 0.30970200 6.3 1 2 0 W
## 11 1.4 25 86 189 9.015 0.31511600 6.4 1 2 0 Fh
## 12 1.8 25 97 251 9.073 0.22812300 9.0 1 1 0 Ag
## 13 11.2 93 285 1096 7.320 0.00000000 15.4 1 1 1 W
## 14 2.5 31 183 504 8.815 0.11393200 8.4 1 1 0 Ah
## 15 2.0 27 130 326 8.937 0.16833600 9.1 1 1 0 Ah
## 16 9.5 86 240 1032 7.702 0.00000000 16.2 1 1 1 W
## 17 7.0 74 133 606 7.160 0.01222430 16.0 1 1 1 W
## 18 7.1 69 148 711 7.100 0.01222430 16.0 1 1 1 W
## 19 8.7 69 207 735 7.020 0.00000000 13.7 1 1 1 W
## 20 12.9 95 284 1052 6.860 0.00000000 14.8 1 1 1 <NA>
## dist.m geometry
## 1 50 POINT(181072 333611)
## 2 30 POINT(181025 333558)
## 3 150 POINT(181165 333537)
## 4 270 POINT(181298 333484)
## 5 380 POINT(181307 333330)
## 6 470 POINT(181390 333260)
## 7 240 POINT(181165 333370)
## 8 120 POINT(181027 333363)
## 9 240 POINT(181060 333231)
## 10 420 POINT(181232 333168)
## 11 400 POINT(181191 333115)
## 12 300 POINT(181032 333031)
## 13 20 POINT(180874 333339)
## 14 130 POINT(180969 333252)
## 15 220 POINT(181011 333161)
## 16 10 POINT(180830 333246)
## 17 10 POINT(180763 333104)
## 18 10 POINT(180694 332972)
## 19 10 POINT(180625 332847)
## 20 10 POINT(180555 332707)
There is a simple plotting funtion including in sf
. It is very similar to the old sp::spplot
:
plot(ms)
## Warning: plotting the first 10 out of 12 attributes; use max.plot = 12 to
## plot all
A wide selecton of spatial data formats can be read using the st_read
command (it’s using GDAL/OGR behind the scenes). Unlike readOGR
from the rgdal
package, generally the command can guess which driver it should use.
file_name <- system.file("shape/nc.shp", package="sf")
nc <- st_read(file_name)
## Reading layer `nc' from data source `/usr/local/lib/R/site-library/sf/shape/nc.shp' using driver `ESRI Shapefile'
## Simple feature collection with 100 features and 14 fields
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965
## epsg (SRID): 4267
## proj4string: +proj=longlat +datum=NAD27 +no_defs
print(nc)
## Simple feature collection with 100 features and 14 fields
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965
## epsg (SRID): 4267
## proj4string: +proj=longlat +datum=NAD27 +no_defs
## First 20 features:
## AREA PERIMETER CNTY_ CNTY_ID NAME FIPS FIPSNO CRESS_ID BIR74
## 1 0.114 1.442 1825 1825 Ashe 37009 37009 5 1091
## 2 0.061 1.231 1827 1827 Alleghany 37005 37005 3 487
## 3 0.143 1.630 1828 1828 Surry 37171 37171 86 3188
## 4 0.070 2.968 1831 1831 Currituck 37053 37053 27 508
## 5 0.153 2.206 1832 1832 Northampton 37131 37131 66 1421
## 6 0.097 1.670 1833 1833 Hertford 37091 37091 46 1452
## 7 0.062 1.547 1834 1834 Camden 37029 37029 15 286
## 8 0.091 1.284 1835 1835 Gates 37073 37073 37 420
## 9 0.118 1.421 1836 1836 Warren 37185 37185 93 968
## 10 0.124 1.428 1837 1837 Stokes 37169 37169 85 1612
## 11 0.114 1.352 1838 1838 Caswell 37033 37033 17 1035
## 12 0.153 1.616 1839 1839 Rockingham 37157 37157 79 4449
## 13 0.143 1.663 1840 1840 Granville 37077 37077 39 1671
## 14 0.109 1.325 1841 1841 Person 37145 37145 73 1556
## 15 0.072 1.085 1842 1842 Vance 37181 37181 91 2180
## 16 0.190 2.204 1846 1846 Halifax 37083 37083 42 3608
## 17 0.053 1.171 1848 1848 Pasquotank 37139 37139 70 1638
## 18 0.199 1.984 1874 1874 Wilkes 37193 37193 97 3146
## 19 0.081 1.288 1880 1880 Watauga 37189 37189 95 1323
## 20 0.063 1.000 1881 1881 Perquimans 37143 37143 72 484
## SID74 NWBIR74 BIR79 SID79 NWBIR79 geometry
## 1 1 10 1364 0 19 MULTIPOLYGON(((-81.47275543...
## 2 0 10 542 3 12 MULTIPOLYGON(((-81.23989105...
## 3 5 208 3616 6 260 MULTIPOLYGON(((-80.45634460...
## 4 1 123 830 2 145 MULTIPOLYGON(((-76.00897216...
## 5 9 1066 1606 3 1197 MULTIPOLYGON(((-77.21766662...
## 6 7 954 1838 5 1237 MULTIPOLYGON(((-76.74506378...
## 7 0 115 350 2 139 MULTIPOLYGON(((-76.00897216...
## 8 0 254 594 2 371 MULTIPOLYGON(((-76.56250762...
## 9 4 748 1190 2 844 MULTIPOLYGON(((-78.30876159...
## 10 1 160 2038 5 176 MULTIPOLYGON(((-80.02567291...
## 11 2 550 1253 2 597 MULTIPOLYGON(((-79.53050994...
## 12 16 1243 5386 5 1369 MULTIPOLYGON(((-79.53050994...
## 13 4 930 2074 4 1058 MULTIPOLYGON(((-78.74912261...
## 14 4 613 1790 4 650 MULTIPOLYGON(((-78.80680084...
## 15 4 1179 2753 6 1492 MULTIPOLYGON(((-78.49252319...
## 16 18 2365 4463 17 2980 MULTIPOLYGON(((-77.33220672...
## 17 3 622 2275 4 933 MULTIPOLYGON(((-76.29892730...
## 18 4 200 3725 7 222 MULTIPOLYGON(((-81.02056884...
## 19 1 17 1775 1 33 MULTIPOLYGON(((-81.80622100...
## 20 1 230 676 0 310 MULTIPOLYGON(((-76.48052978...
plot(nc)
## Warning: plotting the first 10 out of 14 attributes; use max.plot = 14 to
## plot all
plot(nc['AREA'])
Its counterpart is st_write
and is just as easy to use:
st_write(nc, "nc.shp")
st_write(nc, "nc.shp", delete_layer = TRUE)
# Note that write_st erases existing layers silently:
write_sf(nc, "nc.shp")
The drivers available on your machine can be printed using the st_drivers
command:
my_drivers <- st_drivers()
head(my_drivers)
## name long_name
## PCIDSK PCIDSK PCIDSK Database File
## netCDF netCDF Network Common Data Format
## JP2OpenJPEG JP2OpenJPEG JPEG-2000 driver based on OpenJPEG library
## PDF PDF Geospatial PDF
## ESRI Shapefile ESRI Shapefile ESRI Shapefile
## MapInfo File MapInfo File MapInfo File
## write copy is_raster is_vector
## PCIDSK TRUE FALSE TRUE TRUE
## netCDF TRUE TRUE TRUE TRUE
## JP2OpenJPEG FALSE TRUE TRUE TRUE
## PDF TRUE TRUE TRUE TRUE
## ESRI Shapefile TRUE FALSE FALSE TRUE
## MapInfo File TRUE FALSE FALSE TRUE
GDAL/OGR can read data from databases — so sf
does too:
# Using GDAL
meuse <- st_read("PG:dbname=postgis", "meuse")
# Using the Postgres package
library(RPostgreSQL)
# Connect to a PostGIS instance
conn <- dbConnect(PostgreSQL(), dbname = "postgis")
# Use this conection to make a query
x <- st_read_db(conn, "meuse", query = "SELECT * FROM meuse LIMIT 3;")
x <- st_read_db(conn, table = "public.meuse")
print(st_crs(x)) # SRID resolved by the database, not by GDAL!
# Close the DB connection
dbDisconnect(conn)
# Write spatial data as SQLite database
sf::write_sf(nc, 'nc.sqlite')
Very roughly, the sf
package provides two types of functions:
st_
:## [1] "st_agr<-.sf" "st_agr.sf" "st_as_sf.sf"
## [4] "st_bbox.sf" "st_boundary.sf" "st_buffer.sf"
## [7] "st_cast.sf" "st_centroid.sf" "st_convex_hull.sf"
## [10] "st_coordinates.sf" "st_crs<-.sf" "st_crs.sf"
## [13] "st_difference.sf" "st_geometry<-.sf" "st_geometry.sf"
## [16] "st_intersection.sf" "st_is.sf" "st_line_merge.sf"
## [19] "st_make_valid.sf" "st_polygonize.sf" "st_precision.sf"
## [22] "st_segmentize.sf" "st_set_precision.sf" "st_simplify.sf"
## [25] "st_split.sf" "st_sym_difference.sf" "st_transform.sf"
## [28] "st_triangulate.sf" "st_union.sf" "st_voronoi.sf"
## [31] "st_zm.sf"
dplyr
verbs addapted for the sf
data:## [1] "aggregate.sf" "cbind.sf"
## [3] "coerce,oldClass,S3-method" "coerce,sf,Spatial-method"
## [5] "coerce,Spatial,sf-method" "initialize,oldClass-method"
## [7] "merge.sf" "plot.sf"
## [9] "print.sf" "rbind.sf"
## [11] "[.sf" "[[<-.sf"
## [13] "$<-.sf" "show,oldClass-method"
## [15] "slotsFromS3,oldClass-method"
tidy
verbssf
is very well integrated with dplyr
and the tidyverse
: remember that Simple Features as implemented by sf
are fundamentally just data.frame
s!
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
# Example: joining non-spatial attributes
#
additional_data <- data.frame(
CNTY_ID = nc$CNTY_ID,
my_data <- runif(nrow(nc))
)
left_join(nc, additional_data, by = 'CNTY_ID')
## Simple feature collection with 100 features and 15 fields
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965
## epsg (SRID): 4267
## proj4string: +proj=longlat +datum=NAD27 +no_defs
## First 20 features:
## AREA PERIMETER CNTY_ CNTY_ID NAME FIPS FIPSNO CRESS_ID BIR74
## 1 0.114 1.442 1825 1825 Ashe 37009 37009 5 1091
## 2 0.061 1.231 1827 1827 Alleghany 37005 37005 3 487
## 3 0.143 1.630 1828 1828 Surry 37171 37171 86 3188
## 4 0.070 2.968 1831 1831 Currituck 37053 37053 27 508
## 5 0.153 2.206 1832 1832 Northampton 37131 37131 66 1421
## 6 0.097 1.670 1833 1833 Hertford 37091 37091 46 1452
## 7 0.062 1.547 1834 1834 Camden 37029 37029 15 286
## 8 0.091 1.284 1835 1835 Gates 37073 37073 37 420
## 9 0.118 1.421 1836 1836 Warren 37185 37185 93 968
## 10 0.124 1.428 1837 1837 Stokes 37169 37169 85 1612
## 11 0.114 1.352 1838 1838 Caswell 37033 37033 17 1035
## 12 0.153 1.616 1839 1839 Rockingham 37157 37157 79 4449
## 13 0.143 1.663 1840 1840 Granville 37077 37077 39 1671
## 14 0.109 1.325 1841 1841 Person 37145 37145 73 1556
## 15 0.072 1.085 1842 1842 Vance 37181 37181 91 2180
## 16 0.190 2.204 1846 1846 Halifax 37083 37083 42 3608
## 17 0.053 1.171 1848 1848 Pasquotank 37139 37139 70 1638
## 18 0.199 1.984 1874 1874 Wilkes 37193 37193 97 3146
## 19 0.081 1.288 1880 1880 Watauga 37189 37189 95 1323
## 20 0.063 1.000 1881 1881 Perquimans 37143 37143 72 484
## SID74 NWBIR74 BIR79 SID79 NWBIR79 my_data....runif.nrow.nc..
## 1 1 10 1364 0 19 0.008945796
## 2 0 10 542 3 12 0.293739612
## 3 5 208 3616 6 260 0.277374958
## 4 1 123 830 2 145 0.813574215
## 5 9 1066 1606 3 1197 0.260427771
## 6 7 954 1838 5 1237 0.724405893
## 7 0 115 350 2 139 0.906092151
## 8 0 254 594 2 371 0.949040221
## 9 4 748 1190 2 844 0.073144469
## 10 1 160 2038 5 176 0.754675027
## 11 2 550 1253 2 597 0.286000621
## 12 16 1243 5386 5 1369 0.100053522
## 13 4 930 2074 4 1058 0.954068775
## 14 4 613 1790 4 650 0.415607119
## 15 4 1179 2753 6 1492 0.455102418
## 16 18 2365 4463 17 2980 0.971055656
## 17 3 622 2275 4 933 0.583987980
## 18 4 200 3725 7 222 0.962204624
## 19 1 17 1775 1 33 0.761702403
## 20 1 230 676 0 310 0.714508535
## geometry
## 1 MULTIPOLYGON(((-81.47275543...
## 2 MULTIPOLYGON(((-81.23989105...
## 3 MULTIPOLYGON(((-80.45634460...
## 4 MULTIPOLYGON(((-76.00897216...
## 5 MULTIPOLYGON(((-77.21766662...
## 6 MULTIPOLYGON(((-76.74506378...
## 7 MULTIPOLYGON(((-76.00897216...
## 8 MULTIPOLYGON(((-76.56250762...
## 9 MULTIPOLYGON(((-78.30876159...
## 10 MULTIPOLYGON(((-80.02567291...
## 11 MULTIPOLYGON(((-79.53050994...
## 12 MULTIPOLYGON(((-79.53050994...
## 13 MULTIPOLYGON(((-78.74912261...
## 14 MULTIPOLYGON(((-78.80680084...
## 15 MULTIPOLYGON(((-78.49252319...
## 16 MULTIPOLYGON(((-77.33220672...
## 17 MULTIPOLYGON(((-76.29892730...
## 18 MULTIPOLYGON(((-81.02056884...
## 19 MULTIPOLYGON(((-81.80622100...
## 20 MULTIPOLYGON(((-76.48052978...
sf
is also providing verbs that are adapted to working with geometries
. For example summarise
will also operate a spatial union:
nc %>%
mutate(
group = sample(LETTERS[1:3], size = n(), replace = TRUE),
area = st_area(.), # Note the computation of area on the fly
sqrt_area = sqrt(area)
) %>%
group_by(group) %>%
summarise(mean_area = mean(sqrt_area)) %>%
plot
## Warning in Math.units(area): Operation sqrt not meaningful for units
The main functions are st_crs
, which retrieves the Coordinate Reference System (CRS) of the sf
object, and st_transform
, which allows to re-project the dataset into a different CRS.
The CRS can be specified as an EPSG number (that is: an integer), or as a character string follwoing the Proj.4 standard.
# Retrieve projection information
st_crs(nc)
## $epsg
## [1] 4267
##
## $proj4string
## [1] "+proj=longlat +datum=NAD27 +no_defs"
##
## attr(,"class")
## [1] "crs"
# Test whether data is projected or not
st_is_longlat(nc)
## [1] TRUE
# Projection
#
# Using a Proj.4 string
nc_p <- st_transform(nc, crs = "+proj=lcc +lat_1=36.16666666666666 +lat_2=34.33333333333334 +lat_0=33.75 +lon_0=-79 +x_0=609601.22 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs")
# using an EPSG code
nc_p <- st_transform(nc, crs = 32119)
st_crs(nc_p)
## $epsg
## [1] 32119
##
## $proj4string
## [1] "+proj=lcc +lat_1=36.16666666666666 +lat_2=34.33333333333334 +lat_0=33.75 +lon_0=-79 +x_0=609601.22 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs"
##
## attr(,"class")
## [1] "crs"
The sf
library wraps the capabilities of the GEOS (Geometry Engine Open Source) library — which is a very powerful piece of software used throughout the GIS industry, notably in PostGIS. The geometry functions in sf
are prefixed using st_
to mimic the PostGIS counterpart functions.
# Area of features
st_area(nc)
## Units: m^2
## [1] 1137388604 611077263 1423489919 694546292 1520740530 967727952
## [7] 615942210 903650119 1179347051 1232769242 1136287383 1524655447
## [13] 1427072339 1085967384 718190417 1894045426 524418608 1986966652
## [19] 812293946 626337436 439717946 640703090 863300187 1276536764
## [25] 1084141138 1697937846 1109973986 1800646340 1036414044 770546970
## [31] 1423171283 585232978 1311628320 1225109692 800263351 1186420419
## [37] 2194592880 1179139746 1550312884 690606074 665152821 1457875905
## [43] 1340538971 1005743617 988324117 1163892506 2019778925 1810512144
## [49] 944425951 1350102629 1685154840 1068192109 1691487743 2082117118
## [55] 1447094476 943881740 2045540782 1420919020 707671880 653368537
## [61] 1471077596 1436135571 1550981080 1186028306 788535203 1265443511
## [67] 1829439189 1447873644 918216585 1312723474 1043728456 961860121
## [73] 781910957 1046066043 986730350 917752685 601326678 1321916259
## [79] 2437929994 833538942 1210326313 1738550168 1228689799 1648412173
## [85] 1400582685 995107225 1678077407 2071844302 1228274216 519198288
## [91] 1784932020 808600209 1978619669 2439553215 1264050755 2288682896
## [97] 2181174167 2450241815 430695499 2165843695
# Distances between features
st_distance(nc_p[1:3,], nc_p[3:6,])
## Units: m
## [,1] [,2] [,3] [,4]
## [1,] 25652.53 440561.1 299772.2 361529.5
## [2,] 0.00 409428.5 268944.3 332589.8
## [3,] 0.00 367555.5 227017.5 290297.1
# Sanity check on the geometries
st_is_valid(nc[1:3,])
## [1] TRUE TRUE TRUE
The commands st_intersects
, st_disjoint
, st_touches
, st_crosses
, st_within
, st_contains
, st_overlaps
, st_equals
, st_covers
, st_covered_by
, st_equals_exact
and st_is_within_distance
are logical tests. They return a sparse matrix with matching (TRUE) indexes, or a full logical matrix:
# Intersects
st_intersects(nc_p[1:5,], nc_p[1:4,])
## [[1]]
## [1] 1 2
##
## [[2]]
## [1] 1 2 3
##
## [[3]]
## [1] 2 3
##
## [[4]]
## [1] 4
##
## [[5]]
## integer(0)
st_intersects(nc_p[1:5,], nc_p[1:4,], sparse = FALSE)
## [,1] [,2] [,3] [,4]
## [1,] TRUE TRUE FALSE FALSE
## [2,] TRUE TRUE TRUE FALSE
## [3,] FALSE TRUE TRUE FALSE
## [4,] FALSE FALSE FALSE TRUE
## [5,] FALSE FALSE FALSE FALSE
The commands st_buffer
, st_boundary
, st_convexhull
, st_union_cascaded
, st_simplify
, st_triangulate
, st_polygonize
, st_centroid
, st_segmentize
, and st_union
return new geometries, e.g.:
buf <- st_buffer(nc_p[c(1, 2, 14),], dist = 30000)
plot(buf, border = 'red')
## Warning: plotting the first 10 out of 14 attributes; use max.plot = 14 to
## plot all
The st_buffer
function computes a buffer outwards or inwards from features. Bonus trick: if you have invalid fatures, try to make a buffer of size zero
fix <- st_buffer(corrupted, dist = 0)
Commands st_intersection
, st_union
, st_difference
, st_sym_difference
return new geometries that are a function of pairs of geometries:
# Intersection
i <- st_intersection(nc_p[1:5,], nc_p[1:4,])
## Warning: attribute variables are assumed to be spatially constant
## throughout all geometries
plot(i)
## Warning: plotting the first 9 out of 28 attributes; use max.plot = 28 to
## plot all
# Union
u <- st_union(nc)
plot(u)
sp
classesNote that you can convert sf
objects back to either data.frame
or one of the old sp
classes very easily:
# Convert to data.frame
as.data.frame(nc) %>% head
## AREA PERIMETER CNTY_ CNTY_ID NAME FIPS FIPSNO CRESS_ID BIR74
## 1 0.114 1.442 1825 1825 Ashe 37009 37009 5 1091
## 2 0.061 1.231 1827 1827 Alleghany 37005 37005 3 487
## 3 0.143 1.630 1828 1828 Surry 37171 37171 86 3188
## 4 0.070 2.968 1831 1831 Currituck 37053 37053 27 508
## 5 0.153 2.206 1832 1832 Northampton 37131 37131 66 1421
## 6 0.097 1.670 1833 1833 Hertford 37091 37091 46 1452
## SID74 NWBIR74 BIR79 SID79 NWBIR79 geometry
## 1 1 10 1364 0 19 MULTIPOLYGON(((-81.47275543...
## 2 0 10 542 3 12 MULTIPOLYGON(((-81.23989105...
## 3 5 208 3616 6 260 MULTIPOLYGON(((-80.45634460...
## 4 1 123 830 2 145 MULTIPOLYGON(((-76.00897216...
## 5 9 1066 1606 3 1197 MULTIPOLYGON(((-77.21766662...
## 6 7 954 1838 5 1237 MULTIPOLYGON(((-76.74506378...
# convert to one of the sp classes
as(nc, "Spatial") %>% summary
## Length Class Mode
## 100 SpatialPolygonsDataFrame S4
ggplot2
compatibilityIn the upcoming version of ggplot2
(note that I’m running the development version), sf
object can be plotted very easily using geom_sf
— so the incredible power of this plotting sytem can be takn advantage of:
library(ggplot2)
packageVersion('ggplot2')
## [1] '2.2.1.9000'
# Unprojected data
ggplot(nc) +
geom_sf(aes(fill = AREA))
# Facetting (on projected data this time!)
nc_grouped <- nc_p %>%
mutate(
# Creating a (random) group attribute
group = sample(LETTERS[1:3], size = n(), replace = TRUE)
)
ggplot(nc_grouped) +
geom_sf(aes(fill = sqrt(AREA))) +
facet_wrap(~group, ncol = 2)
# Going full fancy
p <- nc_grouped %>%
select(-group) %>%
ggplot() +
geom_sf(colour = "grey70")
p +
geom_sf(data = nc_grouped, aes(fill = sqrt(AREA))) +
facet_wrap(~group, ncol = 2) +
theme_bw()