Working with Raster Dataset in R
We begin with answering the questions. And the possible reason to reach the goal is to define questions like;
- what is a raster dataset?
- What tools/functions are used to import raster in R?
- How to I work with and plot raster data in R
- How missing or bad data in R are handled with R
Objectives
- Describe the fundamental attributes of a raster dataset
- Explore raster attributtes and metadata
- Import raster dataset into R workspace
- visualize raster object
- Distinguish single versus multi-bands rasters
Introduction to Raster data
This this section introduce you to the fundamental principles, packages and metadata/raster attributes that are needed to work with raster data in R. The section discuss some of the core metadata elements that you need to understand to work with rasters in R, including CRS and resolution. Furthermore, missing and bad data values stored in raster will be explored and techniques to handles these elements will be illustrated.
The book use several packages including the tidyverse ecosystem [@tidyverse]—with popular packages like the ggplot2[@ggplot] and dplyr [@dplyr]. The widely used packages for handling raster and vector data like raster [@raster], sp[@sp], sf [@sf] and rgdal [@rgdal] make core tools in this book. R needs these packages imported into the environment to use their functions, which can easily done with the require()
function.
require(sf)
require(sp)
require(raster)
require(tidyverse)
require(metR)
GeoTiff
A popular public domain raster data format is the GeoTIFF
format. If maximum portability and platform independence is important, this file format may be a good choice.
Explore the raster attribute
One of the common raster file is the *GeoTiff** that embed tags of metadata information bout the raster file. This metadata provide the information of the file and hence help us understand the internal structure of the file. This information can be accessed with the GDALinfo()
function [@rgdal]. Looking at the metadata help us have a glimpse of the file before even the file is imported into the workspace.
rgdal::GDALinfo("e:/GIS/Tanzania spatial data Bank/Lake_Tanganyika_Bathymetry/Lake_Tanganyika_Bathymetry/grid/tanganyika_dbm (2013_10_23 20_44_28 UTC).tif")
Read a GeoTIFF raster data
Once you have a glimpse of the information of the raster—for example the above information show that the tiff contain elevation values and provide the summary statistics of the elevation with minimum value of 0 and maximum value of 1500 with average of 607. It also show the geographical extent with minimum longitude o 29.05769 and maximul latitude of -8.811174. Furthermore, the metadata tell us that the file is projected with World Geodetic System (WGS84) and the cell has a horizontal resolution of 0.000833 degree. Once we know this information, we can read the file with the raster
function of raster package [@raster]
For this example, we use the bathmetry information of Lake Tanganyika found in Africa. It the world’s longest freshwater lake, the second largest largest by volume, and the second deepest lake in the world after lake Baikal in Siberia [@wiki2].
lt.bath = raster("e:/GIS/Tanzania spatial data Bank/Lake_Tanganyika_Bathymetry/Lake_Tanganyika_Bathymetry/grid/tanganyika_dbm (2013_10_23 20_44_28 UTC).tif")
We can summary
function to look at the statistics of the bathmetry of this lake. Looking at the descriptive statistics, we notice that the lake has the depth range from 0 to 1500 m and there is no cell without a value.
lt.bath %>% summary()
There are times when a raster file does not show the summary statistics. When this occurs you can manually calculate the cell values using the setMinMax()
function.
lt.bath %>%
setMinMax()
View Raster Coordinate Reference System
A spatial reference system (SRS) or coordinate referene system (CRS) is a coordinate-based local, regional or global system used to locate geographical entities [@wiki1]. A spatial reference system defines a specific map projection and transofrmation betweeen diffferent spatial reference systems. We can look the embedded CRS in the raster file with teh crs()
function from raster package.
lt.bath %>% crs()
We notice that the raster file is projected in Word Geodetic System of 1984 (WGS84). In summary the projection +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
tell us as follows:
proj=longlat
: the projectionis in longitude and latitude decimal degreesdatum=WGS84
: the datum is WGS84 and it referes to the 0,0 refereence for the coordinate system used in the projectionellps=WGS84
: the ellipsoidd—how the earth’s roundness is calculated for the data is WGS84
Dealing with missing data in raster
Raster data often has NoData
value to represent the absence of data. This is a value assigned to pixels where data is missing or absent. The raster comes with a ract1 contain both the elevation and bathmetry information. But if you want to plot area contour from the coastline offshore, you will need to remove elevation information and this is when you assign the elevation pixel with the NoData
value.
The values that is conventionally used to represent missing data varies by ther raster data type. For example, for floating points rasters, the figure -3.4e+38
is commonly used while for integers a figure -9999
is common. However, when raster are imported,
R assigns these missing cell with NA
.
tz.bath = raster::raster("d:/semba/thesisTypeset/etopo1/wioregio-7753.asc")
tz.bath %>% summary()
wioregio.7753
Min. -6668
1st Qu. -4383
Median -3347
3rd Qu. 310
Max. 4506
NA's 0
Assign projection and Reproject Raster Data in R
Sometimes we encounter raster datasets that do not “line up” when plotted or analyzed. Rasters that don’t line up are most often in different Coordinate Reference Systems (CRS). This section explains how to deal with rasters in different, known CRSs. It will walk though reprojecting rasters in R using the projectRaster()
function in the raster package. We can assess the projection of the two raster data we loaded earlier with the crs()
function from raster package. Let’s begin with the bathmetry raster of Lake Tanganyika
crs(tz.bath)
CRS arguments: NA
We notice that the bathmetry raster of Lake Tanganyika has defined coordinate reference system—WGS84. let us also check the bathmetry data from the coastal water of Tanzania using the same crs()
function.
crs(tz.bath)
CRS arguments: NA
Unfortunately, the bathmetry raster of coastal water of Tanzania lack the coordinate reference systm—this idicate that the projection is not defined yet. fortunate, raster package has projectRaster()
function that allows to reproject raster without defined CRS or reproject a raster from one CRS into another. Since the lt.bath
has the projection, we can use its projection to define the missing coordinate system in tz.bath
raster file. Because we need to define a projection of the missing raster, we simply use the crs()
function to copy the projection of lt.bath
into the tz.bath
as the code block illustrate
crs(tz.bath) = "+proj=lcc +lat_1=48 +lat_2=33 +lon_0=-100 +datum=WGS84"
We can check the coordinate of the two files if they are correct
crs(tz.bath)
Since we know that the coastal water of Tanzania lies at zone 37 south, we can simply assign the appropriate projection and then transform the bathmetry from WGS84 to UTM zone 37 south. Since we know the text of the zone, let us define it
tzutm = "+proj=utm +zone=37 +south +datum=WGS84 +units=m +no_defs"
We then use the projection()
function to transform the CRS from WGS84 to UTM Zone 37 south
tz.bath.utm = tz.bath
projection(tz.bath.utm) = tzutm
Then check the files projections. Instead of using the crs()
to assess the type of projection, we use the projection()
function instead.
tz.bath %>%
projection(asText = F)
CRS arguments: NA
tz.bath.utm %>%
projection(asText = F)
CRS arguments:
+proj=utm +zone=37 +south +datum=WGS84 +units=m +no_defs
Raster resolution
Let’s next have a look at the resolution of reprojected tz.bath.utm
and the tz.bath
files.
tz.bath.utm %>% res();tz.bath %>% res()
[1] 0.01666667 0.01666667
[1] 0.01666667 0.01666667
We notice that the horizontal resolution of projected utm tz.bath.utm
file is given in meters of 1859.258 by 1860.036. But the the wgs84 tz.bath
is given in degree of 0.01666667 by 0.01666667. Therefore, depending on how you intend to use the raster in analysis and mapping, you will find yourself resonate between geographical coordianate system (WGS) and Universal Transeverse Mercator (UTM). The former is in degree while the later is in meters.
Raster Calculation
Often times we want to perform calculations on two or more rasters to create a new output raster. For example, if we are interested in mapping the heights of trees across an entire field site, we might want to calculate the difference between the Digital Surface Model (DSM, tops of trees) and the Digital Terrain Model (DTM, ground level). The resulting dataset is referred to as a Canopy Height Model (CHM) and represents the actual height of trees, buildings, etc. with the influence of ground elevation removed.