class: center, middle, inverse, title-slide .title[ # Hands-on Exercise 6: Handling, Processing Visualising and Analysing Movement Data ] .author[ ### Dr. Kam Tin Seong
Assoc. Professor of Information Systems ] .institute[ ### School of Computing and Information Systems,
Singapore Management University ] .date[ ### 2020-2-15 (updated: 2022-06-14) ] --- ## Content .large[ In this hands-on exercise, you will learn how to handling, processing, visualising and analysing movement data using R. By the end of this hands-on exercise, you will be able to: - import geospatial data in *wkt* format into R and saved the imported data as **simple feature** objects by using **sf** package, - mapping geospatial data using tmap package, - import movement data in *wkt* format into R and saved the imported data as **simple feature** objects by using **sf** package, - process movement data by using sf and tidyverse packages, - visualising movement data by using **tmap** and **ggplot2** package, - analysing movement data by using R methods. ] --- ## Getting Started .pull-left[ .large[ In this Hands-on Exercise, the following R packages will be used: - sf, an R package specially designed to handle geospatial data in simple feature objects. ]] -- .pull-right[ .large[ Write a code chunk to check, install and launch **readr**, **sf** and **tmap** packages of R ```r packages = c('sf', 'tmap', 'tidyverse', 'lubridate', 'clock', 'sftime', 'rmarkdown') for (p in packages){ if(!require(p, character.only = T)){ install.packages(p) } library(p,character.only = T) } ``` ]] --- ## Visualising Geographical Data ### Importing wkt data .pull-left[ - *Well-known text (WKT)* is a human readable representation for spatial objects like points, lines, or enclosed areas on a map. Figure below shows the structure of point, line and polygons data in wkt format.  ] --- ## Visualising Geographical Data ### Importing wkt data In the code chunk below, [`read_sf()`]() of **sf** package is used to parse *School.csv* *Pubs.csv*, *Apartments.csv*, *Buildings.csv*, *Employer.csv*, and *Restaurants.csv* into R as sf data.frames. ```r schools <- read_sf("data/wkt/Schools.csv", options = "GEOM_POSSIBLE_NAMES=location") pubs <- read_sf("data/wkt/Pubs.csv", options = "GEOM_POSSIBLE_NAMES=location") apartments <- read_sf("data/wkt/Apartments.csv", options = "GEOM_POSSIBLE_NAMES=location") buildings <- read_sf("data/wkt/Buildings.csv", options = "GEOM_POSSIBLE_NAMES=location") employers <- read_sf("data/wkt/Employers.csv", options = "GEOM_POSSIBLE_NAMES=location") restaurants <- read_sf("data/wkt/Restaurants.csv", options = "GEOM_POSSIBLE_NAMES=location") ``` --- ### Structure of a simple point feature data.frame - After importing the data file into R, it is important for us to review the data object. ```r print(apartments) ``` ``` ## Simple feature collection with 1517 features and 5 fields ## Geometry type: POINT ## Dimension: XY ## Bounding box: xmin: -4616.828 ymin: 22.16098 xmax: 2488.067 ymax: 7829.905 ## CRS: NA ## # A tibble: 1,517 × 6 ## apartmentId rentalCost maxOccupancy numberOfRooms location ## <chr> <chr> <chr> <chr> <POINT> ## 1 1 768.16 2 4 (1077.698 648.4427) ## 2 2 1014.55 2 1 (-185.9293 1520.327) ## 3 3 1057.39 4 3 (2123.014 5126.753) ## 4 4 1259.1 4 3 (2103.63 4266.933) ## 5 5 411.5 1 4 (7.058974 79.96164) ## 6 6 859.58 3 2 (2250.855 5251.337) ## 7 7 982.11 3 4 (486.8811 2251.126) ## 8 8 980.05 4 1 (1233.455 1768.611) ## 9 9 433.45 1 3 (1274.272 1163.505) ## 10 10 1104.33 3 4 (-1697.03 1239.03) ## # … with 1,507 more rows, and 1 more variable: buildingId <chr> ``` --- ### Structure of a simple point feature data.frame Notice that *rentalCost*, *maxOccupancy* and *numberOfRooms* fields are not in the correct data type. **DIY:** Convert these three fields into the correct data type by using appropriate tidyverse functions. --- ### Structure of a simple polygon feature data.frame Now we will print the *buildings* simple feature data.frame. ```r print(buildings) ``` ``` ## Simple feature collection with 1042 features and 4 fields ## Geometry type: POLYGON ## Dimension: XY ## Bounding box: xmin: -4762.191 ymin: -30.08359 xmax: 2650 ymax: 7850.037 ## CRS: NA ## # A tibble: 1,042 × 5 ## buildingId location buildingType maxOccupancy units ## <chr> <POLYGON> <chr> <chr> <chr> ## 1 1 ((350.0639 4595.666, 390.0633 459… Commercial "" "" ## 2 2 ((-1926.973 2725.611, -1948.191 2… Residental "12" "[48… ## 3 3 ((685.6846 1552.131, 645.9985 154… Commercial "" "[38… ## 4 4 ((-976.7845 4542.382, -1053.288 4… Commercial "" "" ## 5 5 ((1259.306 3572.727, 1299.255 357… Residental "2" "[23… ## 6 6 ((478.8969 1082.484, 473.6596 113… Commercial "" "" ## 7 7 ((-1920.823 615.7447, -1960.818 6… Residental "" "" ## 8 8 ((-3302.657 5394.354, -3301.512 5… Commercial "" "[13… ## 9 9 ((-600.5789 4429.228, -495.9506 4… Commercial "" "" ## 10 10 ((-68.75908 5379.924, -28.78232 5… Residental "5" "[10… ## # … with 1,032 more rows ``` --- ### Plotting the building footprint map: tmap methods .pull-left[ The code chunk below plots the building polygon features by using `tm_polygon()`. ```r tmap_mode("view") tm_shape(buildings)+ tm_polygons(col = "grey60", border.col = "black", border.lwd = 1) + tm_basemap(NULL) tmap_mode("plot") ``` .small[ Things to learn from the code chunk: - `tmap_mode()` is used to switch the display from static mode (i.e. "plot") to interactive mode (i.e. "view"). - `tm_shape()` is used to create a **tmap-element** that specifies a spatial data object (i.e. buildings). - `tm_polygon()` is used to create a **tmap-element** that draws polygon feature. - `tm_basemap()` is used to turn-off the default basemap provided by leaflet. By default, three basemap namely: *Esri.WorldGrayCanvas*, *OpenStreetMap* and *Esri.WorldTopoMap* are provided. ]] .pull-right[
] --- ### Plotting a composite map: tmap methods .pull-left[ The code chunk below is used to plot a composite map by combining the buildings and employers simple feature data.frames. ```r tmap_mode("plot") tm_shape(buildings)+ tm_polygons(col = "grey60", size = 1, border.col = "black", border.lwd = 1) + tm_shape(employers) + tm_dots(col = "red") ``` ] .pull-right[ <img src="Hands-on_Ex07-MovementVis_files/figure-html/unnamed-chunk-9-1.png" width="504" /> ] --- ### DIY: Plotting a composite map .pull-left[ The Task: Plot a composite map by combining *buildings*, *apartments*, *employers*, *pubs*, *restaurants*, and *schools*. ```r tmap_mode("plot") tm_shape(buildings)+ tm_polygons(col = "grey60", size = 1, border.col = "black", border.lwd = 1) + tm_shape(employers) + tm_dots(col = "red") + tm_shape(apartments) + tm_dots(col = "lightblue") + tm_shape(pubs) + tm_dots(col = "green") + tm_shape(restaurants) + tm_dots(col = "blue") + tm_shape(schools) + tm_dots(col = "yellow") tmap_mode("plot") ``` ] .pull-right[ <img src="Hands-on_Ex07-MovementVis_files/figure-html/unnamed-chunk-11-1.png" width="504" /> ] --- ### Qualitative thematic map by building type .pull-left[ In the code chunk below, the building footprints are coloured by buildingType field. ```r tmap_mode("plot") tm_shape(buildings)+ *tm_polygons(col = "buildingType", * palette="Accent", border.col = "black", * border.alpha = .5, * border.lwd = 0.5) tmap_mode("plot") ``` ] .pull-right[ <img src="Hands-on_Ex07-MovementVis_files/figure-html/unnamed-chunk-13-1.png" width="504" /> ] --- ### Quantitative thematic map .pull-left[ Code chunk below plot a proportional symbol map showing the geographical distribution of apartment rental in the study area. ```r tm_shape(apartments)+ tm_bubbles(col = "rentalCost", alpha = 0.8, n = 6, style = "jenks", palette="OrRd", size = "numberOfRooms", scale = 0.8, border.col = "black", border.lwd = 0.5) ``` ] .pull-right[ <img src="Hands-on_Ex07-MovementVis_files/figure-html/unnamed-chunk-15-1.png" width="504" /> ] --- ## Movement Data .pull-left[ In this section, you will learn how to handle, process, visualise and analyse movement data. For the purpose of this hands-on exercise, *ParticipantStatusLogs1.csv* will be used. ### Importing wkt data **DIY:** By using the step you had learned, import *ParticipantStatusLogs1.csv* and save it as a simple feature data.frame. ```r logs <- read_sf("data/wkt/ParticipantStatusLogs1.csv", options = "GEOM_POSSIBLE_NAMES=currentLocation") ``` ] -- .pull-right[ Let us examine the structure of *logs* simple feature data.frame by using `glimpse()`. ```r glimpse(logs) ``` Notice that `read_sf()` failed to parse `timestamp` field into correct date-time data type. ] --- ### Processing movement data .pull-left[ To process the movement data, the following steps will be performed: - convert *timestamp* field from character data type to date-time data type by using [`date_time_parse()`](https://clock.r-lib.org/reference/date-time-parse.html) of clock package. - derive a *day* field by using [`get_day()`](https://clock.r-lib.org/reference/Date-getters.html) of clock package. - extract records whereby *currentMode* field is equal to *Transport* class by using `filter()` of dplyr package. ] .pull-right[ DIY: Write a code chunk to perform the tasks described on the left. ```r logs_selected <- logs %>% mutate(Timestamp = date_time_parse(timestamp, zone = "", format = "%Y-%m-%dT%H:%M:%S")) %>% mutate(day = get_day(Timestamp)) %>% filter(currentMode == "Transport") ``` ] --- ### Plotting the moving data as points .pull-left[ - Using appropriate tmap functions to create a map look similar to the figure on the left. ```r tmap_mode("plot") tm_shape(buildings)+ tm_polygons(col = "grey60", size = 1, border.col = "black", border.lwd = 1) + tm_shape(logs_selected) + tm_dots(col = "red") tmap_mode("plot") ``` ] .pull-right[ <img src="Hands-on_Ex07-MovementVis_files/figure-html/unnamed-chunk-22-1.png" width="504" /> ] --- ## Hexagon Binning Map .pull-left[ In this section, you will learn how to create a [hexagon binning map](https://think.design/services/data-visualization-data-design/hexbin/) by using R.  ] --- ### Computing the haxegons .pull-left[ In the code chunk below, [`st_make_grid()`]() of sf package is used to create haxegons ```r hex <- st_make_grid(buildings, cellsize=100, square=FALSE) %>% st_sf() %>% rowid_to_column('hex_id') plot(hex) ``` ] .pull-right[ <img src="Hands-on_Ex07-MovementVis_files/figure-html/unnamed-chunk-24-1.png" width="504" /> ] --- ### Performing point in polygon overlay .pull-left[ The code chunk below perform point in polygon overlay by using [`st_join()`] of sf package. ```r points_in_hex <- st_join(logs_selected, hex, join=st_within) #plot(points_in_hex, pch='.') ``` ] --- ### Performing point in polygon count .pull-left[ In the code chunk below, `st_join()` of sf package is used to count the number of event points in the hexagons. ```r points_in_hex <- st_join(logs_selected, hex, join=st_within) %>% st_set_geometry(NULL) %>% count(name='pointCount', hex_id) head(points_in_hex) ``` ``` ## # A tibble: 6 × 2 ## hex_id pointCount ## <int> <int> ## 1 169 35 ## 2 212 56 ## 3 225 21 ## 4 226 94 ## 5 227 22 ## 6 228 45 ``` ] --- ### Performing relational join .pull-left[ In the code chunk below, left_join() of dplyr package is used to perform a left-join by using *hex* as the target table and *points_in_hex* as the join table. The join ID is *hex_id*. ```r hex_combined <- hex %>% left_join(points_in_hex, by = 'hex_id') %>% replace(is.na(.), 0) ``` ] --- ### Plotting the hexagon binning mapp .pull-left[ In the code chunk below, tmap package is used to create the hexagon binning map. ```r tm_shape(hex_combined %>% filter(pointCount > 0))+ tm_fill("pointCount", n = 10, style = "quantile") + tm_borders(alpha = 0.1) ``` ] .pull-right[ <img src="Hands-on_Ex07-MovementVis_files/figure-html/unnamed-chunk-29-1.png" width="504" /> ] --- ### Mapping outliers .pull-left[ Code chunk below sort the hexagons by values of pointCount field. ```r Q <- hex_combined %>% st_drop_geometry() %>% filter(pointCount > 0) %>% arrange(desc(pointCount)) ``` Then, quantile() is used to determine the cut-off values of lower and upper 25 percentile. ```r quantile(Q$pointCount, probs=c(.25, .75), na.rm = FALSE) ``` ``` ## 25% 75% ## 24.75 215.75 ``` ] .pull-right[ In the code chunk below, IQR() is used to calculate the Interquartile range . ```r iqr <- IQR(Q$pointCount) ``` Then, the upper outlier can be conputed using the code chunk below. ```r upperOutlier <- 215.75+1.5*iqr ``` ] --- ### Mapping outliers .pull-left[ The code chunk below is used to create the outlier hexagon binning map. ```r tm_shape(buildings) + tm_polygons() + tm_shape(hex_combined %>% filter(pointCount > upperOutlier))+ tm_fill(col = "red") + tm_borders(alpha = 0.1) ``` ] .pull-right[ The hexagons colour in are location with relatively higher intencity of traffic count. <img src="Hands-on_Ex07-MovementVis_files/figure-html/unnamed-chunk-35-1.png" width="504" /> ] --- ## Plotting Movement Path using R .pull-left[ In this section, you will learn how to plot movement path using R. ] --- ### Creating movement path from event points .pull-left[ Code chunk below joins the event points into movement paths by using the participants' IDs as unique identifiers. ```r logs_path <- logs_selected %>% group_by(participantId, day) %>% summarize(m = mean(Timestamp), do_union=FALSE) %>% st_cast("LINESTRING") ``` ] .pull-right[ ```r print(logs_path) ``` ``` ## Simple feature collection with 5781 features and 3 fields ## Geometry type: LINESTRING ## Dimension: XY ## Bounding box: xmin: -4616.828 ymin: 35.4377 xmax: 2630 ymax: 7836.546 ## CRS: NA ## # A tibble: 5,781 × 4 ## # Groups: participantId [1,011] ## participantId day m currentLocation ## <chr> <int> <dttm> <LINESTRING> ## 1 0 1 2022-03-01 13:34:23 (-2721.353 6862.861, -2689.275 6644.… ## 2 0 2 2022-03-02 14:19:50 (-2721.353 6862.861, -2689.275 6644.… ## 3 0 3 2022-03-03 13:39:13 (-2721.353 6862.861, -2689.275 6644.… ## 4 0 4 2022-03-04 13:38:11 (-2721.353 6862.861, -2689.275 6644.… ## 5 0 5 2022-03-05 13:08:02 (-2721.353 6862.861, -2689.275 6644.… ## 6 0 6 2022-03-06 06:28:00 (-2721.353 6862.861, -2689.275 6644.… ## 7 1 1 2022-03-01 18:07:24 (-1531.133 5597.244, -1863.284 5825.… ## 8 1 2 2022-03-02 16:57:05 (-2619.036 5860.49, -2200.679 5828.4… ## 9 1 3 2022-03-03 14:13:40 (-260.4575 5026.151, -352.5088 5294.… ## 10 1 4 2022-03-04 14:31:45 (-3903.194 5967.837, -3655.902 5869.… ## # … with 5,771 more rows ``` ] --- ### Plotting the Movement Paths .pull-left[ Write a code chunk to overplot the gps path of participant ID = 0 onto the background building footprint map. ```r logs_path_selected <- logs_path %>% filter(participantId==0) tmap_mode("plot") tm_shape(buildings)+ tm_polygons(col = "grey60", size = 1, border.col = "black", border.lwd = 1) + tm_shape(logs_path_selected) + tm_lines(col = "blue") tmap_mode("plot") ``` ] .pull-right[ <img src="Hands-on_Ex07-MovementVis_files/figure-html/unnamed-chunk-39-1.png" width="504" /> ] --- ## Geographic Data Visualisation: ggplot2 methods In this section, you will learn how to use ggplot2 to visualise geographic data. --- ### Import the buildings shapefile into R ```r buildings_shp <- st_read(dsn = "data/geospatial", layer = "buildings") ``` ``` ## Reading layer `buildings' from data source ## `D:\tskam\ISSS608\Hands-on_Ex\Hands-on_Ex07\data\geospatial' ## using driver `ESRI Shapefile' ## Simple feature collection with 1042 features and 5 fields ## Geometry type: POLYGON ## Dimension: XY ## Bounding box: xmin: -4762.191 ymin: -30.08359 xmax: 2650 ymax: 7850.037 ## CRS: NA ``` ### Plotting buildings ```r ggplot(buildings_shp) + geom_sf() + coord_sf() ``` <img src="Hands-on_Ex07-MovementVis_files/figure-html/unnamed-chunk-41-1.png" width="504" /> ```r ggplot(buildings_shp) + geom_sf(aes(fill = region), color = "black", size = 0.1, show.legend = TRUE) + coord_sf() + theme_bw() + labs(title="Geographical region of the study area") ``` <img src="Hands-on_Ex07-MovementVis_files/figure-html/unnamed-chunk-42-1.png" width="504" /> --- ## References ### sf Package + sf package [main page](https://r-spatial.github.io/sf/index.html) + [1. Simple Features for R](https://r-spatial.github.io/sf/articles/sf1.html) + [2. Reading, Writing and Converting Simple Features](https://r-spatial.github.io/sf/articles/sf2.html) + [3. Manipulating Simple Feature Geometries](https://r-spatial.github.io/sf/articles/sf3.html) + [4. Manipulating Simple Features](https://r-spatial.github.io/sf/articles/sf4.html) ### tmap + tmap package [main page](https://r-tmap.github.io/tmap/index.html) + [tmap: get started!](https://r-tmap.github.io/tmap/articles/tmap-getstarted.html) + [tmap: JSS article reproduction code](https://r-tmap.github.io/tmap/articles/tmap-JSS-code.html) ### Hexagon binning map + [Hexagonal Binning – a new method of visualization for data analysis](https://www.meccanismocomplesso.org/en/hexagonal-binning-a-new-method-of-visualization-for-data-analysis/) + [Hexbin](https://think.design/services/data-visualization-data-design/hexbin/)