--- title: "Get started" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Get started} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r include=FALSE} knitr::opts_chunk$set(collapse = TRUE, comment = "#>", out.width = "100%", dpi = 96, fig.align = "center") ``` The aim of the package `chessboard` is to provide tools to work with **directed** (asymmetric) and **undirected** (symmetrical) spatial (or non-spatial) **networks**. It implements different methods to detect neighbors, all based on the chess game (it goes beyond the rook and the queen available in many R packages) to create complex connectivity scenarios. `chessboard` aims to easily create various network objects, including: - **node list**, i.e. a list of all nodes of the network; - **edge list**, i.e. a list of all edges (links) between pairs of nodes; - **connectivity matrix**, i.e. a binary matrix of dimensions `n x n`, where `n` is the number of **nodes** (sampling units) indicating the presence (`1`) or the absence (`0`) of an **edge** (link) between pairs of nodes. ```{r 'setup', echo=TRUE} # Setup ---- library("chessboard") library("ggplot2") library("patchwork") ``` ```{r 'ggplot-theme', echo=FALSE} ## Custom ggplot2 theme ---- custom_theme <- function() { theme_light() + theme(plot.title = element_text(face = "bold", family = "serif", size = 18), plot.caption = element_text(face = "italic", family = "serif"), axis.title = element_blank(), axis.text = element_text(family = "serif")) } ``` \ ## Network properties `chessboard` can handle spatial networks, but it does not explicitly use geographical coordinates to find neighbors (it is not based on spatial distance). Instead, it identifies neighbors according to **node labels** (i.e. the node position on a two-dimension chessboard) and a specific method (pawn, fool, rook, bishop, knight, queen, wizard, etc.) derived from the chess game. \ ```{r 'cb-network', eval=TRUE, fig.height=8, fig.width=5, echo=FALSE, fig.cap="Figure 1. Network as a chessboard", out.width='60%'} sites <- expand.grid("transect" = 1:3, "quadrat" = 1:5) nodes <- create_node_labels(data = sites, transect = "transect", quadrat = "quadrat") gg_chessboard(nodes) ``` \ The package `chessboard` is designed to work with two-dimensional networks (i.e. sampling on a regular grid), where one dimension is called **transect** and the other is called **quadrat**. By convention, the dimension **transect** corresponds to the x-axis, and the **quadrat** corresponds to the y-axis (Fig. 1). `chessboard` can also deal with one-dimensional network (either **transect-only** or **quadrat-only**). The network can be undirected or directed. If the network is **directed**, it will have (by default) these two orientations: - from bottom to top (along transect) for quadrats - from left to right (along quadrat) for transects \ ## Neighbors detection `chessboard` implements the following rules to detect neighbors and to create edges: - the **degree** of neighborhood: the number of adjacent nodes that will be used to create direct edges. - the **orientation** of neighborhood: can neighbors be detected horizontally, vertically and/or diagonally? - the **direction** of neighborhood: does the sampling has a main direction? This can be particularly relevant for directed networks (e.g. rivers). \ ## Workflow The Figure 2 shows the general workflow and the main features of `chessboard`. \ ```{r echo = FALSE, out.width = "100%", fig.cap = "Figure 2. Workflow and main features of `chessboard`", fig.align = 'center'} knitr::include_graphics("figures/diagramme.png") ``` \ ### Data The package `chessboard` comes with a real-world example: a survey sampling along the French river _L'Adour_ (Fig. 3). _L'Adour_ is a river in southwestern France. It rises in the Pyrenees and flows into the Atlantic Ocean (Bay of Biscay). It's oriented from south-east (upstream) to north-west (downstream). \ ```{r 'map-adour-river', echo=FALSE, fig.height=9, fig.width=10, out.width='80%', fig.cap='Figure 3. Location of the French river L\'Adour'} knitr::include_graphics("figures/map-adour-river.png") ``` \ Along this river, a survey has been realized at **three** locations (Fig. 4). At each location, a sampling has been conducted on a regular grid composed of **three** transects each of them composed of **five** quadrats. \ ```{r 'import-adour-river', echo=FALSE} ## Import the spatial layer of Adour river ---- path_to_file <- system.file("extdata", "adour_lambert93.gpkg", package = "chessboard") adour_river <- sf::st_read(path_to_file, quiet = TRUE) ``` ```{r 'import-adour-sites', echo=FALSE} ## Import sites data ---- path_to_file <- system.file("extdata", "adour_survey_sampling.csv", package = "chessboard") nodes <- read.csv(path_to_file) ## Convert data.frame to sf object ---- nodes_sf <- sf::st_as_sf(nodes, coords = c("longitude", "latitude"), crs = "epsg:2154") ``` ```{r 'map-adour-sites', fig.height=9, fig.width=12, out.width='80%', echo=FALSE, fig.cap='Figure 4. Survey sampling along the river L\'Adour'} ggplot() + geom_sf(data = adour_river, col = "steelblue") + geom_sf(data = nodes_sf, shape = 19, size = 2) + labs(caption = "RGF93 / Lambert-93 Projection") + custom_theme() + geom_segment(aes(x = 454180, xend = 440170, y = 6216290, yend = 6263320), arrow = arrow(length = unit(0.75, 'cm'), type = 'closed'), linewidth = 2.25) + geom_text(aes(x = 334500, y = 6285000), label = "River", hjust = 0, color = "steelblue", fontface = "bold", size = 6, family = "serif") + geom_text(aes(x = 414950, y = 6312200), label = "Location 3", hjust = -0.20, color = "black", fontface = "bold", size = 6, family = "serif") + geom_text(aes(x = 474655, y = 6236708), label = "Location 1", color = "black", fontface = "bold", size = 6, family = "serif") + geom_text(aes(x = 467250, y = 6287620), label = "Location 2", color = "black", fontface = "bold", size = 6, family = "serif") ``` The arrow in Fig. 4 indicates the direction of the river flow. This means that our sampling design is a **directed spatial network** (both inside a location and between locations) where the main direction is from upstream to downstream. \ Let's import this dataset provided by `chessboard`. ```{r 'import-data', echo=TRUE} # Import data ---- path_to_file <- system.file("extdata", "adour_survey_sampling.csv", package = "chessboard") sampling <- read.csv(path_to_file) dim(sampling) ``` ```{r 'head-of-data', echo=TRUE} # First rows ---- head(sampling, 10) ``` ```{r 'tail-of-data', echo=TRUE} # Last rows ---- tail(sampling, 10) ``` This `data.frame` contains the following columns: - `location`: the identifier of the location (`numeric`) - `transect`: the identifier of the transect (`numeric`) - `quadrat`: the identifier of the quadrat (`numeric`) - `longitude`: the longitude of the site (**node**) defined in the [RGF93 / Lambert-93](https://epsg.io/2154) projection - `latitude`: the latitude of the site (**node**) defined in the [RGF93 / Lambert-93](https://epsg.io/2154) projection **N.B.** The column `location` is optional if the survey has been conducted at one single location. If the network has one dimension, one of the columns `transect` and `quadrat` can be omitted. If the survey is not spatial, the columns `longitude` and `latitude` can be omitted. \ ### Node labels When working with `chessboard`, the **first step** is to create node labels with the function `create_node_labels()`. \ But first, let's reduce the size of data by selecting the first location. ```{r 'select-data'} # Select the first location ---- sampling <- sampling[sampling$"location" == 1, ] dim(sampling) ``` \ Let's create node labels with the function `create_node_labels()`. ```{r 'create-nodes-labels'} # Create node labels ---- nodes <- create_node_labels(data = sampling, location = "location", transect = "transect", quadrat = "quadrat") nodes ``` Node labels are a combination of the transect and the quadrat identifiers. They must be **unique**. \ We can visualize this sampling on a Cartesian referential, i.e. non-spatial, where the x-axis corresponds to **transects** and the y-axis represents the **quadrats** (Fig. 5). This new referential is called a chessboard. ```{r 'plot-sampling-units', fig.height=8, fig.width=5, echo=TRUE, out.width='60%', fig.cap='Figure 5. Sampling survey as a chessboard'} # Visualize chessboard ---- gg_chessboard(nodes) ``` \ The function `get_node_list()` can be used to extract and order the node list. ```{r 'get-node-labels'} # Extract node labels ---- get_node_list(nodes) ``` \ ### Edge list The creation of a list of edges (links) between nodes (sampling units) is based on the detection of neighbors. In `chessboard` different methods have been implemented to define neighborhood (argument `method` of the function `create_edge_list()`). These methods are named `'pawn'`, `'rook'`, `'bishop'`, `'queen'`, etc. A complete list of available methods is available at: [https://frbcesab.github.io/chessboard/reference/index.html#detect-neighbors](https://frbcesab.github.io/chessboard/reference/index.html#detect-neighbors) \ Before using the function `create_edge_list()`, users can explore these different methods by calling the functions `pawn()`, `rook()`, `bishop()`, `queen()`, etc. These functions only work on a specific node (argument `focus`). \ Let's take a look of the neighborhood method `pawn()`. ```{r 'method-pawn'} # Explore pawn method to find neighbors ---- neighbors_pawn <- pawn(nodes = nodes, focus = "2-3", degree = 1, directed = FALSE, reverse = FALSE) neighbors_pawn ``` \ The package `chessboard` contains functions to visualize detected neighbors on a chessboard: `gg_chessboard()` is used to plot a chessboard (dimensions defined by the node list), `geom_node()` emphasizes the focus node (in red), and `geom_neighbors()` adds the detected neighbors (dots in black). \ ```{r 'nb-pawn', fig.height=8, fig.width=5, echo=TRUE, fig.cap="Figure 6. Detected neighbors (pawn method)", out.width='50%'} gg_chessboard(nodes) + geom_node(nodes, focus = "2-3") + geom_neighbors(nodes, neighbors_pawn) ``` \ The function `pawn()` can detect neighbors vertically, i.e. among quadrats along a transect. User can change the default settings by increasing the degree of neighborhood (`degree = 4`, Fig. 7A), by adding directionality (`directed = TRUE`, Fig. 7B), and by reversing the default directionality (`directed = TRUE` and `reverse = TRUE`, Fig. 7C). ```{r 'cb-pawn', fig.height=4.3, fig.width=12, echo=FALSE, fig.cap="Figure 7. Pawn movements", out.width='100%'} demo_sites <- expand.grid("transect" = 1:9, "quadrat" = 1:9) demo_nodes <- create_node_labels(data = demo_sites, transect = "transect", quadrat = "quadrat") demo_focus <- "5-5" pawn_1 <- gg_chessboard(demo_nodes, "A. Undirected network", "") + geom_node(demo_nodes, demo_focus) + geom_neighbors(demo_nodes, pawn(demo_nodes, demo_focus, degree = 4, directed = FALSE, reverse = FALSE)) pawn_2 <- gg_chessboard(demo_nodes, "B. Directed network", "") + geom_node(demo_nodes, demo_focus) + geom_neighbors(demo_nodes, pawn(demo_nodes, demo_focus, degree = 4, directed = TRUE, reverse = FALSE)) pawn_3 <- gg_chessboard(demo_nodes, "C. Directed network (reverse)", "") + geom_node(demo_nodes, demo_focus) + geom_neighbors(demo_nodes, pawn(demo_nodes, demo_focus, degree = 4, directed = TRUE, reverse = TRUE)) (pawn_1 | pawn_2 | pawn_3) ``` \ Let's take another example. The function `bishop()` can detect neighbors diagonally. User can change the default settings by increasing the degree of neighborhood (`degree = 4`, Fig. 8A), by adding directionality (`directed = TRUE`, Fig. 8B), and by reversing the default directionality (`directed = TRUE` and `reverse = TRUE`, Fig. 8C). ```{r 'cb-bishop', eval=TRUE, fig.height=4.3, fig.width=12, echo=FALSE, fig.cap="Figure 8. Bishop movements", out.width='100%'} demo_sites <- expand.grid("transect" = 1:9, "quadrat" = 1:9) demo_nodes <- create_node_labels(data = demo_sites, transect = "transect", quadrat = "quadrat") demo_focus <- "5-5" bishop_1 <- gg_chessboard(demo_nodes, "A. Undirected network", "") + geom_node(demo_nodes, demo_focus) + geom_neighbors(demo_nodes, bishop(demo_nodes, demo_focus, degree = 4, directed = FALSE, reverse = FALSE)) bishop_2 <- gg_chessboard(demo_nodes, "B. Directed network", "") + geom_node(demo_nodes, demo_focus) + geom_neighbors(demo_nodes, bishop(demo_nodes, demo_focus, degree = 4, directed = TRUE, reverse = FALSE)) bishop_3 <- gg_chessboard(demo_nodes, "C. Directed network (reverse)", "") + geom_node(demo_nodes, demo_focus) + geom_neighbors(demo_nodes, bishop(demo_nodes, demo_focus, degree = 4, directed = TRUE, reverse = TRUE)) (bishop_1 | bishop_2 | bishop_3) ``` \ The vignette [Chess pieces](https://frbcesab.github.io/chessboard/articles/chess-pieces.html) shows all possible methods available in `chessboard`. \ Now, let's use the function `create_edge_list()` to create an edge list using the method `'pawn'` with a degree `1` of neighborhood and in a directional way. ```{r 'create-edges-list-pawn'} # Create edge list ---- edges_pawn <- create_edge_list(nodes = nodes, method = "pawn", degree = 1, directed = TRUE, reverse = FALSE, self = FALSE) edges_pawn ``` \ It's possible to visualize these edges on a map, i.e. by using spatial coordinates. First, we need to convert our sites into a spatial object (`POINT`). ```{r 'df-to-sf'} # Convert nodes to sf object ---- nodes_sf <- sf::st_as_sf(nodes, coords = c("longitude", "latitude"), crs = "epsg:2154") head(nodes_sf) ``` \ Now we can use the function `edges_to_sf()` to convert our edge list into a spatial object (`LINESTRING`). ```{r 'edges-list-to-sf'} # Convert edge list to sf ---- edges_pawn_sf <- edges_to_sf(edges = edges_pawn, sites = nodes_sf) edges_pawn_sf ``` \ We can now map our nodes and edges. ```{r 'map-edges-list-pawn', fig.height=8, fig.width=6.5, echo=TRUE, fig.cap="Figure 9. Edge list (pawn method)", out.width='80%'} # Map of nodes and edges ---- ggplot(nodes_sf) + geom_sf(size = 12) + geom_sf(data = edges_pawn_sf) + theme_light() ``` \ Users may want to combine different methods to detect neighbors to build complex scenarios. It's possible by using for each method the function `create_edge_list()` and using the function `append_edge_lists()` to merge all edges in a single list. ```{r 'create-edges-list-bishop'} # Create edge list (Bishop method) ---- edges_bishop <- create_edge_list(nodes = nodes, method = "bishop", degree = 1, directed = TRUE, reverse = FALSE, self = FALSE) edges_bishop # Merge Pawn and Bishop edges ---- edges <- append_edge_lists(edges_pawn, edges_bishop) # Convert edges to spatial layer ---- edges_sf <- edges_to_sf(edges, nodes_sf) ``` \ ```{r 'map-edges-list-pawn-bishop', fig.height=8, fig.width=6.5, echo=TRUE, fig.cap="Figure 10. Edges list (combined methods)", out.width='80%'} # Map of nodes and edges ---- ggplot(nodes_sf) + geom_sf(size = 12) + geom_sf(data = edges_sf) + theme_light() ``` \ ### Connectivity matrix From this edge list, we can build a **connectivity matrix**, i.e. a binary matrix of dimensions `n x n`, where `n` is the number of nodes (sampling units) indicating the presence (`1`) or the absence (`0`) of an edge (link) between pairs of nodes. \ We can use the function `connectivity_matrix()` of the package `chessboard`. ```{r 'connectivity-matrix'} # Create connectivity matrix ---- conn_matrix <- connectivity_matrix(edges) conn_matrix ``` \ The package `chessboard` provides a function to visualize this matrix: `gg_matrix()`. ```{r 'plot-connectivity-matrix', fig.height=8, fig.width=8, fig.cap="Figure 11. Connectivity matrix", out.width='80%'} # Visualize connectivity matrix ---- gg_matrix(conn_matrix) ``` \ Optionally, we can use the function `matrix_to_edge_list()` to convert back the connectivity matrix to edge list. ```{r 'connectivity-matrix-to-df'} # Convert connectivity matrix to edge list ---- matrix_to_edge_list(conn_matrix) ``` \ ## Extending chessboard `chessboard` has been built to be compatible with the following R packages: [`igraph`](https://r.igraph.org/) (Csardi & Nepusz 2006), [`sf`](https://r-spatial.github.io/sf/) (Pebesma 2018), and [`ggplot2`](https://ggplot2.tidyverse.org) (Wickham 2016). \ ### with `sf` As seen before, the edge list can be converted into an spatial object with the function `edges_to_sf()`. The output is an `sf` `LINESTRING` that can be handled by many functions of the package `sf`. For instance, let's project the coordinate system. ```{r 'transform-crs'} # Convert edges to spatial layer ---- edges_sf <- edges_to_sf(edges, nodes_sf) # Project the CRS ---- edges_sf_lonlat <- sf::st_transform(edges_sf, crs = "epsg:4326") # Check ---- edges_sf edges_sf_lonlat ``` User can also export this spatial layer. ```{r 'export-sf', eval=FALSE} # Export layer as a GeoPackage ---- sf::st_write(edges_sf, "edge_list.gpkg") ``` For more information about the package `sf`, please visit the [manual](https://cran.r-project.org/package=sf). \ ### with `ggplot2` All plotting functions in `chessboard` are produced with the `ggplot2` engine and are highly customizable. For instance, let's change the default theme. ```{r 'change-theme', fig.height=8, fig.width=8, fig.cap="Figure 12. Custom connectivity matrix", out.width='80%'} # Change default ggplot2 theme ---- gg_matrix(conn_matrix) + theme_bw() + theme(legend.position = "none") ``` For more information about the package `ggplot2`, please visit the [manual](https://cran.r-project.org/package=ggplot2). \ ### with `igraph` The package `igraph` is commonly use to analyze network data. User can use the function `igraph::graph_from_data_frame()` to convert the edge list created with `chessboard` to an `igraph` object. ```{r 'to-igraph'} # Convert edge list to igraph object ---- igraph_obj <- igraph::graph_from_data_frame(d = edges, directed = TRUE, vertices = nodes) # Check ----- class(igraph_obj) print(igraph_obj) ``` Let's plot our network using `igraph`. ```{r 'plot-igraph', fig.height=8, fig.width=8, fig.cap="Figure 13. Network visualization w/ `igraph`", out.width='80%'} # Plot the network w/ igraph ---- plot(igraph_obj) ``` For more information about the package `igraph`, please visit the [manual](https://cran.r-project.org/package=igraph). \ ## References Bivand R & Wong D (2018) Comparing implementations of global and local indicators of spatial association. **TEST**, 27, 716–748. . Csardi G & Nepusz T (2006) The igraph software package for complex network research. **InterJournal, Complex Systems**, 1695, 1–9. . Pebesma E (2018) Simple Features for R: Standardized support for spatial vector data. **The R Journal**, 10, 439–446. . Wickham H (2016) _ggplot2: Elegant graphics for data analysis_ (p. 213). Springer-Verlag. .