\documentclass[twoside,11pt]{article} % \VignetteIndexEntry{Reading shapefiles into R} \SweaveOpts{eps=TRUE} <>= options(SweaveHooks=list(fig=function() par(mar=c(1,1,1,1)))) @ \usepackage[latin1]{inputenc} \usepackage{graphicx} \usepackage[colorlinks=true,urlcolor=blue]{hyperref} \usepackage{color} \usepackage{anysize} \marginsize{2cm}{3cm}{2cm}{3cm} \newcommand{\pkg}[1]{\texttt{#1}} \newcommand{\bold}[1]{{\textbf {#1}}} \newcommand{\R}{{\sf R}} \begin{document} %\bibliographystyle{plain} \thispagestyle{empty} <>= library(spatstat.geom) options(useFancyQuotes=FALSE) sdate <- read.dcf(file = system.file("DESCRIPTION", package = "overlapptest"), fields = "Date") sversion <- read.dcf(file = system.file("DESCRIPTION", package = "overlapptest"), fields = "Version") @ \title{Reading shapefiles into \R{} for use with the \texttt{overlapptest} package} \author{Marcelino de la Cruz} \date{ \Sexpr{sdate} \\ \pkg{overlapptest} version \texttt{\Sexpr{sversion}} } \maketitle This vignette explains how to read shapefile data into \R{} for use the \pkg{overlapptest} package. It is based on the vignette about handling shape files in the package \texttt{spatstat.geom} (Baddeley et al. 2015). This vignette is part of the documentation included in \pkg{overlapptest} version \texttt{\Sexpr{sversion}}. The information applies to \pkg{overlapptest} versions \texttt{1.3-0} and above. \section{Shapefiles} As explained by Baddeley et al. (2015), a shapefile represents a list of spatial objects --- a list of points, a list of lines, or a list of polygonal regions --- and each object in the list may have additional variables attached to it. The \pkg{overlapptest} package deals only with polygonal objects. A dataset stored in shapefile format is actually stored in a collection of text files, for example \begin{verbatim} mydata.shp mydata.prj mydata.sbn mydata.dbf \end{verbatim} which all have the same base name \texttt{mydata} but different file extensions. To refer to this collection you will always use the filename with the extension \texttt{shp}, for example \texttt{mydata.shp}. \section{Helper packages} \label{S:helpers} With earlier versions of \pkg{overlapptest} we used packages \pkg{rgdal} and \pkg{maptools} to respectively read and manipulate files with the shapefile format. As these packages will be retired by the end of 2023, from version \texttt{1.3-0} we will use \pkg{spatstat.geom} and \pkg{sf}. The \pkg{sf} package contains functions for reading shapefiles. The \pkg{spatstat.geom} package contains functions for handling polygons (as \texttt{owin} objects) which allow computing areas, centroids, intersections and rotations. \section{How to read shapefiles into \R{} for use with \pkg{overlapptest}} To read shapefile data into \R{}, for use with \pkg{overlapptest}, two steps must be followed: \begin{enumerate} \item using the facilities of \pkg{sf}, read the shapefiles as a sf object. \item convert the sf object into a multi-polygonal owin object supported by \pkg{spatstat.geom}. \end{enumerate} \subsection{Read shapefiles using \pkg{sf}} Here's how to read shapefile data. \begin{enumerate} \item ensure that the package \pkg{sf} is installed. \item start R and load the packages: <<>>= library(sf) @ \item read the shapefile into \texttt{R} using \texttt{st\_read}, for example <>= x <- st_read("mydata.shp") @ \item To find out what kind of spatial objects are represented by the dataset, inspect its class: <>= class(x) @ For applications of the \pkg{overlapptest} package, the class should typicaly be a \texttt{sf}. \end{enumerate} For example, to read in the shapefile data supplied in the \pkg{overlapptest} package, we just should set the working directory to the folder where there are the sapefile data and use the \texttt{st\_read()} function of \pkg{sf}. <>= dire<-getwd() @ <<>>= setwd(system.file("shapes", package="overlapptest")) Androsace <- st_read("Androsace.shp", quiet =TRUE) @ <<>>= class(Androsace) @ <>= setwd(dire) @ \subsection{Convert data to \pkg{spatstat.geom} format} The \pkg{spatstat.geom} must be loaded in order to convert the data. <<>>= library(spatstat.geom) @ In addition, for applications of the \pkg{overlapptest} package, it is fundamental to avoid the automatic correction of polygons implemented in \pkg{spatstat.geom} which, by default, will try to "repare" overlapping pieces (it would also try repairing polygon self-intersections, so the geometry of the shapefiles should be reliable). For this, just type, <<>>= spatstat.geom::spatstat.options(fixpolygons=FALSE) @ There are different ways to convert the dataset to an object in the \pkg{spatstat.geom} package, as explained in the corresponding vignette in \pkg{spatstat}. For applications of the \pkg{overlapptest}, the most convenient way is combining all the polygonal elements of the same type (ussually present in a unique shapefile) into a single "polygonal region", and convert this to a single object of class \texttt{owin}. To do this, use \texttt{as.owin(x)}, but with the argument \texttt{"check\_polygon"} set to \texttt{"FALSE"}, to avoid errors if some polygons are traversed in the wrong direction. The result is a single window (object of class \texttt{"owin"}) in the \pkg{spatstat.geom} package. In our example, <<>>= Androsace <- as.owin(Androsace, check_polygons=FALSE) @ \subsection{Checking the reliability of the \texttt{owin} object} For this, we should load the \pkg{overlapptest} package. <>= library(overlapptest) @ The function \texttt{check.ventana()} will check that the vertices of all polygons are listed anticlockwise (to ensure that \pkg{spatstat.geom} considers them as "solid" polygons and not "holes", something necessary to be able to compute intersections among them). If it finds some clockwise listed vertices, it would try to reorder them, and will return the corrected \texttt{owin} object. If it succeded, the order number of the corrected polygon(s) would be listed as the attribute \texttt{corrected} of the \texttt{owin} object. If it would not, the order number of the wrong polygon(s) would be listed as the attribute \texttt{not.corrected}. These polygons should be corrected manually before using the other functions in the \pkg{overlapptest} package. <>= check.ventana <- function (ventana) { malos1 <- NULL malos2 <- NULL for (i in 1:length(ventana$bdry)) { xy <- list(x = ventana$bdry[[i]]$x, y= ventana$bdry[[i]]$y) test <- try(owin(poly=xy), silent=TRUE) if(inherits (test, "try-error")){ malos1 <- c(malos1, i) test2 <- try(owin(poly = lapply(xy, rev), silent = TRUE)) if(!inherits (test2, "try-error")){ ventana$bdry[[i]] <- owin(poly = lapply(xy, rev))$bdry } else{ malos2 <- c(malos2, i) ventana$bdry[[i]]$area <- NULL ventana$bdry[[i]]$hole <- NULL } } } attr(ventana, "corrected") <- malos1 attr(ventana, "not.corrected") <- malos2 if (!is.null(malos1)) cat(length(malos1), "problematic polygon(s) detected \n \n") if (!is.null(malos1) & is.null(malos2)) cat("all problematic polygons have been repared\n \n") if (!is.null(malos2)) cat(length(malos2), "problematic polygon(s) could not been repared\n \n") return(ventana) } @ <<>>= Androsace <- check.ventana(Androsace) @ In this case the message warns that 1 polygon has been corrected so no manual correction is necessary. To re-check this, we can examine the attributes of the \texttt{owin} object. <>= options(width=80) @ <<>>= attributes (Androsace) @ In case that running \texttt{check.ventana()} would not produce any warnings that would mean that all the polygons were correct. Once the polygons have been checked, the \texttt{owin} object could be used with the other functions in the \pkg{overlapptest} package. \end{document}