--- title: "Tour de trackeR" author: "[Hannah Frick](https://www.frick.ws) and [Ioannis Kosmidis](https://ikosmidis.com)" date: "`r Sys.Date()`" output: rmarkdown::html_vignette bibliography: trackeR.bib vignette: > %\VignetteEngine{knitr::rmarkdown} %\VignetteIndexEntry{Tour de trackeR} %\VignetteEncoding{UTF-8} --- The **trackeR** package provides infrastructure for handling running and cycling data from GPS-enabled tracking devices. A short demonstration of its functionality is provided below, based on data from running activities. A more comprehensive introduction to the package can be found in the vignette "Infrastructure for Running and Cycling Data", which can be accessed by typing ```{r, vignette, eval = FALSE} vignette("trackeR", package = "trackeR") ``` ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 7 ) ``` ## Reading data **trackeR** can currently import files in the Training Centre XML (TCX) format and .db3 files (SQLite databases, used, for example, by devices from GPSports) through the corresponding functions *readTCX*() and *readDB3*(). It also offers support for JSON files from [Golden Cheetah](http://www.goldencheetah.org/) via *readJSON*(). ```{r, runDF, message = FALSE} library("trackeR") filepath <- system.file("extdata/tcx/", "2013-06-01-183220.TCX.gz", package = "trackeR") runDF <- readTCX(file = filepath, timezone = "GMT") ``` These read functions return a `data.frame` of the following structure ```{r, str_runDF} str(runDF) ``` That `data.frame` can be used as an input to the constructor function for **trackeR**'s *trackeRdata* class, to produce a session-based and unit-aware object that can be used for further analyses. ```{r, runTr0} runTr0 <- trackeRdata(runDF) ``` The *read_container*() function combines the two steps of importing the data and constructing the *trackeRdata* object. ```{r, runTr1} runTr1 <- read_container(filepath, type = "tcx", timezone = "GMT") identical(runTr0, runTr1) ``` The *read_directory*() function can be used to read `all` supported files in a directory and produce the corresponding *trackeRdata* objects. ## Visualisations The package includes an example data set which can be accessed through ```{r, runs} data("runs", package = "trackeR") ``` The default behaviour of the *plot* method for *trackeRdata* objects is to show how heart rate and pace evolve over the session. ```{r, plot_runs, fig.width = 7, fig.height = 5} plot(runs, session = 1:7) ``` The elevation profile of a training session is also accessible, here along with the pace. ```{r, plot_runs_2, fig.width = 7, fig.height = 5} plot(runs, session = 8, what = c("altitude", "pace")) ``` The route taken during a training session can also be plotted on maps from various sources e.g., from Google or OpenStreetMap. This can be done either on a static map ```{r, plotRoute, message = FALSE, warning = FALSE, fig.width = 7, fig.height = 7} tryCatch(plot_route(runs, session = 1, source = "stamen"), error = function(x) "Failed to donwload map data") ``` or on an interactive map. ```{r, leafletRoute, fig.width = 7, fig.height = 7} tryCatch(leaflet_route(runs, session = c(1, 6, 12)), error = function(x) "Failed to donwload map data") ``` ## Session summaries The summary of sessions includes basic statistics like duration, time spent moving, average speed, pace, and heart rate. The speed threshold used to distinguish moving from resting can be set by the argument *moving_threshold*. ```{r, summary} summary(runs, session = 1, moving_threshold = c(cycling = 2, running = 1, swimming = 0.5)) ``` It is usually desirable to visualise summaries from multiple sessions. This can be done using the *plot* method for summary objects. Below, we produce such a plot for average heart rate, average speed, distance, and duration. ```{r, plot_summary, fig.width = 7, fig.height = 7} runs_summary <- summary(runs) plot(runs_summary, group = c("total", "moving"), what = c("avgSpeed", "distance", "duration", "avgHeartRate")) ``` The timeline plot is useful to visualise the date and time that the sessions took place and provide information of their relative duration. ```{r, timeline, fig.width = 7, fig.height = 7} timeline(runs_summary) ``` ## Time in zones The time spent training in certain zones, e.g., speed zones, can also be calculated and visualised. ```{r, plot_zones, fig.width = 7, fig.height = 5} run_zones <- zones(runs[1:4], what = "speed", breaks = c(0, 2:6, 12.5)) plot(run_zones) ``` ## Quantifying work capacity via W' (W prime) *trackeR* can also be used to calculate and visualise the work capacity W' (pronounced as `W prime`). The comprehensive vignette "Infrastructure for Running and Cycling Data" provides the definition of work capacity and details on the *version* and *quantity* arguments. ```{r, plot_Wprime, fig.width = 7, fig.height = 5} wexp <- Wprime(runs, session = 11, quantity = "expended", cp = 4, version = "2012") plot(wexp, scaled = TRUE) ``` ## Distribution and concentration profiles @trackeR:Kosmidis+Passfield:2015 introduce the concept of distribution and concentration profiles for which **trackeR** provides an implementation. These profiles are motivated by the need to compare sessions and use information on such variables as heart rate or speed during a session for further modelling. The distribution profile for a variable such as speed or heart rate describes the time exercising above a (speed or heart rate) threshold. Here, the distribution profiles for the first 4 sessions are calculated for speed with thresholds ranging from 0 to 12.5 m/s in increments of 0.05 m/s. ```{r, distProfile, fig.width = 7, fig.height = 5} d_profile <- distribution_profile(runs, session = 1:4, what = "speed", grid = list(speed = seq(0, 12.5, by = 0.05))) plot(d_profile, multiple = TRUE) ``` Sessions 4 and 1 are longer than session 2 and 3, as visible by the higher amount of time spent exercising above 0 m/s. Sessions 3 and 4 show a larger amount of time spent exercising above 4 m/s than the other sessions. This is easier to spot in the concentration profiles which are the negative derivative of the distribution profiles. The concentration profile for session 3 has a mode at around 3.5 meters per second and another one above 4 meters per second, showing that this session involved training at a combination of low and high speeds. ```{r, conProfile, fig.width = 7, fig.height = 5} c_profile <- concentrationProfile(d_profile, what = "speed") plot(c_profile, multiple = TRUE, smooth = TRUE) ``` More details on distribution and concentration profiles can be found in the comprehensive vignette "Infrastructure for Running and Cycling Data". ## Functional principal components analysis The distribution and concentration can be used for further analysis such as a functional principal components analysis (PCA) to describe the differences between the profiles. The concentration profiles for all session ```{r, prep_profiles, fig.width = 7, fig.height = 5} runsT <- threshold(runs) dp_runs <- distribution_profile(runsT, what = "speed") dp_runs_S <- smoother(dp_runs) cp_runs <- concentration_profile(dp_runs_S) plot(cp_runs, multiple = TRUE, smooth = FALSE) ``` vary in their shape (unimodal or multimodal), height, and location (revealing concentrations at higher or lower speeds). The function *funPCA*() can be used to fit a functional PCA, here with 4 principal components. ```{r, funPCA} cpPCA <- funPCA(cp_runs, what = "speed", nharm = 4) ``` For the speed concentration profiles here, the first two components cover `r round(sum(cpPCA$varprop[1:2]*100))`% of the variability in the profiles. ```{r, plot_fPCA, fig.width = 7, fig.height = 7} round(cpPCA$varprop, 2) plot(cpPCA, harm = 1:2) ``` A plot of the first two principal components reveals that the profiles here differ mostly in the general height of the curves and the location. These two aspects of the profiles can be captured by two univariate measures of the sessions, the time spent moving and the average speed moving. ```{r, plot_scores, fig.show = 'hold', fig.width = 7, fig.height = 5} ## plot scores vs summary statistics scoresSP <- data.frame(cpPCA$scores) names(scoresSP) <- paste0("speed_pc", 1:4) d <- cbind(runs_summary, scoresSP) library("ggplot2") ## pc1 ~ session duration (moving) ggplot(d) + geom_point(aes(x = as.numeric(durationMoving), y = speed_pc1)) + theme_bw() ## pc2 ~ avg speed (moving) ggplot(d) + geom_point(aes(x = avgSpeedMoving, y = speed_pc2)) + theme_bw() ``` ## References