Help for package phytoclass

Title:

Estimate Chla Concentrations of Phytoplankton Groups

Version:

2.0.0

Description:

Determine the chlorophyll a (Chl a) concentrations of different phytoplankton groups based on their pigment biomarkers. The method uses non-negative matrix factorisation and simulated annealing to minimise error between the observed and estimated values of pigment concentrations (Hayward et al. (2023) <doi:10.1002/lom3.10541>). The approach is similar to the widely used 'CHEMTAX' program (Mackey et al. 1996) <doi:10.3354/meps144265>, but is more straightforward, accurate, and not reliant on initial guesses for the pigment to Chl a ratios for phytoplankton groups.

Imports:

bestNormalize, dplyr, dynamicTreeCut, ggplot2, Metrics, RcppML, stats, tidyr

License:

MIT + file LICENSE

Encoding:

UTF-8

RoxygenNote:

7.3.1

Depends:

R (≥ 3.8)

LazyData:

true

Suggests:

knitr, rmarkdown, testthat (≥ 3.0.0)

VignetteBuilder:

knitr

URL:

https://github.com/phytoclass/phytoclass/

BugReports:

https://github.com/phytoclass/phytoclass/issues/

Config/testthat/edition:

NeedsCompilation:

Packaged:

2024-11-14 08:36:26 UTC; algh

Author:

Alexander Hayward [aut, cre, cph], Tylar Murray [aut], Andy McKenzie [aut]

Maintainer:

Alexander Hayward <phytoclass@outlook.com>

Repository:

CRAN

Date/Publication:

2024-11-14 08:50:02 UTC

Add weights to the data, bound at a maximum.

Description

Add weights to the data, bound at a maximum.

Usage

Bounded_weights(S, weight.upper.bound = 30)

Arguments

S

Sample data matrix – a matrix of pigment samples

weight.upper.bound

Upper bound for weights (default is 30)

Value

A vector with upper bounds for weights

Examples

Bounded_weights(Sm, weight.upper.bound = 30)

Cluster things

Description

Cluster things

Usage

Cluster(Data, min_cluster_size)

Arguments

Data

S (sample) matrix

min_cluster_size

the minimum size required for a cluster

Value

A named list of length two. The first element "cluster.list" is a list of clusters, and the second element "cluster.plot" the cluster analysis object (dendogram) that can be plotted.

Examples

Cluster.result <- Cluster(Sm, 14)
Cluster.result$cluster.list
plot(Cluster.result$cluster.plot)

Calculate the condition number ...

Description

Calculate the condition number ...

Usage

Condition_test(S, Fn, min.val = NULL, max.val = NULL)

Arguments

S

Fn

min.val

max.val

Conduit between minimise_elements function and Fac_F_R of steepest descent algorithm.

Description

Conduit between minimise_elements function and Fac_F_R of steepest descent algorithm.

Usage

Conduit_1(Fmat, place, S, cm)

Arguments

Fmat

place

S

cm

Conduit between minimise_elements function and Fac_F_R of steepest descent algorithm.

Description

Conduit between minimise_elements function and Fac_F_R of steepest descent algorithm.

Usage

Conduit_2(Fmat, place, S, cm)

Arguments

Fmat

place

S

cm

Conduit between minimise_elements function and Fac_F_R of steepest descent algorithm.

Description

Conduit between minimise_elements function and Fac_F_R of steepest descent algorithm.

Usage

Conduit_3(Fmat, place, S, cm)

Arguments

Fmat

place

S

cm

Sets the default minimum and maximum values for phytoplankton groups pigment ratios. To use this function, pigment and phytoplankton group names will need to fit the naming criteria of phytoclass.

Description

Sets the default minimum and maximum values for phytoplankton groups pigment ratios. To use this function, pigment and phytoplankton group names will need to fit the naming criteria of phytoclass.

Usage

Default_min_max(min_max, Fmat, place)

Arguments

min_max

Fmat

place

Part of the steepest descent algorithm and work to reduce error given the S and F matrices.

Description

Part of the steepest descent algorithm and work to reduce error given the S and F matrices.

Usage

Fac_F_RR1(Fmat, vary, S, cm)

Arguments

Fmat

vary

S

cm

Part of the steepest descent algorithm and work to reduce error given the S and F matrices

Description

Part of the steepest descent algorithm and work to reduce error given the S and F matrices

Usage

Fac_F_RR2(Fmat, vary, place, S, cm)

Arguments

Fmat

vary

place

S

cm

Part of the steepest descent algorithm and work to reduce error given the S and F matrices

Description

Part of the steepest descent algorithm and work to reduce error given the S and F matrices

Usage

Fac_F_RR3(Fmat, vary, place, S, cm)

Arguments

Fmat

vary

place

S

cm

Fm data

Description

Fm data

Usage

Fm

Format

`Fm`

A data frame with 9 rows and 15 columns:

chl_c1: XX
Per: XX
X19but: XX

...

Source

Fp data

Description

Fp data

Usage

Fp

Format

`Fp`

A data frame with 9 rows and 15 columns:

chl_c1: XX
Per: XX
X19but: XX

...

Source

Remove any column values that average 0. Further to this, also remove phytoplankton groups from the F matrix if their diagnostic pigment isn’t present.

Description

Remove any column values that average 0. Further to this, also remove phytoplankton groups from the F matrix if their diagnostic pigment isn’t present.

Usage

Matrix_checks(S, Fmat)

Arguments

S

Sample data matrix – a matrix of pigment samples

Fmat

Pigment to Chl a matrix

Value

Named list with new S and Fmat matrices

Examples

MC <- Matrix_checks(Sm, Fm)  
Snew <- MC$Snew

Part of the steepest descent algorithm

Description

Part of the steepest descent algorithm

Usage

Minimise_elements(Fmat, place, S, cm)

Arguments

Fmat

place

S

cm

A function that reduces every for every element that didn't reduce in index function

Description

A function that reduces every for every element that didn't reduce in index function

Usage

Minimise_elements1(Fmat, place, S, cm)

Arguments

Fmat

place

S

cm

A function that reduces every for every element that didn't reduce in index function

Description

A function that reduces every for every element that didn't reduce in index function

Usage

Minimise_elements2(Fmat, place, S, cm)

Arguments

Fmat

place

S

cm

Performs the non-negative matrix factorisation for given phytoplankton pigments and pigment ratios, to attain an estimate of phytoplankton class abundances.

Description

Performs the non-negative matrix factorisation for given phytoplankton pigments and pigment ratios, to attain an estimate of phytoplankton class abundances.

Usage

NNLS_MF(Fn, S, cm = NULL)

Arguments

Fn

Pigment to Chl a matrix

S

Sample data matrix – a matrix of pigment samples

cm

Weights for each column

Value

A list containing

The F matrix (pigment: Chl a) ratios
The root mean square error (RMSE)
The C matrix (class abundances for each group)

Examples

MC <- Matrix_checks(Sm,Fm)
Snew <- MC$Snew
Fnew <- MC$Fnew
cm <- Bounded_weights(Snew, weight.upper.bound = 30)
NNLS_MF(Fnew, Snew, cm)

Perform matrix factorisation for phytoplankton pigments and pigments ratios

Description

Performs the non-negative matrix factorisation for given phytoplankton pigments and pigment ratios, to attain an estimate of phytoplankton class abundances.

Usage

NNLS_MF_Final(Fn, S, S_Chl, cm)

Arguments

Fn

S

S_Chl

cm

Details

Unlike NNLS_ML(), it also removes any weighting and normalisation, and also multiplies relative abundances by chlorophyll values to determine the biomass of phytoplankton groups.

This function normalises each column in F to row sum

Description

This function normalises each column in F to row sum

Usage

Normalise_F(Fmat)

Arguments

Fmat

Value

A matrix

This function normalises each column in S to row sum

Description

This function normalises each column in S to row sum

Usage

Normalise_S(S)

Arguments

S

Value

A matrix

Final step for MF with prochlorococcus

Description

Final step for MF with prochlorococcus

Usage

Prochloro_NNLS_MF_Final(Fn, S, S_Chl, cm, S_dvChl)

Arguments

Fn

S

S_Chl

cm

Normalise F for prochloro

Description

Normalise F for prochloro

Usage

Prochloro_Normalise_F(Fmat)

Arguments

Fmat

Prochloro random neighbour

Description

Prochloro random neighbour

Usage

Prochloro_Random_Neighbour(
  Fn,
  Temp,
  chlv,
  s_c,
  N,
  place,
  S,
  cm,
  min.val,
  max.val
)

Arguments

Fn

Temp

chlv

s_c

N

place

S

cm

min.val

max.val

Selects a random neighbour for the simulated annealing algorithm.

Description

Selects a random neighbour for the simulated annealing algorithm.

Usage

Prochloro_Random_Neighbour_2(
  Fn,
  Temp,
  chlv,
  s_c,
  place,
  S,
  cm,
  min.val,
  max.val,
  chlvp
)

Arguments

Fn

Temp

chlv

s_c

place

S

cm

min.val

max.val

chlvp

Prochloro Wrangling

Description

Prochloro Wrangling

Usage

Prochloro_Wrangling(Fl, min.val, max.val)

Arguments

Fl

min.val

max.val

Select a random neighbour when the previous random neighbour is beyond the minimum or maximum value.

Description

Select a random neighbour when the previous random neighbour is beyond the minimum or maximum value.

Usage

Random_neighbour(Fn, Temp, chlv, s_c, N, place, S, cm, min.val, max.val)

Arguments

Fn

Temp

chlv

s_c

N

place

S

cm

min.val

max.val

Selects a random neighbour for the simulated annealing algorithm.

Description

Selects a random neighbour for the simulated annealing algorithm.

Usage

Random_neighbour2(Fn, Temp, chlv, s_c, place, S, cm, min.val, max.val)

Arguments

Fn

Temp

chlv

s_c

place

S

cm

min.val

max.val

Randomise individual elements in the F matrix.

Description

Randomise individual elements in the F matrix.

Usage

Randomise_elements(x, min.scaler, max.scaler)

Arguments

x

min.scaler

max.scaler

Value

numeric

Select the new F matrix element with lowest error in the steepest descent algorithm.

Description

Select the new F matrix element with lowest error in the steepest descent algorithm.

Usage

Replace_Rand(Fmat, i, S, cm, min.scaler, max.scaler)

Arguments

Fmat

i

S

cm

min.scaler

max.scaler

Apply the steepest descent algorithm

Description

Apply the steepest descent algorithm

Usage

SAALS(Ft, min.value, max.value, place, S, cm, num.loops)

Arguments

Ft

min.value

max.value

place

S

cm

num.loops

Sm data

Description

Sm data

Usage

Sm

Format

`Sm`

A data frame with 29 rows and 15 columns:

chl_c1: XX
Per: XX
X19but: XX

...

Source

Sp data

Description

Sp data

Usage

Sp

Format

`Sp`

A data frame with 29 rows and 15 columns:

chl_c1: XX
Per: XX
X19but: XX

...

Source

Stand-alone version of steepest descent algorithm. This is similar to the CHEMTAX steepest descent algorithm. It is not required to use this function, and as results are not bound by minimum and maximum, results may be unrealistic.

Description

Stand-alone version of steepest descent algorithm. This is similar to the CHEMTAX steepest descent algorithm. It is not required to use this function, and as results are not bound by minimum and maximum, results may be unrealistic.

Usage

Steepest_Desc(Fmat, S, num.loops)

Arguments

Fmat

Pigment to Chl a matrix

S

Sample data matrix – a matrix of pigment samples

num.loops

Number of loops/iterations to perform (no default)

Value

A list containing

The F matrix (pigment: Chl a) ratios
RMSE (Root Mean Square Error)
Condition number
class abundances
Figure (plot of results)
MAE (Mean Absolute Error)

Examples

MC <- Matrix_checks(Sm,Fm)
Snew <- MC$Snew
Fnew <- MC$Fnew
SDRes <- Steepest_Desc(Fnew,Snew, num.loops = 20)

Performs the steepest descent algorithm for a set number of iterations

Description

Performs the steepest descent algorithm for a set number of iterations

Usage

Steepest_Descent(Fmat, place, S, cm, num.loops)

Arguments

Fmat

place

S

cm

num.loops

Apply weights to F/S matrices

Description

Apply weights to F/S matrices

Usage

Weight_error(S, cm)

Arguments

S

cm

Value

A matrix

Converts data-types and selects data for randomisation in the simulated annealing algorithm

Description

Converts data-types and selects data for randomisation in the simulated annealing algorithm

Usage

Wrangling(Fl, min.val, max.val)

Arguments

Fl

min.val

max.val

min_max data

Description

min_max data

Usage

min_max

Format

`min_max`

A data frame with 76 rows and 4 columns:

class: XX
Pig_Abbrev: XX
min: XX
max: max

...

Source

Phytoclass - simualted annealing

Description

This is the main phytoclass algorithm. It performs simulated annealing algorithm for S and F matrices. See the examples (Fm, Sm) for how to set up matrices, and the vignette for more detailed instructions. Different pigments and phytoplankton groups may be used.

Usage

simulated_annealing(
  S,
  Fmat = NULL,
  user_defined_min_max = NULL,
  do_matrix_checks = TRUE,
  niter = 500,
  step = 0.009,
  weight.upper.bound = 30,
  verbose = TRUE
)

Arguments

S

Sample data matrix – a matrix of pigment samples

Fmat

Pigment to Chl a matrix

user_defined_min_max

data frame with some format as min_max built-in data

do_matrix_checks

This should only be set to TRUE when using the default values. This will remove pigment columns that have column sums of 0. Set to FALSE if using customised names for pigments and phytoplankton groups

niter

Number of iterations (default is 500)

step

Step ratio used (default is 0.009)

weight.upper.bound

Upper limit of the weights applied (default value is 30).

verbose

Logical value. Output error and temperature at each iteration. Default value of TRUE

Value

A list containing

Fmat matrix
RMSE (Root Mean Square Error)
condition number
Class abundances
Figure (plot of results)
MAE (Mean Absolute Error)
Error

Examples

# Using the built-in matrices Sm and Fm
set.seed(5326)
sa.example <- simulated_annealing(Sm, Fm, niter = 5)
sa.example$Figure

#Using non-default data:
# Set up a new F matrix
Fu <- data.frame(
  Per = c(0, 0, 0, 0, 1, 0, 0, 0),
  X19but = c(0, 0, 0, 0, 0, 1, 1, 0),
  Fuco = c(0, 0, 0, 1, 0, 1, 1, 0),
  Pra = c(1, 0, 0, 0, 0, 0, 0, 0),
  X19hex = c(0, 0, 0, 0, 0, 1, 0, 0),
  Allo = c(0, 0, 1, 0, 0, 0, 0, 0),
  Zea = c(1, 1, 0, 0, 0, 0, 0, 1),
  Chl_b = c(1, 1, 0, 0, 0, 0, 0, 0),
  Tchla = c(1, 1, 1, 1, 1, 1, 1, 1)
)

rownames(Fu) <- c(
  "Prasinophytes", "Chlorophytes", "Cryptophytes"
  , "Diatoms-2", "Dinoflagellates-1",
  "Haptophytes", "Pelagophytes", "Syn"
)

#Set up a new Min_max file
Min_max <- data.frame(
  Class = c(
    "Syn", "Chlorophytes", "Chlorophytes", "Prasinophytes", "Prasinophytes",
    "Prasinophytes", "Cryptophytes", "Diatoms-2", "Diatoms-2", "Pelagophytes",
    "Pelagophytes", "Pelagophytes", "Dinoflagellates-1", "Haptophytes",
    "Haptophytes", "Haptophytes", "Haptophytes", "Diatoms-2", "Cryptophytes",
    "Prasinophytes", "Chlorophytes", "Syn", "Dinoflagellates-1", "Pelagophytes"
  ),
  Pig_Abbrev = c(
    "Zea", "Zea", "Chl_b", "Pra", "Zea", "Chl_b", "Allo", "Chl_c3",
    "Fuco", "Chl_c3", "X19but", "Fuco", "Per", "X19but", "X19hex",
    "Fuco", "Tchla", "Tchla", "Tchla", "Tchla", "Tchla", "Tchla", "Tchla",
    "Tchla"
  ),
  min = as.numeric(c(
    0.0800, 0.0063, 0.1666, 0.0642, 0.0151, 0.4993, 0.2118, 0.0189,
    0.3315, 0.1471, 0.2457, 0.3092, 0.3421, 0.0819, 0.2107, 0.0090,
    1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000
  )),
  max = as.numeric(c(
    1.2123, 0.0722, 0.9254, 0.4369, 0.1396, 0.9072, 0.5479, 0.1840,
    0.9332, 0.2967, 1.0339, 1.2366, 0.8650, 0.2872, 1.3766, 0.4689,
    1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000
  ))
)
#Run the new file with your own set up (make sure all names between your data (S),
#F-marix, and min_max are correct)
Results <- simulated_annealing(
  S = Sm, 
  F = Fu,
  user_defined_min_max = Min_max,
  do_matrix_checks = TRUE, 
  #You may want to change this to faults if your naming conventions are different.
  niter = 1,
  step = 0.01,
  weight.upper.bound = 30)

Perform simulated annealing algorithm for S and F matrices

Description

Perform simulated annealing algorithm for S and F matrices

Perform simulated annealing algorithm for samples with divinyl chlorophyll and prochlorococcus. Divinyl chlorophyll must be the final column of both S and F matrices, with chlorophyll a the 2nd to last column. See how the example Sp and Fp matrices are organised.

Usage

simulated_annealing_Prochloro(
  S,
  Fmat = NULL,
  user_defined_min_max = NULL,
  do_matrix_checks = TRUE,
  niter = 500,
  step = 0.009,
  weight.upper.bound = 30,
  verbose = TRUE
)

simulated_annealing_Prochloro(
  S,
  Fmat = NULL,
  user_defined_min_max = NULL,
  do_matrix_checks = TRUE,
  niter = 500,
  step = 0.009,
  weight.upper.bound = 30,
  verbose = TRUE
)

Arguments

S

Sample data matrix – a matrix of pigment samples

Fmat

Pigment to Chl a matrix

user_defined_min_max

data frame with some format as min_max built-in data

do_matrix_checks

niter

Number of iterations (default is 500)

step

Step ratio used (default is 0.009)

weight.upper.bound

Upper limit of the weights applied (default value is 30).

verbose

Logical value. Output error and temperature at each iteration. Default value of TRUE

Value

A list containing

Fmat matrix
RMSE (Root Mean Square Error)
condition number
Class abundances
Figure (plot of results)
MAE (Mean Absolute Error)
Error

A list containing

Fmat matrix
RMSE (Root Mean Square Error)
condition number
Class abundances
Figure (plot of results)
MAE (Mean Absolute Error)
Error

Examples

# Using the built-in matrices Sp and Fp
set.seed(5326)
sa.example <- simulated_annealing_Prochloro(Sp, Fp, niter = 5)
sa.example$Figure
# Using the built-in matrices Sp and Fp.
set.seed(5326)
sa.example <- simulated_annealing_Prochloro(Sp, Fp, niter = 1)
sa.example$Figure

# To use with non-defauæy values, see the 'simulated_annealing' example-

Title

Description

Title

Usage

target(x)

Arguments

x

Turn each non-zero element of the F-matrix into a vector

Description

Turn each non-zero element of the F-matrix into a vector

Usage

vectorise(Fmat)

Arguments

Fmat