Title: Simultaneous Analysis of Multiplexed Metabarcodes
Version: 0.1.2
Description: A comprehensive set of wrapper functions for the analysis of multiplex metabarcode data. It includes robust wrappers for 'Cutadapt' and 'DADA2' to trim primers, filter reads, perform amplicon sequence variant (ASV) inference, and assign taxonomy. The package can handle single metabarcode datasets, datasets with two pooled metabarcodes, or multiple datasets simultaneously. The final output is a matrix per metabarcode, containing both ASV abundance data and associated taxonomic assignments. An optional function converts these matrices into 'phyloseq' and 'taxmap' objects. For more information on 'DADA2', including information on how DADA2 infers samples sequences, see Callahan et al. (2016) <doi:10.1038/nmeth.3869>. For more details on the demulticoder R package see Sudermann et al. (2025) <doi:10.1094/PHYTO-02-25-0043-FI>.
License: MIT + file LICENSE
Encoding: UTF-8
Depends: R (≥ 3.0.2)
Imports: furrr, purrr, readr, stringr, tidyr, dplyr, ggplot2, tibble, utils
Suggests: BiocManager, Biostrings, dada2, metacoder, ShortRead, phyloseq, rmarkdown, RcppParallel, testthat (≥ 3.0.0)
Config/testthat/edition: 3
URL: https://grunwaldlab.github.io/demulticoder/, https://github.com/grunwaldlab/demulticoder
BugReports: https://github.com/grunwaldlab/demulticoder/issues
RoxygenNote: 7.3.2
Config/Needs/website: rmarkdown
NeedsCompilation: no
Packaged: 2025-04-30 21:59:31 UTC; marthasudermann
Maintainer: Martha A. Sudermann <sudermam@oregonstate.edu>
Author: Martha A. Sudermann [aut, cre, cph], Zachary S. L Foster [aut], Samantha Dawson [aut], Hung Phan [aut], Jeff H. Chang [aut], Niklaus Grünwald [aut, cph]
Repository: CRAN
Date/Publication: 2025-05-05 09:50:02 UTC

Combine taxonomic assignments and bootstrap values for each metabarcode into single falsification vector

Description

Combine taxonomic assignments and bootstrap values for each metabarcode into single falsification vector

Usage

assignTax_as_char(tax_results, temp_directory_path, metabarcode)

Arguments

tax_results

The dataframe containing taxonomic assignments


Assign taxonomy functions

Description

Assign taxonomy functions

Usage

assign_tax(
  analysis_setup,
  asv_abund_matrix,
  retrieve_files = FALSE,
  overwrite_existing = FALSE,
  db_rps10 = "oomycetedb.fasta",
  db_its = "fungidb.fasta",
  db_16S = "bacteriadb.fasta",
  db_other1 = "otherdb1.fasta",
  db_other2 = "otherdb2.fasta"
)

Arguments

analysis_setup

An object containing directory paths and data tables, produced by the prepare_reads function

asv_abund_matrix

The final abundance matrix containing amplified sequence variants

retrieve_files

Logical, TRUE/FALSE whether to copy files from the temp directory to the output directory. Default is FALSE.

overwrite_existing

Logical, indicating whether to remove or overwrite existing files and directories from previous runs. Default is FALSE.

db_rps10

The reference database for the rps10 metabarcode

db_its

The reference database for the ITS metabarcode

db_16S

The SILVA 16S-rRNA reference database provided by the user

db_other1

The reference database for other metabarcode 1 (assumes format is like SILVA DB entries)

db_other2

The reference database for other metabarcode 2 (assumes format is like SILVA DB entries)

Details

At this point, 'DADA2' function assignTaxonomy is used to assign taxonomy to the inferred ASVs.

Value

Taxonomic assignments of each unique ASV sequence

Examples


# Assign taxonomies to ASVs on by metabarcode
analysis_setup <- prepare_reads(
  data_directory = system.file("extdata", package = "demulticoder"),
  output_directory = tempdir(),
  overwrite_existing = TRUE
)
cut_trim(
analysis_setup,
cutadapt_path="/usr/bin/cutadapt",
overwrite_existing = TRUE
)
make_asv_abund_matrix(
analysis_setup,
overwrite_existing = TRUE
)
assign_tax(
analysis_setup,
asv_abund_matrix,
retrieve_files=FALSE,
overwrite_existing = TRUE
)


Assign taxonomy

Description

Assign taxonomy

Usage

assign_taxonomyDada2(
  asv_abund_matrix,
  temp_directory_path,
  metabarcode = "metabarcode",
  barcode_params
)

Arguments

asv_abund_matrix

The final abundance matrix containing amplified sequence variants.

temp_directory_path

User-defined temporary directory to output unfiltered, trimmed, and filtered read directories throughout the workflow

metabarcode

The metabarcode used throughout the workflow (applicable options: 'rps10', 'its', 'r16S', 'other1', other2')


Filter ASV abundance matrix and convert to 'taxmap' and 'phyloseq' objects

Description

Filter ASV abundance matrix and convert to 'taxmap' and 'phyloseq' objects

Usage

convert_asv_matrix_to_objs(
  analysis_setup,
  min_read_depth = 0,
  minimum_bootstrap = 0,
  save_outputs = FALSE
)

Arguments

analysis_setup

An object containing directory paths and data tables, produced by the prepare_reads function

min_read_depth

ASV filter parameter. If mean read depth of across all samples is less than this threshold, ASV will be filtered.

minimum_bootstrap

Set threshold for bootstrap support value for taxonomic assignments. Below designated minimum bootstrap threshold, taxonomic assignments will be set to N/A

save_outputs

Logical, indicating whether to save the resulting phyloseq and 'taxmap' objects. If TRUE, the objects will be saved; if FALSE, they will only be available in the global environment. Default is FALSE.

Value

ASV matrix converted to 'taxmap' object

Examples


# Convert final matrix to 'taxmap' and phyloseq objects for downstream analysis steps
analysis_setup <- prepare_reads(
data_directory = system.file("extdata", package = "demulticoder"),
  output_directory = tempdir(),
  overwrite_existing = TRUE
)
cut_trim(
analysis_setup,
cutadapt_path="/usr/bin/cutadapt",
overwrite_existing = TRUE
)
make_asv_abund_matrix(
analysis_setup,
overwrite_existing = TRUE
)
assign_tax(
analysis_setup,
asv_abund_matrix,
retrieve_files=FALSE,
overwrite_existing=TRUE
)
objs<-convert_asv_matrix_to_objs(
analysis_setup
)


Count overlap to see how well the reads were merged

Description

Count overlap to see how well the reads were merged

Usage

countOverlap(data_tables, merged_reads, barcode, output_directory_path)

Arguments

data_tables

The data tables containing the paths to read files, metadata, and metabarcode information with associated primer sequences

merged_reads

Intermediate merged read RData file

barcode

The metabarcode used throughout the workflow (applicable options: 'rps10', 'its', 'r16S', 'other1', other2')

output_directory_path

The path to the directory where resulting files are output

Value

A plot describing how well reads merged and information on overlap between reads


Make ASV sequence matrix

Description

Make ASV sequence matrix

Usage

createASVSequenceTable(merged_reads, orderBy = "abundance", barcode_params)

Arguments

merged_reads

Intermediate merged read RData file

orderBy

(Optional). character(1). Default "abundance". Specifies how the sequences (columns) of the returned table should be ordered (decreasing). Valid values: "abundance", "nsamples", NULL.

Value

raw_seqtab


Main command to trim primers using 'Cutadapt' and core 'DADA2' functions

Description

Main command to trim primers using 'Cutadapt' and core 'DADA2' functions

Usage

cut_trim(analysis_setup, cutadapt_path, overwrite_existing = FALSE)

Arguments

analysis_setup

An object containing directory paths and data tables, produced by the prepare_reads function

cutadapt_path

Path to the 'Cutadapt' program.

overwrite_existing

Logical, indicating whether to remove or overwrite existing files and directories from previous runs. Default is FALSE.

Details

If samples are comprised of two different metabarcodes (like ITS1 and rps10), reads will also be demultiplexed prior to 'DADA2'-specific read trimming steps.

Value

Trimmed reads, primer counts, quality plots, and ASV matrix.

Examples


# Remove remaining primers from raw reads, demultiplex pooled barcoded samples,
# and then trim reads based on specific 'DADA2' parameters
analysis_setup <- prepare_reads(
  data_directory = system.file("extdata", package = "demulticoder"),
  output_directory = tempdir(),
  overwrite_existing = TRUE
)
cut_trim(
analysis_setup,
cutadapt_path="/usr/bin/cutadapt",
overwrite_existing = TRUE
)


Wrapper function for filterAndTrim function from 'DADA2', to be used after primer trimming

Description

Wrapper function for filterAndTrim function from 'DADA2', to be used after primer trimming

Usage

filter_and_trim(
  output_directory_path,
  temp_directory_path,
  cutadapt_data_barcode,
  barcode_params,
  barcode
)

Arguments

output_directory_path

The path to the directory where resulting files are output

cutadapt_data_barcode

Metabarcode-specific FASTQ read files trimmed of primers

Value

Filtered and trimmed reads


Format ASV abundance matrix

Description

Format ASV abundance matrix

Usage

format_abund_matrix(
  data_tables,
  asv_abund_matrix,
  seq_tax_asv,
  output_directory_path,
  metabarcode
)

Arguments

data_tables

The data tables containing the paths to read files, metadata, and metabarcode information with associated primer sequences

asv_abund_matrix

The final abundance matrix containing amplified sequence variants

seq_tax_asv

An amplified sequence variants matrix with taxonomic information


General functions to format user-specified databases

Description

General functions to format user-specified databases

Usage

format_database(
  data_tables,
  data_path,
  output_directory_path,
  temp_directory_path,
  metabarcode,
  db_its,
  db_rps10,
  db_16S,
  db_other1,
  db_other2
)

Arguments

data_tables

The data tables containing the paths to read files, metadata, and metabarcode information with associated primer sequences

data_path

Path to the data directory

output_directory_path

The path to the directory where resulting files are output

temp_directory_path

User-defined temporary directory to output unfiltered, trimmed, and filtered read directories throughout the workflow

metabarcode

The metabarcode used throughout the workflow (applicable options: 'rps10', 'its', 'r16S', 'other1', other2')

Value

Formatted database(s) for the specified metabarcode type(s)


An 16S database that has modified headers and is output in the reference_databases folder

Description

An 16S database that has modified headers and is output in the reference_databases folder

Usage

format_db_16S(
  data_tables,
  data_path,
  output_directory_path,
  temp_directory_path,
  db_16S
)

Arguments

data_tables

The data tables containing the paths to read files, metadata, and metabarcode information with associated primer sequences

data_path

Path to the data directory

output_directory_path

The path to the directory where resulting files are output

temp_directory_path

User-defined temporary directory to output unfiltered, trimmed, and filtered read directories throughout the workflow

db_16S

The SILVA 16S rRNA reference database provided by the user

Value

The SILVA 16S rRNA database with modified headers


An ITS database that has modified headers and is output in the reference_databases folder

Description

An ITS database that has modified headers and is output in the reference_databases folder

Usage

format_db_its(
  data_tables,
  data_path,
  output_directory_path,
  temp_directory_path,
  db_its
)

Arguments

data_tables

The data tables containing the paths to read files, metadata, and metabarcode information with associated primer sequences

data_path

Path to the data directory

output_directory_path

The path to the directory where resulting files are output

temp_directory_path

User-defined temporary directory to output unfiltered, trimmed, and filtered read directories throughout the workflow

db_its

The UNITE ITS reference database provided by the user

Value

The UNITE ITS database with modified headers


An other, user-specified database that is initially in the format specified by 'DADA2' with header containing taxonomic levels (kingdom down to species, separated by semi-colons)

Description

An other, user-specified database that is initially in the format specified by 'DADA2' with header containing taxonomic levels (kingdom down to species, separated by semi-colons)

Usage

format_db_other1(
  data_tables,
  data_path,
  output_directory_path,
  temp_directory_path,
  db_other1
)

Arguments

data_tables

The data tables containing the paths to read files, metadata, and metabarcode information with associated primer sequences

data_path

Path to the data directory

output_directory_path

The path to the directory where resulting files are output

temp_directory_path

User-defined temporary directory to output unfiltered, trimmed, and filtered read directories throughout the workflow

db_other1

A reference database other than SILVA, UNITE, or oomyceteDB (assumes format is like SILVA DB entries)

Value

The database with modified headers


An second user-specified database that is initially in the format specified by 'DADA2' with header containing taxonomic levels (kingdom down to species, separated by semi-colons)

Description

An second user-specified database that is initially in the format specified by 'DADA2' with header containing taxonomic levels (kingdom down to species, separated by semi-colons)

Usage

format_db_other2(
  data_tables,
  data_path,
  output_directory_path,
  temp_directory_path,
  db_other2
)

Arguments

data_tables

The data tables containing the paths to read files, metadata, and metabarcode information with associated primer sequences

data_path

Path to the data directory

output_directory_path

The path to the directory where resulting files are output

temp_directory_path

User-defined temporary directory to output unfiltered, trimmed, and filtered read directories throughout the workflow

db_other2

A second reference database other than SILVA, UNITE, or oomyceteDB (assumes format is like SILVA DB entries)

Value

The database with modified headers


Create modified reference rps10 database for downstream analysis

Description

Create modified reference rps10 database for downstream analysis

Usage

format_db_rps10(
  data_tables,
  data_path,
  output_directory_path,
  temp_directory_path,
  db_rps10
)

Arguments

data_tables

The data tables containing the paths to read files, metadata, and metabarcode information with associated primer sequences

data_path

Path to the data directory

output_directory_path

The path to the directory where resulting files are output

temp_directory_path

User-defined temporary directory to output unfiltered, trimmed, and filtered read directories throughout the workflow

db_rps10

The oomyceteDB rps10 reference database provided by the user

Value

The oomyceteDB database with modified headers


Retrieve the paths of the filtered and trimmed Fastq files

Description

Retrieve the paths of the filtered and trimmed Fastq files

Usage

get_fastq_paths(data_tables, my_direction, my_primer_pair_id)

Arguments

data_tables

The data tables containing the paths to read files, metadata, and metabarcode information with associated primer sequences

my_direction

Whether primer is in forward or reverse direction

my_primer_pair_id

The specific metabarcode ID


Get primer counts for reach sample after primer removal and trimming steps

Description

Get primer counts for reach sample after primer removal and trimming steps

Usage

get_post_trim_hits(primer_data, cutadapt_data, output_directory_path)

Arguments

primer_data

Primer data.frame created using the orient_primers function to parse information on forward and reverse primer sequences.

cutadapt_data

FASTQ read files trimmed of primers

output_directory_path

The path to the directory where resulting files are output

Value

Table of read counts across each sample


Get primer counts for reach sample before primer removal and trimming steps

Description

Get primer counts for reach sample before primer removal and trimming steps

Usage

get_pre_primer_hits(primer_data, fastq_data, output_directory_path)

Arguments

primer_data

Primer data.frame created using the orient_primers function to parse information on forward and reverse primer sequences.

fastq_data

A data.frame containing the read file paths and the direction of the reads by sample

output_directory_path

The path to the directory where resulting files are output

Value

The number of reads in which the primer is found

The number of reads in which the primer is found


Final inventory of read counts after each step from input to removal of chimeras. This function deals with if you have more than one sample. TODO optimize for one sample

Description

Final inventory of read counts after each step from input to removal of chimeras. This function deals with if you have more than one sample. TODO optimize for one sample

Usage

get_read_counts(
  asv_abund_matrix,
  temp_directory_path,
  output_directory_path,
  metabarcode
)

Arguments

asv_abund_matrix

The final abundance matrix containing amplified sequence variants


Function to infer ASVs, for multiple loci

Description

Function to infer ASVs, for multiple loci

Usage

infer_asv_command(
  output_directory_path,
  temp_directory_path,
  data_tables,
  barcode_params,
  barcode
)

Arguments

output_directory_path

The path to the directory where resulting files are output

data_tables

The data tables containing the paths to read files, metadata, and metabarcode information with associated primer sequences


Core 'DADA2' function to learn errors and infer ASVs

Description

Core 'DADA2' function to learn errors and infer ASVs

Usage

infer_asvs(
  data_tables,
  my_direction,
  my_primer_pair_id,
  barcode_params,
  output_directory_path
)

Arguments

data_tables

The data tables containing the paths to read files, metadata, and metabarcode information with associated primer sequences

my_direction

Whether primer is in forward or reverse direction

my_primer_pair_id

The specific metabarcode ID

output_directory_path

The path to the directory where resulting files are output

Value

asv_data


Quality filtering to remove chimeras and short sequences

Description

Quality filtering to remove chimeras and short sequences

Usage

make_abund_matrix(
  raw_seqtab,
  temp_directory_path,
  barcode_params = barcode_params,
  barcode
)

Arguments

raw_seqtab

An RData file containing intermediate read data before chimeras were removed

Value

asv_abund_matrix The returned final ASV abundance matrix


Make an amplified sequence variant (ASV) abundance matrix for each of the input barcodes

Description

Make an amplified sequence variant (ASV) abundance matrix for each of the input barcodes

Usage

make_asv_abund_matrix(analysis_setup, overwrite_existing = FALSE)

Arguments

analysis_setup

An object containing directory paths and data tables, produced by the prepare_reads function

overwrite_existing

Logical, indicating whether to overwrite existing results. Default is FALSE.

Details

The function processes data for each unique barcode separately, inferring ASVs, merging reads, and creating an ASV abundance matrix. To do this, the 'DADA2' core denoising alogrithm is used to infer ASVs.

Value

The ASV abundance matrix (asv_abund_matrix)

Examples


# The primary wrapper function for 'DADA2' ASV inference steps
analysis_setup <- prepare_reads(
  data_directory = system.file("extdata", package = "demulticoder"),
  output_directory = tempdir(),
  overwrite_existing = TRUE
)
cut_trim(
analysis_setup,
cutadapt_path="/usr/bin/cutadapt",
overwrite_existing = TRUE
)
make_asv_abund_matrix(
analysis_setup,
overwrite_existing = TRUE
)


Prepare for primmer trimming with 'Cutadapt'. Make new sub-directories and specify paths for the trimmed and untrimmed reads

Description

Prepare for primmer trimming with 'Cutadapt'. Make new sub-directories and specify paths for the trimmed and untrimmed reads

Usage

make_cutadapt_tibble(fastq_data, metadata_primer_data, temp_directory_path)

Arguments

fastq_data

A data.frame containing the read file paths and the direction of the reads by sample

metadata_primer_data

A data.frame combining the metadata and primer data

temp_directory_path

User-defined temporary directory to output unfiltered, trimmed, and filtered read directories throughout the workflow

Value

Returns a larger data.frame containing paths to temporary read directories, which is used as input when running 'Cutadapt'


Plots a histogram of read length counts of all sequences within the ASV matrix

Description

Plots a histogram of read length counts of all sequences within the ASV matrix

Usage

make_seqhist(asv_abund_matrix, output_directory_path)

Arguments

asv_abund_matrix

The final abundance matrix containing amplified sequence variants

Value

histogram with read length counts of all sequences within ASV matrix


Merge forward and reverse reads

Description

Merge forward and reverse reads

Usage

merge_reads_command(
  output_directory_path,
  temp_directory_path,
  barcode_params,
  barcode
)

Arguments

output_directory_path

The path to the directory where resulting files are output

Value

merged_reads Intermediate merged read RData file


Take in user's forward and reverse sequences and creates the complement, reverse, reverse complement of primers in one data.frame

Description

Take in user's forward and reverse sequences and creates the complement, reverse, reverse complement of primers in one data.frame

Usage

orient_primers(primers_params_path)

Arguments

primers_params_path

A path to the CSV file that holds the primer information.

Value

A data.frame with oriented primer information.


Wrapper script for plotQualityProfile after trim steps and primer removal.

Description

Wrapper script for plotQualityProfile after trim steps and primer removal.

Usage

plot_post_trim_qc(
  cutadapt_data,
  output_directory_path,
  n = 5e+05,
  barcode_params
)

Arguments

cutadapt_data

FASTQ read files trimmed of primers

output_directory_path

The path to the directory where resulting files are output

n

(Optional). Default 500,000. The number of records to sample from the fastq file.

Value

Quality profiles of reads after primer trimming


Wrapper function for plotQualityProfile function

Description

Wrapper function for plotQualityProfile function

Usage

plot_qc(cutadapt_data, output_directory_path, n = 5e+05, barcode_params)

Arguments

cutadapt_data

FASTQ read files trimmed of primers

output_directory_path

The path to the directory where resulting files are output

n

(Optional). Default 500,000. The number of records to sample from the fastq file.

Value

'DADA2' wrapper function for making quality profiles for each sample


Prepare final ASV abundance matrix

Description

Prepare final ASV abundance matrix

Usage

prep_abund_matrix(cutadapt_data, asv_abund_matrix, data_tables, metabarcode)

Arguments

asv_abund_matrix

The final abundance matrix containing amplified sequence variants

metabarcode

The metabarcode used throughout the workflow (applicable options: 'rps10', 'its', 'r16S', 'other1', other2')


Read metadata file from user and combine and reformat it, given primer data. Included in a larger function prepare_reads.

Description

Read metadata file from user and combine and reformat it, given primer data. Included in a larger function prepare_reads.

Usage

prepare_metadata_table(metadata_file_path, primer_data)

Arguments

metadata_file_path

The path to the metadata file.

primer_data

Primer data.frame created using the orient_primers function to parse information on forward and reverse primer sequences.

Value

A data.frame containing the merged metadata and primer data called metadata_primer_data.


Prepare reads for primer trimming using 'Cutadapt'

Description

Prepare reads for primer trimming using 'Cutadapt'

Usage

prepare_reads(
  data_directory = "data",
  output_directory = tempdir(),
  tempdir_path = NULL,
  tempdir_id = "demulticoder_run",
  overwrite_existing = FALSE
)

Arguments

data_directory

Directory path where the user has placed raw FASTQ (forward and reverse reads), metadata.csv, and primerinfo_params.csv files. Default is "data".

output_directory

User-specified directory for outputs. Default is tempdir().

tempdir_path

Path to a temporary directory. If NULL, a temporary directory path will be identified using the tempdir() command.

tempdir_id

ID for temporary directories. The user can provide any helpful ID, whether it be a date or specific name for the run. Default is "demulticoder_run"

overwrite_existing

Logical, indicating whether to remove or overwrite existing files and directories from previous runs. Default is FALSE.

Value

A list containing data tables, including metadata, primer sequences to search for based on orientation, paths for trimming reads, and user-defined parameters for all subsequent steps.

Examples


# Pre-filter raw reads and parse metadata and primer_information to prepare
# for primer trimming and filter
analysis_setup <- prepare_reads(
  data_directory = system.file("extdata", package = "demulticoder"),
  output_directory = tempdir(),
  overwrite_existing = TRUE
)


Matching Order Primer Check

Description

Matching Order Primer Check

Usage

primer_check(fastq_data)

Arguments

fastq_data

A data.frame containing the read file paths and the direction of the reads by sample

Value

None


Run 'DADA2' taxonomy functions for single metabarcode

Description

Run 'DADA2' taxonomy functions for single metabarcode

Usage

process_single_barcode(
  data_tables,
  temp_directory_path,
  output_directory_path,
  asv_abund_matrix,
  metabarcode = metabarcode,
  barcode_params
)

Arguments

data_tables

The data tables containing the paths to read files, metadata, and metabarcode information with associated primer sequences

asv_abund_matrix

The final abundance matrix containing amplified sequence variants


Takes in the FASTQ files from the user and creates a data.frame with the paths to files that will be created and used in the future. Included in a larger 'read_prefilt_fastq' function.

Description

Takes in the FASTQ files from the user and creates a data.frame with the paths to files that will be created and used in the future. Included in a larger 'read_prefilt_fastq' function.

Usage

read_fastq(data_directory_path, temp_directory_path)

Arguments

data_directory_path

The path to the directory containing raw FASTQ (forward and reverse reads), metadata.csv, and primerinfo_params.csv files.

temp_directory_path

User-defined temporary directory to output unfiltered, trimmed, and filtered read directories throughout the workflow.

Value

A data.frame with the FASTQ file paths, primer orientations and sequences, and parsed sample names.


Take in user's 'DADA2' parameters and make a data frame for downstream steps

Description

Take in user's 'DADA2' parameters and make a data frame for downstream steps

Usage

read_parameters_table(primers_params_path)

Arguments

primers_params_path

A path to the CSV file that holds the primer information.

Value

A data.frame with information on the 'DADA2' parameters.


A function for calling read_fastq, primer_check, and remove_ns functions. This will process and edit the FASTQ and make them ready for the trimming of primers with 'Cutadapt'. Part of a larger 'prepare_reads' function.

Description

A function for calling read_fastq, primer_check, and remove_ns functions. This will process and edit the FASTQ and make them ready for the trimming of primers with 'Cutadapt'. Part of a larger 'prepare_reads' function.

Usage

read_prefilt_fastq(
  data_directory_path = data_directory_path,
  multithread,
  temp_directory_path
)

Arguments

data_directory_path

The path to the directory containing raw FASTQ (forward and reverse reads), metadata.csv, and primerinfo_params.csv files

multithread

(Optional). Default is FALSE. If TRUE, input files are filtered in parallel via mclapply. If an integer is provided, it is passed to the mc.cores argument of mclapply. Note that the parallelization here is by forking, and each process is loading another fastq file into memory. This option is ignored in Windows, as Windows does not support forking, with mc.cores set to 1. If memory is an issue, execute in a clean environment and reduce the chunk size n and/or the number of threads.

temp_directory_path

User-defined temporary directory to output unfiltered, trimmed, and filtered read directories throughout the workflow

Value

Returns filtered reads that have no Ns


Wrapper function for core 'DADA2' filter and trim function for first filtering step

Description

Wrapper function for core 'DADA2' filter and trim function for first filtering step

Usage

remove_ns(fastq_data, multithread, temp_directory_path)

Arguments

fastq_data

A data.frame containing the read file paths and the direction of the reads by sample

multithread

(Optional). Default is FALSE. If TRUE, input files are filtered in parallel via mclapply. If an integer is provided, it is passed to the mc.cores argument of mclapply. Note that the parallelization here is by forking, and each process is loading another fastq file into memory. This option is ignored in Windows, as Windows does not support forking, with mc.cores set to 1. If memory is an issue, execute in a clean environment and reduce the chunk size n and/or the number of threads.

temp_directory_path

User-defined temporary directory to output unfiltered, trimmed, and filtered read directories throughout the workflow

Value

Return prefiltered reads with no Ns


Core function for running 'Cutadapt'

Description

Core function for running 'Cutadapt'

Usage

run_cutadapt(
  cutadapt_path,
  cutadapt_data_barcode,
  barcode_params,
  minCutadaptlength
)

Arguments

cutadapt_path

A path to the 'Cutadapt' program.

minCutadaptlength

Read lengths that are lower than this threshold will be discarded. Default is 0.

Value

Trimmed read.


Set up directory paths for subsequent analyses

Description

This function sets up the directory paths for subsequent analyses. It checks whether the specified output directories exist or creates them if they don't. The function also provides paths to primer and metadata files within the data directory.

Usage

setup_directories(
  data_directory = "data",
  output_directory = tempdir(),
  tempdir_path = NULL,
  tempdir_id = "demulticoder_run"
)

Arguments

data_directory

Directory path where the user has placed raw FASTQ (forward and reverse reads), metadata.csv, and primerinfo_params.csv files. Default is "data".

output_directory

User-specified directory path for outputs. Default is tempdir().

tempdir_path

Path to a temporary directory. If NULL, a temporary directory path will be identified using the tempdir() command.

tempdir_id

ID for temporary directories. The user can provide any helpful ID, whether it be a date or specific name for the run. Default is "demulticoder_run".

Value

A list with paths for data, output, temporary directories, primer, and metadata files.