--- title: "Tutorial" author: "Max Conway" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Tutorial} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ## Introduction While README.Rmd or the Introduction vignette provide a brief overview of the package, this vignette is intended to provide a more complete guide, with examples of usage, and exercises. ## Installation and Optimizers The `fbar` package should install in a very straightforward manner, as follows: ```{r, eval=FALSE} install.packages('fbar') library(fbar) ``` or ```{r, eval=FALSE} devtools::install_github('maxconway/fbar') library(fbar) ``` However, `fbar`, like all flux balance analysis packages, requires an linear programming library in order to conduct simulations. `fbar` can use a number of linear programming libraries via the R Optimization Infrastructure (`ROI`) library and its plugins, and also supports `Rglpk` and `gurobi` directly. The suggested method to get started quickly is via the `ROI.plugin.ecos` library. To install and set it up, just run: ```{r, eval=FALSE} ROI::ROI_registered_solvers() install.packages('ROI.plugin.ecos') library('ROI.plugin.ecos') # This line is necessary to register the plugin with ROI the first time ROI::ROI_registered_solvers() ``` The `ROI.plugin.ecos` library does not export any functions, but running the `library` statement is necessary after installation to register with `ROI`. Installing other optimizers to work with `ROI` is normally similar, but you may have to install a separate package on your operating system first. ## Looking at a model and understanding what it means Download a simple model for *Escherichia coli* by running the following code: ```{r} library(fbar) # load fbar package data(ecoli_core) ``` To look at the model, you might want to use the `filter` and `select` functions from `dplyr`, or if you're using Rstudio, the `View` function. ### Questions 1. One of the reactions has an unusually long equation. a. What does this reaction represent? b. Why does it have such a long equation? c. Why is the stoichiometry of this reaction unusual? 2. Some of the equations are one sided. a. What does this mean? b. Why don't these reactions have gene sets associated with them? 3. The letters in square brackets represent compartments. a. Why are these useful? b. What might [c] stand for? c. What might [e] stand for? d. This *E. coli* model has only two compartments. Why might a model of *S. cerevisiae* have more? 4. The columns `lowbnd` and `uppbnd` represent the limits on reaction rates. a. Why is ±1000 used in many places in these columns? b. What ways can you tell if a reaction is reversible or not? 5. The column `obj_coef` represents the objective coefficient. a. If we multiplied everything in this column by 5, how would that effect the model? b. If we multiplied everything in this column by -1, how would that effect the model? 6. `geneAssociation` shows which genes control the reaction. a. Which reactions would be affected if we knocked out gene b1241? b. Which reactions would be affected if we knocked out gene b0351? c. Which reactions would be affected if we knocked out gene b0356? d. Which reactions would be affected if we knocked out genes b0351 and b0356? e. Which reactions would be affected if we knocked out genes b0351 and b1241? f. Which reactions would be affected if we knocked out genes b0356 and b1241? ## Parsing and evaluating a model To find the fluxes, and then compare them to the original model, do the following: ```{r, eval=FALSE} library(dplyr) # load dplyr, to explore data ecoli_fluxes <- ecoli_core %>% reactiontbl_to_expanded() %>% expanded_to_ROI() %>% ROI::ROI_solve() %>% ROI::solution() ecoli_core_evaluated <- ecoli_core %>% mutate(flux = ecoli_fluxes) ``` ### Questions 1. The code above performs a number of operations, using a number of packages. a. What does each line do? You'll probably want to use R's help function, `?` b. what does `::` mean? 2. A new column has been added to `ecoli_core_evaluated`, called `flux`. a. What does this represent? b. What does it mean when a value is zero? c. Why are some of the number negative? d. What would be suitable units for this column? e. How does `flux` compare to `uppbnd` and `lowbnd`? ## Modifying models The code in the previous section is explict, but we don't necessarily want to type it all out each time we evaluate a model. The code below does (roughly) the same thing in one line, so we can explore the model faster. ```{r, eval=FALSE} evaluated <- find_fluxes_df(ecoli_core) ``` ### Questions 1. By altering `ecoli_core`, and rerunning `find_fluxes_df`, you can see the effects of changes to the model. a. Which reactions can you delete without changing biomass production? b. Alter the bounds of a reaction to increase biomass production. c. Find another reaction and change the bounds to reduce biomass production again, but not to 0. 2. Look at the source code of `find_fluxes_df` (you can see it just by typing the name in at the console). a. What does the argument `do_minimization` do? b. Why is this possible? c. Why would we want to do this? 3. Find the reaction `EX_ac(e)`. a. What does it do? b. Find the maximum acetate production possible without taking biomass production below 0.5. ## Next When you're done with this, you might want to look at the vignette `Multi-Objective Optimization case study`, to see an example of this package in a a more complicated context. Try using the code to find a good tradeoff between production of Acetate and Biomass.