Type: | Package |
Version: | 1.0 |
Date: | 2023-12-01 |
Title: | Data Manipulation using Formula |
Description: | A tool for manipulating data using the generic formula. A single formula allows to easily add, replace and remove variables before running the analysis. |
Depends: | R (≥ 3.5.0) |
Imports: | utils, stats, formula.tools(≥ 1.7.1) |
Suggests: | knitr, rmarkdown |
VignetteBuilder: | knitr |
License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
Repository: | CRAN |
URL: | https://github.com/serafinialessio/dformula |
BugReports: | https://github.com/serafinialessio/dformula/issues |
NeedsCompilation: | no |
Encoding: | UTF-8 |
LazyData: | true |
Packaged: | 2023-12-01 09:29:09 UTC; alessioserafini |
Author: | Alessio Serafini |
Maintainer: | Alessio Serafini <srf.alessio@gmail.com> |
Date/Publication: | 2023-12-01 10:10:02 UTC |
Add variables
Description
Add new variables by mutating the input variables using a formula.
Usage
add(from, formula, as = NULL,
position = c("right", "left"),
na.remove = FALSE, logic_convert = TRUE,...)
Arguments
from |
a data.frame object with variables |
formula |
a formula indicating the operation to create new varibles. Look at the detail section for explanantion. |
as |
a character vector with names of new variables. |
position |
if the new varaibles are positioned at the begining ( |
na.remove |
a logical value indicating whether NA values should be removed. |
logic_convert |
logical value indicating if the new logical varaible are convertet to |
... |
further arguments |
Details
The formula is composed of two part:
~ new_variables
the right-hand are the new varaible to add starting from the existing varaibles, using the I()
function.
For example:
~ I(log(column_names1)) + I(column_names2/100)
the column_names1
and log(column_names1)
are added to the data.
If na.remove
is set ti TRUE
, new variables are created, added to the dataset in input and then the observation with missing are removed.
Value
Returns a data.frame object with the original and the new varaibles.
Author(s)
Alessio Serafini
Examples
data("airquality")
dt <- airquality
head(add(from = dt, formula = ~ log(Ozone)))
head(add(from = dt, formula = ~ log(Ozone) + log(Wind)))
head(add(from = dt, formula = ~ log(Ozone), as = "Ozone_1"))
head(add(from = dt, formula = Ozone + Wind ~ log()))
head(add(from = dt, formula = ~ log()))
head(add(from = dt, formula = .~ log(), position = "left"))
head(add(from = dt, formula = .~ log(), na.remove = TRUE))
head(add(from = dt, formula = ~ I((Ozone>5))))
head(add(from = dt, formula = ~ I((Ozone>5)), logic_convert = FALSE ))
head(add(from = dt, formula = Ozone + Wind ~ C(Ozone-Ozone)))
head(add(from = dt, formula = ~ C(log(Ozone))))
head(add(from = dt, formula = ~ C(5)))
head(add(from = dt, formula = Ozone + Wind ~ C(Ozone-Ozone)))
head(add(from = dt, formula = Ozone + Wind ~ C(log(Ozone))))
foo <- function(x, a = 100){return(x-x + a)}
head(add(from = dt, formula = Ozone + Month~ I(foo(a = 100))))
head(add(from = dt, formula = Ozone + Month~ foo()))
head(add(from = dt, formula = ~ I(foo(Ozone, a = 100))))
World population
Description
World population and countries are
Usage
data("population_data")
Format
A data frame with 159 observations on the following 3 variables.
Country
a character vector with countries names
Population
a numeric vector with population
Area
a numeric vector with area of the counties
Source
Examples
data(population_data)
str(population_data)
Remove a subset
Description
Selects the row and the varaibles to remove by specifing a condition using a formula.
Usage
remove(from, formula = .~., na.remove = FALSE, ...)
Arguments
from |
a data.frame object with variables |
formula |
a formula indicating the operation to create new varibles. Look at the detail section for explanantion. |
na.remove |
a logical value indicating whether NA values should be removed. |
... |
further arguments |
Details
The formula is composed of two part:
column_names ~ rows_conditions
the left-hand side are the names of the column to remove, and the right-hand the operation to remove the rows, using the I()
function.
For example:
column_names1 + column_names2 ~ I(column_names1 == "a") + I(column_names2 > 4)
first the row are selected to be removed if the observation in the column_names1
are equal to a
and if the observation in the column_names2
are biggers than 4
, then the column_names1
and column_names2
are removed and the other varaibles are returned.
If na.remove
is set to TRUE
, after the subsetting the observations with missing are removed.
Value
Returns a data.frame object without the selected elements.
Author(s)
Alessio Serafini
Examples
data("airquality")
dt <- airquality
head(remove(from = dt, formula = .~ I(Ozone > 10)))
head(remove(from = dt, formula = .~ I(Ozone > 10), na.remove = TRUE))
head(remove(from = dt, formula = Ozone ~ .))
head(remove(from = dt, formula = Ozone~ I(Ozone > 10)))
head(remove(from = dt, formula = Ozone + Wind~ I(Ozone > 10)))
head(remove(from = dt, formula = Ozone + . ~ I(Ozone > 10)))
head(remove(from = dt, formula = Ozone + NULL ~ I(Ozone > 10)))
Rename variables
Description
Rename variables using formulas
Usage
rename(from, formula, ...)
Arguments
from |
a data.frame object with variables |
formula |
a formula indicating the operation to create new varibles. Look at the detail section for explanantion. |
... |
further arguments |
Details
The formula is composed of two part:
column_names ~ new_variables_name
the left-hand side select the columns to change the names, and the right-hand the new names of the selected columns
For example:
column_names1 + column_names2 ~ new_variables_name1 + new_variables_name2
the name of the column 1
and the name of the column 2
are changed in new_variables_name1
and new_variables_name2
Value
The original data.frame with changed column names
Author(s)
Alessio Serafini
Examples
data("airquality")
dt <- airquality
head(rename(from = dt, Ozone ~ Ozone1))
head(rename(from = dt, Ozone + Wind ~ Ozone_new + Wind_new))
Select a subset
Description
Selects the row and the varaibles by specifing a condition using a formula.
Usage
select(from, formula = .~., as = NULL, na.remove = FALSE, na.return = FALSE,...)
Arguments
from |
a data.frame object with variables |
formula |
a formula indicating the operation to create new varibles. Look at the detail section for explanantion |
as |
a character vector with names of new variables. |
na.remove |
a logical value indicating whether NA values should be removed |
na.return |
a logical value indicating whether only the observation with NA values should be shown |
... |
further arguments |
Details
The formula is composed of two part:
column_names ~ row_conditions
the left-hand side are the names of the column to select, and the right-hand the operations to select the rows, using the I()
function.
For example:
column_names1 + column_names2 ~ I(column_names1 == "a") + I(column_names2 > 4)
first the rows are selected if the observation in the column_names1
are equal to a
and if the observation in the column_names2
are biggers than 4
, then the column_names1
and column_names2
are returned.
If na.remove
is set to TRUE
, after the subsetting the observations with missing are removed.
Value
Returns a data.frame object containing the selected elements.
Author(s)
Alessio Serafini
Examples
data("airquality")
dt <- airquality
## Selects columns and filter rows
select(from = dt, formula = .~ I(Ozone > 10 & Wind > 10))
select(from = dt, formula = Ozone ~ I(Wind > 10))
select(from = dt, formula = Ozone + Wind~ I(Ozone > 10))
## All rows and filter columns
select(from = dt, formula = Ozone ~ .)
select(from = dt, formula = Ozone + Wind ~ NULL)
Transform varibles
Description
Mutate input variables using a formula.
Usage
transform(from, formula, as = NULL,
na.remove = FALSE, logic_convert = TRUE, ...)
Arguments
from |
a data.frame object with variables |
formula |
a formula indicating the operation to create new varibles. Look at the detail section for explanantion. |
as |
a character vector with names of new variables. |
na.remove |
a logical value indicating whether NA values should be removed. |
logic_convert |
logical value indicating if the new logical varaible are converted to |
... |
further arguments |
Details
The formula is composed of two part:
column_names ~ trasformed_variables
the left-hand side are the names of the column to transform, and the right-hand the operations applied to the selected columns, using the I()
function.
For example:
column_names1 + column_names2 ~ I(log(column_names1)) + I(column_names2/100)
the column_names1
is mutated in log(column_names1)
and column_names2
is divided by 100.
If na.remove
is set to TRUE
, variables are mutaded, and then the observation with missing are removed.
Value
Returns the original data.frame object with mutaded varaibles.
Author(s)
Alessio Serafini
Examples
data("airquality")
dt <- airquality
head(transform(from = dt, Ozone ~ I(Ozone-Ozone)))
head(transform(from = dt, Ozone ~ log(Ozone)))
head(transform(from = dt, Ozone ~ I(Ozone>5)))
head(transform(from = dt, Ozone ~ I(Ozone>5), logic_convert = TRUE))
head(transform(from = dt, ~ log()))
head(transform(from = dt, . ~ log()))
head(transform(from = dt, NULL ~ log()))
head(transform(from = dt, Ozone + Day ~ log()))
head(transform(from = dt, Ozone + Day ~ log(Ozone/100) + exp(Day)))
head(transform(from = dt, Ozone ~ log()))
head(transform(from = dt,Ozone + Wind ~ C(log(1))))
head(transform(from = dt,Ozone + Wind ~ log(Ozone) + C(10)))
head(transform(from = dt, Ozone + Wind~ C(log(Ozone))))
foo <- function(x, a = 100){return(x-x + a)}
head(transform(from = dt, Ozone + Wind ~ foo(a = 100)))
head(transform(from = dt, . ~ foo(a = 100)))
head(transform(from = dt, Ozone + Wind ~ log(log(1))))