Help for package crimeutils

Title:

A Comprehensive Set of Functions to Clean, Analyze, and Present Crime Data

Version:

0.5.1

Description:

A collection of functions that make it easier to understand crime (or other) data, and assist others in understanding it. The package helps you read data from various sources, clean it, fix column names, and graph the data.

Depends:

R (≥ 2.10)

Imports:

dplyr, stringr, ggplot2, readr, gridExtra, scales, magrittr, gt, grDevices, tidyr, stats, methods, rlang

License:

MIT + file LICENSE

URL:

https://github.com/jacobkap/crimeutils/

BugReports:

https://github.com/jacobkap/crimeutils/issues/

RoxygenNote:

7.2.2

Suggests:

spelling, testthat (≥ 2.1.0), covr

Language:

en-US

Encoding:

UTF-8

NeedsCompilation:

Packaged:

2022-12-07 05:04:32 UTC; jkkap

Author:

Jacob Kaplan

[aut, cre]

Maintainer:

Jacob Kaplan <jkkaplan6@gmail.com>

Repository:

CRAN

Date/Publication:

2022-12-07 15:10:07 UTC

Pipe operator

Description

See magrittr::%>% for details.

Usage

lhs %>% rhs

Capitalizes the first letter of every word

Description

Capitalizes the first letter of every word

Usage

capitalize_words(words, lowercase_of = TRUE)

Arguments

words

A string or vector of strings with words you want capitalized

lowercase_of

If TRUE (default), keeps the string " of " to be lowercased as is custom in English writing (e.g. District of Columbia).

Value

The original string with the first letter of each word capitalized

Examples

capitalize_words("district of columbia")

Creates new columns to indicate which values are outliers based on the average value.

Description

Creates new columns to indicate which values are outliers based on the average value.

Usage

indicate_outliers(
  data,
  select_columns = NULL,
  group_variable,
  std_dev_value = 1.96,
  zero_is_outlier = FALSE
)

Arguments

data

A data.frame

select_columns

A string or vector of strings with the name(s) of the numeric columns to check for outliers. If NULL (default), will use all numeric columns in the data.

group_variable

A string with the name of the column with the grouping variable.

std_dev_value

A number indicating how many standard deviations away from the mean to determine if a value is an outlier.

zero_is_outlier

If TRUE (not default), reports any zero value as an outlier.

Value

The initial data.frame with new columns for each numeric variable included with a value of 0 if not an outlier and 1 if that row is an outlier.

Examples

indicate_outliers(mtcars, "drat", group_variable = "am")
indicate_outliers(mtcars, "drat", group_variable = "am", zero_is_outlier = TRUE)

Create a line graph with 95% confidence interval bars

Description

Create a line graph with 95% confidence interval bars

Usage

make_average_graph(
  data,
  x_col,
  y_col,
  confidence_interval_error_bars = TRUE,
  mean_line = TRUE,
  type = c("line", "bar")
)

Arguments

data

A data.frame with the data you want to graph

x_col

A string with the name of the x-axis column

y_col

A string with the name of the y-axis column

confidence_interval_error_bars

A boolean (default TRUE) for whether to include 95% confidence intervals or not.

mean_line

If TRUE (default) willadd a dashed line with the overall mean.

type

A string for whether it should make a linegraph ("line", default) or a bargraph ("bar")

Value

A ggplot object. Also prints the graph to the Plots panel.

Examples

data = data.frame(x = sample(15:25, size = 200, replace = TRUE),
y = sample(1:100, size = 200, replace = TRUE))
make_average_graph(data, "x", "y")
make_average_graph(data, "x", "y", confidence_interval_error_bars  = FALSE)
make_average_graph(data, "x", "y", type = "bar", mean_line = FALSE)
make_average_graph(data, "x", "y", confidence_interval_error_bars  = FALSE, type = "bar")

Make a nice-looking barplot.

Description

Make a nice-looking barplot.

Usage

make_barplots(data, column, count = TRUE, title = NULL, ylab = NULL)

Arguments

data

A data.frame with the data you want to graph.

column

A string with the name of the column you want to make the plot from.

count

A boolean (default TRUE) indicating if you want the barplot to show a count of the column values or a percent.

title

A string with the text you want as the title.

ylab

A string with the text you want as the y-axis label.

Value

A barplot object.

Examples

make_barplots(mtcars, "cyl")

make_barplots(mtcars, "cyl", count = FALSE, title = "hello", ylab = "YLAB Label")

Create a descriptive statistics table from numeric variables

Description

Create a descriptive statistics table from numeric variables

Usage

make_desc_stats_table(
  data,
  columns,
  output = c("min", "median", "mean", "sd", "max", "sum", "NAs"),
  decimals = 2,
  title = NULL,
  subtitle = NULL,
  footnote = NULL
)

Arguments

data

A data.frame with the data you want to make the table from.

columns

A string or vector of strings with the names of the columns you want to use.

output

A string or vector of strings indicating which math functions you want to perform on the columns and present in the table. Options are: 'min', 'median', 'mean', 'sd', 'max', and 'N'. Default is to use all of these math functions. The order you put in these values is the order the table will present the columns.

decimals

A positive integer for how many decimal places you want to round to.

title

A string with the text you want as the title

subtitle

A string with the text you want as the subtitle.

footnote

A string with the text you want as the footnote.

Value

A data.frame with the data that generates the table, which is outputted in the Viewer tab.

Examples

make_desc_stats_table(mtcars, columns = c("mpg", "disp", "wt", "cyl"))

make_desc_stats_table(mtcars, c("mpg", "disp", "wt"), output = c("mean", "min"),
decimals = 4, title = "hello", subtitle = "world")

Creates a .tex file with LaTeX code to create a table from an R data.frame.

Description

Creates a .tex file with LaTeX code to create a table from an R data.frame.

Usage

make_latex_tables(
  data,
  file,
  caption = "",
  label = "",
  multi_column = NULL,
  footnote = "",
  sideways = FALSE,
  longtable = FALSE
)

Arguments

data

A data.frame or a list of data.frames. If a data.frame, the table is created with the values in that data.frame. If a list of data.frames, the table gets one panel for each data.frame. If the list is named, will use the names to create panel labels.

file

A string with the name of the file to save the .tex as.

caption

(Optional) A string with the caption for the table (i.e. the table title).

label

(Optional) A string with the reference for the table - to be used when referencing the table in the text. If NULL,

multi_column

(Optional) A named vector with the names being the names of the multi-column and the values being the width of the multi-column.

footnote

(Optional) A string with text for the footnote of the table.

sideways

(Optional) If TRUE, will make a sideways table (useful for large tables), otherwise (default) will make a normal table.

longtable

(Optional) If TRUE, will make a longtable table (useful for long tables), otherwise (default) will make a normal table.

Value

Nothing. It will create a .tex file in the current working directory.

Examples

## Not run: 
make_latex_tables(mtcars, file =  "text.tex", caption = "This is a description of the table",
label = "internal_table_label", footnote = "Here is some info you should know to read this table",
longtable = TRUE)

## End(Not run)

Create a table showing the mean, median, and mode of a certain column

Description

Create a table showing the mean, median, and mode of a certain column

Usage

make_mean_median_mode_table_by_group(
  data,
  group_column,
  data_column,
  total_row = TRUE
)

Arguments

data

A data.frame with the data you want to make the table from.

group_column

A string with the name of the variable you are grouping by

data_column

A string for the variable you want to get the mean, median, and mode from, Variable should be numeric.

total_row

A boolean (default TRUE) for whether to include a row a the bottom for the overall mean and standard deviation (i.e. not by group).

Value

A data.frame with the first column showing the category grouped by. Then one column for the mean, one column for the median, and one column for the mode.

Examples

make_mean_median_mode_table_by_group(mtcars, "gear", "mpg")

Get mean and standard deviation of variables by group

Description

Get mean and standard deviation of variables by group

Usage

make_mean_std_dev_by_group_table(data, group_column, columns, total_row = TRUE)

Arguments

data

A data.frame with the data you want to make the table from.

group_column

A string with the name of the variable you are grouping by

columns

A string or vector of strings for the variables you want to get the mean and standard deviation for.

total_row

A boolean (default TRUE) for whether to include a row a the bottom for the overall mean and standard deviation (i.e. not by group).

Value

A data.frame with the first column showing the category grouped by. Then one column for each variable you want the mean and standard deviation for. Will give the mean and standard deviation as a single string with the standard deviation in parentheses.

Examples

make_mean_std_dev_by_group_table(mtcars, "gear", c("mpg", "disp"))

Make a table showing the number (n) and percent of the population (e.g. % of nrow()) for each value in a variable(s).

Description

Make a table showing the number (n) and percent of the population (e.g. % of nrow()) for each value in a variable(s).

Usage

make_n_and_percent_table(data, columns)

Arguments

data

A data.frame with the data you want to make the table from.

columns

A string or vector of strings with the column names to make the N and % from.

Value

A data.frame with one row for each value in the inputted variable(s) and columns showing the N and % for that value.

Examples

make_n_and_percent_table(mtcars, c("cyl", "gear"))

Make a graph of coefficient values and 95 percent confidence interval for regression.

Description

Make a graph of coefficient values and 95 percent confidence interval for regression.

Usage

make_regression_graph(model, coefficients = NULL)

Arguments

model

A 'lm' object made from making a model using 'lm()'.

coefficients

A string or vector of strings with the coefficient names. Will then make the graph only with those coefficients.

Value

Outputs a 'ggplot2' graph

Examples

make_regression_graph(model = lm(mpg ~ cyl + disp + hp + drat, data = mtcars))
make_regression_graph(model = lm(mpg ~ cyl + disp + hp + drat, data = mtcars),
coefficients = c("cyl", "disp"))
make_regression_graph(model = lm(mpg ~ cyl + disp, data = mtcars))

Turns regression results in a data.frame for easy conversion to a table

Description

Turns regression results in a data.frame for easy conversion to a table

Usage

make_regression_table(model, coefficients_only = TRUE)

Arguments

model

A 'lm' object made from making a model using 'lm()'.

coefficients_only

If TRUE (default), returns only the coefficients,standard error, t-value, p-value, and confidence intervals. Else also returns the r-squared, the adjusted r-squared,f-stat, p-value for the f-stat, and the degrees of freedom.

Value

A data.frame with the regression results

Examples

make_regression_table(lm(mpg ~ cyl, data = mtcars))
make_regression_table(lm(mpg ~ cyl, data = mtcars), coefficients_only = FALSE)

Make a nice-looking stat_count (similar to barplot) plot.

Description

Make a nice-looking stat_count (similar to barplot) plot.

Usage

make_stat_count_plots(
  data,
  column,
  count = TRUE,
  title = NULL,
  ylab = NULL,
  xlab = NULL
)

Arguments

data

A data.frame with the data you want to graph.

column

A string with the name of the column you want to make the plot from.

count

A boolean (default TRUE) indicating if you want the barplot to show a count of the column values or a percent.

title

A string with the text you want as the title.

ylab

A string with the text you want as the y-axis label.

xlab

A string with the text you want as the x-axis label.

Value

A stat_count object

Examples

make_stat_count_plots(mtcars, "mpg")

make_stat_count_plots(mtcars, "mpg", count = FALSE, title = "hello", ylab = "YLAB Label")

Returns abbreviations of state name input.

Description

Returns abbreviations of state name input.

Usage

make_state_abb(state)

Arguments

state

A vector of strings with the names of US states.

Value

A vector of strings with the abbreviations of the inputted state names.

Examples

make_state_abb("california")

Pad decimal places with trailing zeros.

Description

Pad decimal places with trailing zeros.

Usage

pad_decimals(numbers, digits = NULL)

Arguments

numbers

A number or vector of numbers.

digits

Number of decimal places to pad. If NULL (default), uses the maximum number of decimal places in the numbers input. If digits is less than the number of decimal places in the data, rounds the data to the decimal place specified. If rounding at a 5, follows R's rules to round to the nearest even number.

Value

The original numbers, now as strings with trailing zeros added to the decimal places.

Examples

pad_decimals(c(2, 3.4, 8.808))

A set of colorblind friendly colors for graphs.

Description

A set of colorblind friendly colors for graphs.

Usage

scale_color_crim(...)

Arguments

...

Arguments passed to discrete_scale()

Value

The ggplot graph with colors set.

Examples

ggplot2::ggplot(mtcars, ggplot2::aes(x = mpg, y = hp, color = as.character(cyl))) +
  ggplot2::geom_point(size = 2) +
  scale_color_crim()

A set of colorblind friendly fill colors for graphs.

Description

A set of colorblind friendly fill colors for graphs.

Usage

scale_fill_crim(...)

Arguments

...

Arguments passed to discrete_scale()

Value

The ggplot graph with fills set.

Examples

ggplot2::ggplot(mtcars, ggplot2::aes(x = cyl, fill = as.character(cyl))) +
 ggplot2::geom_bar() +
  scale_fill_crim()

A set of linetypes

Description

A set of linetypes

Usage

scale_linetype_crim(...)

Arguments

...

Arguments passed to discrete_scale()

Value

The ggplot graph with linetypes set.

Examples

ggplot2::ggplot(mtcars, ggplot2::aes(x = mpg, y = hp, linetype = as.character(cyl))) +
  ggplot2::geom_line(size = 1) +
  scale_linetype_crim() +
  theme_crim()

Create a PDF with one scatterplot for each group in the data.

Description

Create a PDF with one scatterplot for each group in the data.

Usage

scatterplot_data_graph(
  data,
  numeric_variable1,
  numeric_variable2,
  group_variable,
  file_name
)

Arguments

data

A data.frame with the data you want to graph.

numeric_variable1

A string with the name of the first column with numeric data to graph.

numeric_variable2

A string with the name of the second column with numeric data to graph.

group_variable

A string with the name of the column with the grouping variable.

file_name

A string with the name of the PDF to be made with one page for each graph.

Value

A PDF with one page per graph

Examples

## Not run: 
scatterplot_data_graph(mtcars, numeric_variable1 = "mpg", numeric_variable2 = "disp",
group_variable = "gear", file_name = "test.pdf")

## End(Not run)

A minimalist theme designed for graphics in academic research

Description

A minimalist theme designed for graphics in academic research

Usage

theme_crim()

Value

The graph with the theme changed.

Examples

ggplot2::ggplot(mtcars) +
ggplot2::geom_point(ggplot2::aes(x = wt, y = mpg)) +
theme_crim()

Create a PDF with one time-series graph for each group in the data.

Description

Create a PDF with one time-series graph for each group in the data.

Usage

time_series_data_graph(
  data,
  numeric_variable,
  time_variable,
  group_variable,
  outlier_std_dev_value = 1.96,
  file_name
)

Arguments

data

A data.frame with the data you want to graph.

numeric_variable

A string with the name of the column with numeric data to graph.

time_variable

A string with the name of the column that contains the time variable.

group_variable

A string with the name of the column with the grouping variable.

outlier_std_dev_value

A number that indicates how many standard deviations from the group mean an outlier is. Outliers will be colored orange in the data.

file_name

A string with the name of the PDF to be made with one page for each graph.

Value

A PDF with one page per graph

Get ORIs that consistently report their data every year.

Description

Get ORIs that consistently report their data every year.

Usage

ucr_constant_reporter_oris(data, minimum_months_reported)

Arguments

data

A data.frame with Uniform Crime Report (UCR) data. Requires at least the ORI, year, and number_of_months_reported columns.

minimum_months_reported

Integer indicating the minimum number of months requesting to keep in data.

Value

A vector with the ORIs that report the minimum number of months for every year in the data.