--- title: "ForCausality: A Curated Collection of Causal Inference Datasets and Tools" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{ForCausality: A Curated Collection of Causal Inference Datasets and Tools} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup} library(ForCausality) library(ggplot2) library(dplyr) ``` # Introduction The `ForCausality` package provides a curated and comprehensive collection of datasets designed for **causal inference research**. It brings together data from diverse domains such as **clinical trials, cancer studies, epidemiological surveys, environmental exposures, and health-related observational studies**. The package includes a wide range of data types, covering **treatment outcomes, risk factors, survival data, case-control studies, and exposure assessments**. These datasets enable researchers and students to perform **causal analysis, risk evaluation, and advanced statistical modeling**, supporting both applied work and methodological development in causal inference. ## Dataset Suffixes Each dataset in the `ForCausality` package uses a `suffix` to denote the type of R object: - `_df`: A data frame ## Example Datasets Below are selected example datasets included in the `ForCausality` package: - `Colon_df`: Chemotherapy for Stage B/C colon cancer - `Stroke_df`: Fictional ischemic stroke data case control data with risk factors, exposures and confounders - `Pph_df`: An external control trial of treatments for post-partum hemorrhage ## Data Visualization with Colon Data ```{r patterns-subcortical-plot, fig.width=6, fig.height=4, out.width="100%", fig.alt = "Bar chart showing the number of patients by treatment group in the Colon_df dataset"} # Summarize the number of patients per treatment group colon_summary <- Colon_df %>% group_by(rx) %>% summarise(count = n()) # Create a simple bar chart ggplot(colon_summary, aes(x = rx, y = count, fill = rx)) + geom_bar(stat = "identity") + labs( title = "Number of Patients by Treatment Group", x = "Treatment Group", y = "Number of Patients" ) + theme_minimal() + guides(fill = "none") # Hide the legend since x-axis already shows groups ``` ## Conclusion The `ForCausality` package provides a well-curated collection of datasets specifically tailored for **causal inference research**. It integrates data from clinical trials, cancer studies, epidemiological surveys, environmental exposures, and health-related observational studies. By offering structured and documented datasets, the package facilitates **causal analysis, risk assessment, and advanced statistical modeling**, serving as a valuable resource for researchers, educators, and students interested in causal inference. For detailed information and full documentation of each dataset, please refer to the reference manual and help files included within the package.