---
title: "tidysummary"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{tidysummary}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
tidysummary: An Elegant Approach to Summarizing Clinical Data.
The goal of tidysummary is to streamlines the analysis of clinical data by automatically selecting appropriate statistical descriptions and inference methods based on variable types.
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
# Prepare your data
A data frame containing the variables to analyze, with variables at columns and observations at rows:
- Continuous variables: Numeric.
- Categorical variables: Factor (Ordinal Categorical variables: ordered Factor).
# add_var()
The add_var() function prepares your dataset for downstream analysis by classifying variables into:
- Continuous variables: Further subdivided by normality and equal variance assumptions.
- Categorical variables: Further subdivided by ordered status and expected frequency.
## Usage
Specify the variables to summarize in `var` and the grouping variable in `group`.
``` r
data <- iris %>%
add_var(var = c("Sepal.Length", "Sepal.Width"), group = "Species")
```
The function can automatically checks normality using statistical tests. You can choose:
### norm
- `'auto'`: By default, automatically checks normality, but the same as `ask` when n \> 1000.
- `'ask'`: Displays automatic result, QQ plots and prompts for manual confirmation.
- `true`: Treats all variables as normal.
- `false`: Treats all variables as non-normal.
``` r
data <- iris %>%
add_var(var = c("Sepal.Length", "Sepal.Width"), group = "Species", norm = "ask")
```
# add_summary()
The add_summary() function summarize your dataset from add_var() result with:
A summary dataframe with rows as the variables and columns as the group.
### Usage
Just input the result from add_var()
``` r
summary <- data %>%
add_summary()
```
If you want to custom the summary style, You can choose:
### add_overall
- `TRUE`: By default, include an "Overall" summary column.
- `FALSE`: Show only groups summary column.
### continuous_format
Format string to override both `norm_continuous_format`, and `unnorm_continuous_format`.
Accepted placeholders are `'{mean}'`, `'{SD}'`, `'{median}'`, `'{Q1}'`, `'{Q3}'`.
### norm_continuous_format
Default is `'{mean} ± {SD}'`. Accepted placeholders same as `continuous_format`.
### unnorm_continuous_format
Default is `'{median} ({Q1}, {Q3})'`. Accepted placeholders same as `continuous_format`.
### categorical_format
Format string for categorical variables. Default is `'{n} ({pct})'`. Accepted placeholders are `'{n}'` and `'{pct}'`.
### binary_show
- `'last'`: By default, show only last level.
- `'first'`: Show only first level.
- `'all'`: show all levels.
``` r
summary <- data %>%
add_summary(add_overall = T,
continuous_format = "{mean} ± {SD}",
categorical_format = "{n} ({pct})",
binary_show = "last")
```
# add_p()
The add_summary() function summarize your dataset from add_summary() result with:
A summary_with_p dataframe with rows as the variables and columns as the group.
### Usage
Just input the result from add_summary()
``` r
summary_with_p <- summary %>%
add_p()
```
If you want to custom the summary_with_p column, You can choose:
### asterisk
- `TRUE`: Show asterisk significance markers.
- `FALSE`: By default, show p-values.
### add_method
- `TRUE`: Show method text.
- `FALSE`: By default, not show method text.
- `'code'`: Show method as codes according to order of appearance.
### add_statistic_name
- `TRUE`: Show statistic name.
- `FALSE`: By default, not show statistic name.
### add_statistic_value
- `TRUE`: Show statistic value.
- `FALSE`: By default, not show statistic value.
``` r
summary_with_p <- summary %>%
add_p(asterisk = T, add_method = "code")
```