Skip to content

tsostarics/contrastable

Repository files navigation

contrastable

CRAN status

Codecov test coverage Lifecycle: stable R-CMD-check

This package provides utilities to set different common contrast coding schemes for use with regression models. Please see tsostarics.github.io/contrastable/ for the package website containing documentation, get started guide, and related vignettes.

Installation

You can install from CRAN with:

install.packages("contrastable")

You can install the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("tsostarics/contrastable", build_vignettes = TRUE)

Citation

To cite contrastable in publications, please use

Sostarics, T. (2024). contrastable: Contrast Coding Utilities in R. R package version 1.0.2. https://CRAN.R-project.org/package=contrastable. doi:10.32614/CRAN.package.contrastable

A BibTeX entry for LaTeX users is

@Manual{contrastable,
  author = {Thomas Sostarics},
  title = {{contrastable}: Contrast Coding Utilities in {R}},
  year = {2024},
  note = {R package version 1.0.2},
  url = {https://CRAN.R-project.org/package=contrastable},
  doi = {10.32614/CRAN.package.contrastable},
}

See the Citation Examples section in the contrasts vignette for suggestions and examples of how to cite this package in a paper.

Usage

Here is a simple example showing how to set particular factors to a specific contrast scheme.

library(contrastable)
my_data <- mtcars
my_data$gear <- ordered(my_data$gear) # Set as ordered factor in dataframe

set_contrasts can be used to set the contrasts onto the dataframe itself, which is needed when a modeling function lacks a contrasts argument.

# Specify the contrast schemes we want, factor conversion done automatically
# Set reference level with + and intercept with *
my_data <- set_contrasts(my_data, 
                         cyl ~ scaled_sum_code + 6,
                         carb ~ helmert_code,
                         vs ~ treatment_code + 1,
                         print_contrasts = TRUE)
#> Converting to factors: cyl carb vs
#> Expect contr.treatment or contr.poly for unset factors: gear
#> $cyl
#>   4    8   
#> 4  2/3 -1/3
#> 6 -1/3 -1/3
#> 8 -1/3  2/3
#> 
#> $carb
#>   <2   <3   <4   <6   <8  
#> 1 -1/2 -1/3 -1/4 -1/5 -1/6
#> 2  1/2 -1/3 -1/4 -1/5 -1/6
#> 3    0  2/3 -1/4 -1/5 -1/6
#> 4    0    0  3/4 -1/5 -1/6
#> 6    0    0    0  4/5 -1/6
#> 8    0    0    0    0  5/6
#> 
#> $vs
#>   0
#> 0 1
#> 1 0

We can use glimpse_contrasts to get information about the factors and diagnostics about the scheme we have set.

# Create a reusable list to use with multiple functions
contrast_schemes <- list(
  cyl ~ scaled_sum_code + 6,
  carb ~ helmert_code,
  vs ~ treatment_code + 1
)

# Get information about our contrasts, even those we didn't explicitly set
# (gear is ordered, and so uses contr.poly by default)
glimpse_contrasts(my_data,
                  contrast_schemes,
                  add_namespace = TRUE,
                  show_all_factors = TRUE) |>
  knitr::kable()
factor n level_names scheme reference intercept
cyl 3 4, 6, 8 contrastable::scaled_sum_code 6 grand mean
carb 6 1, 2, 3,…. contrastable::helmert_code NA grand mean
vs 2 0, 1 contrastable::treatment_code 1 mean(1)
gear 3 3, 4, 5 stats::contr.poly NA grand mean

enlist_contrasts can be used to generate a named list of contrasts that can be used in the contrasts argument of various modeling functions.

# Get a list of the contrasts we've explicitly set
enlist_contrasts(mtcars, contrast_schemes)
#> $cyl
#>   4    8   
#> 4  2/3 -1/3
#> 6 -1/3 -1/3
#> 8 -1/3  2/3
#> 
#> $carb
#>   <2   <3   <4   <6   <8  
#> 1 -1/2 -1/3 -1/4 -1/5 -1/6
#> 2  1/2 -1/3 -1/4 -1/5 -1/6
#> 3    0  2/3 -1/4 -1/5 -1/6
#> 4    0    0  3/4 -1/5 -1/6
#> 6    0    0    0  4/5 -1/6
#> 8    0    0    0    0  5/6
#> 
#> $vs
#>   0
#> 0 1
#> 1 0

You can also set multiple contrasts at once using {tidyselect} functionality.

# Create a new dataframe with a bunch of factors
my_data2 <- 
  data.frame(a = gl(2,10),
             b = gl(5,2, ordered = TRUE),
             c = gl(5,2),
             d = 1:10,
             e = 11:20)

enlist_contrasts(my_data2,
                 where(is.ordered) ~ polynomial_code,
                 where(is.unordered) ~ helmert_code,
                 d + e ~ sum_code)
#> $b
#>              .L         .Q            .C         ^4
#> 1 -6.324555e-01  0.5345225 -3.162278e-01  0.1195229
#> 2 -3.162278e-01 -0.2672612  6.324555e-01 -0.4780914
#> 3 -3.510833e-17 -0.5345225  1.755417e-16  0.7171372
#> 4  3.162278e-01 -0.2672612 -6.324555e-01 -0.4780914
#> 5  6.324555e-01  0.5345225  3.162278e-01  0.1195229
#> 
#> $a
#>     <2
#> 1 -0.5
#> 2  0.5
#> 
#> $c
#>     <2         <3    <4   <5
#> 1 -0.5 -0.3333333 -0.25 -0.2
#> 2  0.5 -0.3333333 -0.25 -0.2
#> 3  0.0  0.6666667 -0.25 -0.2
#> 4  0.0  0.0000000  0.75 -0.2
#> 5  0.0  0.0000000  0.00  0.8
#> 
#> $d
#>     2  3  4  5  6  7  8  9 10
#> 1  -1 -1 -1 -1 -1 -1 -1 -1 -1
#> 2   1  0  0  0  0  0  0  0  0
#> 3   0  1  0  0  0  0  0  0  0
#> 4   0  0  1  0  0  0  0  0  0
#> 5   0  0  0  1  0  0  0  0  0
#> 6   0  0  0  0  1  0  0  0  0
#> 7   0  0  0  0  0  1  0  0  0
#> 8   0  0  0  0  0  0  1  0  0
#> 9   0  0  0  0  0  0  0  1  0
#> 10  0  0  0  0  0  0  0  0  1
#> 
#> $e
#>    12 13 14 15 16 17 18 19 20
#> 11 -1 -1 -1 -1 -1 -1 -1 -1 -1
#> 12  1  0  0  0  0  0  0  0  0
#> 13  0  1  0  0  0  0  0  0  0
#> 14  0  0  1  0  0  0  0  0  0
#> 15  0  0  0  1  0  0  0  0  0
#> 16  0  0  0  0  1  0  0  0  0
#> 17  0  0  0  0  0  1  0  0  0
#> 18  0  0  0  0  0  0  1  0  0
#> 19  0  0  0  0  0  0  0  1  0
#> 20  0  0  0  0  0  0  0  0  1

The functions in this package aim to be helpful when potential mistakes are made and transparent when things happen behind the scenes (e.g., automatic factor coercion). You can check out descriptions of various messages and warnings in the warnings vignette with vignette('warnings', 'contrastable').