Expand a variable into binary indicators
expand_bin.Rd
This function takes a dataframe and a variable, and expands it into binary indicators.
The variable is split by the split_by
separator, and each choice is represented by a binary column.
The binary columns are separated by the bin_sep
separator.
Usage
expand_bin(
df,
vars,
split_by = " ",
bin_sep = ".",
drop_undefined = NULL,
value_in = NULL,
value_in_suffix = NULL,
remove_new_bin = TRUE,
remove_other_bin = TRUE
)
Arguments
- df
The input dataframe.
- vars
The name of the variables to expand.
- split_by
The separator used to split the variable into choices (default: " ").
- bin_sep
The separator used to separate the original variable name and the choice name in the binary columns (default: ".").
- drop_undefined
A character vector of values to consider as undefined. Defaults to NULL if none.
- value_in
A character vector of values to consider as value_in. Defaults to NULL if none.
- value_in_suffix
A character scalar or an empty string to append to the variable names. Defaults to NULL.
- remove_new_bin
A logical scalar indicating whether to remove the new binary columns if they already exist in the dataframe. Defaults to TRUE.
- remove_other_bin
A logical scalar indicating whether to remove other binary columns starting with the variable name and the bin_sep. Defaults to TRUE.
Value
The modified dataframe with as many binary columns as there are choices in the original variable.
Examples
df <- data.frame(var1 = c("a b c", "a c", "d", NA), var2 = c("a b c", "a c", "c a", NA))
df <- expand_bin(df, c("var1", "var2"))
#> Warning: Converting df to data.table.
df
#> var1 var1.a var1.b var1.c var1.d var2 var2.a var2.b var2.c
#> <char> <int> <int> <int> <int> <char> <int> <int> <int>
#> 1: a b c 1 1 1 0 a b c 1 1 1
#> 2: a c 1 0 1 0 a c 1 0 1
#> 3: d 0 0 0 1 c a 1 0 1
#> 4: <NA> NA NA NA NA <NA> NA NA NA