Skip to contents

Fast data.frame comparisons at the cell level

Usage

ifelsedata(x, y, arg = NULL, matchCols = FALSE)

Arguments

x

A `data.frame` or `matrix`

y

A `data.frame` or `matrix` of the same dimensions as `x` or a vector with a length matching the rows of `x` or a length 1. If `y` is a `data.frame` or `matrix` with dimensions different than `x`, the larger will be trimmed to match the dimensions of the smaller.

arg

(Optional) A logical test expression including `x` and `y`. If `arg` is not included, it is assumed that all values of `y` are logical.

matchCols

(Optional) A boolean that determines if columns will be matched based on name or position. Columns will be returned in the order they are in in `x`. Columns not present in `x` will not be returned. Defaults to `FALSE`.

Value

Returns a data.frame of the smallest size by rows and columns. The cells returned are from `x` if the test passes and `NA` if it does not pass.

Examples

# create dummy data
x <- data.frame(matrix(data = sample(1:10, 100, TRUE), nrow = 10, ncol = 10))
y <- data.frame(matrix(data = sample(1:10, 100, TRUE), nrow = 10, ncol = 10))

# test for equality
ifelsedata(x, y, "x >= y | x == y - 2")
#>    X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
#> 1   4 NA  7  1 NA  6 NA  7  4  NA
#> 2   8 NA  6  8 NA NA  2  6  4   5
#> 3  NA 10 NA  1 NA NA  4  8  9  NA
#> 4  10  8  9 NA  7  8  9 NA NA   9
#> 5  NA 10 NA  2 10 10  6 NA 10  NA
#> 6   8  6 NA NA  9  9  7 NA NA  NA
#> 7  NA NA  8 10 NA  9 10 NA  7  NA
#> 8   5  8  2 NA NA  2 NA NA  7  NA
#> 9   2 10 NA  2  9 NA 10  5  2   6
#> 10 NA  4 NA  8 NA NA  4 NA  1   8

# rename x columns
colnames(x) <- paste0("X", 5:14)
# match with column names
ifelsedata(x, y, "x >= y | x == y - 2", TRUE)
#>    X5 X6 X7 X8 X9 X10
#> 1  NA  6  7 NA  4  NA
#> 2   8 NA  6  8  7   1
#> 3   4 10 NA  1  1  NA
#> 4  10  8  9 NA  7   8
#> 5  NA 10  6 NA 10  10
#> 6   8 NA NA NA  9   9
#> 7  NA NA  8 10  5   9
#> 8  NA  8 NA NA NA  NA
#> 9  NA 10  4 NA  9   1
#> 10 NA NA  6  8  5   4

# match based on booleans in y
y <- data.frame(matrix(data = sample(c(TRUE, FALSE), 100, TRUE),
                       nrow = 10,
                       ncol = 10))
# test based on TRUE/FALSE in y
ifelsedata(x, y)
#>    X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
#> 1  NA NA  7  1 NA  6 NA  7 NA   4
#> 2  NA  4  6  8 NA  1  2 NA  4  NA
#> 3  NA 10 NA  1  1  4 NA  8  9  NA
#> 4  NA  8  9  1  7  8 NA  1 NA  NA
#> 5   3 10  6  2 10 NA NA NA 10  NA
#> 6   8  6  1  2  9 NA NA  6  3   5
#> 7   9 NA  8 NA  5 NA NA NA NA  NA
#> 8   5 NA  2 NA NA  2  4  3 NA   2
#> 9   2 10  4 NA NA NA NA NA  2   6
#> 10  1 NA NA NA NA  4  4  6  1  NA