The ops module provides five infix operators that cover four common tasks:
| Task | Operators |
|---|---|
| String concatenation | %p% |
| Set membership | %nin% |
| Case-insensitive matching | %match%, %map% |
| Strict equality | %is% |
Note: All code examples in this vignette are static (
eval = FALSE). Output is hand-written to reflect the current implementation. If you modify the operators, re-verify the examples manually or switch chunks toeval = TRUE.
All operators that accept character input validate their arguments
and raise an informative error for non-character, NA, or
empty inputs. %nin% and %is% are unrestricted
— they mirror base R behaviour for any type.
%p% — Paste with a spaceConcatenates two character vectors element-wise with a single space.
Equivalent to paste(lhs, rhs, sep = " ") but reads more
naturally in pipelines.
"Hello" %p% "world"
#> [1] "Hello world"
c("good", "hello") %p% c("morning", "world")
#> [1] "good morning" "hello world"A length-1 operand is recycled over the longer vector in the usual R fashion:
Empty strings are valid — the space is always inserted:
NA values and non-character inputs are rejected:
"Hello" %p% NA
#> Error in `%p%()`:
#> ! `rhs` must be a non-empty character vector without NA values.
123 %p% "world"
#> Error in `%p%()`:
#> ! `lhs` must be a non-empty character vector without NA values.%nin% — Not-in operatorReturns TRUE for every element of x that is
not present in table. A concise
alternative to !(x %in% table).
c("A", "B", "C") %nin% c("B", "D")
#> [1] TRUE FALSE TRUE
1:5 %nin% c(2, 4)
#> [1] TRUE FALSE TRUE FALSE TRUE%nin% mirrors %in% exactly — it accepts any
type and follows base R semantics for NA and type
coercion:
# NA matches NA in the table
NA %nin% c(NA, 1)
#> [1] FALSE
# NA does not match non-NA elements
NA %nin% c(1, 2)
#> [1] TRUER coerces types before comparing, so character strings can match numeric values by their printed representation:
Empty vectors return zero-length results without error:
c("a", "b") %nin% character(0) # nothing to be in
#> [1] TRUE TRUE
character(0) %nin% c("a", "b")
#> logical(0)Both %match% and %map% lower-case both
sides before comparing, so "tp53" and "TP53"
are treated as the same string. Both require non-NA,
non-empty character vectors on both sides.
%match% — Return match indicesLike base::match(), but case-insensitive. Returns an
integer vector of positions; unmatched elements become
NA.
When table contains duplicates the index of the
first match is returned, matching base R behaviour:
Duplicate elements in x are each matched
independently:
Non-character inputs, NA values, and empty vectors are
rejected on both sides:
# empty x
character(0) %match% c("TP53")
#> Error in `%match%()`:
#> ! `x` must be a non-empty character vector without NA values.
# NA in x
c("tp53", NA) %match% c("TP53")
#> Error in `%match%()`:
#> ! `x` must be a non-empty character vector without NA values.
# empty table
c("tp53") %match% character(0)
#> Error in `%match%()`:
#> ! `table` must be a non-empty character vector without NA values.
# NA in table
c("tp53") %match% c("TP53", NA)
#> Error in `%match%()`:
#> ! `table` must be a non-empty character vector without NA values.%map% — Return a named character vectorLike %match%, but returns a named character
vector instead of indices. Names are the canonical entries from
table; values are the original elements from
x. Unmatched entries are silently dropped. Output order
follows x.
Output order follows x, not table:
Unmatched elements are dropped rather than returned as
NA:
When nothing matches, an empty named character vector is returned:
Duplicate elements in x that match are both
retained:
The same error rules as %match% apply on both sides:
# empty x
character(0) %map% c("TP53")
#> Error in `%map%()`:
#> ! `x` must be a non-empty character vector without NA values.
# NA in x
c("tp53", NA) %map% c("TP53")
#> Error in `%map%()`:
#> ! `x` must be a non-empty character vector without NA values.
# empty table
c("tp53") %map% character(0)
#> Error in `%map%()`:
#> ! `table` must be a non-empty character vector without NA values.
# NA in table
c("tp53") %map% c("TP53", NA)
#> Error in `%map%()`:
#> ! `table` must be a non-empty character vector without NA values.%is% — Identical comparisonWraps base::identical(). Returns a single
TRUE or FALSE with no tolerance for type or
attribute differences.
Unlike ==, %is% distinguishes types, names,
and storage mode:
1:3 %is% c(1, 2, 3) # integer vs double
#> [1] FALSE
c(a = 1, b = 2) %is% c(b = 1, a = 2) # same values, different names
#> [1] FALSENULL and NA variants are handled
correctly:
NULL %is% NULL
#> [1] TRUE
NA %is% NA
#> [1] TRUE
NA %is% NA_real_ # logical NA vs double NA
#> [1] FALSE%is% accepts any type — there is no input
restriction.
The operators compose naturally in bioinformatics pipelines. The example below filters a gene table to canonical symbols, maps aliases to their official form, then labels each gene’s match status.
library(evanverse)
canonical <- c("TP53", "BRCA1", "EGFR", "MYC", "PTEN")
query <- c("tp53", "brca1", "AKT1", "egfr", "unknown")
# 1. Which queries are not in the canonical set (case-insensitive)?
missing_idx <- which(is.na(query %match% canonical))
query[missing_idx]
#> [1] "AKT1" "unknown"
# 2. Map matched queries to their canonical names
query %map% canonical
#> TP53 BRCA1 EGFR
#> "tp53" "brca1" "egfr"
# 3. Build an annotation column using %p%
anno <- "Gene:" %p% canonical
anno
#> [1] "Gene: TP53" "Gene: BRCA1" "Gene: EGFR" "Gene: MYC" "Gene: PTEN"
# 4. Check that the canonical list hasn't changed
canonical %is% c("TP53", "BRCA1", "EGFR", "MYC", "PTEN")
#> [1] TRUE?"%p%", ?"%nin%", ?"%match%",
?"%map%", ?"%is%"