sqlq helps writing SQL queries, using factory functions or an explicit syntax tree. With sqlq, identifiers and strings are safely quoted when needed. Complex conditional expressions can be created for WHERE clauses, and they can be built dynamically.
The original motivation for this package was to help building complex WHERE clauses, especially when the conditions are not known in advance. Here is a set of conditional parameters that may be used to construct a WHERE clause:
author <- NULL
year_range <- list(min=2010, max=2020)
price_ranges <- list(list(min=5, max=9), list(min=20,max=29),
list(min=50,max=100))
In the following example the WHERE clause is built dynamically, not knowing the values of the different conditions in advance (the list price ranges may be empty or contain 1, 10 or more elements, the author name may be provided or not, etc.):
library(sqlq)
expr <- ExprCommOp$new("and")
if (! is.null(author))
expr$add(ExprBinOp$new(ExprField$new('author')), '=', ExprValue$new(author))
if (! is.null(year_range))
expr$add(make_between('year', year_range$min, year_range$max))
if (length(price_ranges) > 0) {
or <- ExprCommOp$new("or")
for (rng in price_ranges)
or$add(make_between('price', rng$min, rng$max))
expr$add(or)
}
where <- make_where(expr)
The final WHERE expression constructed this way is then used to build a SELECT query:
make_select_all("books", where = where)$toString()
#> [1] "SELECT * FROM books WHERE \"year\" BETWEEN 2010 AND 2020 AND (price BETWEEN 5 AND 9 OR price BETWEEN 20 AND 29 OR price BETWEEN 50 AND 100);"
Another motivation was to automatically generate the correct quoting for identifiers and strings, depending on the database management system (DBMS) used, but avoid quoting when unnecessary. Identifiers are only quoted when they contain special characters (spaces, punctuation, etc.) or when they are reserved keywords.
Using the factory function make_select_all()
, we can
create a simple select *
query on a table:
We may add the DISTINCT keyword to remove duplicates:
Using the make_select()
factory function, we choose the
fields we want to retrieve:
We may add the LIMIT
keyword by specifying the limit
inside the make_select*()
functions:
The make_select*()
functions accept also a
WHERE
clause:
Expression objects can be combined to form more complex expressions. In the following example we want to retrieve all books whose author’s name starts with an “A” and whose publication year is after 2015.
We first define the expressions to check author en year:
year_cond <- ExprBinOp$new(ExprField$new("year"), ">", ExprValue$new(2015))
author_cond <- ExprBinOp$new(
ExprField$new("author"), "like",
ExprValue$new("A%")
)
Then we build the WHERE statement:
Finally we construct the SELECT query:
Arbitrarily complex expressions can be built using the various
Expr*
classes and used inside a WHERE clause. In this
example we use a vector of patterns to build a condition that tests a
field on the different patterns.
From the following list of patterns:
And the author field:
We build the list of expressions that checks all the patterns:
expressions <- lapply(
patterns,
function(x) {
ExprBinOp$new(
author, "like",
ExprValue$new(x)
)
}
)
Then we link all these expressions with OR operators:
Finally, we build the SELECT query:
In this example, we make a join between tables books and authors.
First we define the fields we want to retrieve, from table books:
Then we define the join statement, that operates on
authors.id and books.author_id, using the
make_join()
function:
Finally we create the SELECT query:
make_select(tabl = "books", fields = fields, join = join)$toString()
#> [1] "SELECT \"title\", authors.name FROM books INNER JOIN authors ON authors.id = books.author_id;"
By default, the make_join()
function creates an
INNER JOIN
. If you want to use a different type of join,
you can specify it using the type
argument:
You can also join multiple tables by using the add()
method of the SelectQuery
class, which is returned by the
make_select_all()
or make_select()
functions:
To generate an INSERT INTO
query, use the
make_insert()
:
values <- list(list('John Smith', 'Memories', 1999),
list('Barbara', 'My Life', 2010))
make_insert(tabl = 'books', fields = c('author', 'title', 'year'),
values = values)$toString()
#> [1] "INSERT INTO books (author, \"title\", \"year\") VALUES ('John Smith', 'Memories', 1999), ('Barbara', 'My Life', 2010);"
To generate a CREATE TABLE
query, use the
make_create_table()
function.
We must first define the fields/columns of the table:
fields_def <- list(ExprFieldDef$new('id', 'integer', primary=TRUE),
ExprFieldDef$new('title', 'varchar(200)', nullable=FALSE),
ExprFieldDef$new('author', 'varchar(80)', nullable=FALSE))
Then we can call the make_create_table()
function:
To generate a DELETE FROM
query, use the
make_delete()
function.
We first define the WHERE clause:
Then we call the make_delete()
function:
To generate a UPDATE
query, use the
make_update()
function, along with the
make_set()
function.
We first define the WHERE clause:
Then we create the SET
statement using the
make_set()
function:
Finally we create the query using the make_update()
function:
Binary operators are used to compare two expressions, such as fields
or values. The ExprBinOp
class is used to create such
expressions. It accepts two operands and an operator as arguments. The
operands can be ExprField
, ExprValue
, or other
Expr
instances.
comp1 <- ExprBinOp$new(ExprField$new("year"), ">", ExprValue$new(1994))
comp2 <- ExprBinOp$new(ExprField$new("author"), "=", ExprValue$new("John Doe"))
where <- StmtWhere$new(ExprBinOp$new(comp1, "or", comp2))
make_select_all("books", distinct = TRUE, where = where)$toString()
#> [1] "SELECT DISTINCT * FROM books WHERE (\"year\" > 1994) OR (author = 'John Doe');"
Commutative operators are used to combine multiple expressions with
the same operator, such as AND
or OR
. The
ExprCommOp
class is used to create such expressions. It
accepts an operator and a list of expressions as arguments.
or <- ExprCommOp$new("or", list(
ExprBinOp$new(ExprField$new("year"), ">", ExprValue$new(1994)),
ExprBinOp$new(ExprField$new("author"), "=", ExprValue$new("John Doe")),
ExprBinOp$new(ExprField$new("title"), "like", ExprValue$new("A%"))
))
where <- StmtWhere$new(or)
make_select_all("books", distinct = TRUE, where = where)$toString()
#> [1] "SELECT DISTINCT * FROM books WHERE (\"year\" > 1994) OR (author = 'John Doe') OR (\"title\" LIKE 'A%');"
Using the make_between()
function, we can create a
BETWEEN
expression:
make_between()
accepts both atomic values and
ExprValue
instances for low and high limits.
To test if a field is NULL use the ExprIsNull
class:
To test if a field is NULL use the ExprIsNotNull
class:
In order to generate correct SQL requests for a DBMS, it is
recommended to inform the sqlq package about the connector we
are using. This is done with the sqlq_conn
global
option.
First we create the database connector (here a connector to an In-Memory instance of SQLite DB):
Then we declare the connector to sqlq:
Finally we build the request:
fields <- c("The Title", "author")
where <- StmtWhere$new(ExprBinOp$new(ExprField$new("author"), "=",
ExprValue$new("John Doe")))
make_select("books", fields = fields, where = where)$toString()
#> [1] "SELECT `The Title`, author FROM books WHERE author = 'John Doe';"
We can see that now backticks (`) are used instead of regular quotes (“). This is indeed the official mean of quoting identifiers in SQLite. This is also the case for MariaDB.
By default keywords and alphabetical operators (OR, AND ,etc.) are written uppercase.
You can force lowercase by setting the global option
sqlq_uppercase
to FALSE
:
options(sqlq_uppercase = FALSE)
comp1 <- ExprBinOp$new(ExprField$new("year"), ">", ExprValue$new(1994))
comp2 <- ExprBinOp$new(ExprField$new("author"), "=", ExprValue$new("John Doe"))
where <- StmtWhere$new(ExprBinOp$new(comp1, "or", comp2))
make_select_all("books", distinct = TRUE, where = where)$toString()
#> [1] "select distinct * from books where (`year` > 1994) or (author = 'John Doe');"
Unnecessary spaces may be removed by setting the
sqlq_spaces
global option to FALSE
.
Without setting the option, spaces are set around non-alphabetical
operators (here >
):
where <- StmtWhere$new(ExprBinOp$new(
ExprField$new("year"), ">",
ExprValue$new(1994)
))
make_select_all("books", distinct = TRUE, where = where)$toString()
#> [1] "select distinct * from books where `year` > 1994;"
When setting the option to FALSE
, such space characters
are removed: