Introduction to sqlq package

sqlq helps writing SQL queries, using factory functions or an explicit syntax tree. With sqlq, identifiers and strings are safely quoted when needed. Complex conditional expressions can be created for WHERE clauses, and they can be built dynamically.

The original motivation for this package was to help building complex WHERE clauses, especially when the conditions are not known in advance. Here is a set of conditional parameters that may be used to construct a WHERE clause:

author <- NULL
year_range <- list(min=2010, max=2020)
price_ranges <- list(list(min=5, max=9), list(min=20,max=29),
                     list(min=50,max=100))

In the following example the WHERE clause is built dynamically, not knowing the values of the different conditions in advance (the list price ranges may be empty or contain 1, 10 or more elements, the author name may be provided or not, etc.):

library(sqlq)
expr <- ExprCommOp$new("and")
if (! is.null(author))
  expr$add(ExprBinOp$new(ExprField$new('author')), '=', ExprValue$new(author))
if (! is.null(year_range))
  expr$add(make_between('year', year_range$min, year_range$max))
if (length(price_ranges) > 0) {
  or <- ExprCommOp$new("or")
  for (rng in price_ranges)
    or$add(make_between('price', rng$min, rng$max))
  expr$add(or)
}
where <- make_where(expr)

The final WHERE expression constructed this way is then used to build a SELECT query:

make_select_all("books", where = where)$toString()
#> [1] "SELECT * FROM books WHERE \"year\" BETWEEN 2010 AND 2020 AND (price BETWEEN 5 AND 9 OR price BETWEEN 20 AND 29 OR price BETWEEN 50 AND 100);"

Another motivation was to automatically generate the correct quoting for identifiers and strings, depending on the database management system (DBMS) used, but avoid quoting when unnecessary. Identifiers are only quoted when they contain special characters (spaces, punctuation, etc.) or when they are reserved keywords.

SELECT query

Using the factory function make_select_all(), we can create a simple select * query on a table:

make_select_all("books")$toString()
#> [1] "SELECT * FROM books;"

We may add the DISTINCT keyword to remove duplicates:

make_select_all("books", distinct = TRUE)$toString()
#> [1] "SELECT DISTINCT * FROM books;"

Selecting specific fields

Using the make_select() factory function, we choose the fields we want to retrieve:

make_select("books", fields = c("title", "author"))$toString()
#> [1] "SELECT \"title\", author FROM books;"

LIMIT keyword

We may add the LIMIT keyword by specifying the limit inside the make_select*() functions:

make_select_all("books", limit = 16)$toString()
#> [1] "SELECT * FROM books LIMIT 16;"

WHERE clause

The make_select*() functions accept also a WHERE clause:

where <- StmtWhere$new(ExprBinOp$new(
  ExprField$new("year"), ">",
  ExprValue$new(2015)
))
make_select_all("books", where = where)$toString()
#> [1] "SELECT * FROM books WHERE \"year\" > 2015;"

Complex where clause

Expression objects can be combined to form more complex expressions. In the following example we want to retrieve all books whose author’s name starts with an “A” and whose publication year is after 2015.

We first define the expressions to check author en year:

year_cond <- ExprBinOp$new(ExprField$new("year"), ">", ExprValue$new(2015))
author_cond <- ExprBinOp$new(
  ExprField$new("author"), "like",
  ExprValue$new("A%")
)

Then we build the WHERE statement:

where <- StmtWhere$new(ExprBinOp$new(year_cond, "and", author_cond))

Finally we construct the SELECT query:

make_select_all("books", where = where)$toString()
#> [1] "SELECT * FROM books WHERE (\"year\" > 2015) AND (author LIKE 'A%');"

Dynamic where clause

Arbitrarily complex expressions can be built using the various Expr* classes and used inside a WHERE clause. In this example we use a vector of patterns to build a condition that tests a field on the different patterns.

From the following list of patterns:

patterns <- c("A%", "D%", "Z%")

And the author field:

author <- ExprField$new("author")

We build the list of expressions that checks all the patterns:

expressions <- lapply(
  patterns,
  function(x) {
    ExprBinOp$new(
      author, "like",
      ExprValue$new(x)
    )
  }
)

Then we link all these expressions with OR operators:

cond <- ExprCommOp$new("or", expressions)

Finally, we build the SELECT query:

make_select_all("books", where = StmtWhere$new(cond))$toString()
#> [1] "SELECT * FROM books WHERE (author LIKE 'A%') OR (author LIKE 'D%') OR (author LIKE 'Z%');"

JOIN statement

In this example, we make a join between tables books and authors.

First we define the fields we want to retrieve, from table books:

fields <- list(ExprField$new("title"), ExprField$new("name", "authors"))

Then we define the join statement, that operates on authors.id and books.author_id, using the make_join() function:

join <- make_join("id", "authors", "author_id", "books")

Finally we create the SELECT query:

make_select(tabl = "books", fields = fields, join = join)$toString()
#> [1] "SELECT \"title\", authors.name FROM books INNER JOIN authors ON authors.id = books.author_id;"

Setting the type of JOIN

By default, the make_join() function creates an INNER JOIN. If you want to use a different type of join, you can specify it using the type argument:

make_join("id", "authors", "author_id", "books", type = "LEFT")$toString()
#> [1] "LEFT OUTER JOIN authors ON authors.id = books.author_id"

Join on multiple tables

You can also join multiple tables by using the add() method of the SelectQuery class, which is returned by the make_select_all() or make_select() functions:

x <- make_select_all(tabl = "books")
x$add(make_join("book_id", "bookcat", "id", "books"))
x$add(make_join("id", "categories", "cat_id", "bookcat"))
x$toString()
#> [1] "SELECT * FROM books INNER JOIN bookcat ON bookcat.book_id = books.id INNER JOIN categories ON categories.id = bookcat.cat_id;"

INSERT INTO query

To generate an INSERT INTO query, use the make_insert():

values <- list(list('John Smith', 'Memories', 1999),
               list('Barbara', 'My Life', 2010))
make_insert(tabl = 'books', fields = c('author', 'title', 'year'),
            values = values)$toString()
#> [1] "INSERT INTO books (author, \"title\", \"year\") VALUES ('John Smith', 'Memories', 1999), ('Barbara', 'My Life', 2010);"

CREATE TABLE query

To generate a CREATE TABLE query, use the make_create_table() function.

We must first define the fields/columns of the table:

fields_def <- list(ExprFieldDef$new('id', 'integer', primary=TRUE),
                   ExprFieldDef$new('title', 'varchar(200)', nullable=FALSE),
                   ExprFieldDef$new('author', 'varchar(80)', nullable=FALSE))

Then we can call the make_create_table() function:

make_create_table(tabl = 'books', fields_def = fields_def)$toString()
#> [1] "CREATE TABLE books (id INTEGER PRIMARY KEY, \"title\" VARCHAR(200) NOT NULL, author VARCHAR(80) NOT NULL);"

DELETE FROM query

To generate a DELETE FROM query, use the make_delete() function.

We first define the WHERE clause:

where <- StmtWhere$new(ExprBinOp$new(
  ExprField$new("year"), "<",
  ExprValue$new(2015)
))

Then we call the make_delete() function:

make_delete(tabl = "books", where = where)$toString()
#> [1] "DELETE FROM books WHERE \"year\" < 2015;"

UPDATE query

To generate a UPDATE query, use the make_update() function, along with the make_set() function.

We first define the WHERE clause:

where <- StmtWhere$new(ExprBinOp$new(
  ExprField$new("year"), "<",
  ExprValue$new(2010)
))

Then we create the SET statement using the make_set() function:

set <- make_set(price = 9.50, old = TRUE)

Finally we create the query using the make_update() function:

make_update('books', set = set, where = where)$toString()
#> [1] "UPDATE books SET price = 9.5, \"old\" = TRUE WHERE \"year\" < 2010;"

Noticeable expressions

Binary operators

Binary operators are used to compare two expressions, such as fields or values. The ExprBinOp class is used to create such expressions. It accepts two operands and an operator as arguments. The operands can be ExprField, ExprValue, or other Expr instances.

comp1 <- ExprBinOp$new(ExprField$new("year"), ">", ExprValue$new(1994))
comp2 <- ExprBinOp$new(ExprField$new("author"), "=", ExprValue$new("John Doe"))
where <- StmtWhere$new(ExprBinOp$new(comp1, "or", comp2))
make_select_all("books", distinct = TRUE, where = where)$toString()
#> [1] "SELECT DISTINCT * FROM books WHERE (\"year\" > 1994) OR (author = 'John Doe');"

Commutative operators

Commutative operators are used to combine multiple expressions with the same operator, such as AND or OR. The ExprCommOp class is used to create such expressions. It accepts an operator and a list of expressions as arguments.

or <- ExprCommOp$new("or", list(
  ExprBinOp$new(ExprField$new("year"), ">", ExprValue$new(1994)),
  ExprBinOp$new(ExprField$new("author"), "=", ExprValue$new("John Doe")),
  ExprBinOp$new(ExprField$new("title"), "like", ExprValue$new("A%"))
))
where <- StmtWhere$new(or)
make_select_all("books", distinct = TRUE, where = where)$toString()
#> [1] "SELECT DISTINCT * FROM books WHERE (\"year\" > 1994) OR (author = 'John Doe') OR (\"title\" LIKE 'A%');"

BETWEEN

Using the make_between() function, we can create a BETWEEN expression:

make_between('i', 1, 10)$toString()
#> [1] "i BETWEEN 1 AND 10"

make_between() accepts both atomic values and ExprValue instances for low and high limits.

IS NULL and IS NOT NULL

To test if a field is NULL use the ExprIsNull class:

StmtWhere$new(ExprIsNull$new(ExprField$new("name")))$toString()
#> [1] "WHERE name IS NULL"

To test if a field is NULL use the ExprIsNotNull class:

StmtWhere$new(ExprIsNotNull$new(ExprField$new("name")))$toString()
#> [1] "WHERE name IS NOT NULL"

Style options

Using proper quoting for a specific database

In order to generate correct SQL requests for a DBMS, it is recommended to inform the sqlq package about the connector we are using. This is done with the sqlq_conn global option.

First we create the database connector (here a connector to an In-Memory instance of SQLite DB):

mydb <- DBI::dbConnect(RSQLite::SQLite(), ":memory:")

Then we declare the connector to sqlq:

options(sqlq_conn = mydb)

Finally we build the request:

fields <- c("The Title", "author")
where <- StmtWhere$new(ExprBinOp$new(ExprField$new("author"), "=",
                                     ExprValue$new("John Doe")))
make_select("books", fields = fields, where = where)$toString()
#> [1] "SELECT `The Title`, author FROM books WHERE author = 'John Doe';"

We can see that now backticks (`) are used instead of regular quotes (“). This is indeed the official mean of quoting identifiers in SQLite. This is also the case for MariaDB.

Keywords uppercase/lowercase

By default keywords and alphabetical operators (OR, AND ,etc.) are written uppercase.

You can force lowercase by setting the global option sqlq_uppercase to FALSE:

options(sqlq_uppercase = FALSE)
comp1 <- ExprBinOp$new(ExprField$new("year"), ">", ExprValue$new(1994))
comp2 <- ExprBinOp$new(ExprField$new("author"), "=", ExprValue$new("John Doe"))
where <- StmtWhere$new(ExprBinOp$new(comp1, "or", comp2))
make_select_all("books", distinct = TRUE, where = where)$toString()
#> [1] "select distinct * from books where (`year` > 1994) or (author = 'John Doe');"

Spaces

Unnecessary spaces may be removed by setting the sqlq_spaces global option to FALSE.

Without setting the option, spaces are set around non-alphabetical operators (here >):

where <- StmtWhere$new(ExprBinOp$new(
  ExprField$new("year"), ">",
  ExprValue$new(1994)
))
make_select_all("books", distinct = TRUE, where = where)$toString()
#> [1] "select distinct * from books where `year` > 1994;"

When setting the option to FALSE, such space characters are removed:

options(sqlq_spaces = FALSE)
where <- StmtWhere$new(ExprBinOp$new(
  ExprField$new("year"), ">",
  ExprValue$new(1994)
))
make_select_all("books", distinct = TRUE, where = where)$toString()
#> [1] "select distinct * from books where `year`>1994;"