---
title: "maldipickr"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{maldipickr}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
```{r setup}
library(maldipickr)
```
## Quickstart
The `{maldipickr}` package helps microbiologists reduce duplicate/clonal bacteria from their cultures and eventually exclude previously selected bacteria. `{maldipickr}` achieve this feat by grouping together data from MALDI Biotyper and helps choose representative bacteria from each group using user-relevant metadata -- a process known as **cherry-picking**.
`{maldipickr}` cherry-picks bacterial isolates with MALDI Biotyper:
* [using taxonomic identification report](#using-taxonomic-identification-report)
* [using spectra data](#using-spectra-data)
### Using taxonomic identification report
First make sure `{maldipickr}` is installed and loaded, alternatively [follow the instructions to install the package](https://clavellab.github.io/maldipickr/index.html#installation).
Cherry-picking four isolates based on their taxonomic identification by the MALDI Biotyper is done in a few steps with `{maldipickr}`.
#### Get example data
We import an example Biotyper CSV report and glimpse at the table.
```{r quickstart_report_data, eval = TRUE}
report_tbl <- read_biotyper_report(
system.file("biotyper_unknown.csv", package = "maldipickr")
)
report_tbl %>%
dplyr::select(name, bruker_species, bruker_log) %>% knitr::kable()
```
#### Delineate clusters and cherry-pick
Delineate clusters from the identifications after filtering the reliable ones and cherry-pick one representative spectra.
Unreliable identifications based on the log-score are replaced by "not reliable identification", but stay tuned as they do not represent the same isolates!
```{r quickstart_report_filter, eval = TRUE}
report_tbl <- report_tbl %>%
dplyr::mutate(
bruker_species = dplyr::if_else(bruker_log >= 2, bruker_species,
"not reliable identification")
)
knitr::kable(report_tbl)
```
The chosen ones are indicated by `to_pick` column.
```{r quickstart_report_delineate, eval = TRUE}
report_tbl %>%
delineate_with_identification() %>%
pick_spectra(report_tbl, criteria_column = "bruker_log") %>%
dplyr::relocate(name, to_pick, bruker_species) %>%
knitr::kable()
```
### Using spectra data
In parallel to taxonomic identification reports, `{maldipickr}` process spectra data.
Make sure `{maldipickr}` is installed and loaded, alternatively [follow the instructions to install the package](https://clavellab.github.io/maldipickr/index.html#installation).
Cherry-picking six isolates from three species based on their spectra data obtained from the MALDI Biotyper is done in a few steps with `{maldipickr}`.
#### Get example data
We set up the directory location of our example spectra data, but adjust for your requirements. We import and process the spectra which gives us a named list of three objects: spectra, peaks and metadata (more details in Value section of `process_spectra()`).
```{r quickstart_spectra_data, eval = TRUE}
spectra_dir <- system.file("toy-species-spectra", package = "maldipickr")
processed <- spectra_dir %>%
import_biotyper_spectra() %>%
process_spectra()
```
#### Delineate clusters and cherry-pick
Delineate spectra clusters using Cosine similarity and cherry-pick one representative spectra.
The chosen ones are indicated by `to_pick` column.
```{r quickstart_spectra_delineate, eval = TRUE}
processed %>%
list() %>%
merge_processed_spectra() %>%
coop::tcosine() %>%
delineate_with_similarity(threshold = 0.92) %>%
set_reference_spectra(processed$metadata) %>%
pick_spectra() %>%
dplyr::relocate(name, to_pick) %>%
knitr::kable()
```
This provides only a brief overview of the features of `{maldipickr}`, browse the other vignettes to learn more about additional features.
## Session information
```{r session, eval = TRUE}
sessionInfo()
```