This vignette of package
groupedHyperframe
(CRAN, Github, RPubs)
documents the creation of groupedHyperframe
object, the
batch processes for a groupedHyperframe
, and aggregations
of various statistics over multi-level grouping structure.
Package groupedHyperframe
may require
the development versions of the spatstat
family.
devtools::install_github('spatstat/spatstat')
devtools::install_github('spatstat/spatstat.data')
devtools::install_github('spatstat/spatstat.explore')
devtools::install_github('spatstat/spatstat.geom')
devtools::install_github('spatstat/spatstat.linnet')
devtools::install_github('spatstat/spatstat.model')
devtools::install_github('spatstat/spatstat.random')
devtools::install_github('spatstat/spatstat.sparse')
devtools::install_github('spatstat/spatstat.univar')
devtools::install_github('spatstat/spatstat.utils')
Examples in this vignette require that the search
path
has
library(groupedHyperframe)
library(spatstat.data)
library(survival) # to help hyperframe understand Surv object
Users should remove the parameter mc.cores = 1L
from all
examples to engage all CPU cores on the current host under macOS. The
authors of package groupedHyperframe
are
forced to have mc.cores = 1L
in this vignette to pass
CRAN
’s submission check.
Term / Abbreviation | Description | Reference |
---|---|---|
Forward pipe operator |
?base::pipeOp introduced in R 4.1.0
|
|
attr
|
Attributes |
base::attr ; base::attributes
|
CRAN , R
|
The Comprehensive R Archive Network | https://cran.r-project.org |
data.frame
|
Data frame |
base::data.frame
|
formula
|
Formula |
stats::formula
|
fv , fv.object , fv.plot
|
(Plot of) function value table |
spatstat.explore::fv.object ,
spatstat.explore::plot.fv
|
groupedData , ~ g1/.../gm
|
Grouped data frame; nested grouping structure |
nlme::groupedData ; nlme::lme
|
hypercolumns , hyperframe
|
(Hyper columns of) hyper data frame |
spatstat.geom::hyperframe
|
inherits
|
Class inheritance |
base::inherits
|
kerndens
|
Kernel density |
stats::density.default()$y
|
mc.cores
|
Number of CPU cores to use |
parallel::mclapply ; parallel::detectCores
|
multitype
|
Multitype object |
spatstat.geom::is.multitype
|
object.size
|
Memory allocation |
utils::object.size
|
pmean , pmedian
|
Parallel mean and median |
groupedHyperframe::pmean ;
groupedHyperframe::pmedian
|
pmax , pmin
|
Parallel maxima and minima |
base::pmax ; base::pmin
|
ppp , ppp.object
|
(Marked) point pattern |
spatstat.geom::ppp.object
|
quantile
|
Quantile |
stats::quantile
|
save , xz
|
Save with xz compression
|
base::save(., compress = 'xz') ;
base::saveRDS(., compress = 'xz') ; https://en.wikipedia.org/wiki/XZ_Utils
|
S3 , generic , methods
|
S3 object oriented system
|
base::UseMethod ; utils::methods ;
utils::getS3method ; https://adv-r.hadley.nz/s3.html
|
search
|
Search path |
base::search
|
Surv
|
Survival object |
survival::Surv
|
trapz , cumtrapz
|
(Cumulative) trapezoidal integration |
pracma::trapz ; pracma::cumtrapz ; https://en.wikipedia.org/wiki/Trapezoidal_rule
|
This work supported by NCI R01CA222847 (I. Chervoneva, T. Zhan, and H. Rui) and R01CA253977 (H. Rui and I. Chervoneva).
groupedHyperframe
ClassThe S3
class groupedHyperframe
inherits
from the hyperframe
class, in a
similar fashion as the groupedData
class inherits from the
data.frame
class.
A groupedHyperframe
object, in addition to a
hyperframe
object, has attribute(s)
attr(., 'group')
, a formula
to specify the
(nested) grouping structuregroupedHyperframe
hyperframe
The S3
method dispatch
as.groupedHyperframe.hyperframe()
converts a
hyperframe
to groupedHyperframe
. Data set
spatstat.data::osteo
has the serial number of sampling
volume brick
nested in the bone sample id
,
osteo |> as.groupedHyperframe(group = ~ id/brick)
#> Grouped Hyperframe: ~id/brick
#>
#> 40 brick nested in
#> 4 id
#>
#> id shortid brick pts depth
#> 1 c77za4 4 1 (pp3) 45
#> 2 c77za4 4 2 (pp3) 60
#> 3 c77za4 4 3 (pp3) 55
#> 4 c77za4 4 4 (pp3) 60
#> 5 c77za4 4 5 (pp3) 85
#> 6 c77za4 4 6 (pp3) 90
#> 7 c77za4 4 7 (pp3) 95
#> 8 c77za4 4 8 (pp3) 65
#> 9 c77za4 4 9 (pp3) 100
#> 10 c77za4 4 10 (pp3) 100
data.frame
The S3
method dispatch
as.groupedHyperframe.data.frame()
converts a
data.frame
to a groupedHyperframe.
This
function inspects the input by the (nested) grouping structure,
identifies the column(s) with elements not identical within the lowest
group, and converts them into hypercolumns
. Data set
Ki67.
in this package has non-identical
column logKi67
in the nested grouping structure
~ patientID/tissueID
.
(Ki67g = Ki67. |> as.groupedHyperframe(group = ~ patientID/tissueID, mc.cores = 1L))
#> Grouped Hyperframe: ~patientID/tissueID
#>
#> 6 tissueID nested in
#> 6 patientID
#>
#> logKi67 tissueID Tstage PFS recfreesurv_mon recurrence adj_rad adj_chemo
#> 1 (numeric) TJUe_I17 2 100+ 100 0 FALSE FALSE
#> 2 (numeric) TJUe_G17 1 22 22 1 FALSE FALSE
#> 3 (numeric) TJUe_F17 1 99+ 99 0 FALSE NA
#> 4 (numeric) TJUe_D17 1 99+ 99 0 FALSE TRUE
#> 5 (numeric) TJUe_J18 1 112 112 1 TRUE TRUE
#> 6 (numeric) TJUe_N17 4 12 12 1 TRUE FALSE
#> histology Her2 HR node race age patientID
#> 1 3 TRUE TRUE TRUE White 66 PT00037
#> 2 3 FALSE TRUE FALSE Black 42 PT00039
#> 3 3 FALSE TRUE FALSE White 60 PT00040
#> 4 3 FALSE TRUE TRUE White 53 PT00042
#> 5 3 FALSE TRUE TRUE White 52 PT00054
#> 6 2 TRUE TRUE TRUE Black 51 PT00059
Converting a data.frame
with cell intensities, etc.,
into a groupedHyperframe
reduces memory allocation, but
does not reduce much the save
d files size if
xz
compression is used.
groupedHyperframe
with
ppp
-hypercolumn
Function grouped_ppp()
creates a
groupedHyperframe
with one-and-only-one
ppp
-hypercolumn
. In the following example, the
argument formula
specifies
numeric
mark
hladr
and multitype
mark
phenotype
, on the left-hand-sideOS
, gender
and
age
, before the |
separator on the
right-hand-sideimage_id
nested
in patient_id
, after the |
separator
on the right-hand-side.(s = grouped_ppp(formula = hladr + phenotype ~ OS + gender + age | patient_id/image_id,
data = wrobel_lung, mc.cores = 1L))
#> Grouped Hyperframe: ~patient_id/image_id
#>
#> 25 image_id nested in
#> 5 patient_id
#>
#> OS gender age patient_id image_id ppp.
#> 1 3488+ F 85 #01 0-889-121 [40864,18015].im3 (ppp)
#> 2 3488+ F 85 #01 0-889-121 [42689,19214].im3 (ppp)
#> 3 3488+ F 85 #01 0-889-121 [42806,16718].im3 (ppp)
#> 4 3488+ F 85 #01 0-889-121 [44311,17766].im3 (ppp)
#> 5 3488+ F 85 #01 0-889-121 [45366,16647].im3 (ppp)
#> 6 1605 M 66 #02 1-037-393 [56576,16907].im3 (ppp)
#> 7 1605 M 66 #02 1-037-393 [56583,15235].im3 (ppp)
#> 8 1605 M 66 #02 1-037-393 [57130,16082].im3 (ppp)
#> 9 1605 M 66 #02 1-037-393 [57396,17896].im3 (ppp)
#> 10 1605 M 66 #02 1-037-393 [57403,16934].im3 (ppp)
ppp
-hypercolumn
In this section, we outline the batch processes of spatial point
pattern analyses applicable to the one-and-only-one
ppp
-hypercolumn
of a hyperframe
.
These batch processes are not intended for a hyperframe
with multiple ppp
-hypercolumns
in the
foreseeable future, as that would require checking for name clashes in
the $marks
from multiple
ppp
-hypercolumns
.
fv
-hypercolumn
Batch Process | Workhorse in
spatstat.explore |
Applicable To | fv -hypercolumn Suffix |
---|---|---|---|
Emark_() |
Emark() |
numeric marks |
.E |
Vmark_() |
Vmark() |
numeric marks |
.V |
markcorr_() |
markcorr() |
numeric marks |
.k |
markvario_() |
markvario() |
numeric marks |
.gamma |
Gcross_() |
Gcross() |
multitype marks |
.G |
Kcross_() |
Kcross() |
multitype marks |
.K |
Jcross_() |
Jcross() |
multitype marks |
.J |
numeric
-hypercolumn
Batch Process | Workhorse in
spatstat.geom |
Applicable To | numeric -hypercolumn
Suffix |
---|---|---|---|
nncross_() |
nncross.ppp(., what = 'dist') |
multitype marks |
.nncross |
Multiple batch processes may be applied to a hyperframe
(or groupedHyperframe
) in a pipeline.
r = seq.int(from = 0, to = 250, by = 10)
out = s |>
Emark_(r = r, correction = 'best', mc.cores = 1L) |> # slow
# Vmark_(r = r, correction = 'best', mc.cores = 1L) |> # slow
# markcorr_(r = r, correction = 'best', mc.cores = 1L) |> # slow
# markvario_(r = r, correction = 'best', mc.cores = 1L) |> # slow
Gcross_(i = 'CK+.CD8-', j = 'CK-.CD8+', r = r, correction = 'best', mc.cores = 1L) |> # fast
# Kcross_(i = 'CK+.CD8-', j = 'CK-.CD8+', r = r, correction = 'best', mc.cores = 1L) |> # fast
nncross_(i = 'CK+.CD8-', j = 'CK-.CD8+', correction = 'best', mc.cores = 1L) # fast
#>
The returned hyperframe
(or
groupedHyperframe
) has
fv
-hypercolumn
hladr.E
, created by function Emark_()
on numeric
mark hladr
fv
-hypercolumn
phenotype.G
, created by function
Gcross_()
on multitype
mark
phenotype
numeric
-hypercolumn
phenotype.nncross
, created by function
nncross_()
on multitype
mark
phenotype
out
#> Grouped Hyperframe: ~patient_id/image_id
#>
#> 25 image_id nested in
#> 5 patient_id
#>
#> OS gender age patient_id image_id ppp. hladr.E phenotype.G
#> 1 3488+ F 85 #01 0-889-121 [40864,18015].im3 (ppp) (fv) (fv)
#> 2 3488+ F 85 #01 0-889-121 [42689,19214].im3 (ppp) (fv) (fv)
#> 3 3488+ F 85 #01 0-889-121 [42806,16718].im3 (ppp) (fv) (fv)
#> 4 3488+ F 85 #01 0-889-121 [44311,17766].im3 (ppp) (fv) (fv)
#> 5 3488+ F 85 #01 0-889-121 [45366,16647].im3 (ppp) (fv) (fv)
#> 6 1605 M 66 #02 1-037-393 [56576,16907].im3 (ppp) (fv) (fv)
#> 7 1605 M 66 #02 1-037-393 [56583,15235].im3 (ppp) (fv) (fv)
#> 8 1605 M 66 #02 1-037-393 [57130,16082].im3 (ppp) (fv) (fv)
#> 9 1605 M 66 #02 1-037-393 [57396,17896].im3 (ppp) (fv) (fv)
#> 10 1605 M 66 #02 1-037-393 [57403,16934].im3 (ppp) (fv) (fv)
#> phenotype.nncross
#> 1 (numeric)
#> 2 (numeric)
#> 3 (numeric)
#> 4 (numeric)
#> 5 (numeric)
#> 6 (numeric)
#> 7 (numeric)
#> 8 (numeric)
#> 9 (numeric)
#> 10 (numeric)
When nested grouping structure ~g1/g2/.../gm
is present,
we may aggregate over the
fv
-hypercolumns
numeric
-hypercolumns
numeric
marks in the
ppp
-hypercolumn
by either one of the grouping levels ~g1
,
~g2
, …, or ~gm
. If the lowest grouping
~gm
is specified, then no aggregation is performed.
fv
-hypercolumns
Function aggregate_fv()
aggregates
fv.plot
. In the following example, we have
numeric
-hypercolumns
hladr.E.value
and
phenotype.G.value
, aggregated function values from
fv
-hypercolumns
hladr.E
and phenotype.G
numeric
-hypercolumns
hladr.E.cumtrapz
and
phenotype.G.cumtrapz
, aggregated cumulative
trapezoidal integration from fv
-hypercolumns
hladr.E
and phenotype.G
(afv = out |>
aggregate_fv(by = ~ patient_id, f_aggr_ = pmean, mc.cores = 1L))
#> Column(s) image_id removed; as they are not identical per aggregation-group
#> Hyperframe:
#> OS gender age patient_id hladr.E.value hladr.E.cumtrapz
#> 1 3488+ F 85 #01 0-889-121 (numeric) (numeric)
#> 2 1605 M 66 #02 1-037-393 (numeric) (numeric)
#> 3 176 M 84 #03 2-080-378 (numeric) (numeric)
#> 4 2042+ M 79 #04 2-223-153 (numeric) (numeric)
#> 5 3747+ M 68 #05 2-286-740 (numeric) (numeric)
#> phenotype.G.value phenotype.G.cumtrapz
#> 1 (numeric) (numeric)
#> 2 (numeric) (numeric)
#> 3 (numeric) (numeric)
#> 4 (numeric) (numeric)
#> 5 (numeric) (numeric)
Each of the numeric
-hypercolumns
contains
tabulated values on the common grid of r
. One “slice” of
this grid may be extracted by
numeric
-hypercolumns
and
numeric
mark(s) in
ppp
-hypercolumn
Function aggregate_quantile()
aggregates the quantile
of
numeric
-hypercolumns
. In the following
example, we have
numeric
-hypercolumn
phenotype.nncross.quantile
, aggregated quantile of
numeric
-hypercolumn
phenotype.nncross
numeric
mark(s) in the
ppp
-hypercolumn.
In the following example, we
have
numeric
-hypercolumn
hladr.quantile
, aggregated quantile of
numeric
mark hladr
in
ppp
-hypercolumn
out |>
aggregate_quantile(by = ~ patient_id, probs = seq.int(from = 0, to = 1, by = .1), mc.cores = 1L)
#> Column(s) image_id removed; as they are not identical per aggregation-group
#> Hyperframe:
#> OS gender age patient_id phenotype.nncross.quantile hladr.quantile
#> 1 3488+ F 85 #01 0-889-121 (numeric) (numeric)
#> 2 1605 M 66 #02 1-037-393 (numeric) (numeric)
#> 3 176 M 84 #03 2-080-378 (numeric) (numeric)
#> 4 2042+ M 79 #04 2-223-153 (numeric) (numeric)
#> 5 3747+ M 68 #05 2-286-740 (numeric) (numeric)
Function aggregate_kerndens()
aggregates the kernel
density of
numeric
-hypercolumns
. In the following
example, we have
numeric
-hypercolumn
phenotype.nncross.kerndens
, aggregated kernel
density of numeric
-hypercolumn
phenotype.nncross
numeric
mark(s) in the
ppp
-hypercolumn
. In the following example, we
have
numeric
-hypercolumn
hladr.kerndens
, aggregated kernel density of
numeric
mark hladr
in
ppp
-hypercolumn
(mdist = out$phenotype.nncross |> unlist() |> max())
#> [1] 354.2968
out |>
aggregate_kerndens(by = ~ patient_id, from = 0, to = mdist, mc.cores = 1L)
#> Column(s) image_id removed; as they are not identical per aggregation-group
#> Hyperframe:
#> OS gender age patient_id phenotype.nncross.kerndens hladr.kerndens
#> 1 3488+ F 85 #01 0-889-121 (numeric) (numeric)
#> 2 1605 M 66 #02 1-037-393 (numeric) (numeric)
#> 3 176 M 84 #03 2-080-378 (numeric) (numeric)
#> 4 2042+ M 79 #04 2-223-153 (numeric) (numeric)
#> 5 3747+ M 68 #05 2-286-740 (numeric) (numeric)