| Title: | Supervised Classification for Functional Data via Signed Depth |
| Version: | 0.1.0 |
| Description: | Provides a suite of supervised classifiers for functional data based on the concept of signed depth. The core pipeline computes Fraiman-Muniz (FM) functional depth in either its Tukey or Simplicial variant, derives a signed depth by comparing each curve to a reference median curve via the signed distance integral, and feeds the resulting scalar summary into several classifiers: the k-Ranked Nearest Neighbour (k-RNN) rule, a moving-average smoother, a kernel-density Bayes rule, logistic regression on signed depth and distance to the mode, and a generalised additive model (GAM) classifier. Cross-validation routines for tuning the neighbourhood size k and parametric bootstrap confidence intervals are also included. |
| License: | GPL-3 |
| Encoding: | UTF-8 |
| Language: | en-GB |
| RoxygenNote: | 7.3.1 |
| Depends: | R (≥ 4.1.0) |
| Imports: | stats, graphics, mgcv, modeest |
| Suggests: | testthat (≥ 3.0.0), spelling, knitr, rmarkdown |
| Config/testthat/edition: | 3 |
| URL: | https://github.com/dapr12/fdclassify |
| BugReports: | https://github.com/dapr12/fdclassify/issues |
| NeedsCompilation: | no |
| Packaged: | 2026-04-22 14:53:05 UTC; mbbxkdp3 |
| Author: | Diego Andrés Pérez Ruiz
|
| Maintainer: | Diego Andrés Pérez Ruiz <diego.perezruiz@manchester.ac.uk> |
| Repository: | CRAN |
| Date/Publication: | 2026-04-23 20:10:03 UTC |
Bayesian Kernel-Density Classifier on Signed Depth
Description
Classifies functional curves via Bayes' rule applied to the signed depths.
Class-conditional densities f_g(\mathrm{sdp}) are estimated by
kernel density estimation; prior probabilities are estimated from class
frequencies or supplied by the user.
Usage
bayes_depth_classify(
X_train,
y_train,
X_test = NULL,
priors = NULL,
bw_method = "nrd0",
grid = NULL,
type = c("tukey", "simplicial")
)
## S3 method for class 'fd_bayes_fit'
print(x, ...)
Arguments
X_train |
Numeric matrix ( |
y_train |
Integer vector of labels (0/1). |
X_test |
Numeric matrix of test curves. If |
priors |
Named numeric vector with elements |
bw_method |
Bandwidth selection method passed to
|
grid |
Numeric vector of length |
type |
Depth variant. |
x |
A |
... |
Ignored. |
Details
The posterior probability that a new observation belongs to class g
is
P(g \mid \mathrm{sdp}_0) =
\frac{f_g(\mathrm{sdp}_0)\,\pi_g}
{\sum_{g'} f_{g'}(\mathrm{sdp}_0)\,\pi_{g'}}.
Value
An object of class "fd_bayes_fit" with components:
- predicted
Predicted class labels.
- prob
Matrix with columns
prob_0andprob_1: posterior probabilities.- log_odds
Log-odds
\log(f_0\pi_0 / f_1\pi_1).- sd_train
The
"fd_signed_depth"object for the training set.
Invisibly returns x.
Examples
set.seed(11)
M <- 80; N <- 100
t <- seq(0, 1, length.out = M)
X0 <- t(replicate(50, sin(2 * pi * t) + rnorm(M, sd = 0.4)))
X1 <- t(replicate(50, cos(2 * pi * t) + rnorm(M, sd = 0.4)))
X <- rbind(X0, X1)
y <- rep(0:1, each = 50)
fit <- bayes_depth_classify(X, y)
table(fit$predicted, y)
Fraiman-Muniz Functional Depth
Description
Computes the Fraiman-Muniz (FM) depth for a collection of discretised
functional observations. Two variants are supported: the Tukey-FM depth,
which maps depth values to [0, 1], and the Simplicial-FM depth,
which maps to [0.5, 1]. The two are related by
\text{Tukey-FM}_i = 2\,(\text{Simplicial-FM}_i - 1/2).
Usage
fm_depth(X, grid = NULL, type = c("tukey", "simplicial"))
Arguments
X |
Numeric matrix of dimension |
grid |
Numeric vector of length |
type |
Character string, either |
Details
For a discretised dataset \{x_i(t_j)\}, i = 1,\ldots,N,
j = 1,\ldots,M, the Simplicial-FM depth is
\text{FM}_i = \sum_{j=2}^{M}(t_j - t_{j-1})
\left[1 - \left|\frac{1}{2} - F_{N,t_j}(x_i(t_j))\right|\right],
where F_{N,t} is the empirical CDF of the sample at time t.
The Tukey-FM depth substitutes the Tukey univariate depth
\min\{F_{N,t}(x), 1 - F_{N,t}(x)\} at each time point.
Value
A numeric vector of length N containing the FM depth value
for each curve.
References
Fraiman, R. and Muniz, G. (2001). Trimmed means for functional data. Test, 10(2), 419–440.
López-Pintado, S. and Romo, J. (2009). On the concept of depth for functional data. Journal of the American Statistical Association, 104(486), 718–734.
Examples
set.seed(1)
N <- 50; M <- 100
X <- matrix(rnorm(N * M), nrow = N)
d <- fm_depth(X)
plot(d, xlab = "Curve index", ylab = "Tukey-FM depth")
GAM Classifier on Signed Depth
Description
Fits a generalised additive model (GAM) with a smooth term on the signed depth to estimate class membership probabilities. An optional iterative outlier down-weighting scheme is available (Section 3.8 of Perez Ruiz, 2020).
Usage
gam_depth_classify(
X_train,
y_train,
X_test = NULL,
covariates = c("sdp", "sdp+mode"),
k_gam = 10L,
n_pc = 10L,
downweight = FALSE,
max_iter = 10L,
grid = NULL,
type = c("tukey", "simplicial")
)
## S3 method for class 'fd_gam_fit'
print(x, ...)
Arguments
X_train |
Numeric matrix ( |
y_train |
Integer vector of labels (0/1). |
X_test |
Numeric matrix. If |
covariates |
Character: |
k_gam |
Basis dimension for |
n_pc |
Number of PCs for mode estimation. Default |
downweight |
Logical; if |
max_iter |
Maximum down-weighting iterations. Default |
grid |
Numeric vector of length |
type |
Depth variant. |
x |
A |
... |
Ignored. |
Value
An object of class "fd_gam_fit" with components:
- predicted
Predicted class labels.
- prob
Matrix with columns
prob_0andprob_1.- gam_fit
The fitted
gamobject.- sd_train
The
"fd_signed_depth"object for the training set.- covariates
Character: covariates used.
Invisibly returns x.
Examples
set.seed(17)
M <- 80; N <- 100
t <- seq(0, 1, length.out = M)
X0 <- t(replicate(50, sin(2 * pi * t) + rnorm(M, sd = 0.4)))
X1 <- t(replicate(50, cos(2 * pi * t) + rnorm(M, sd = 0.4)))
X <- rbind(X0, X1)
y <- rep(0:1, each = 50)
fit <- gam_depth_classify(X, y)
print(fit)
Cross-Validation for k-RNN Neighbourhood Size
Description
Selects the optimal half-neighbourhood size k by R-fold
cross-validation, using either the minimum-error rule or the
one-standard-error (1-SE) rule (Section 3.3.1 of Perez Ruiz, 2020).
Usage
krnn_cv(
X,
y,
k_max = NULL,
R = 10L,
rule = c("min", "1se"),
grid = NULL,
type = c("tukey", "simplicial"),
seed = NULL
)
## S3 method for class 'krnn_cv'
print(x, ...)
Arguments
X |
Numeric matrix ( |
y |
Integer vector of class labels (0/1). |
k_max |
Maximum value of k to evaluate. Default
|
R |
Number of CV folds. Default |
rule |
Character: |
grid |
Numeric vector of length |
type |
Depth variant. |
seed |
Optional integer seed for reproducibility. |
x |
A |
... |
Ignored. |
Value
An object of class "krnn_cv" with components:
- k_opt
Selected optimal k.
- cv_error
Mean CV misclassification error for each k.
- cv_se
Standard error of CV error per k.
- k_max
Maximum k evaluated.
- R
Number of folds used.
- rule
The selection rule used.
Invisibly returns x.
Examples
set.seed(3)
M <- 80
t <- seq(0, 1, length.out = M)
X0 <- t(replicate(50, sin(2 * pi * t) + rnorm(M, sd = 0.4)))
X1 <- t(replicate(50, cos(2 * pi * t) + rnorm(M, sd = 0.4)))
X <- rbind(X0, X1)
y <- rep(0:1, each = 50)
cv <- krnn_cv(X, y, k_max = 20, R = 5, seed = 1)
print(cv)
plot_krnn_cv(cv)
k-Ranked Nearest Neighbour Classifier for Functional Data
Description
Fits the k-RNN classifier of Perez Ruiz (2020). Curves are ranked by their
signed depth and each new observation is assigned to the majority class
among its 2k nearest ranked neighbours (k above and k below).
Usage
krnn_fit(
X_train,
y_train,
X_test = NULL,
k = 5L,
grid = NULL,
type = c("tukey", "simplicial"),
sd_train = NULL
)
## S3 method for class 'krnn_fit'
print(x, ...)
Arguments
X_train |
Numeric matrix ( |
y_train |
Integer vector of class labels (0/1) of length
|
X_test |
Numeric matrix of test curves. If |
k |
Positive integer: half-neighbourhood size. The total number of
neighbours per observation is |
grid |
Numeric vector of length |
type |
Depth variant passed to |
sd_train |
Optional pre-computed |
x |
A |
... |
Ignored. |
Value
An object of class "krnn_fit" with components:
- predicted
Integer vector of predicted class labels.
- prob
Numeric vector of estimated probabilities
\hat{P}(y=0|\mathrm{sdp}).- sd_train
The
"fd_signed_depth"object for the training set.- y_train
The training labels.
- k
The value of k used.
Invisibly returns x.
References
Perez Ruiz, D. A. (2020). Supervised Classification for Functional Data. PhD thesis, University of Manchester. Section 3.2.
See Also
Examples
set.seed(7)
M <- 100; n <- 80
t <- seq(0, 1, length.out = M)
X0 <- t(replicate(n / 2, sin(2 * pi * t) + rnorm(M, sd = 0.4)))
X1 <- t(replicate(n / 2, cos(2 * pi * t) + rnorm(M, sd = 0.4)))
X <- rbind(X0, X1)
y <- rep(0:1, each = n / 2)
idx_tr <- sample(n, 60)
fit <- krnn_fit(X[idx_tr, ], y[idx_tr], X[-idx_tr, ], k = 5)
print(fit)
k-RNN Moving-Average Smoother
Description
Estimates the conditional probability
\hat{P}(y = 0 \mid \mathrm{sdp}) using the running-mean smoother
(moving average) with span 2k. This is the regression interpretation
of the k-RNN (Section 3.4 of Perez Ruiz, 2020).
Usage
krnn_smoother(
X,
y,
k = 10L,
grid_eval = NULL,
boot = FALSE,
B = 200L,
alpha = 0.05,
grid_fd = NULL,
type = c("tukey", "simplicial")
)
Arguments
X |
Numeric matrix ( |
y |
Integer vector of labels (0/1). |
k |
Half-neighbourhood size. |
grid_eval |
Optional numeric vector of signed-depth values at which to evaluate the smoother. Defaults to the training signed depths. |
boot |
Logical; if |
B |
Number of bootstrap replicates (only used when
|
alpha |
Nominal level: coverage is |
grid_fd |
Numeric vector of length |
type |
Depth variant. |
Value
An object of class "krnn_smoother" with components:
- sdp_eval
Evaluation points (signed depths).
- prob
Estimated conditional probabilities.
- ci_lower, ci_upper
Bootstrap CI bounds (if
boot = TRUE).- k
Half-neighbourhood size used.
Examples
set.seed(5)
M <- 100; N <- 80
t <- seq(0, 1, length.out = M)
X0 <- t(replicate(40, sin(2 * pi * t) + rnorm(M, sd = 0.4)))
X1 <- t(replicate(40, cos(2 * pi * t) + rnorm(M, sd = 0.4)))
X <- rbind(X0, X1)
y <- rep(0:1, each = 40)
sm <- krnn_smoother(X, y, k = 10)
plot_krnn_smoother(sm)
Logistic Regression Classifier on Signed Depth
Description
Fits a logistic regression model with class label as response and signed depth (and optionally signed distance to the mode) as covariates (Section 3.7 of Perez Ruiz, 2020).
Usage
logistic_depth_classify(
X_train,
y_train,
X_test = NULL,
model = c("sdp", "sdp+mode", "sdp*mode"),
n_pc = 10L,
grid = NULL,
type = c("tukey", "simplicial")
)
## S3 method for class 'fd_logistic_fit'
print(x, ...)
Arguments
X_train |
Numeric matrix ( |
y_train |
Integer vector of labels (0/1). |
X_test |
Numeric matrix. If |
model |
Character: |
n_pc |
Number of principal components for mode estimation. Default 10. |
grid |
Numeric vector of length |
type |
Depth variant. |
x |
A |
... |
Ignored. |
Details
Three model formulae are supported:
"sdp"Model 1: signed depth only.
"sdp+mode"Model 2: signed depth plus signed distance to mode.
"sdp*mode"Model 3: Model 2 plus interaction term.
Value
An object of class "fd_logistic_fit" with components:
- predicted
Predicted class labels.
- prob
Matrix with columns
prob_0andprob_1.- glm_fit
The fitted
glmobject.- sd_train
The
"fd_signed_depth"object for the training set.- model
Character: model formula used.
Invisibly returns x.
Examples
set.seed(13)
M <- 80; N <- 100
t <- seq(0, 1, length.out = M)
X0 <- t(replicate(50, sin(2 * pi * t) + rnorm(M, sd = 0.4)))
X1 <- t(replicate(50, cos(2 * pi * t) + rnorm(M, sd = 0.4)))
X <- rbind(X0, X1)
y <- rep(0:1, each = 50)
fit <- logistic_depth_classify(X, y, model = "sdp")
summary(fit$glm_fit)
Plot a krnn_cv Object
Description
Plots the cross-validation error curve against the neighbourhood size k, with standard-error bars and a vertical line at the selected optimum.
Usage
plot_krnn_cv(x, ...)
Arguments
x |
A |
... |
Additional graphical parameters passed to
|
Value
Invisibly returns x.
Examples
set.seed(3)
M <- 80
t <- seq(0, 1, length.out = M)
X0 <- t(replicate(50, sin(2 * pi * t) + rnorm(M, sd = 0.4)))
X1 <- t(replicate(50, cos(2 * pi * t) + rnorm(M, sd = 0.4)))
X <- rbind(X0, X1)
y <- rep(0:1, each = 50)
cv <- krnn_cv(X, y, k_max = 20, R = 5, seed = 1)
plot_krnn_cv(cv)
Plot a krnn_smoother Object
Description
Plots the estimated conditional probability
\hat{P}(y = 0 \mid \mathrm{sdp}) against the signed depth, with
optional bootstrap confidence bands.
Usage
plot_krnn_smoother(x, ...)
Arguments
x |
A |
... |
Additional graphical parameters passed to
|
Value
Invisibly returns x.
Examples
set.seed(5)
M <- 100; N <- 80
t <- seq(0, 1, length.out = M)
X0 <- t(replicate(40, sin(2 * pi * t) + rnorm(M, sd = 0.4)))
X1 <- t(replicate(40, cos(2 * pi * t) + rnorm(M, sd = 0.4)))
X <- rbind(X0, X1)
y <- rep(0:1, each = 40)
sm <- krnn_smoother(X, y, k = 10)
plot_krnn_smoother(sm)
Predict Method for krnn_fit Objects
Description
Classifies new functional observations using a fitted
krnn_fit object.
Usage
## S3 method for class 'krnn_fit'
predict(object, newdata, y_true = NULL, ...)
Arguments
object |
A |
newdata |
Numeric matrix of new curves
( |
y_true |
Optional integer vector of true labels for computing the misclassification rate. |
... |
Ignored. |
Value
A list with components predicted, prob, and (if
y_true is supplied) misclass.
Examples
set.seed(7)
M <- 100; n <- 80
t <- seq(0, 1, length.out = M)
X0 <- t(replicate(n / 2, sin(2 * pi * t) + rnorm(M, sd = 0.4)))
X1 <- t(replicate(n / 2, cos(2 * pi * t) + rnorm(M, sd = 0.4)))
X <- rbind(X0, X1)
y <- rep(0:1, each = n / 2)
idx_tr <- sample(n, 60)
fit <- krnn_fit(X[idx_tr, ], y[idx_tr], k = 5)
pred <- predict(fit, newdata = X[-idx_tr, ], y_true = y[-idx_tr])
pred$misclass
Reference (Median) Curve
Description
Returns the curve (or average of curves) that attains the maximum Fraiman-Muniz depth over the combined sample — the functional analogue of the median.
Usage
reference_curve(X, depth = NULL, ...)
Arguments
X |
Numeric matrix ( |
depth |
Numeric vector of length |
... |
Additional arguments forwarded to |
Value
A numeric vector of length M.
Examples
set.seed(1)
X <- matrix(rnorm(50 * 100), nrow = 50)
ref <- reference_curve(X)
plot(ref, type = "l", xlab = "t", ylab = "x(t)",
main = "Reference (median) curve")
Signed Depth
Description
The core transformation of the k-RNN pipeline. For each curve computes
\mathrm{sdp}(x_i(t)) =
\mathrm{sgn}\!\left\{\int_I(x_i(t)-x_{\mathrm{ref}}(t))\,dt\right\}
\times D(x_i(t)),
where D(\cdot) is the Fraiman-Muniz depth and the sign is derived
from the signed distance integral (equation 3.2 of Perez Ruiz, 2020).
Curves above the reference receive a positive signed depth; curves below
receive a negative one.
Usage
signed_depth(
X,
grid = NULL,
type = c("tukey", "simplicial"),
x_ref = NULL,
depth = NULL
)
## S3 method for class 'fd_signed_depth'
print(x, ...)
Arguments
X |
Numeric matrix ( |
grid |
Numeric vector of length |
type |
Depth variant: |
x_ref |
Optional pre-computed reference curve (numeric vector of
length |
depth |
Optional pre-computed FM depth vector of length |
x |
A |
... |
Ignored. |
Value
An object of class "fd_signed_depth" with components:
- sdp
Numeric vector of length
N: the signed depths.- depth
Numeric vector of length
N: raw FM depths.- sdi
Numeric vector of length
N: signed distance integrals.- x_ref
Numeric vector of length
M: reference curve.- grid
Numeric vector of length
M: time grid used.- type
Character: depth variant used.
- N
Number of curves.
- M
Number of time points.
Invisibly returns x.
References
Perez Ruiz, D. A. (2020). Supervised Classification for Functional Data. PhD thesis, University of Manchester.
Examples
set.seed(42)
N <- 60; M <- 100
t <- seq(0, 1, length.out = M)
X0 <- t(replicate(N / 2, sin(2 * pi * t) + rnorm(M, sd = 0.3)))
X1 <- t(replicate(N / 2, cos(2 * pi * t) + rnorm(M, sd = 0.3)))
X <- rbind(X0, X1)
sd_obj <- signed_depth(X)
plot(sd_obj$sdp, col = rep(1:2, each = N / 2),
xlab = "Curve index", ylab = "Signed depth",
main = "Signed depth by group")
abline(h = 0, lty = 2)
Signed Distance Integral
Description
For each curve x_i(t), computes
\int_I \bigl(x_i(t) - x_{\mathrm{ref}}(t)\bigr)\,dt,
which is positive when the curve lies predominantly above the reference and negative when it lies below. This integral assigns a sign to the depth of each curve (equation 3.1 of Perez Ruiz, 2020).
Usage
signed_distance_integral(X, x_ref = NULL, grid = NULL)
Arguments
X |
Numeric matrix ( |
x_ref |
Numeric vector of length |
grid |
Numeric vector of length |
Value
A numeric vector of length N.
See Also
Examples
set.seed(1)
X <- matrix(rnorm(40 * 80), nrow = 40)
sdi <- signed_distance_integral(X)
hist(sdi, main = "Signed distance integrals", xlab = "SDI")
Simulate a Two-Group Functional Dataset
Description
Generates a labelled functional dataset from two Gaussian processes,
useful for illustrating and testing the classifiers in fdclassify.
The two groups differ in their mean function:
x_i(t) = \mu_g(t) + \varepsilon_i(t), \quad
\varepsilon_i(t) \sim \mathcal{GP}(0, \sigma^2),
where \mu_0(t) = A_0 \sin(2\pi f_0 t) and
\mu_1(t) = A_1 \cos(2\pi f_1 t) by default.
Usage
simulate_fd(
n0 = 50L,
n1 = 50L,
M = 100L,
sigma = 0.4,
A0 = 1,
A1 = 1,
f0 = 1,
f1 = 1,
seed = NULL
)
Arguments
n0 |
Number of curves in group 0. Default 50. |
n1 |
Number of curves in group 1. Default 50. |
M |
Number of time points. Default 100. |
sigma |
Standard deviation of the noise. Default 0.4. |
A0, A1 |
Amplitudes of the mean functions. Both default to 1. |
f0, f1 |
Frequencies of the mean functions. Both default to 1. |
seed |
Optional integer seed. |
Value
A list with:
XNumeric matrix
(n_0+n_1) \times M.yInteger vector of labels (0 or 1).
gridNumeric vector of length
Mon[0,1].
Examples
dat <- simulate_fd(n0 = 40, n1 = 40, seed = 1)
matplot(t(dat$X[dat$y == 0, ]), type = "l", col = "steelblue",
lty = 1, xlab = "t", ylab = "x(t)", main = "Simulated curves")
matlines(t(dat$X[dat$y == 1, ]), col = "firebrick", lty = 1)
legend("topright", legend = c("Group 0", "Group 1"),
col = c("steelblue","firebrick"), lty = 1)