---
title: "Gasper: GrAph Signal ProcEssing in R"
author: Basile de Loynes, Fabien Navarro, Baptiste Olivier
date: "`r Sys.Date()`"
output: 
  rmarkdown::pdf_document:
    citation_package: natbib
    fig_caption: no
    keep_tex: true
    includes:
      in_header: template-latex.tex
    number_sections: yes
geometry: margin=1in
fontsize: 11pt
bibliography: references
biblio-style: apalike
vignette: >
  %\VignetteIndexEntry{Gasper Vignette}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
library(gasper)
if (as.numeric(R.version$minor)>6){
  RNGkind(sample.kind = "Rounding")
}
set.seed(434343)
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width=3,
  fig.height=3,
  fig.align="center"
)
knitr::opts_knit$set(global.par = TRUE)
```

\begin{abstract}
We present a short tutorial on to the use of the \proglang{R} \pkg{gasper} package. Gasper is a package dedicated to signal processing on graphs. It also provides an interface to the SuiteSparse Matrix Collection.
\end{abstract}

# Introduction
The emerging field of Graph Signal Processing (GSP) aims to bridge the gap between signal processing and spectral graph theory. One of the objectives is to generalize fundamental analysis operations from regular grid signals to irregular structures in the form of graphs. There is an abundant literature on GSP, in particular we refer the reader to @shuman2013emerging and @ortega2018graph for an introduction to this field and an overview of recent developments, challenges and applications. GSP has also given rise to numerous applications in machine/deep learning: convolutional neural networks (CNN) on graphs @bruna2013spectral, @henaff2015deep, @defferrard2016convolutional, semi-supervised classification with graph CNN @kipf2016semi, @hamilton2017inductive, community detection @tremblay2014graph, to name just a few.

Different software programs exist for processing signals on graphs, in different languages. The Graph Signal Processing toolbox (GSPbox) is an easy to use matlab toolbox that performs a wide variety of operations on graphs. This toolbox was port to Python as the PyGSP @perraudin2014gspbox. There is also another matlab toolbox the Spectral Graph Wavelet Transform (SGWT) toolbox dedicated to the implementation of the SGWT developed in @hammond2011wavelets. However, to our knowledge, there are not yet any tools dedicated to GSP in \proglang{R}. A development version of the \pkg{gasper} package is currently available online\footnote{\url{https://github.com/fabnavarro/gasper}}, while the latest stable release can be obtained from the Comprehensive R Archive Network\footnote{\url{https://cran.r-project.org/web/packages/gasper/}}. In particular, it includes the methodology and codes\footnote{\url{https://github.com/fabnavarro/SGWT-SURE}} developed in @de2019data and provides an interface to the SuiteSparse Matrix Collection @davis2011university. 

This vignette is organized as follows. Section 2 introduces the interface to the SuiteSparse Matrix Collection and some visualization tools for GSP. Section 3 gives a short introduction to some underlying concepts of GSP, focusing on the Graph Fourier Transform. Section 4 gives an illustration for denoising signals on graphs using SGWT, thresholding techniques, and the minimization of Stein Unbiased Risk Estimator for an automatic selection of the threshold parameter.

# Graphs Collection and Visualization
A certain number of graphs are present in the package. They are stored as an Rdata file  which contains a list consisting of the graph's weight matrix $W$ (in the form of a sparse matrix denoted by `sA`) and the coordinates associated with the graph (if it has any).

An interface is also provided. It allows to retrieve the matrices  related to many problems provided by the SuiteSparse Matrix Collection (formerly known as the University of Florida Sparse Matrix Collection) @davis2011university, @kolodziej19. This collection is a large and actively growing set of sparse matrices that arise in real applications (as structural engineering, computational fluid dynamics, computer graphics/vision, optimization, economic and financial modeling, mathematics and statistics, to name just a few). For more details see https://sparse.tamu.edu/.

The package includes the `SuiteSparseData` dataset, which contains data from the SuiteSparse Matrix Collection. The structure of this dataframe mirrors the structure of the main table presented on the SuiteSparse Matrix Collection website, allowing users to query and explore the dataset directly within \proglang{R}.

Here is a sample of the `SuiteSparseData` dataset, showing the first 15 rows of the table:
```{r, eval=F}
head(SuiteSparseData, 15)
```

```{r, echo=FALSE}
SuiteSparseData_subset <- head(SuiteSparseData, 15)
df1 <-kableExtra::kbl(SuiteSparseData_subset, 
                    valign = 't',
                    format = "latex", booktabs = T,
                    caption="Overview of the first 15 matrices from the SuiteSparse Matrix Collection.")
kableExtra::kable_styling(df1, latex_options = c("striped","HOLD_position"))
```

For example, to retrieve all undirected graphs with between 100 and 150 columns and rows:
```{r, eval=FALSE}
filtered_mat <- SuiteSparseData[SuiteSparseData$Kind == "Undirected Graph" & 
                 SuiteSparseData$Rows >= 100 & SuiteSparseData$Rows <= 150 &
                 SuiteSparseData$Cols >= 100 & SuiteSparseData$Cols <= 150, ]
filtered_mat
```

```{r, echo=FALSE}
filtered_mat <- SuiteSparseData[SuiteSparseData$Kind == "Undirected Graph" & 
                 SuiteSparseData$Rows >= 100 & SuiteSparseData$Rows <= 150 &
                 SuiteSparseData$Cols >= 100 & SuiteSparseData$Cols <= 150, ]
df2 <-kableExtra::kbl(filtered_mat, 
                    valign = 't',
                    format = "latex", booktabs = T,
                    caption = "Subset of undirected matrices with 100 to 150 rows and columns.")
kableExtra::kable_styling(df2, latex_options = c("striped","HOLD_position"))
```

```{r, include=FALSE}
data(grid1)
grid1_1 <- grid1
grid1$info <- NULL
```

The `download_graph` function allows to download a matrix from this collection, based on the name of the matrix and the name of the group that provides it. An example is given below
```{r, eval = F}
matrixname <- "grid1"
groupname <- "AG-Monien"
download_graph(matrixname, groupname)
```

```{r}
attributes(grid1)
```

The output is stored (in a temporary folder) as a list composed of:

* `sA` the corresponding sparse matrix (in compressed sparse column format);
```{r}
str(grid1$sA)
```

* possibly coordinates `xy` (stored in a `data.frame`); 
```{r}
head(grid1$xy, 3)
```

* `dim"` the numbers of rows, columns and numerically nonzero elements; 
```{r}
grid1$dim
```

* `temp` the path to the temporary directory where the matrix and downloaded files (including singular values if requested) are stored.
```{r, eval=FALSE}
list.files(grid1$temp)
```

Metadata associated with the matrix can be display via
```{r, eval = F}
file.show(paste(grid1$temp,"grid1",sep=""))
```

or in the console:
```{r, eval = F}
cat(readLines(paste(grid1$temp,"grid1",sep=""), n=14), sep = "\n")
```

`download_graph` function has an optional `svd` argument; setting `svd = "TRUE"` downloads a ".mat" file containing the singular values of the matrix, if available.

For further insights, the `get_graph_info` function retrieve detailed information about the matrix from the SuiteSparse Matrix Collection website. `get_graph_info` fetches the three tables with "MatrixInformation", "MatrixProperties," and "SVDStatistics", providing a comprehensive overview of the matrix (`rvest` package needs to be installed).
```{r, eval=FALSE}
matrix_info <- get_graph_info(matrixname, groupname)
matrix_info
```

The `download_graph` function also has an optional argument `add_info` which, when set to `TRUE`, automatically calls `get_graph_info` and appends the retrieved information to the output of `download_graph`. This makes it easy to get both the graph data and its associated information in a single function call.
```{r, eval=FALSE}
downloaded_graph <- download_graph(matrixname, groupname, add_info = TRUE)
downloaded_graph$info
```

```{r, echo=FALSE, eval=T,fig.pos='H'}
data(grid1)
df1 <-kableExtra::kbl(grid1_1$info[[1]], 
                    valign = 't',
                    format = "latex", booktabs = T)
#kableExtra::kable_styling(df1, latex_options = c("striped","HOLD_position"))
df2<-kableExtra::kable(grid1_1$info[[2]], 
                 valign = 't',
                 format = "latex", booktabs = T)
#kableExtra::kable_styling(df2, latex_options = c("striped","HOLD_position"))
x_table <- knitr::kables(
  list(df1,df2), 
  format = "latex", 
  caption = "Matrix Information (left) and Matrix Properties (right).")
kableExtra::kable_styling(x_table,
                          latex_options = c("HOLD_position"))
```

The package also allows to plot a (planar) graph using the function `plot_graph`. It also contains a function to plot signals defined on top of the graph `plot_signal`.
```{r, fig.show='hold', fig.cap="Graph (left) and graph signal (right)."}
f <- rnorm(nrow(grid1$sA))
plot_graph(grid1)
plot_signal(grid1, f, size = 2)
```

In cases where these coordinates are not available, `plot_graph` employs simple spectral graph drawing to calculate some node coordinates. This is done using the function `spectral_coords`, which computes the spectral coordinates based on the eigenvectors associated with the two smallest non-zero eigenvalues of the graph's Laplacian @hall70.
```{r, eval=FALSE, include=FALSE}
grid1$xy <- NULL
attributes(grid1)
plot_graph(grid1)
```

# A Short Introduction to Graph Signal Processing

Graph theory provides a robust mathematical framework for representing complex systems. In this context, entities are modeled as vertices (or nodes) and their interconnections as edges, encapsulating a broad spectrum of real-world phenomena from social and communication networks to molecular structures and brain connectivity patterns.

Among the diverse types of graphs, such as undirected, directed, weighted, bipartite, and multigraphs, each offers distinct analytical advantages tailored to specific contexts. This vignette is devoted to undirected, connected, graphs where edges link two vertices symmetrically, often with weighted values to express connection strength or intensity. For instance, in a road network graph, the weights might correspond to the length of each road segment. 

Graphs are defined by ${G}=({V}, {E})$, where ${V}$ denotes the set of vertices or nodes and ${E}$ represents the set of edges. Each edge $(i, j) \in {E}$ connects nodes $i$ and $j$, potentially with an associated weight $w_{ij}$. The connectivity and interaction structure of ${G}$ is encoded in the adjacency matrix $W$, where $w_{ij} = w_{ji}$ for $i,j\in V$. The size of the graph is the number of nodes $n=|V|$. The degree matrix $\D$ is a diagonal matrix with $\D_{ii} = \sum_{j\in V} w_{ij}$. These matrices, $\W$ and $\D$, serve as the foundation for analyzing signal behavior on graph structures in GSP.

The spectral properties of the Laplacian matrices offer deep insights into the structure of graphs. The unnormalized Laplacian, $\La = \D - \W$, has non-negative eigenvalues, with the smallest being zero, indicating the number of connected components in the graph. This number can be retrieved from the "MatrixProperty" dataframe using the `get_graph_info` function, if the graph has been downloaded with `download_graph`, or computed using Depth-First Search algorithm, using \pkg{igraph} \proglang{R} package for instance \cite{igraph}.
```{r}
grid1$info$MatrixProp["Strongly Connect Components",]
```

On the other hand, the normalized Laplacian, $\La_{\mathrm{norm}} = I - \D^{-1/2} \W \D^{-1/2}$, has a spectrum that typically lies between 0 and 2. The zero eigenvalue corresponds to the number of connected components, and the first non-zero smallest eigenvalue, often referred to as the spectral gap, plays a crucial role in determining the graph's propensity for clustering. The larger this gap, the more pronounced the cluster structures within the graph. By scaling the eigenvalues according to node degrees, the normalized Laplacian accentuates the separation between clusters, making the spectral gap a significant measure in spectral graph clustering algorithms. However, a large spectral gap might present challenges for fast spectral filtering, especially depending on the approximation methods used, where it could lead to issues in convergence or computational efficiency.

Lastly, the random walk Laplacian, $\La_{\mathrm{rw}} = I - \D^{-1}\W$, is generally better suited for directed graphs and scenarios involving random walk dynamics. Its spectrum is also non-negative. The choice of which Laplacian to use is dictated by the particular graph properties one aims to emphasize or analyze in a given application.	

The `laplacian_mat` function (which supports both standard and sparse matrix representations) allows to compute those three forms of Laplacian matrices. For example, let $G=({V}, {E})$ a simple undirected graph with the vertex set ${V} = \{1, 2, 3\}$ and the edge set ${E} = \{\{1, 2\}, \{2, 3\}\}$. The corresponding adjacency matrix $\W$, as well as its unnormalized, normalized, and random walk Laplacians, can be represented and calculated as follows: 
```{r}
W <- matrix(c(0, 1, 0,
              1, 0, 1,
              0, 1, 0), ncol=3)
laplacian_mat(W, "unnormalized")
laplacian_mat(W, "normalized")
laplacian_mat(W, "randomwalk")
```

GSP extends classical signal processing concepts to signals defined on graphs. Let ${G}$ be a graph on the vertex set ${V} = \{v_1, \ldots, v_n\}$. A graph signal on ${G}$ is a function  $f : {V} \rightarrow \R$, that assigns a value to each node of the graph. It can be represented as a vector $(f(v_1), f(v_2), \ldots, f(v_n))^\top \in \R^n$, where each entry $f_i$ corresponds to the signal value at node $i$. 

The Laplacian quadratic form $f^\top\La f$ gives a measure of a graph signal's smoothness:
\[ 
f^\top \La f = \frac{1}{2} \sum_{(i,j) \in E} w_{ij} (f_i - f_j)^2,
\]
where a lower value suggests that the signal varies little between connected nodes, and thus is smoother on the graph. It's a global measure of the graph's "frequency" content which is insightful for understanding the overall variation in graph signals. The `smoothmodulus` function calculates this form for a given graph signal, returning a scalar value that quantifies the signal's smoothness in relation to the graph's structure. Moreover, the `randsignal` function can be used to generate graph signals with varying smoothness properties.

To analyze graph signals, the concept of the Graph Fourier Transform (GFT) is fundamental. The GFT provides a means to represent graph signals in the frequency domain, analogous to the classical Fourier Transform for traditional signals. Given a graph ${G}$, a GFT can be defined as the representation of signals on an orthonormal basis for $\R^n$ consisting of eigenvectors of the graph shift operator. The choice of graph shift operator is essential, as it determines the basis for the GFT, it can be either the Laplacian matrix or the adjacency matrix. In this tutorial, we primarily focus on signal processing using the Laplacian matrix as the shift operator.

For undirected graphs, the Laplacian matrix $\La$ is symmetric and positive semi-definite, with non-negative real eigenvalues. Given the eigenvalue decomposition of the graph Laplacian $\La = U \Lambda U^T$, where $U$ is the matrix of eigenvectors and $\Lambda$ is the diagonal matrix of eigenvalues, the GFT of a signal $f$ is given by $\hat{f} = U^T f$. Here, $\hat{f}$ represents the graph signal in the frequency domain. The elements of $\hat{f}$ are the coefficients of the signal $f$ with respect to the eigenvectors of $\La$, which can be interpreted as the frequency components of the signal on the graph.

The inverse GFT is given by $f = U \hat{f}$. This allows for the reconstruction of the graph signal in the vertex domain from its frequency representation. The GFT provides a powerful tool for analyzing and processing signals on graphs. It enables the identification of signal components that vary smoothly or abruptly over the graph, facilitating tasks such as filtering, denoising, and compression of graph signals. The function `forward_gft` allows to perform a GFT decomposition and to obtain the associated Fourier coefficients. The function `inverse_gft` allows to make the reconstruction.


# Data-Driven Graph Signal Denoising 

We give an example of an application in the case of the denoising of a noisy signal $f$ defined on a graph $G$ with set of vertices $V$. More precisely, the (unnormalized) graph Laplacian matrix $\La\in\R^{V\times V}$ associated with $G$ is the symmetric matrix defined as $\La=\D - \W$, where $\W$ is the matrix of weights with coefficients $(w_{ij})_{i,j\in V}$, and $\D$ the diagonal matrix with diagonal coefficients $\D_{ii}= \sum_{j\in V} w_{ij}$. A signal $f$ on the graph $G$ is a function $f:V\rightarrow \R$. 

The degradation model can be written as
\[
\tilde f = f + \xi,
\]
where $\xi\sim\mathcal{N}(0,\sigma^2)$. The purpose of denoising is to build an estimator of $f$ that depends only on $\tilde f$.

A simple way to construct an effective non-linear estimator is obtained by thresholding the SGWT coefficients of $f$ on a frame (see @hammond2011wavelets for details about the SGWT).

A general thresholding operator $\tau$ with threshold parameter $t\geq 0$ applied to some signal $f$ is defined as
\begin{equation}\label{eq:tau}
\tau(x,t)=x\max \{ 1-t^{\beta}|x|^{-\beta},0 \},
\end{equation}
 with $\beta \geq 1$. The most popular choices are the soft thresholding ($\beta=1$), the James-Stein thresholding ($\beta=2$) and the hard thresholding ($\beta=\infty$).

Given the Laplacian and a given frame, denoising in this framework can be summarized as follows:

* Analysis: compute the SGWT transform $\WT \tilde f$;

* Thresholding: apply a given thresholding operator to the coefficients $\WT  \tilde f$; 

* Synthesis: apply the inverse SGWT transform to obtain an estimation of the original signal.

Each of these steps can be performed via one of the functions `analysis`, `synthesis`, `beta_thresh`. Laplacian is given by the function `laplacian_mat`. The `tight_frame` function allows the construction of a tight frame based on @gobel2018construction and @coulhon2012heat. In order to select a threshold value, we consider the method developed in @de2019data which consists in determining the threshold that minimizes the Stein unbiased risk estimator (SURE) in a graph setting (see @de2019data for more details).

We give an illustrative example on the `grid1` graph from the previous section. We start by calculating, the Laplacian matrix (from the adjacency matrix), its eigendecomposition and the frame coefficients.
```{r}
A <- grid1$sA
L <- laplacian_mat(A)
val1 <- eigensort(L)
evalues <- val1$evalues
evectors <- val1$evectors
#- largest eigenvalue
lmax <- max(evalues)
#- parameter that controls the scale number
b <- 2
tf <- tight_frame(evalues, evectors, b=b)
```

Wavelet frames can be seen as special filter banks. The tight-frame considered here is a finite collection $(\psi_j)_{j=0, \ldots,J}$ forming a finite partition of unity on the compact $[0,\lambda_1]$, where $\lambda_1$ is the largest eigenvalue of the Laplacian spectrum $\mathrm{sp}(\La)$. This partition is defined as follows: let $\omega : \mathbb R^+ \rightarrow [0,1]$ be some function with support in $[0,1]$, satisfying $\omega \equiv 1$ on $[0,b^{-1}]$, for some $b>1$, and set
\begin{equation*}
\psi_0(x)=\omega(x)~~\textrm{and}~~\psi_j(x)=\omega(b^{-j}x)-\omega(b^{-j+1}x)~~\textrm{for}~~j=1, \ldots, J,~~\textrm{where}~~J= \left \lfloor \frac{\log \lambda_1}{\log b} \right \rfloor + 2.
\end{equation*}
Thanks to Parseval's identity, the following set of vectors is a tight frame:
\[
\mathfrak F = \left \{ \sqrt{\psi_j}(\La)\delta_i, j=0, \ldots, J, i \in V \right \}.
\]
The `plot_filter` function allows to represent the elements $\sqrt{\psi_j}$ (filters) of this partition, with 
\[
\omega(x) = \begin{cases}
1 & \text{if } x \in [0,b^{-1}] \\
b \cdot \frac{x}{1 - b} + \frac{b}{b - 1} & \text{if } x \in (b^{-1}, 1] \\
0 & \text{if } x > 1
\end{cases}
\]
which corresponds to the tigth-frame constructed from the `zetav` function.
```{r, echo=FALSE}
par(mgp = c(1.5, 0.5, 0), 
    tcl = 0.2,
    mar = .1 + c(2.5,2.5,0,0), 
    oma = c(0,0,0,0),
    cex.axis = 0.8,
    las = 1)
```

```{r,fig.width=4,fig.height=2, fig.cap="Plot of the spectral graph filters on the spectrum of grid1 graph."}
plot_filter(lmax,b)
```

The  SGWT of a signal $f \in \mathbb R^V$ is given by
\[
\WT f = \left ( \sqrt{\psi_0}(\La)f^{T},\ldots,\sqrt{\psi_J}(\La)f^{T} \right )^{T} \in \mathbb R^{n(J+1)}.
\]
The adjoint linear transformation $\WT^\ast$ of $\WT$ is:
\[
\WT^\ast \left (\eta_{0}^{T}, \eta_{1}^{T}, \ldots, \eta_{J}^T \right )^{T} = \sum_{j\geq 0} \sqrt{\psi_j}(\La)\eta_{j}.
\]
The tightness of the underlying frame implies that $\WT^\ast \WT=\mathrm{Id}_{\mathbb R^V}$ so that a signal $f \in \mathbb R^V$ can be recovered by applying $\WT^\ast$ to its wavelet coefficients $((\WT f)_i)_{i=1, \ldots, n(J+1)} \in \mathbb R^{n(J+1)}$.

Then, noisy observations $\tilde f$ are generated from a random signal $f$.
```{r}
n <- nrow(L)
f <- randsignal(0.01, 3, A)
sigma <- 0.01
noise <- rnorm(n, sd = sigma)
tilde_f <- f + noise
```

Below is a graphical representation of the original signal and its noisy version.
```{r, fig.show='hold', fig.cap="Original graph signal (left) and noisy observations (right)."}
plot_signal(grid1, f, size = 2)
plot_signal(grid1, tilde_f, size = 2)
```

We compute the SGWT transforms $\WT \tilde f$ and $\WT f$.
```{r}
wcn <- analysis(tilde_f,tf)
wcf <- analysis(f,tf)
```

An alternative to avoid frame calculation is given by the `forward_sgwt` function which provides a forward SGWT. For example:
```{r, eval=FALSE}
wcf <- forward_sgwt(f, evalues, evectors, b=b)
```

The optimal threshold is then determined by minimizing the SURE (using Donoho and Johnstone's trick @donoho1995adapting which remains valid here, see @de2019data). More precisely,  the SURE for a general thresholding process $h$ is given by the following identity  
\begin{equation}
\mathbf{SURE}(h)=-n \sigma^2 + \|h(\widetilde F)-\widetilde F\|^2 + 2 \sum_{i,j=1}^{n(J+1)} \gamma_{i,j}^2 \partial_j h_i(\widetilde F),
\end{equation}
where $\gamma_{i,j}^2=\sigma^2(\WT \WT ^\ast)_{i,j}$ that can be computed from the frame (or estimated via Monte-Carlo simulation). The `SURE_thresh`/`SURE_MSEthresh` allow to evaluate the SURE (in a global fashion) considering the general thresholding operator $\tau$ \eqref{eq:tau}. These functions provide two different ways of applying the threshold, "uniform" and "dependent" (\emph{i.e.}, the same threshold for each coefficient vs a threshold normalized by the variance of each coefficient). The second approach generally provides better results (especially when the weights have been calculated via the frame). A comparative example of these two approaches is given below (with $\beta=2$ James-Stein attenuation threshold).
```{r}
diagWWt <- colSums(t(tf)^2)
thresh <- sort(abs(wcn))
opt_thresh_d <- SURE_MSEthresh(wcn, 
                           wcf, 
                           thresh, 
                           diagWWt, 
                           beta=2, 
                           sigma, 
                           NA,
                           policy = "dependent",
                           keepwc = TRUE)

opt_thresh_u <- SURE_MSEthresh(wcn, 
                           wcf, 
                           thresh, 
                           diagWWt, 
                           beta=2, 
                           sigma, 
                           NA,
                           policy = "uniform",
                           keepwc = TRUE)
```

We can plot MSE risks and their SUREs estimates as a function of the threshold parameter (assuming that $\sigma$ is known).
```{r, fig.width=4, fig.height=3, fig.cap="MSE risk and its SURE estimates as a function of the threshold parameter."}
plot(thresh, opt_thresh_u$res$MSE,
     type="l", xlab = "t", ylab = "risk", log="x")
lines(thresh, opt_thresh_u$res$SURE-n*sigma^2, col="red")
lines(thresh, opt_thresh_d$res$MSE, lty=2)
lines(thresh, opt_thresh_d$res$SURE-n*sigma^2, col="red", lty=2)
legend("topleft", legend=c("MSE_u", "SURE_u",
                           "MSE_d", "SURE_d"),
       col=rep(c("black", "red"), 2), 
       lty=c(1,1,2,2), cex = 1)
```

Finally, the synthesis allows us to determine the resulting estimators of $f$, \emph{i.e.}, the ones that minimize the unknown MSE risks and the ones that minimizes the SUREs.
```{r}
wc_oracle_u <- opt_thresh_u$wc[, opt_thresh_u$min["xminMSE"]]
wc_oracle_d <- opt_thresh_d$wc[, opt_thresh_d$min["xminMSE"]]
wc_SURE_u <- opt_thresh_u$wc[, opt_thresh_u$min["xminSURE"]]
wc_SURE_d <- opt_thresh_d$wc[, opt_thresh_d$min["xminSURE"]]

hatf_oracle_u <- synthesis(wc_oracle_u, tf)
hatf_oracle_d <- synthesis(wc_oracle_d, tf)
hatf_SURE_u  <- synthesis(wc_SURE_u, tf)
hatf_SURE_d  <- synthesis(wc_SURE_d, tf)

res <- data.frame("Input_SNR"=round(SNR(f,tilde_f),2),
                  "MSE_u"=round(SNR(f,hatf_oracle_u),2),
                  "SURE_u"=round(SNR(f,hatf_SURE_u),2),
                  "MSE_d"=round(SNR(f,hatf_oracle_d),2),
                  "SURE_d"=round(SNR(f,hatf_SURE_d),2))
```

```{r risk, echo=F}
df <-kableExtra::kbl(res, 
                    valign = 't',
                    format = "latex", booktabs = T, caption="Comparison of SNR performance between  uniform and dependent policies." )
kableExtra::kable_styling(df, latex_options = c("striped","HOLD_position"))
#knitr::kable(res, caption="Uniform vs Dependent" )
```
It can be seen from Table \ref{tab:risk} that in both cases, SURE provides a good estimator of the MSE and therefore the resulting estimators have performances close (in terms of SNR) to those obtained by minimizing the unknown risk.

Equivalently, estimators can be obtained by the inverse of the SGWT given by the function `inverse_sgwt`. For exemple:
```{r, eval=FALSE}
hatf_oracle_u <- inverse_sgwt(wc_oracle_u,
                              evalues, evectors, b)
```

Or if the coefficients have not been stored for each threshold value (\emph{i.e.}, with the argument "keepwc=FALSE"  when calling `SUREthresh`) using the thresholding function `beta_thresh`, \emph{e.g.}, 
```{r, eval=FALSE}
wc_oracle_u <- betathresh(wcn, 
                          thresh[opt_thresh_u$min[[1]]], 2)
```

Notably, SURE can also be applied in a level-dependent manner using `SUREthresh` at each scale (the output of `SUREthresh` can be retrieve with the argument "keepSURE = TRUE" at the function call).
```{r}
J <- floor(log(lmax)/log(b)) + 2
LD_opt_thresh_u <- LD_SUREthresh(J=J, 
                             wcn=wcn, 
                             diagWWt=diagWWt, 
                             beta=2, 
                             sigma=sigma,
                             hatsigma=NA,
                             policy = "uniform",
                             keepSURE = FALSE)
hatf_LD_SURE_u <- synthesis(LD_opt_thresh_u$wcLDSURE, tf)
print(paste0("LD_SURE_u = ",round(SNR(f,hatf_LD_SURE_u),2),"dB"))
```

Even though the SURE no longer depends on the original signal, it does depend on $\sigma^2$, two naive (biased) estimators are obtained via `GVN` or `HPVN` functions (see @de2019data for more details). Another possible improvement would be to use a scale-dependent variance estimator (especially in the case of "policy = "dependent"").

Furthermore, the major limitations are the need to diagonalize the graph's Laplacian, and the calculation of the weights involved in the SURE (which requires an explicit calculation of the frame). To address the first limitation, several strategies have been proposed in the literature, notably via approximation by Chebyshev polynomials (see @hammond2011wavelets or @shuman18). Combined with these approximations, a Monte Carlo method to estimate the SURE weights has been proposed in @chedemail22, extending the applicability of SURE to large graphs.