### Wavefield retrieval, manifold learning and data reduction

**Wavefield retrieval from surface array data**

There have been extensive efforts in developing data regularization strategies and interpolation techniques accommodating the limitations of present day acquisition. Acquisition designs away from regular sampling and exploiting frames with respect to which data compress (motivated by compressed sensing) have been considered; however, the underlying theory is subtle and incomplete in the case of solutions to the system of equations describing elastic waves. The multi-scale physics of waves provides us with opportunities to discover new directions in the fields of probing, retrieval and compression.

*Sparsity, compression, geometry*. The wavepackets, like curvelets, for given scale and orientation, are positioned via translation on a lattice. Naturally the points in the lattice have no knowledge of the waves in the data. We developed a nonlinear technique where the points become a function of the data to be decomposed, thus departing from the use of frames, and obtain optimal compression. This technique makes use of block Hankel operators and so-called AAK theory [75,76].Recently, we introduced a method for frequency extrapolation assuming that the data (in time) can be sparsely represented by a sum of Lorentzian functions. This condition restricts, for example, the ‘density’ of reflections in a given time window. The current implementation of this technique makes use of knowledge of the band-pass filter, which requirement needs to be further relaxed. The extrapolation provides insight in the ultimate information contained in finite-frequency data.Via compression, we can also recover geometrical information (slopes, curvatures, etc.) connecting points in phase space corresponding with the significant wavepacket coefficients using graph cuts. These appeared to allow conflicting slopes in the data and proved to be a powerful tool in interpolation.

*Gradients, nonlinear diffusion*. Through multi-component data we have implicit information about the spatial derivatives (and, hence, directionality) of the wavefield. We can use these derivatives to form structure tensors (which we can extend to allow for conflicting slopes). We constructed a nonlinear anisotropic diffusion equation using these tensors; the associated process provides effectively a new interpolation concept.

The **Dirichlet-to-Neumann map **represents the data in the analysis of the inverse boundary value problem for seismic waves. This map models observations when acting on a boundary source, or equivalently simultaneous sources, which can be synthesized from actual sources in multiple ways. We have studied the probing of this Dirichlet-to-Neumann map in the frequency domain, carried out an analysis and obtained conditions and a procedure to enable accurate low-rank approximations. This property lends itself to time-harmonic wavefield recovery with matrix completion techniques.

**Classification of seismic phases and manifold learning**

The analysis of seismic waves remains a tedious task that seismologists often perform by inspection of the available seismograms. The complexity and noisy nature of the signal combined with a sparse and irregular sampling make this analysis difficult and imprecise. Yet, a detailed interpretation of the geometric contents of these datasets can provide valuable prior information for the solution of corresponding inverse problems.

This state of affair is indicative of a lack of effective and robust algorithms for the computational parsing and interpretation of seismograms. Indeed, the limited frequency content, strong nonlinearity, temporally scattered nature of these signals prove challenging to standard signal processing techniques.

In this context, recent advances in signal processing research show remarkable promise to enable the automatic characterization of the subtle and irregular patterns that are present in seismic waves. The scattering transform, in particular, through its invariance to translation and stability under small temporal distortions, constitutes a compelling new direction for the processing of seismic waves. In a pilot study, we have shown that the scattering transform combined with dimensionality reduction techniques can elucidate the fundamental organization of seismic signals. Our approach leverages ideas from signal processing and machine learning to create an efficient computational pipeline through which individual events and their constitutive wave patterns can be identified and connected across signals. We expect a wide range applications, such as the classification of tremors and identification of repeat earthquakes.

**Uncertainty quantification**

We developed data analysis tools for uncertainty quantification of discretized linearized inverse problems, based on *l*^{2}– and*l*^{1}-regularization [88]. These can be used to reveal the presence of systematic errors or to validate the assumed statistical model. Our methods include bounds on the performance of randomized estimators of large matrices, bounds for the bias, and resampling methods for model validation.