Sparse Models for Space-Time Environmental Data Inspired from Statistical Physics
Dionissios T. Hristopulos
Geostatistics Laboratory, School of Mineral Resources Engineering
Technical University of Crete - Chania 73100, Greece
Laboratory Web: http://www.geostatistics.tuc.gr/index.php?id=4908
This presentation will focus on models for space-time data based on statistical physics. Statistical physics provides a general framework for developing space-time models using Boltzmann-Gibbs probability density functions and stochastic partial differential equations (SPDEs). In geostatistics and in machine learning, spatial models are typically defined in terms of an explicit covariance (kernel) function. In contrast, in the Boltzmann-Gibbs approach the covariance function is intrinsically generated from the underlying joint probability density model. The latter is determined from a respective energy function which incorporates interactions between different locations and times. In the SPDE formulation, the covariance function is determined from the stochastic equation of motion of the random field, which leads to a respective partial differential equation for the covariance function.
I will briefly discuss the connection between the Boltzmann-Gibbs and SPDE formulations for Gaussian random fields. I will then review some results which are based on Boltzmann-Gibbs densities equipped with an energy function that comprises short-range interactions. These results include: (1) A class of flexible spatial covariance functions; (2) a non-separable covariance function with a composite space-time metric; (3) a family of non-separable covariance functions that are based on linear response theory combined with the space transform; and (4) ongoing efforts to generalize Boltzmann-Gibbs models from continuum and lattice spaces to irregular sampling geometries. The space-time models generated by means of the Boltzmann-Gibbs formulation with short-range energy functions involve by construction sparse precision matrices. This is a significant advantage for the processing of big spatial or space-time datasets, since the computationally demanding inversion of large covariance matrices (which is common in geostatistics and Gaussian process regression) is avoided. I will illustrate these concepts with applications to environmental and energy resources datasets.