Moderator: Jean-Christophe Mourrat (Lyon)
Jean-Christophe Mourrat (Lyon),
Leticia Cugliandolo (Paris),
Brice Huang (Boston),
Antoine Maillard (Paris),
Rémi Monasson (Paris),
Guilhem Semerjian (Paris),
Eliran Subag (Rehovot),
Pierfrancesco Urbani (Saclay).
Jean-Christophe Mourrat (Lyon): Celebrating the Abel Prize of Michel Talagrand.
Leticia Cugliandolo (Paris): TBA.
TBA
Brice Huang (Boston): Capacity threshold for the Ising perceptron. Cancelled
We show that the capacity of the Ising perceptron is with high probability upper bounded by the constant $\alpha \approx 0.833$ conjectured by Krauth and Mézard, under the condition that an explicit two-variable function $S(\lambda_1,\lambda_2)$ is maximized at (1,0). The earlier work of Ding and Sun proves the matching lower bound subject to a similar numerical condition, and together these results give a conditional proof of the conjecture of Krauth and Mézard.
Antoine Maillard (Paris): Average-case matrix discrepancy.
We study the operator norm discrepancy of a sequence of n large symmetric matrices, defined as the minimal operator norm
of any signed sum formed from these matrices. Kunisky and Zhang (2023) recently introduced a random model for this problem,
where the matrices are drawn from the Gaussian Orthogonal Ensemble. This model can be seen as a random variant
of the celebrated Matrix Spencer conjecture, and as a matrix-valued analog of the symmetric binary perceptron in statistical physics.
In this talk, we analyze the satisfiability transition for this random matrix discrepancy problem as the number of matrices
scales quadratically with their dimension:
• First, we prove that the expected number of signings achieving a given operator norm objective exhibits a sharp threshold
at a critical number of matrices. Below this threshold, solutions are typically absent; above it, the average number of solutions
grows exponentially.
• Second, by combining a second-moment analysis with recent results from Altschuler (2023) on margin concentration
in perceptron-like problems, we identify a second critical threshold beyond which solutions exist with high probability.
Our results establish that a collection of n=O(d^2) Gaussian random matrices can be balanced so that the spectrum of the
resulting signed matrix is macroscopically compressed relative to the typical semicircle law. The proofs rely on concentration
inequalities and large deviation estimates for correlated Gaussian matrices under spectral norm constraints. We will also discuss
the breakdown of the second-moment method in certain regions of the phase diagram, and the challenges posed in the use of advanced
non-rigorous statistical physics tools for analyzing this problem.
This talk is based on the manuscript arXiv:2410.17887, as well as ongoing work.
Rémi Monasson (Paris): Accelerated Sampling with Stacked Restricted Boltzmann Machines.
Sampling complex distributions is an important but difficult objective in various fields, including physics, chemistry, and statistics. An improvement of standard Monte Carlo (MC) methods, intensively used in particular in the context of disordered systems, is Parallel Tempering, also called replica exchange MC, in which a sequence of MC Markov chains at decreasing temperatures are run in parallel and can swap their configurations. In this seminar I will show how the ideas of parallel tempering can be applied in the context of restricted Boltzmann machines (RBM), a paradigm of unsupervised architectures, capable to learn complex, multimodal distributions. Inspired by Deep Tempering, an approach introduced for deep belief networks, we show how to learn on top of the first RBM a stack of nested RBMs, using the representations of a RBM as 'data' for the next one along the stack. In our Stacked Tempering approach the hidden configurations of a machine can be exchanged with the visible configurations of the next one in the stack. Replica exchanges between the different RBMs is facilitated by the increasingly clustered representations learnt by deeper RBMs, allowing for fast transitions between the different modes of the data distribution. Analytical calculations of mixing times in a simplified theoretical setting shed light on why Stacked Tempering works, and how hyperparameters, such as the aspect ratios of the RBMs and weight regularization should be chosen. We illustrate the efficiency of the Stacked Tempering method with respect to standard and replica exchange MC on several datasets: MNIST, in-silico Lattice Proteins, and the 2D-Ising model.
Guilhem Semerjian (Paris): Matrix denoising via low-degree polynomials.
In this talk I will discuss the additive version of the matrix denoising problem, where a random symmetric matrix S has to be inferred from the observation of Y=S+Z, with Z an independent random matrix modeling a noise. Systematic approximations to the Bayes-optimal estimator of S can be built by considering polynomial estimators. When the prior distributions on S and Z are orthogonally invariant this procedure allows to recover asymptotically the estimator introduced by Bun, Allez, Bouchaud and Potters in 2016. It also opens the way to the discussion of finite-size corrections, and to non-orthogonally invariant priors. A special case of particular interest occurs when S has a Wishart distribution, the denoising problem being then a simplified version of the extensive rank matrix factorization problem.
Eliran Subag (Rehovot): Disordered Gibbs measures and Gaussian conditioning.
Given a Gaussian energy function H(x) and another random process O(x) (an observable) both defined on the same
configuration space, what is the law of O(y) for y sampled from the Gibbs measure associated to H(x)? We will see
the answer to this question in the high temperature phase in a general setting, and in the more challenging low
temperature phase for spherical spin glasses. For both, the answer will be given in terms of the law of O(x) for
deterministic x, conditional on an appropriate event. In the former case the conditioning will only involve the value
of the energy H(x) at the same point x. In the latter, we additionally need to specify the energy and its derivatives
over a sequence of critical points with a certain geometry.
Based on joint work with Amir Dembo.
Pierfrancesco Urbani (Saclay): The KHGPS model and the spin glass transition in a field.
The KHGPS model is a soft mean field spin glass with an unusual spin glass transition in a field at zero temperature. I will discuss the corresponding properties and the physics of spin glass phase and argue that this suggests an interesting scenario for the transition in finite dimensions. A deformation of the model allows to construct a zero temperature field theory which becomes strongly coupled approaching the bare critical point. If time permits I will also discuss a possible missing ingredient in the final picture.