NLA MT25, Gaussian random matrices


Flashcards

gaussian-random-matrix-definition

@Define a Gaussian random matrix.


A matrix $G \in \mathbb R^{m \times n}$ whose entries $G _ {ij}$ are i.i.d. $\sim N(0, 1)$.

orthogonal-invariance-of-gaussian-matrices

@State what is meant by the orthogonal invariance of Gaussian random matrices $G$.


The distribution of a Gaussian random matrix is invariant under orthogonal multiplication, i.e. if:

  • $G$ is a Gaussian matrix
  • $Q$ is an orthogonal matrix independent of $G$

Then:

  • $QG$ and $GQ$ are both Gaussian random matrices
square-gaussian-ill-conditioning

@State a result about the conditioning of a square Gaussian matrix, and contrast this with the rectangular case.


Suppose $G \in \mathbb R^{n \times n}$ has i.i.d. $N(0, 1)$ entries. Then $G$ is almost surely nonsingular, but $\sigma _ \min(G)$ has density bounded away from zero at the origin, so for small $t$,

\[\mathbb P(\sigma _ \min(G) < t) \sim t,\]

and hence consequently

\[\mathbb E[\sigma _ \min(G)^{-2}] = \infty.\]

For rectangular random matrices with $m \ge n$ has $\sigma _ \min \approx \sqrt{m} - \sqrt{n}$ bounded away from zero by ∆marchenko-pastur, and hence $\mathbb E[\sigma _ \min^{-2}]$ is finite.

Bite-sized

The two key facts about Gaussian random matrices used throughout randomised NLA are: orthogonal invariance (the distribution is unchanged by left/right multiplication by an independent orthogonal matrix), and Marchenko-Pastur (rectangular Gaussians are well-conditioned with high probability).

Source: Lecture 14, Gaussian random matrices slide and §14.1 of the lecture notes.

@bite~

bite-gaussian-invariance-utility

Why is the orthogonal invariance of Gaussian random matrices so useful in randomised NLA analysis?


It lets us replace $GQ$ or $QG$ with another Gaussian matrix wherever $Q$ is an orthogonal matrix independent of $G$. Combined with Marchenko-Pastur conditioning of the resulting rectangular Gaussian, this is the workhorse tool behind error bounds for sketch-and-solve, Blendenpik, HMT randomised SVD, and generalised Nyström.

Source: Lecture 14, §14.1.1 of the lecture notes (and its applications in §14.4, §14.5, §15.2).

@bite~

Two equivalent proofs of orthogonal invariance for a Gaussian column $g \sim N(0, I _ n)$ under independent orthogonal $Q$: (a) compute mean $\mathbb E[Q g] = $ $0$ and covariance $\mathbb E[(Qg)(Qg)^\top] = $ $Q I _ n Q^\top = I _ n$, so $Qg$ has the same multivariate Gaussian distribution; (b) the joint pdf $\tfrac{1}{(2\pi)^{n/2}} e^{-\ \vert g\ \vert ^2/2}$ depends only on $\ \vert g\ \vert $, and $ \vert \det Q \vert = 1$, so a change of variables leaves the pdf invariant.

Source: Lecture 14, §14.1.1 of the lecture notes (proof of orthogonal invariance).

@bite~

bite-when-gaussian-sketch-suboptimal

When is a dense Gaussian sketch not the right choice in practice, despite being analytically convenient?


When forming $GA$ is too expensive: a dense Gaussian sketch costs $O(smn)$ flops, the same order as direct QR-based least-squares. Structured fast sketches like SRFT ($G = SFD$ with $F$ an FFT, $D$ random $\pm 1$ diagonal, $S$ subsampling) reduce this to $O(mn \log s + sn^2)$. CountSketch is even faster on sparse $A$. The trade-off is that the theoretical analysis becomes much more delicate.

Source: Lecture 14, §14.4 of the lecture notes (“Complexity and faster sketches” remark, non-examinable).

@bite~




Related posts