Geometric Deep Learning HT26, Sets
Flashcards
78658aefed5b478ebddff65908e6191eSuppose:
- We have a set of $d$-dimensional features, specified as an $n \times d$ matrix $\mathbf X = (\pmb x _ 1, \ldots, \pmb x _ n)^\top$
- $f$ is a function operating on these matrices
- $\Sigma _ n$ is the permutation group on $n$ elements
@Define what it means for $f$ to be permutation invariant, and give an @example of a class of such functions.
For any permutation matrix $\mathbf P$ corresponding to the representation of a permutation $\rho (g)$, we have
\[f(\mathbf P \mathbf X) = f(\mathbf X)\]One example is
\[f(\mathbf X) = \phi\left( \sum _ {u \in \mathcal V} \psi(\mathbf x _ u) \right)\]where $\psi$ is independently applied to every node’s features, and $\phi$ is applied to the sum-aggregated outputs.
fcef2e2c11d349d7a7268565ccb2532bSuppose:
- We have a set of $d$-dimensional features, specified as an $n \times d$ matrix $\mathbf X = (\pmb x _ 1, \ldots, \pmb x _ n)^\top$, possibly with some graph structure
- $f$ is a function operating on these matrices
Often we want some function $\mathbf F$ that acts locally in a node-wise manner, so that we may construct a new matrix $\mathbf H = \mathbf F(\mathbf X)$ which stacks these calculated features into a matrix.
We don’t have permutation invariance here. What do we have instead?
Permutation equivariance.
ebadfe578b62419e95054a262820f1b4Suppose:
- We have a set of $d$-dimensional features, specified as an $n \times d$ matrix $\mathbf X = (\pmb x _ 1, \ldots, \pmb x _ n)^\top$
- $\mathbf F$ is a function operating on these matrices
- $\Sigma _ n$ is the permutation group on $n$ elements
@Define what it means for $\mathbf F$ to be permutation invariant, and give an @example of a class of such functions.
For any permutation matrix $\mathbf P$ corresponding to the representation of a permutation $\rho (g)$, we have
\[\mathbf (\mathbf P \mathbf X) = \mathbf P\mathbf F(\mathbf X)\]One example is a shared node-wise linear transform
\[\mathbf F _ {\mathbf \Theta} (\mathbf X) = \mathbf X \mathbf \Theta\]where $\mathbf \Theta \in \mathbb R^{d \times d’}$ is a weight matrix.
156aaec4c50641eeb5d950660d044a46@State a result about the class of all linear equivariant functions, i.e. functions of the form
\[\mathbf{FPX} = \mathbf{PFX}\]
Give an interpretation of this result.
Any such map can be written as a linear combination of two generators, the identity $\mathbf F _ 1 \mathbf X = \mathbf X$, and the identity $\mathbf F _ 2 \mathbf X = \frac 1 n \mathbf 1 \mathbf 1^\top \mathbf X$.
This says that either we can process each node independently or all together at once.