Computer Vision MT25, Image transformations
Flashcards
Suppose we represent an image via a 2D function $f$. @Describe the differences between:
- a pointwise transformation
- a geometric transformation
- image filtering
In all cases, we make a new image $g$ from $f$.
- Pointwise: $g(x, y) = t(f(x,y))$ where $t$ is some function
- Geometric: $g(x, y) = f(T(x,y))$ where $T$ is some function
- Filtering: $g(x, y) = F(N(x, y))$ where $F$ is some function and $N$ is a neighbourhood
Suppose we represent an image via a 2D function $f$. @Define the gamma correction filter.
Warps
@State the name of the technique for applying a transformation to an image by iterating over source pixels and drawing them at the target location, and state a problem with this.
This is called a forward warp, the problem is that there may be “holes” in the generated image.
@State the name of the technique for applying a transformation to an image by iterating over output pixels and computing where the pixels come from, possibly interpolating if the source location is between pixels.
Backward warp.
@Define the optical axis. What does rotation around this axis and translation correspond to?
The axis passing through the centre of the camera and the centre of the image plane.
- Rotation around the axis: Rotation of the image around its centre.
- Translation along the axis: Scales the image around its centre.

Bite-sized
Image negation is the pointwise transformation $g(x, y) = <span class="cloze" tabindex="0">1 - f(x, y)</span>$ (assuming $f$ is normalised to $[0, 1]$), producing a photographic-negative-style inversion of intensities.
Contrast adjustment is the affine pointwise transformation $g(x, y) = <span class="cloze" tabindex="0">a f(x, y) + b</span>$, where $a$ scales the dynamic range (contrast) and $b$ shifts the brightness.
@State the affine transformation matrix in homogeneous coordinates for each of the four 2D operations: translation, scaling, rotation, horizontal shearing.
For a point $(x, y, 1)^\top$:
- Translation by $(\delta _ x, \delta _ y)$:
- Uniform scaling by factor $s$:
- Rotation by angle $\theta$ (anti-clockwise):
- Horizontal shearing by factor $m$:
The general affine transformation has 6 free parameters (the top two rows of a $3 \times 3$ matrix with last row $(0, 0, 1)$).
@Describe what camera motions correspond to affine image transformations, and what motions go beyond affine.
Of the 6 camera-motion degrees of freedom (3 rotation + 3 translation):
- Rotation around the optical axis: maps to image rotation around the centre. Affine.
- Translation along the optical axis (zoom in/out): maps to image scaling around the centre. Affine.
- Translation perpendicular to the optical axis: maps to image translation. Affine.
What goes beyond affine is rotation around an axis not aligned with the optical axis, e.g. tilting the camera up or down. These produce perspective distortions where parallel scene lines converge — handled by a homography rather than an affine transformation.
A backward warp uses the inverse transformation $T^{-1}$: for every target pixel $(x, y)$, look up the corresponding source coordinate. If the result lies between integer source pixels, interpolate (bilinear, bicubic, etc.). This avoids the holes/pile-ups of forward warps.
@State the three properties preserved by every affine transformation.
- Collinearity: any three points on a line stay on a line after the transformation.
- Parallelism: parallel lines stay parallel.
- Convexity: convex sets stay convex.
In contrast, a general homography preserves only collinearity (and the more general projective invariant: the cross-ratio of four collinear points). Parallelism and convexity are not preserved under a homography in general.
A 2D rotation matrix by angle $\theta$ (anti-clockwise) is
\[R(\theta) = <span class="cloze" tabindex="0">\begin{pmatrix} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \end{pmatrix}</span>\]and satisfies $R(\theta)^\top R(\theta) = I$ (orthogonal) and $\det R(\theta) = 1$ (orientation-preserving) — these are the defining properties of $\mathrm{SO}(2)$.