Image Pose, Attitude, and Rotation Angles#
In photogrammetry and computer vision, an image’s pose (also called attitude or orientation) describes how the camera coordinate system is rotated relative to a world / mapping coordinate system.
In weitsicht an image is geo-referenced when it has:
a camera model (intrinsics),
a position (camera center),
an orientation (
weitsicht.Rotation),and a CRS definition for that pose (the perspective image CRS).
Rotation matrix (what it means)#
weitsicht.Rotation stores a 3×3 rotation matrix \(R\).
By convention in weitsicht, the matrix maps camera CRS → world CRS (the perspective image CRS):
To transform a 3D point \(\mathbf{X}\) in world coordinates into camera coordinates you typically use:
where \(\mathbf{P}_0\) is the projection center (camera position) in world CRS.
This convention matches how weitsicht.ImagePerspective.project() and
weitsicht.ImagePerspective.pixel_to_ray_vector() use the rotation.
Note
In the weitsicht camera CRS the optical axis points along \(-Z_{CAM}\). With the camera-to-world rotation
\(R\), the viewing direction in world CRS is therefore -R[:, 2].
Note
Some software packages store a world-to-camera rotation and compute camera coordinates like
\(\mathbf{X}_{cam} = R \, (\mathbf{X}_{world} - \mathbf{P}_0)\).
In weitsicht the stored matrix maps camera-to-world (\(\mathbf{v}_{world} = R\,\mathbf{v}_{cam}\)),
because then the rotation matrix directly shows how the camera axes are aligned in the world frame (its columns are
the camera basis vectors expressed in world coordinates). This is also commonly how INS/IMU systems report attitude.
This is mainly a personal preference that makes debugging easier; both conventions are equivalent by transposing the rotation matrix.
Spatial similarity transformation (notation mapping)#
Some photogrammetry / geodesy texts describe the relation between a local/sensor frame (lowercase) and a superior/world frame (uppercase) as a 3D spatial similarity transformation:
where:
\(\mathbf{p}\) is a point in the local/sensor coordinate system,
\(\mathbf{p}_0\) is the point of rotation (origin) in the local/sensor coordinate system,
\(s\) is a scale factor,
\(\mathbf{P}\) is the same point in the superior/world coordinate system,
\(\mathbf{P}_0\) is the point of rotation (origin) in the superior/world coordinate system,
\(R\) is the 3×3 rotation matrix.
For camera poses in weitsicht this is a rigid transform, i.e. \(s = 1\). The local system is the
camera CRS whose origin is the camera projection center, so \(\mathbf{p}_0 = \mathbf{0}\) and
\(\mathbf{P}_0\) is the camera center in world/perspective CRS. This yields the standard 3D transform used in
weitsicht:
and the inverse:
For how 2D image observations (\(x, y\)) are linked to this pose via interior orientation (\(x_0, y_0, c\) and distortion) and the classic collinearity equations, see Perspective Image.
Two common photogrammetry angle notations#
There are many valid angle conventions in the literature. The same symbol names may mean different axis orders or different sign conventions depending on the software stack. Always validate your convention with a known control point.
OPK (omega, phi, kappa) - typical for aerial / near-nadir imagery#
OPK (\(\omega, \varphi, \kappa\)) is the classic photogrammetry notation for image exterior orientation angles. It is especially common for aerial imagery where the camera looks close to nadir:
\(\omega\) (omega): small tilt/roll component,
\(\varphi\) (phi): small tilt/pitch component,
\(\kappa\) (kappa): rotation around the camera Z axis (often related to heading / image rotation).
Practical intuition:
When images are close to vertical, \(\omega\) and \(\varphi\) are usually small.
\(\kappa\) often dominates and is close to the map-direction / heading rotation.
Note
OPK is most intuitive for near-nadir imagery. For near-horizontal views, AZK/APK (or quaternions / rotation matrices) are often the better choice.
In weitsicht you can build this with:
from weitsicht import Rotation
rot = Rotation.from_opk_degree(omega=1.2, phi=-0.5, kappa=42.0)
OPK rotation matrix definition#
In weitsicht the OPK angles define the camera-to-world rotation matrix as:
with:
which evaluates to:
This is the same OPK definition used by weitsicht.Rotation.
AZK / APK (alpha, zeta, kappa) - typical for terrestrial / horizontal imagery#
AZK (also written APK, using \(\alpha, \zeta, \kappa\)) is commonly used to describe a camera direction
and roll. In weitsicht the angles returned by weitsicht.Rotation.apk are defined on the camera’s +Z axis:
\(\alpha\) (alpha): azimuth of the camera +Z axis, in the XY plane,
\(\zeta\) (zeta): off-nadir angle (0 deg = nadir, 90 deg = horizontal),
\(\kappa\) (kappa): rotation around the camera +Z axis.
Note
In the weitsicht camera CRS the optical axis is \(-Z_{CAM}\). If you prefer AZK/APK angles defined on the
viewing direction (optical axis), you can convert approximately with:
\(\alpha_{view} = \alpha + 180^\circ\) (wrapped to your preferred range)
\(\kappa_{view} = -\kappa\)
This notation is often convenient for terrestrial / oblique imagery where the camera looks close to horizontal (\(\zeta \approx 90^\circ\)) and you want an immediate interpretation of:
where the camera is looking (azimuth + up/down tilt),
and how the image is rolled.
Note
For \(\zeta \approx 0^\circ\) (near-nadir) or \(\zeta \approx 180^\circ\) the azimuth/roll split is not unique (gimbal lock). In that case OPK, quaternions, or a rotation matrix are usually the better interchange format.
In weitsicht you can build this with:
from weitsicht import Rotation
rot = Rotation.from_apk_degree(alpha=120.0, zeta=90.0, kappa=0.0)
AZK / APK rotation matrix definition#
In weitsicht the AZK/APK angles define the camera-to-world rotation matrix as:
with:
which evaluates to:
Other common orientation representations#
Besides OPK and AZK/APK you will often encounter:
Yaw/Pitch/Roll (also called heading/pitch/roll or roll/pitch/yaw): widely used in navigation and IMU/GNSS. (Be careful: different communities use different axis orders and sign conventions.)
Pan/Tilt/Roll: common for gimbals and terrestrial camera rigs (conceptually similar to yaw/pitch/roll).
Quaternions: compact, numerically stable, no gimbal lock; very common in SfM/SLAM and robotics outputs.
Axis-angle / rotation vector (Rodrigues): used in many vision libraries (e.g. OpenCV) and bundle adjustment outputs.
Rotation matrix directly: the most explicit form, often used in academic outputs and for validation/debugging.
If you frequently exchange data with other tools, it can be safer to store the pose as a rotation matrix or quaternion and only convert to angles for display/interaction.