HMOO 讀書筆記: 電腦視覺中的相機模型

本文為 Multiple View Geometry 第六章 Camera Models 的筆記。

相機模型 (camera) 指的是將三維世界中的點至二維圖片平面上的點的轉換。本書中介紹的都為 central projection camera model，而從轉換矩陣的性質我們可以將相機模型從最一般的 general projective camera 中再細分。

另外從相機中心在有限或無限遠處又可以分成兩大類：finite cameras 與 cameras at infinity，也就是平行投影相機模型。

相機模型的一般式

一般式可寫成： \[ \mathbf{x}=KR[I | - \mathbf{\bar{C}}]\mathbf{X} \]其中 K 為 intrinsics，R 與 C 為 extrinsics；相機矩陣 P 為 \(KR[I | - \mathbf{\bar{C}}]\)。

Pinhole 模型

矩陣 K 的形式為： \[ K = \begin{bmatrix} f & & c_x\\ & f & c_y\\ & & 1 \end{bmatrix} \]此相機有九個自由度：K 三個、旋轉平移各三個。

CCD camera 模型

矩陣 K 的形式為： \[ K = \begin{bmatrix} f_x & & c_x\\ & f_y & c_y\\ & & 1 \end{bmatrix} \]此相機有十個自由度：K 四個、旋轉平移各三個。

Finite projective camera 模型

矩陣 K 的形式為： \[ K = \begin{bmatrix} f_x & s & c_x\\ & f_y & c_y\\ & & 1 \end{bmatrix} \]此相機有十一個自由度：K 五個、旋轉平移各三個。值得一提的是 P 的左邊 3*3 矩陣（也就是 KR）為 non-singular。反過來說，當有一個相機矩陣的左邊 3*3 矩陣為 non-singular 時則 P 必定為 finite projective camera。只要用 RQ 分解即可解出 K 與 R 矩陣。

General Projective camera 模型

General projective camera 為最一般的相機模型，唯一的限制為 P 的 rank 必須為三，自由度為十一。只要是一個 general projective camera 矩陣 P 即可將三維世界中的點 X 轉換至平面上的點 x。以下為一些矩陣 P 的性質。

左邊的 3*3 矩陣必須為 non-singular，否則就不是 finite camera 了。
相機中心：定義為 P 的 right-null-space \(\mathbf{C}\) 使得 P\(\mathbf{C} = 0\)。

Finite camera：M 是 non-singular，\(\mathbf{C}=\begin{bmatrix} -M^{-1}\mathbf{p}_4\\ 1 \end{bmatrix}\)。
Camera at infinity：M 是 singular，\(\mathbf{C}=\begin{bmatrix}\mathbf{d}\\ 0 \end{bmatrix}\)；\(\mathbf{d}\) 為 M 的 null space。

Column vectors：前三組 column vectors 為三個方向的消失點投影，第四組 column vector 為世界坐標系原點的投影。以上關係可由 (1, 0, 0, 0)，也就是 x 軸；以及 (0, 1, 0, 0)、(0, 0, 1, 0)、(0, 0, 0, 1) 求得。
Row vectors：第三個 row 為 principal plane。物理意義為所有投影在 infinity of the image 的點所構成的平面，也就是 P\(\mathbf{X}=(x, y, 0)^T\)。這個平面同時也通過相機中心。另外兩個 row 為 axis planes，分別通過 y 與 x 軸。
Principal point：為 principal axis 與圖片平面的交點。M 為 P 的左邊 3*3 矩陣，principal point \(x_0 = M\mathbf{m}^3\)，\(\mathbf{m}^{3T}\) 是 M 的第三個 row。
Principal ray：\(\mathbf{v}=det(M)\mathbf{m}^3\) 為 principal axis 的方向向量，永遠指向相機前方。
點對於相機的深度公式： \[ P(X, Y, Z, T)^T = w(x, y, 1)^T \\ depth(\mathbf{X};P)=-\frac{sign(det\ M)w}{T \left \| \mathbf{m}^3 \right \|} \]
從相機矩陣算出相機中心：用 SVD 解 \(P \mathbf{C} = 0\)。
分解相機矩陣 P 成為 K 與 R：可以利用已知 K 的特性（各種相機模型）來找到正確的解。
當 skew 不為零時，很有可能為 picture of a picture 的情況。

Cameras at infinity

當相機中心在無限遠處時此模型為 cameras at infinity，也就是矩陣 M 為 singular。Affine camera 指的是相機矩陣的第三個 row 可以寫成 (0, 0, 0, 1)，此相機會將 points at infinity 轉換至 points at infinity。

Dolly Zoom 滑動變焦

滑動變焦是一種攝影技巧，同時讓相機遠離物體並使鏡頭放大（或反方向），達成物體大小不變但背景卻能有變化的效果。當拉遠時的相機矩陣會是以下形式： \[ P_{\infty}=\lim_{t\rightarrow \infty}P_t=K\begin{bmatrix} \mathbf{r}^{1T} & -\mathbf{r}^{1T}\mathbf{\bar{C}} \\ \mathbf{r}^{2T} & -\mathbf{r}^{2T}\mathbf{\bar{C}}\\ \mathbf{0}^{T} & -\mathbf{r}^{3T}\mathbf{\bar{C}} \end{bmatrix} \\ \]也就是一個 affine camera。

使用 affine camera 近似 projective camera 的誤差

一個三維空間中的點 X 當分別用 projective camera 與 affine camera 來轉換時，兩者的誤差可由以下式子計算： \[ \mathbf{x}_{affine}=P_{\infty}\mathbf{X} \\ \mathbf{x}_{proj}=P_{0}\mathbf{X} \\ \mathbf{x}_0=[c_x, c_y]^T \\ \mathbf{x}_{affine} - \mathbf{x}_{proj}= \frac{\Delta }{d_0}(\mathbf{x}_{proj} - \mathbf{x}_0) \]因此在以下兩種狀況滿足時，用 affine camera 近似 projective camera 的誤差會比較小。

Depth relief \(\Delta\) 很小的時候。
點很靠近 principal ray 的時候，例如比較小的 field of view。

幾種 affine camera 的分類

最簡單的形式是將相機矩陣寫成以下形式： \[ P= \begin{bmatrix} 1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 & 0 & 0 & 1 \end{bmatrix} \]也就是說直接捨棄 Z 軸，將三維世界中的點 (X, Y, Z, 1) 轉換至 (X, Y, 1)。

Orthographic projection

考慮一般的平行投影情況，會加上坐標系變換的旋轉與平移，而如前面所說我們會拿掉 Z 軸，因此相機矩陣會寫成以下形式： \[ \begin{bmatrix} \mathbf{r}^{1T} & t_1\\ \mathbf{r}^{2T} & t_2\\ \mathbf{0}^{T} & 1 \end{bmatrix} \]此相機有五個自由度。

Scaled orthographic projection

跟 orthographic projection 多一個 scale 的自由度： \[ P= \begin{bmatrix} k & & \\ & k & \\ & & 1 \end{bmatrix} \begin{bmatrix} \mathbf{r}^{1T} & t_1\\ \mathbf{r}^{2T} & t_2\\ \mathbf{0}^{T} & 1 \end{bmatrix} \]

Weak perspective projection

與 scaled orthographic projection 相比兩個軸的 scale 是不同的： \[ P= \begin{bmatrix} \alpha_x & & \\ & \alpha_y & \\ & & 1 \end{bmatrix} \begin{bmatrix} \mathbf{r}^{1T} & t_1\\ \mathbf{r}^{2T} & t_2\\ \mathbf{0}^{T} & 1 \end{bmatrix} \]以下為一張示意圖：

物體會先投影在一個平面上（也就是上面提過的 \(d_0\)，再 perspective 投影至圖片上。

Affine camera

Affine camera 多了一個 skew 的自由度。可以寫成本節一開始 affine camera 的一般式，也就是只有第三個 row 為 (0, 0, 0, 1)。

參考資料

[1] https://www.cse.unr.edu/~bebis/CS791E/Notes/PerspectiveProjection.pdf

HMOO 讀書筆記

2022年3月24日星期四

電腦視覺中的相機模型

相機模型的一般式

Pinhole 模型

CCD camera 模型

Finite projective camera 模型

General Projective camera 模型

Cameras at infinity

Dolly Zoom 滑動變焦

使用 affine camera 近似 projective camera 的誤差

幾種 affine camera 的分類

Orthographic projection

Scaled orthographic projection

Weak perspective projection

Affine camera

參考資料

沒有留言:

張貼留言

2022年3月24日 星期四

電腦視覺中的相機模型

相機模型的一般式

Pinhole 模型

CCD camera 模型

Finite projective camera 模型

General Projective camera 模型

Cameras at infinity

Dolly Zoom 滑動變焦

使用 affine camera 近似 projective camera 的誤差

幾種 affine camera 的分類

Orthographic projection

Scaled orthographic projection

Weak perspective projection

Affine camera

參考資料

沒有留言:

張貼留言

2022年3月24日星期四