HMOO 讀書筆記: PCN: Point Completion Network 簡介

2021年1月17日星期日

PCN: Point Completion Network 簡介

前文的 Pointnet 是利用深度學習將輸入的點雲分類或分割，而本文要介紹的 PCN (Point Completion Network) [1] 的目的是將觀察到的部分點雲 X 預測（或重建）其完整的點雲 Y。以下是 PCN 文中的示意圖：

PCN

模型簡介

此模型的為 encoder - decoder 的架構，以下為模型的流程：

輸入的點雲 X 為 \(N \times 3\) 的矩陣。
Encoder 的輸出為 \(k\) 維的向量，在文中稱為 global feature。Encoder 使用的是 Pointnet 模型（請參考前文）。
Decoder 包含兩部分：coarse decoder 與 fine decoder，分別產生 \(Y_{coarse}\) 與 \(Y_{detail}\)。\(Y_{couase}\) 是 \(s \times 3\) 的矩陣，代表 \(s\) 個點的點雲，而 \(Y_{detail}\) 是 \(n \times 3\) 的矩陣，代表 \(n\) 個點的點雲，其中 \(n = u^2 s\)，\(u\) 為一個正整數，之後會簡介其意義。
Loss function 想要比較的是兩個點雲的距離，本文使用的是 Chamfer Distance 及推土機距離（請參考前文）。

以下為此模型的架構圖：

PCN Architecture

其他細節

Coarse decoder 的輸入是 \(k\) 維的向量，輸出為 \(s \times 3\) 的矩陣，由 MLP 與 reshape 組成。
Fine decoder 借用了 FoldingNet [2] 的精神：想像每個 \(Y_{coarse}\) 的點的周圍由 \(u \times u\) 的平面所組成，也就是說這 \(u^2\) 個點組成的平面可以經過一些變換轉換成任何形狀。因此 fine decoder 的輸入由這幾個部分構成：

\(Y_{coarse}\) 的點：每個點重複 \(u^2\)次，維度為 \(u^2 s \times 3\)。
Global feature：重複 \(u^2 s\) 次，維度為 \(u^2 s \times k\)。
2D grid feature：維度為 \(u^2 s \times 2\)。

Fine decoder 由 MLP 組成，輸出為 \(u^2 s \times 3\)，也就是 \(u^2 s\) 個點。
Chamfer distance 的定義如下： \[ CD(S_1, S_2) = \frac{1}{|S_1|}\sum_{x \in S_1}\ \underset{y\in S_2}{min}\left \| x-y \right \|_2 + \frac{1}{|S_2|}\sum_{y \in S_2}\ \underset{x\in S_1}{min}\left \| y-x \right \|_2 \]

參考資料

[1] PCN: Point Completion Network

[2] FoldingNet: Point Cloud Auto-encoder via Deep Grid Deformation

沒有留言:

張貼留言

訂閱：張貼留言 (Atom)