カメラマトリックス

コンピュータービジョンでは、 カメラマトリックスまたは（カメラ）投影マトリックスは、世界の3Dポイントから画像の2Dポイントへのピンホールカメラのマッピングを記述する3×4 {\ displaystyle 3 \ times 4}マトリックスです。

x {\ displaystyle \ mathbf {x}}を同次座標（4次元ベクトル）の3Dポイントの表現とし、y {\ displaystyle \ mathbf {y}}をこのポイントの画像の表現としますピンホールカメラ（3次元ベクトル）。そして、次の関係が成り立つ

y∼Cx {\ displaystyle \ mathbf {y} \ sim \ mathbf {C} \、\ mathbf {x}}

ここで、C {\ displaystyle \ mathbf {C}}はカメラ行列で、〜{\ displaystyle \、\ sim}記号は、左側と右側がゼロ以外のスカラー乗算に等しいことを意味します。

カメラ行列C {\ displaystyle \ mathbf {C}}は2つの射影空間の要素間のマッピングに関与するため、射影要素と見なすこともできます。これは、ゼロ以外のスカラーを乗算すると同等のカメラ行列が生成されるため、自由度が11のみであることを意味します。

導出

ピンホールカメラモデルによる、3D点Pの座標から画像平面への点の投影の2D画像座標へのマッピングは、

（y1y2）= fx3（x1x2）{\ displaystyle {\ begin {pmatrix} y_ {1} \\ y_ {2} \ end {pmatrix}} = {\ frac {f} {x_ {3}}} {\ begin {pmatrix} x_ {1} \\ x_ {2} \ end {pmatrix}}}

ここで、（x1、x2、x3）{\ displaystyle（x_ {1}、x_ {2}、x_ {3}）}は、カメラ中心座標系に対する相対的なPの3D座標、（y1、y2）{\ displaystyle （y_ {1}、y_ {2}）}は結果の画像座標であり、 fはf > 0と仮定するカメラの焦点距離です。さらに、 x3> 0と仮定します。

カメラ行列を導出するために、この式は同次座標の観点から書き直されます。 2Dベクトル（y1、y2）{\ displaystyle（y_ {1}、y_ {2}）}の代わりに、射影要素（3Dベクトル）y =（y1、y2,1）{\ displaystyle \ mathbf { y} =（y_ {1}、y_ {2}、1）}そして、同等の代わりに、〜{\ displaystyle \、\ sim}で表されるゼロ以外の数によるスケーリングまでの同等を考慮します。まず、同種の画像座標を通常の3D座標の式として記述します。

（y1y21）= fx3（x1x2x3f）〜（x1x2x3f）{\ displaystyle {\ begin {pmatrix} y_ {1} \\ y_ {2} \\ 1 \ end {pmatrix}} = {\ frac {f} {x_ { 3}}} {\ begin {pmatrix} x_ {1} \\ x_ {2} \\ {\ frac {x_ {3}} {f}} \ end {pmatrix}} \ sim {\ begin {pmatrix} x_ {1} \\ x_ {2} \\ {\ frac {x_ {3}} {f}} \ end {pmatrix}}}

最後に、3D座標も同次表現x {\ displaystyle \ mathbf {x}}で表現され、これがカメラ行列の表示方法です。

（y1y21）〜（10000100001f0）（x1x2x31）{\ displaystyle {\ begin {pmatrix} y_ {1} \\ y_ {2} \\ 1 \ end {pmatrix}} \ sim {\ begin {pmatrix} 1＆0＆0＆0 \\ 0＆1＆0＆0 \\ 0＆0＆{\ frac {1} {f}}＆0 \ end {pmatrix}} \、{\ begin {pmatrix} x_ {1} \\ x_ {2} \\ x_ {3} \\ 1 \ end { pmatrix}}}またはy∼Cx {\ displaystyle \ mathbf {y} \ sim \ mathbf {C} \、\ mathbf {x}}

ここで、C {\ displaystyle \ mathbf {C}}はカメラ行列であり、ここで与えられます

C =（10000100001f0）{\ displaystyle \ mathbf {C} = {\ begin {pmatrix} 1＆0＆0＆0 \\ 0＆1＆0＆0 \\ 0＆0＆{\ frac {1} {f}}＆0 \ end {pmatrix}}}

対応するカメラマトリックスは、

C =（10000100001f0）〜（f0000f000010）{\ displaystyle \ mathbf {C} = {\ begin {pmatrix} 1＆0＆0＆0 \\ 0＆1＆0＆0 \\ 0＆0＆{\ frac {1} {f}}＆0 \ end {pmatrix}} \ sim {\ begin {pmatrix} f＆0＆0＆0 \\ 0＆f＆0＆0 \\ 0＆0＆1＆0 \ end {pmatrix}}}

最後のステップは、C {\ displaystyle \ mathbf {C}}自体が射影要素であることの結果です。

ここで導出されたカメラ行列は、ゼロ以外の要素がほとんど含まれていないという意味で些細なように見える場合があります。これは、3Dおよび2Dポイントに選択された特定の座標系に大きく依存します。ただし、実際には、以下に示すように、カメラマトリックスの他の形式が一般的です。

カメラの位置

前のセクションで導出されたカメラ行列C {\ displaystyle \ mathbf {C}}には、ベクトルがまたがるヌル空間があります。

n =（0001）{\ displaystyle \ mathbf {n} = {\ begin {pmatrix} 0 \\ 0 \\ 0 \\ 1 \ end {pmatrix}}}

これは、座標（0,0,0）を持つ3Dポイントの同次表現でもあります。つまり、「カメラの中心」（別名、瞳孔、ピンホールカメラのピンホールの位置）はOです。これは、カメラの中心（およびこの点のみ）がカメラによって画像平面の点にマッピングできないことを意味します（または同等に、画像上のすべての光線がこの点を通過するため、画像上のすべての点にマッピングされます）。

x3 = 0 {\ displaystyle x_ {3} = 0}の他の3Dポイントの場合、結果y〜Cx {\ displaystyle \ mathbf {y} \ sim \ mathbf {C} \、\ mathbf {x}} -definedで、形式はy =（y1y20）⊤{\ displaystyle \ mathbf {y} =（y_ {1} \、y_ {2} \、0）^ {\ top}}です。これは、投影画像平面内の無限遠点に対応します（ただし、画像平面がユークリッド平面である場合、対応する交差点は存在しません）。

正規化されたカメラ行列と正規化された画像座標

f = 1と仮定すると、上記で導出したカメラ行列をさらに簡略化できます。

C0 =（100001000010）=（I0）{\ displaystyle \ mathbf {C} _ {0} = {\ begin {pmatrix} 1＆0＆0＆0 \\ 0＆1＆0＆0 \\ 0＆0＆1＆0 \ end {pmatrix}} = \ left（{\ begin {array } {c | c} \ mathbf {I}＆\ mathbf {0} \ end {array}} \ right）}

ここで、I {\ displaystyle \ mathbf {I}}は3×3 {\ displaystyle 3 \ times 3}の単位行列を示します。ここで、3×4 {\ displaystyle 3 \ times 4}行列C {\ displaystyle \ mathbf {C}}は、3×3 {\ displaystyle 3 \ times 3}行列と3次元ベクトルの連結に分割されることに注意してください。。カメラ行列C0 {\ displaystyle \ mathbf {C} _ {0}}は、 標準形式と呼ばれることもあります。

これまで、3Dワールドのすべてのポイントはカメラ中心の座標系、つまりカメラの中心（ピンホールカメラのピンホールの位置）を原点とする座標系で表されていました。ただし、実際には、3Dポイントは、任意の座標系（X1 '、X2'、X3 '）を基準とした座標で表されます。カメラ座標軸（X1、X2、X3）および軸（X1 '、X2'、X3 '）がユークリッドタイプ（直交および等方性）であると仮定すると、間に固有のユークリッド3D変換（回転および平行移動） 2つの座標系。言い換えれば、カメラは必ずしもz軸に沿って原点にあるとは限りません。

3D座標の回転と平行移動の2つの操作は、2つの4×4 {\ displaystyle 4 \ times 4}行列として表すことができます。

（R001）{\ displaystyle \ left（{\ begin {array} {c | c} \ mathbf {R}＆\ mathbf {0} \\\ hline \ mathbf {0}＆1 \ end {array}} \ right） }および（It01）{\ displaystyle \ left（{\ begin {array} {c | c} \ mathbf {I}＆\ mathbf {t} \\\ hline \ mathbf {0}＆1 \ end {array}} \右）}

ここで、R {\ displaystyle \ mathbf {R}}は3×3 {\ displaystyle 3 \ times 3}回転行列であり、t {\ displaystyle \ mathbf {t}}は3次元の平行移動ベクトルです。最初のマトリックスが3Dポイントの同次表現に乗算されると、結果は回転ポイントの同次表現になり、2番目のマトリックスは代わりに平行移動を実行します。 2つの操作を順番に実行します。つまり、最初に回転、次に平行移動（すでに回転した座標系で指定された平行移動ベクトルを使用）により、回転と平行移動の複合行列が得られます。

（Rt01）{\ displaystyle \ left（{\ begin {array} {c | c} \ mathbf {R}＆\ mathbf {t} \\\ hline \ mathbf {0}＆1 \ end {array}} \ right） }

R {\ displaystyle \ mathbf {R}}およびt {\ displaystyle \ mathbf {t}}が2つの座標系（X1、X2、X3）および（X1 '、X2'に関連する回転と平行移動であると仮定すると、 X3 '）上記、これは、

x =（Rt01）x ′{\ displaystyle \ mathbf {x} = \ left（{\ begin {array} {c | c} \ mathbf {R}＆\ mathbf {t} \\\ hline \ mathbf {0} ＆1 \ end {array}} \ right）\ mathbf {x} '}

ここで、x '{\ displaystyle \ mathbf {x}'}は、座標系（X1 '、X2'、X3 '）の点Pの同次表現です。

カメラ行列がC0 {\ displaystyle \ mathbf {C} _ {0}}で与えられると仮定すると、（X1 '、X2'、X3 '）システムの座標から同種の画像座標へのマッピングは

y∼C0x =（I0）（Rt01）x ′=（Rt）x′ {\ displaystyle \ mathbf {y} \ sim \ mathbf {C} _ {0} \、\ mathbf {x} = \ left（{\ begin {array} {c | c} \ mathbf {I}＆\ mathbf {0} \ end {array}} \ right）\、\ left（{\ begin {array} {c | c} \ mathbf {R} ＆\ mathbf {t} \\\ hline \ mathbf {0}＆1 \ end {array}} \ right）\ mathbf {x} '= \ left（{\ begin {array} {c | c} \ mathbf {R }＆\ mathbf {t} \ end {array}} \ right）\、\ mathbf {x} '}

したがって、座標系（X1 '、X2'、X3 '）の点を画像座標に関連付けるカメラ行列は、

CN =（Rt）{\ displaystyle \ mathbf {C} _ {N} = \ left（{\ begin {array} {c | c} \ mathbf {R}＆\ mathbf {t} \ end {array}} \右）}

3D回転行列と3次元平行移動ベクトルの連結。

このタイプのカメラマトリックスは、 正規化カメラマトリックスと呼ばれます。焦点距離= 1であり、画像座標は座標系で測定され、原点は軸X3と画像平面の交点に位置し、同じ単位を持ちます。 3D座標系として。結果の画像座標は、 正規化画像座標と呼ばれます。

カメラの位置

繰り返しますが、上記の正規化されたカメラ行列のヌル空間CN {\ displaystyle \ mathbf {C} _ {N}}は、4次元ベクトルによって広がります。

n =（− R−1t1）=（n〜1）{\ displaystyle \ mathbf {n} = {\ begin {pmatrix}-\ mathbf {R} ^ {-1} \、\ mathbf {t} \\ 1 \ end {pmatrix}} = {\ begin {pmatrix} {\ tilde {\ mathbf {n}}} \\ 1 \ end {pmatrix}}}

これもまた、カメラの中心の座標であり、現在は（X1 '、X2'、X3 '）システムを基準にしています。これは、最初に回転を適用し、次に3次元ベクトルn〜{\ displaystyle {\ tilde {\ mathbf {n}}}}に変換を適用することで確認でき、結果は3D座標（0,0 、0）。

これは、カメラマトリックスが参照するのと同じ座標系に関連する3D座標で表される場合、カメラの中心（その均質な表現）はカメラマトリックスのヌル空間にあることを意味します。

正規化されたカメラ行列CN {\ displaystyle \ mathbf {C} _ {N}}は、次のように記述できます。

CN = R（IR−1t）= R（I−n〜）{\ displaystyle \ mathbf {C} _ {N} = \ mathbf {R} \、\ left（{\ begin {array} {c | c} \ mathbf {I}＆\ mathbf {R} ^ {-1} \、\ mathbf {t} \ end {array}} \ right）= \ mathbf {R} \、\ left（{\ begin {array} { c | c} \ mathbf {I}＆-{\ tilde {\ mathbf {n}}} \ end {array}} \ right）}

ここで、n〜{\ displaystyle {\ tilde {\ mathbf {n}}}}は、（X1 '、X2'、X3 '）システムに対するカメラの3D座標です。

一般的なカメラマトリックス

正規化されたカメラマトリックスによって生成されたマッピングが与えられると、結果の正規化された画像座標は、任意の2Dホモグラフィによって変換できます。これには、スケーリング（等方性および異方性）だけでなく、2Dの平行移動と回転だけでなく、一般的な2D透視変換も含まれます。このような変換は、3×3 {\ displaystyle 3 \ times 3}行列H {\ displaystyle \ mathbf {H}}として表すことができます。この行列は、均質な正規化画像座標y {\ displaystyle \ mathbf {y}}を均質にマッピングします変換された画像座標y ′{\ displaystyle \ mathbf {y}'}：

y ′= Hy {\ displaystyle \ mathbf {y}' = \ mathbf {H} \、\ mathbf {y}}

3D座標に関して正規化された画像座標に上記の式を挿入すると、

y ′= HCNx′ {\ displaystyle \ mathbf {y} '= \ mathbf {H} \、\ mathbf {C} _ {N} \、\ mathbf {x}'}

これにより、カメラマトリックスの最も一般的な形式が生成されます。

C = HCN = H（Rt）{\ displaystyle \ mathbf {C} = \ mathbf {H} \、\ mathbf {C} _ {N} = \ mathbf {H} \、\ left（{\ begin {array} {c | c} \ mathbf {R}＆\ mathbf {t} \ end {array}} \ right）}