SlideShare una empresa de Scribd logo
1 de 44
1
DEEP LEARNING JP
[DL Papers]
http://deeplearning.jp/
Neural Radiance Field (NeRF) の派生研究まとめ
Kento Doi, Matsuo Lab
NeRFとは
2
NeRFとは
• NeRF : Neural Radiance Field
• ECCV2020でbest paper honorable mention
• 三次元点の座標と視線方向を入力すると色と密度を出力するNN
• Novel view synthesisのタスクで大幅な性能向上を実現
• 参考 : DL輪読会の発表資料や日経クロストレンド記事
3
B. Midenhall et al. “
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis”, arxiv prepring, 2020.
NeRFの原理
• 三次元点の色と密度を予測するNN (NeRF) でシーンを表現 (左図)
• 入力 : 三次元点の座標、視線方向 (反射や透過など方向依存の現象を表現)
• 出力 : 色、密度 (点の透明度合いを表すイメージ)
• ボリュームレンダリングにより視点に条件づけた画像を合成 (中央図)
• 各ピクセルに対応するレイを辿り色を計算
• 色付きゼリーの中を進む光を想像するとわかりやすいかも
• 再構成誤差で学習 (右図)
4
B. Midenhall et al. “
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis”, arxiv prepring, 2020.
デモ動画
5
YouTube動画url https://www.youtube.com/watch?v=JuH79E8rdKc
発表内容
• Neural radiance field (NeRF) の派生研究を紹介
• 以下の項目に研究を分類
• 雑多な画像を用いたNeRF
• 動画のNeRF
• データ効率の良いNeRF
• NeRF + GAN
• 実装の改良・高速化
• controllableなNeRF
• 詳細な部分には立ち入らず要点のみ紹介します
6
NeRF in the Wild
7
NeRF in the Wild: Neural Radiance Fields for Unconstrained
Photo Collections
• 書誌情報
• タイトル: NeRF in the Wild: Neural Radiance
Fields for Unconstrained Photo Collections
• 研究グループ: Google Research
• CVPR2021 oral
• プロジェクトページ
• 概要
• 雑多な画像を用いたNeRF
• インターネット上で収集した画像から観光名所
を復元 (右図)
R. M-Brualla et al. NeRF in the Wild: Neural Radiance
Fields for Unconstrained Photo Collections. CVPR, 2021.
8
NeRF in the Wild: Neural Radiance Fields for Unconstrained
Photo Collections
• モチベーション : 雑多な画像で三次元復元
ができると嬉しい
• 例) Google検索で収集した画像から観光名所の
三次元復元を行う
• 雑多な画像を用いた三次元復元の課題
• 天候や時刻の差異などに起因する画像の見かけ
の変化
• 歩行者などの動的な物体の出現・消失
2つの問題を解消する工夫を提案
1. 見かけの変化を司る潜在変数の導入
2. シーンを明示的に静的な部分と動的な部分に
分ける機構の導入
9
R. M-Brualla et al. NeRF in the Wild: Neural Radiance
Fields for Unconstrained Photo Collections. CVPR, 2021.
NeRF in the Wild: Neural Radiance Fields for Unconstrained
Photo Collections
• Latent Appearance Modeling
• 天候、時刻等による見かけの変化に対応するための手法
• 各画像の見かけのスタイルを司る潜在変数 𝑙𝑖
(𝑎)
を導入
• Transient Objects
• 動的な物体への対応
• 静的な部分と動的な部分のネットワークを分ける
• 動的な部分は潜在変数 𝑙𝑖
(𝜏)
に条件づけて予測
• また、画像のuncertaintyを陽に推定
• uncertaintyが大きい部分はロスの重みを下げる
10
R. M-Brualla et al. NeRF in the Wild: Neural
Radiance Fields for Unconstrained Photo
Collections. CVPR, 2021.
NeRF in the Wild: Neural Radiance Fields for Unconstrained
Photo Collections
• 実験
• ベースライン
• Neural rendering in the Wild (multi-view stereo + pix2pix)
• NeRF
• NeRF-A (Latent Appearance Modelingのみを適用)
• NeRF-U (Transient Objectsのみを適用)
• データセット
• 世界の名所の画像データ (Neural rerendering in the wildで作成されたもの)
• 定量評価
11
NeRF in the Wild: Neural Radiance Fields for Unconstrained
Photo Collections
12
NeRF for video (single camera)
13
Neural Scene Flow Fields for Space-Time View Synthesis of
Dynamic Scenes
• 書誌情報
• CVPR2021
• Adobe Researchインターン生の成果
• プロジェクトページ
• 概要
• 単眼の動画から4Dのview synthesis
• 任意の視点&時刻の画像生成
• Neural Scene Flow Fieldsにより実現
• シーンを時間を入力とする関数で表す
• 細い物体がある複雑なシーンも再現できる
14
Z. Li et al. Neural Scene Flow Fields for Space-Time
View Synthesis of Dynamic Scenes. CVPR, 2021.
Neural Scene Flow Fields for Space-Time View Synthesis of
Dynamic Scenes
• 定式化
• 入力 : 三次元点の座標、視線方向、時刻
• 出力
• 輝度、密度
• 3D scene flow (三次元点の前後フレームとの間の変位)
• disocclusion weight (後述のロスへの重み付けのための係数)
• モデルは次の式で表せる (𝑖は時刻)
• 学習
• 前後フレームの再構成誤差でモデルを学習
• 物体の縁ではオクルージョンの発生により正確にflowを予測できないのでロ
スの重みを下げる (disocclusion weight)
15
Neural Scene Flow Fields for Space-Time View Synthesis of
Dynamic Scenes
• 前ページの内容を説明する図
16
3D scene flow disocclusion weight
Z. Li et al. Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes. CVPR, 2021.
Neural Scene Flow Fields for Space-Time View Synthesis of
Dynamic Scenes
• 正則化のテクニック𝑖 → 𝑖 + 1
• Scene flow prior (cycle consistency)
• 時刻𝑖 → 𝑖 + 1のflowと時刻𝑖 + 1 → 𝑖のflowが同じになる制約
• Geometric consistency prior
• フレームに投影した3D scene flowが2D optical flowと一致する制約
• Single depth prior
• 単眼デプス推定との結果の整合性
17
Z. Li et al. Neural Scene Flow Fields for Space-Time
View Synthesis of Dynamic Scenes. CVPR, 2021.
Neural Scene Flow Fields for Space-Time View Synthesis of
Dynamic Scenes
• 2つのモデルを用いシーンの静的な部分と動的な部分をそれぞれ表現
• 静的な部分 : 通常のNeRF
• 動的な部分 : 提案したモデル
18
Z. Li et al. Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes. CVPR, 2021.
Neural Scene Flow Fields for Space-Time View Synthesis of
Dynamic Scenes
• 実験結果
19
Z. Li et al. Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes. CVPR, 2021.
Neural Scene Flow Fieldsの類似手法
• 時間に依存するradiance field (+正則化) で動画をモデリング
• Deformable Neural Radiance Fields (D-NeRF, NeRFies)
• セルフィー動画のnovel view synthesis、デモの見せ方が素晴らしい
• 複数人物を扱ってないことが査読で突っ込まれたらしい (著者のツイート)
• Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synthesis
of a Dynamic Scene From Monocular Video (NR-NeRF)
• 実データで検証
• D-NeRF: Neural Radiance Fields for Dynamics Scenes
• ダイナミックなCGデータで検証
20
K. Park et al. Deformable Neural Radiance Fields. arXiv, 2020. A. Pumarola et al. D-NeRF: Neural Radiance Fields for Dynamics Scenes. arXiv, 2020.
Neural Scene Flow Fieldsの類似手法
• Space-time Neural Irradiance Fields for Free-Viewpoint Video
• デプス推定による正則化
• STaR: Self-supervised Tracking and Reconstruction of Rigid
Objects in Motion with Neural Rendering
• シーン中を単一の剛体が移動するという問題設定
21
W. Xian et al. Space-time Neural Irradiance Fields for Free-
Viewpoint Video.
W. Yuan et al. STaR: Self-supervised Tracking and Reconstruction
of Rigid Objects in Motion with Neural Rendering. arXiv, 2021.
NeRF for video (multi camera)
22
Neural 3D Video Synthesis (DyNeRF)
• 書誌情報
• タイトル: Neural 3D Video Synthesis
• 研究グループ: Facebook Reality Labs Research (インターン中の成果)
• プロジェクトページ
• 概要
• ダイナミックなシーンのためのNeRF
• 座標、方向、時間から輝度を推定
• 効率的に学習する方法も提案
• 18個のカメラから自由視点の動画を生成
• ベースライン (1フレーム1モデル)
と比較してモデルサイズが1/40に
T. Li et al. Neural 3D Video Synthesis. arXiv, 2021. 23
Neural 3D Video Synthesis (DyNeRF)
• 定式化
• 三次元点の座標と視線方向に加え、時刻から輝度と密度を予測
• 実際は各時刻に対応した潜在変数 𝒛𝑡 により条件づける (𝒛𝑡 も学習する)
24
Neural 3D Video Synthesis (DyNeRF)
• レンダリングとロス関数はNeRFと同じ
• NeRFと同様に元画像との再構成誤差によりradiance fieldを学習する
• しかし、普通に学習するとフレーム数の分だけ学習時間が線形に増加
してしまう・・・
効率的に学習する方法を2つ提案
1. Hierarchical Training
2. Ray Importance Sampling
25
Neural 3D Video Synthesis (DyNeRF)
• 工夫1: Hierarchical Training
• 学習のカリキュラムを工夫することによる効率化
1. 動画のフレームを間引き事前学習
2. 全フレームを用いてfine-tuning
• 工夫2: Ray Importance Sampling
• シーン中のダイナミックなピクセルを重点的にサンプリング
1. Sampling Based on Global Median Maps (DyNeRF-ISG)
• 全時刻におけるピクセル値の中央値との差が大きいピクセルを重点的にサンプルする
2. Sampling Based on Temporal Difference (DyNeRF-IST)
• 近傍フレームとの差が大きいピクセルを重点的にサンプルする
• 実験では2番目の方法が一番性能が良かった
26
Neural 3D Video Synthesis (DyNeRF)
• 実験用データセットは21個のGoProで作成
• 18個を学習に使用
• カメラパラメータは動画の最初のフレームを
用いてCOLMAPにより求めた
T. Li et al. Neural 3D Video Synthesis. arXiv, 2021.
27
Neural 3D Video Synthesis (DyNeRF)
• ベースラインとの比較により手法の有効性を検証
• 画像の写実性で優位 (左下)
• NeRF-T … 潜在変数 𝒛𝑡 を使わないモデル
• DyNeRF† … 学習効率化の工夫なし
• 残りの3つが提案手法
• 1フレームごとにNeRFを学習した場合に比べて1/40のモデル圧縮効果があっ
た (右下)
T. Li et al. Neural 3D Video Synthesis. arXiv, 2021.
28
Neural 3D Video Synthesis (DyNeRF)
• 定性評価
• プロジェクトページ のデモ動画参照
29
データ効率の良いNeRF
30
データ効率の良いNeRF
• 課題: 1つのモデルで1つのシーンしか表現できない
• pixelNeRF: Neural Radiance Fields from One or Few Images
• CVPR2021
• 入力画像に条件付けたradiance fieldを学習することによりFew-shotのNeRFが実現
• GRF: Learning a General Radiance Field for 3D Scene Representation and
Rendering
• 汎化能力がありスパースな入力からシーンを復元可能
• 単一モデルで 複数のシーンを表現可能
31
A. Trevithick and B. Yang. GRF: Learning a General Radiance
Field for 3D Scene Representation and Rendering. arXiv, 2020.
A. Yu et al. pixelNeRF: Neural Radiance Fields from One or Few Images. CVPR, 2021.
データ効率の良いNeRF
• Learned Initializations for Optimizing Coordinate-Based Neural
Representations
• CVPR2021
• メタラーニングで良い初期値を獲得する研究
• 学習の高速化につながる結果
32
M. Tancik et al. Learned Initializations for Optimizing Coordinate-Based Neural Representations. CVPR, 2021.
NeRF + GAN
33
GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis
• 書誌情報
• NeurIPS2020
• 概要
• cGAN + NeRF
• radiance fieldを用いた3D画像生成
• shapeとappearanceで条件づけた
radiance fieldで画像生成
• ボクセルベースの手法 (HoloGAN) を写
実性で上回る
34
K. Schwarz et al. GRAF: Generative Radiance Fields
for 3D-Aware Image Synthesis. NeurIPS, 2021.
pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-
Aware Image Synthesis
• 書誌情報
• arXiv
• Stanford大学
• 概要
• NeRF + StyleGAN + SIREN
• radiance fieldからボリュームレンダリン
グで画像を生成
• StyleGANを模したアーキテクチャ
• 活性化関数にサイン関数 (SIREN) を用い
る
• 従来手法 (HoloGAN) を上回る写実性
35
E. R. Chan et al. pi-GAN: Periodic Implicit Generative Adversarial
Networks for 3D-Aware Image Synthesis. arXiv, 2020.
実装の改良・高速化
36
実装の改良
• NeRF++: Analyzing and Improving Neural Radiance Fields
[Zhang+ ]
• NeRFのネットワーク構造の解析を行い、より広いシーンに適用可能にした
• JaxNeRF [Deng+]
• NeRFのJAX実装
• 本家のTensor Flow実装より早い
37
K. Zhang et al. NeRF++: Analyzing and Improving Neural Radiance Fields. arXiv, 2020.
NeRFの高速化
• Neural Sparse Voxel Fields [Liu+ NeurIPS2020]
• 空間をスパースなボクセルで区切り、それぞれのボクセルでNeRFを学習
• 10倍以上の高速化
• DeRF: Decomposed Radiance Fields [Rebain+ arXiv2020]
• シーンをボロノイ空間分割しそれぞれにNeRFを割り当て
• 3倍以上の効率化 (高速化?)
38
L. Liu et al. Neural Sparse Voxel Fields. NeurIPS, 2020.
D. Rebain et al. DeRF: Decomposed Radiance Fields. arXiv, 2020.
controllableなNeRF
39
controllableなNeRF (この辺りよくわかりません)
• NeRV: Neural Reflectance and Visibility Fields for Relighting and View
Synthesis (CVPR2021)
• 密度、法線ベクトル、マテリアルパラメータ、表面までの距離と方向を予測するMLPを学習
• 任意のライティングでレンダリング可能
• NeRD: Neural Reflectance Decomposition from Image Collections
• NeRVと同じような感じ??
40
P. Srinivasan et al. NeRV: Neural Reflectance and Visibility
Fields for Relighting and View Synthesis. CVPR, 2021.
M. Boss et al. NeRD: Neural Reflectance Decomposition from Image Collections. arXiv, 2020.
まとめ
• Neural radiance field (NeRF) の派生研究の紹介
• 大きく以下のように分類することが可能
• 雑多な画像を用いたNeRF
• 動画のNeRF
• データ効率の良いNeRF
• NeRF + GAN
• 実装の改良・高速化
• controllableなNeRF
• 感想
• 今後は実用化に向けて必要なことを考える必要がありそう
• NeRF関連の研究はGoogle Research, Facebookが牽引してる感じがある
41
おまけ
42
参考文献
• B. Mildenhall et al. "NeRF: Representing Scenes as Neural Radiance Fields for View
Synthesis". ECCV, 2020.
• R. M-Brualla et al. NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo
Collections. CVPR, 2021.
• Z. Li et al. Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes.
CVPR, 2021.
• K. Park et al. Deformable Neural Radiance Fields. arXiv, 2020.
• A. Pumarola et al. D-NeRF: Neural Radiance Fields for Dynamics Scenes. arXiv, 2020.
• T. Li et al. Neural 3D Video Synthesis. arXiv, 2021.
• A. Yu et al. pixelNeRF: Neural Radiance Fields from One or Few Images. CVPR, 2021.
• A. Trevithick and B. Yang. GRF: Learning a General Radiance Field for 3D Scene
Representation and Rendering. arXiv, 2020.
• M. Tancik et al. Learned Initializations for Optimizing Coordinate-Based Neural
Representations. CVPR, 2021.
43
参考文献
• K. Schwarz et al. GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis. NeurIPS,
2021.
• E. R. Chan et al. pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware
Image Synthesis. arXiv, 2020.
• L. Liu et al. Neural Sparse Voxel Fields. NeurIPS, 2020.
• K. Zhang et al. NeRF++: Analyzing and Improving Neural Radiance Fields. arXiv, 2020.
• B. Deng et al. JaxNeRF. (https://github.com/google-research/google-
research/tree/master/jaxnerf )
• P. Srinivasan et al. NeRV: Neural Reflectance and Visibility Fields for Relighting and View
Synthesis. CVPR, 2021.
• M. Boss et al. NeRD: Neural Reflectance Decomposition from Image Collections. arXiv, 2020.
44

Más contenido relacionado

Más de Deep Learning JP

【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?Deep Learning JP
 
【DL輪読会】Hopfield network 関連研究について
【DL輪読会】Hopfield network 関連研究について【DL輪読会】Hopfield network 関連研究について
【DL輪読会】Hopfield network 関連研究についてDeep Learning JP
 
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )Deep Learning JP
 
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...Deep Learning JP
 
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"Deep Learning JP
 
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
【DL輪読会】"Language Instructed Reinforcement Learning  for Human-AI Coordination "【DL輪読会】"Language Instructed Reinforcement Learning  for Human-AI Coordination "
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "Deep Learning JP
 
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat ModelsDeep Learning JP
 
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"Deep Learning JP
 
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...Deep Learning JP
 
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...Deep Learning JP
 
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...Deep Learning JP
 
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...Deep Learning JP
 
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...Deep Learning JP
 
【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...
【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...
【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...Deep Learning JP
 
【DL輪読会】マルチモーダル 基盤モデル
【DL輪読会】マルチモーダル 基盤モデル【DL輪読会】マルチモーダル 基盤モデル
【DL輪読会】マルチモーダル 基盤モデルDeep Learning JP
 
【DL輪読会】TrOCR: Transformer-based Optical Character Recognition with Pre-traine...
【DL輪読会】TrOCR: Transformer-based Optical Character Recognition with Pre-traine...【DL輪読会】TrOCR: Transformer-based Optical Character Recognition with Pre-traine...
【DL輪読会】TrOCR: Transformer-based Optical Character Recognition with Pre-traine...Deep Learning JP
 
【DL輪読会】HyperDiffusion: Generating Implicit Neural Fields withWeight-Space Dif...
【DL輪読会】HyperDiffusion: Generating Implicit Neural Fields withWeight-Space Dif...【DL輪読会】HyperDiffusion: Generating Implicit Neural Fields withWeight-Space Dif...
【DL輪読会】HyperDiffusion: Generating Implicit Neural Fields withWeight-Space Dif...Deep Learning JP
 
【DL輪読会】大量API・ツールの扱いに特化したLLM
【DL輪読会】大量API・ツールの扱いに特化したLLM【DL輪読会】大量API・ツールの扱いに特化したLLM
【DL輪読会】大量API・ツールの扱いに特化したLLMDeep Learning JP
 
【DL輪読会】DINOv2: Learning Robust Visual Features without Supervision
【DL輪読会】DINOv2: Learning Robust Visual Features without Supervision【DL輪読会】DINOv2: Learning Robust Visual Features without Supervision
【DL輪読会】DINOv2: Learning Robust Visual Features without SupervisionDeep Learning JP
 
【DL輪読会】Poisoning Language Models During Instruction Tuning Instruction Tuning...
【DL輪読会】Poisoning Language Models During Instruction Tuning Instruction Tuning...【DL輪読会】Poisoning Language Models During Instruction Tuning Instruction Tuning...
【DL輪読会】Poisoning Language Models During Instruction Tuning Instruction Tuning...Deep Learning JP
 

Más de Deep Learning JP (20)

【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?
 
【DL輪読会】Hopfield network 関連研究について
【DL輪読会】Hopfield network 関連研究について【DL輪読会】Hopfield network 関連研究について
【DL輪読会】Hopfield network 関連研究について
 
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
 
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
 
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
 
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
【DL輪読会】"Language Instructed Reinforcement Learning  for Human-AI Coordination "【DL輪読会】"Language Instructed Reinforcement Learning  for Human-AI Coordination "
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
 
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
 
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
 
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
 
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
 
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
 
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
 
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
 
【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...
【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...
【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...
 
【DL輪読会】マルチモーダル 基盤モデル
【DL輪読会】マルチモーダル 基盤モデル【DL輪読会】マルチモーダル 基盤モデル
【DL輪読会】マルチモーダル 基盤モデル
 
【DL輪読会】TrOCR: Transformer-based Optical Character Recognition with Pre-traine...
【DL輪読会】TrOCR: Transformer-based Optical Character Recognition with Pre-traine...【DL輪読会】TrOCR: Transformer-based Optical Character Recognition with Pre-traine...
【DL輪読会】TrOCR: Transformer-based Optical Character Recognition with Pre-traine...
 
【DL輪読会】HyperDiffusion: Generating Implicit Neural Fields withWeight-Space Dif...
【DL輪読会】HyperDiffusion: Generating Implicit Neural Fields withWeight-Space Dif...【DL輪読会】HyperDiffusion: Generating Implicit Neural Fields withWeight-Space Dif...
【DL輪読会】HyperDiffusion: Generating Implicit Neural Fields withWeight-Space Dif...
 
【DL輪読会】大量API・ツールの扱いに特化したLLM
【DL輪読会】大量API・ツールの扱いに特化したLLM【DL輪読会】大量API・ツールの扱いに特化したLLM
【DL輪読会】大量API・ツールの扱いに特化したLLM
 
【DL輪読会】DINOv2: Learning Robust Visual Features without Supervision
【DL輪読会】DINOv2: Learning Robust Visual Features without Supervision【DL輪読会】DINOv2: Learning Robust Visual Features without Supervision
【DL輪読会】DINOv2: Learning Robust Visual Features without Supervision
 
【DL輪読会】Poisoning Language Models During Instruction Tuning Instruction Tuning...
【DL輪読会】Poisoning Language Models During Instruction Tuning Instruction Tuning...【DL輪読会】Poisoning Language Models During Instruction Tuning Instruction Tuning...
【DL輪読会】Poisoning Language Models During Instruction Tuning Instruction Tuning...
 

Último

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 

Último (20)

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 

[DL輪読会]Neural Radiance Field (NeRF) の派生研究まとめ

  • 1. 1 DEEP LEARNING JP [DL Papers] http://deeplearning.jp/ Neural Radiance Field (NeRF) の派生研究まとめ Kento Doi, Matsuo Lab
  • 3. NeRFとは • NeRF : Neural Radiance Field • ECCV2020でbest paper honorable mention • 三次元点の座標と視線方向を入力すると色と密度を出力するNN • Novel view synthesisのタスクで大幅な性能向上を実現 • 参考 : DL輪読会の発表資料や日経クロストレンド記事 3 B. Midenhall et al. “ NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis”, arxiv prepring, 2020.
  • 4. NeRFの原理 • 三次元点の色と密度を予測するNN (NeRF) でシーンを表現 (左図) • 入力 : 三次元点の座標、視線方向 (反射や透過など方向依存の現象を表現) • 出力 : 色、密度 (点の透明度合いを表すイメージ) • ボリュームレンダリングにより視点に条件づけた画像を合成 (中央図) • 各ピクセルに対応するレイを辿り色を計算 • 色付きゼリーの中を進む光を想像するとわかりやすいかも • 再構成誤差で学習 (右図) 4 B. Midenhall et al. “ NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis”, arxiv prepring, 2020.
  • 6. 発表内容 • Neural radiance field (NeRF) の派生研究を紹介 • 以下の項目に研究を分類 • 雑多な画像を用いたNeRF • 動画のNeRF • データ効率の良いNeRF • NeRF + GAN • 実装の改良・高速化 • controllableなNeRF • 詳細な部分には立ち入らず要点のみ紹介します 6
  • 7. NeRF in the Wild 7
  • 8. NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections • 書誌情報 • タイトル: NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections • 研究グループ: Google Research • CVPR2021 oral • プロジェクトページ • 概要 • 雑多な画像を用いたNeRF • インターネット上で収集した画像から観光名所 を復元 (右図) R. M-Brualla et al. NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections. CVPR, 2021. 8
  • 9. NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections • モチベーション : 雑多な画像で三次元復元 ができると嬉しい • 例) Google検索で収集した画像から観光名所の 三次元復元を行う • 雑多な画像を用いた三次元復元の課題 • 天候や時刻の差異などに起因する画像の見かけ の変化 • 歩行者などの動的な物体の出現・消失 2つの問題を解消する工夫を提案 1. 見かけの変化を司る潜在変数の導入 2. シーンを明示的に静的な部分と動的な部分に 分ける機構の導入 9 R. M-Brualla et al. NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections. CVPR, 2021.
  • 10. NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections • Latent Appearance Modeling • 天候、時刻等による見かけの変化に対応するための手法 • 各画像の見かけのスタイルを司る潜在変数 𝑙𝑖 (𝑎) を導入 • Transient Objects • 動的な物体への対応 • 静的な部分と動的な部分のネットワークを分ける • 動的な部分は潜在変数 𝑙𝑖 (𝜏) に条件づけて予測 • また、画像のuncertaintyを陽に推定 • uncertaintyが大きい部分はロスの重みを下げる 10 R. M-Brualla et al. NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections. CVPR, 2021.
  • 11. NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections • 実験 • ベースライン • Neural rendering in the Wild (multi-view stereo + pix2pix) • NeRF • NeRF-A (Latent Appearance Modelingのみを適用) • NeRF-U (Transient Objectsのみを適用) • データセット • 世界の名所の画像データ (Neural rerendering in the wildで作成されたもの) • 定量評価 11
  • 12. NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections 12
  • 13. NeRF for video (single camera) 13
  • 14. Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes • 書誌情報 • CVPR2021 • Adobe Researchインターン生の成果 • プロジェクトページ • 概要 • 単眼の動画から4Dのview synthesis • 任意の視点&時刻の画像生成 • Neural Scene Flow Fieldsにより実現 • シーンを時間を入力とする関数で表す • 細い物体がある複雑なシーンも再現できる 14 Z. Li et al. Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes. CVPR, 2021.
  • 15. Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes • 定式化 • 入力 : 三次元点の座標、視線方向、時刻 • 出力 • 輝度、密度 • 3D scene flow (三次元点の前後フレームとの間の変位) • disocclusion weight (後述のロスへの重み付けのための係数) • モデルは次の式で表せる (𝑖は時刻) • 学習 • 前後フレームの再構成誤差でモデルを学習 • 物体の縁ではオクルージョンの発生により正確にflowを予測できないのでロ スの重みを下げる (disocclusion weight) 15
  • 16. Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes • 前ページの内容を説明する図 16 3D scene flow disocclusion weight Z. Li et al. Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes. CVPR, 2021.
  • 17. Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes • 正則化のテクニック𝑖 → 𝑖 + 1 • Scene flow prior (cycle consistency) • 時刻𝑖 → 𝑖 + 1のflowと時刻𝑖 + 1 → 𝑖のflowが同じになる制約 • Geometric consistency prior • フレームに投影した3D scene flowが2D optical flowと一致する制約 • Single depth prior • 単眼デプス推定との結果の整合性 17 Z. Li et al. Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes. CVPR, 2021.
  • 18. Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes • 2つのモデルを用いシーンの静的な部分と動的な部分をそれぞれ表現 • 静的な部分 : 通常のNeRF • 動的な部分 : 提案したモデル 18 Z. Li et al. Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes. CVPR, 2021.
  • 19. Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes • 実験結果 19 Z. Li et al. Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes. CVPR, 2021.
  • 20. Neural Scene Flow Fieldsの類似手法 • 時間に依存するradiance field (+正則化) で動画をモデリング • Deformable Neural Radiance Fields (D-NeRF, NeRFies) • セルフィー動画のnovel view synthesis、デモの見せ方が素晴らしい • 複数人物を扱ってないことが査読で突っ込まれたらしい (著者のツイート) • Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synthesis of a Dynamic Scene From Monocular Video (NR-NeRF) • 実データで検証 • D-NeRF: Neural Radiance Fields for Dynamics Scenes • ダイナミックなCGデータで検証 20 K. Park et al. Deformable Neural Radiance Fields. arXiv, 2020. A. Pumarola et al. D-NeRF: Neural Radiance Fields for Dynamics Scenes. arXiv, 2020.
  • 21. Neural Scene Flow Fieldsの類似手法 • Space-time Neural Irradiance Fields for Free-Viewpoint Video • デプス推定による正則化 • STaR: Self-supervised Tracking and Reconstruction of Rigid Objects in Motion with Neural Rendering • シーン中を単一の剛体が移動するという問題設定 21 W. Xian et al. Space-time Neural Irradiance Fields for Free- Viewpoint Video. W. Yuan et al. STaR: Self-supervised Tracking and Reconstruction of Rigid Objects in Motion with Neural Rendering. arXiv, 2021.
  • 22. NeRF for video (multi camera) 22
  • 23. Neural 3D Video Synthesis (DyNeRF) • 書誌情報 • タイトル: Neural 3D Video Synthesis • 研究グループ: Facebook Reality Labs Research (インターン中の成果) • プロジェクトページ • 概要 • ダイナミックなシーンのためのNeRF • 座標、方向、時間から輝度を推定 • 効率的に学習する方法も提案 • 18個のカメラから自由視点の動画を生成 • ベースライン (1フレーム1モデル) と比較してモデルサイズが1/40に T. Li et al. Neural 3D Video Synthesis. arXiv, 2021. 23
  • 24. Neural 3D Video Synthesis (DyNeRF) • 定式化 • 三次元点の座標と視線方向に加え、時刻から輝度と密度を予測 • 実際は各時刻に対応した潜在変数 𝒛𝑡 により条件づける (𝒛𝑡 も学習する) 24
  • 25. Neural 3D Video Synthesis (DyNeRF) • レンダリングとロス関数はNeRFと同じ • NeRFと同様に元画像との再構成誤差によりradiance fieldを学習する • しかし、普通に学習するとフレーム数の分だけ学習時間が線形に増加 してしまう・・・ 効率的に学習する方法を2つ提案 1. Hierarchical Training 2. Ray Importance Sampling 25
  • 26. Neural 3D Video Synthesis (DyNeRF) • 工夫1: Hierarchical Training • 学習のカリキュラムを工夫することによる効率化 1. 動画のフレームを間引き事前学習 2. 全フレームを用いてfine-tuning • 工夫2: Ray Importance Sampling • シーン中のダイナミックなピクセルを重点的にサンプリング 1. Sampling Based on Global Median Maps (DyNeRF-ISG) • 全時刻におけるピクセル値の中央値との差が大きいピクセルを重点的にサンプルする 2. Sampling Based on Temporal Difference (DyNeRF-IST) • 近傍フレームとの差が大きいピクセルを重点的にサンプルする • 実験では2番目の方法が一番性能が良かった 26
  • 27. Neural 3D Video Synthesis (DyNeRF) • 実験用データセットは21個のGoProで作成 • 18個を学習に使用 • カメラパラメータは動画の最初のフレームを 用いてCOLMAPにより求めた T. Li et al. Neural 3D Video Synthesis. arXiv, 2021. 27
  • 28. Neural 3D Video Synthesis (DyNeRF) • ベースラインとの比較により手法の有効性を検証 • 画像の写実性で優位 (左下) • NeRF-T … 潜在変数 𝒛𝑡 を使わないモデル • DyNeRF† … 学習効率化の工夫なし • 残りの3つが提案手法 • 1フレームごとにNeRFを学習した場合に比べて1/40のモデル圧縮効果があっ た (右下) T. Li et al. Neural 3D Video Synthesis. arXiv, 2021. 28
  • 29. Neural 3D Video Synthesis (DyNeRF) • 定性評価 • プロジェクトページ のデモ動画参照 29
  • 31. データ効率の良いNeRF • 課題: 1つのモデルで1つのシーンしか表現できない • pixelNeRF: Neural Radiance Fields from One or Few Images • CVPR2021 • 入力画像に条件付けたradiance fieldを学習することによりFew-shotのNeRFが実現 • GRF: Learning a General Radiance Field for 3D Scene Representation and Rendering • 汎化能力がありスパースな入力からシーンを復元可能 • 単一モデルで 複数のシーンを表現可能 31 A. Trevithick and B. Yang. GRF: Learning a General Radiance Field for 3D Scene Representation and Rendering. arXiv, 2020. A. Yu et al. pixelNeRF: Neural Radiance Fields from One or Few Images. CVPR, 2021.
  • 32. データ効率の良いNeRF • Learned Initializations for Optimizing Coordinate-Based Neural Representations • CVPR2021 • メタラーニングで良い初期値を獲得する研究 • 学習の高速化につながる結果 32 M. Tancik et al. Learned Initializations for Optimizing Coordinate-Based Neural Representations. CVPR, 2021.
  • 34. GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis • 書誌情報 • NeurIPS2020 • 概要 • cGAN + NeRF • radiance fieldを用いた3D画像生成 • shapeとappearanceで条件づけた radiance fieldで画像生成 • ボクセルベースの手法 (HoloGAN) を写 実性で上回る 34 K. Schwarz et al. GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis. NeurIPS, 2021.
  • 35. pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D- Aware Image Synthesis • 書誌情報 • arXiv • Stanford大学 • 概要 • NeRF + StyleGAN + SIREN • radiance fieldからボリュームレンダリン グで画像を生成 • StyleGANを模したアーキテクチャ • 活性化関数にサイン関数 (SIREN) を用い る • 従来手法 (HoloGAN) を上回る写実性 35 E. R. Chan et al. pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis. arXiv, 2020.
  • 37. 実装の改良 • NeRF++: Analyzing and Improving Neural Radiance Fields [Zhang+ ] • NeRFのネットワーク構造の解析を行い、より広いシーンに適用可能にした • JaxNeRF [Deng+] • NeRFのJAX実装 • 本家のTensor Flow実装より早い 37 K. Zhang et al. NeRF++: Analyzing and Improving Neural Radiance Fields. arXiv, 2020.
  • 38. NeRFの高速化 • Neural Sparse Voxel Fields [Liu+ NeurIPS2020] • 空間をスパースなボクセルで区切り、それぞれのボクセルでNeRFを学習 • 10倍以上の高速化 • DeRF: Decomposed Radiance Fields [Rebain+ arXiv2020] • シーンをボロノイ空間分割しそれぞれにNeRFを割り当て • 3倍以上の効率化 (高速化?) 38 L. Liu et al. Neural Sparse Voxel Fields. NeurIPS, 2020. D. Rebain et al. DeRF: Decomposed Radiance Fields. arXiv, 2020.
  • 40. controllableなNeRF (この辺りよくわかりません) • NeRV: Neural Reflectance and Visibility Fields for Relighting and View Synthesis (CVPR2021) • 密度、法線ベクトル、マテリアルパラメータ、表面までの距離と方向を予測するMLPを学習 • 任意のライティングでレンダリング可能 • NeRD: Neural Reflectance Decomposition from Image Collections • NeRVと同じような感じ?? 40 P. Srinivasan et al. NeRV: Neural Reflectance and Visibility Fields for Relighting and View Synthesis. CVPR, 2021. M. Boss et al. NeRD: Neural Reflectance Decomposition from Image Collections. arXiv, 2020.
  • 41. まとめ • Neural radiance field (NeRF) の派生研究の紹介 • 大きく以下のように分類することが可能 • 雑多な画像を用いたNeRF • 動画のNeRF • データ効率の良いNeRF • NeRF + GAN • 実装の改良・高速化 • controllableなNeRF • 感想 • 今後は実用化に向けて必要なことを考える必要がありそう • NeRF関連の研究はGoogle Research, Facebookが牽引してる感じがある 41
  • 43. 参考文献 • B. Mildenhall et al. "NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis". ECCV, 2020. • R. M-Brualla et al. NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections. CVPR, 2021. • Z. Li et al. Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes. CVPR, 2021. • K. Park et al. Deformable Neural Radiance Fields. arXiv, 2020. • A. Pumarola et al. D-NeRF: Neural Radiance Fields for Dynamics Scenes. arXiv, 2020. • T. Li et al. Neural 3D Video Synthesis. arXiv, 2021. • A. Yu et al. pixelNeRF: Neural Radiance Fields from One or Few Images. CVPR, 2021. • A. Trevithick and B. Yang. GRF: Learning a General Radiance Field for 3D Scene Representation and Rendering. arXiv, 2020. • M. Tancik et al. Learned Initializations for Optimizing Coordinate-Based Neural Representations. CVPR, 2021. 43
  • 44. 参考文献 • K. Schwarz et al. GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis. NeurIPS, 2021. • E. R. Chan et al. pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis. arXiv, 2020. • L. Liu et al. Neural Sparse Voxel Fields. NeurIPS, 2020. • K. Zhang et al. NeRF++: Analyzing and Improving Neural Radiance Fields. arXiv, 2020. • B. Deng et al. JaxNeRF. (https://github.com/google-research/google- research/tree/master/jaxnerf ) • P. Srinivasan et al. NeRV: Neural Reflectance and Visibility Fields for Relighting and View Synthesis. CVPR, 2021. • M. Boss et al. NeRD: Neural Reflectance Decomposition from Image Collections. arXiv, 2020. 44