SlideShare una empresa de Scribd logo
1 de 22
Descargar para leer sin conexión
Deep Learning for
3D Point Clouds
Vijaylaxmi Nagurkar
Introduction
This article presents a summary of an in-depth study on Point Cloud Learning. Point cloud in AI
has gained a lot of attention in recent years due to its contribution in evolving fields like
autonomous driving, computer vision etc.Deep learning has already mastered solving problems
related to 2D vision. But when it comes to 3D Point Clouds, deep learning techniques are still
ina naive stage as they face a lot of unique challenges. This study covers the various
methodologies to address such unique challenges.
The major tasks which will be explained are: 3D shape classification, 3D object detection and
tracking, and 3D point cloud segmentation.
With numerous publicly available datasets like ModelNet, ShapeNet , ScanNet, Semantic3D,
and the KITTI Vision Benchmark Suite. With these datasets, there are a lot of challenges that
may arise due to different methods applied in these datasets. Hence the research of deep
learning on 3D Point Clouds has gained further velocity to address these challenges.
The following figure shows a taxonomy of existing deep learning methods for 3D
point clouds.
3D Shape Classification
Under this, methods first learn the embedding of individual points , then a global shape
embedding is extracted from the whole point cloud using an aggregation method.After which
several fully connected layers are connected to achieve classification. They are further divided
into projection-based networks and point-based networks.
Projection-based methods first project an unstructured point cloud into an intermediate regular
representation, and then leverage the well-established 2D or 3D convolution to achieve shape
classification. In contrast, point-based methods directly work on raw point clouds without any
voxelization or projection. Point-based methods do not introduce explicit information loss and
becoming increasingly popular.
Projection-based Networks
These methods project 3D point clouds into following modalities :
● Multi-View representation
● Volumetric representation
Multi-view representation
3D object is first projected into multiple views, corresponding view-wise features are extracted , which
are finally fused for accurate object detection. The major challenge is to aggregate multiple view-wise
features into a discriminative global representation. Many methods have been proposed to overcome
this challenge and to improve recognition accuracy.
Volumetric representation
Early methods usually apply 3D Convolution Neural Network (CNN) build upon the volumetric
representation of 3D point clouds.
3D Shape Classification
Point-based Networks
It emphasizes on feature learning of each point. They are further divided into:
● Pointwise MLP Networks
● Convolution-based Networks
● Graph-based Networks
● Data Indexing-based Networks
● Other Networks
● Pointwise MLP Networks
Multi Layer Perceptrons are used to model each point independently, after which a symmetric function
is used to aggregate a global feature.
● Convolution-based Networks
As 3D Point Clouds are irregular, it is difficult to design convolution kernels for them.They can be
further divided into continuous convolution networks and discrete convolution networks.
Point-based Networks
Point-based Networks continued ..
● 3D Continuous Convolution Networks
Convolution kernels are defined on a continuous space, and the weights for the neighbouring
points are related to spatial distribution w.r.t centre point.
● 3D Discrete Convolution Networks
Convolutional kernels are defined on regular grids, where the weights for neighboring points
are related to the offsets with respect to the center point.
Graph based Networks
Each point in a point cloud is considered as the vertex of a graph after which directed edges for the graph
based on the neighbors of each point are generated. Feature learning is then performed in spatial or spectral
domains.
These can be further divided into:
● Graph-based Methods in Spatial Domain.
Operations like Convolution and Pooling are defined in spatial domain.convolution is implemented through
MLP over spatial neighbors, pooling produces a new coarsened graph by aggregating information from each
point’s neighbors.
● Graph-based Methods in Spectral Domain
Convolutions are defined as spectral filtering. It is implemented as the multiplication of signals on graphs with
eigenvectors of the graph Laplacian matrix.
Data Indexing based Networks
Based on different data indexing structures like octree and kd-tree, these networks are formed.
Point features are learned hierarchically from leaf nodes to the root node along a tree.At each
level, point features are first learned through MLP based on local cues (which models
interdependencies between points in a local region) and global contextual cues (which models
the relationship for one position with respect to all other positions). Then, the feature of a
non-leaf node is computed from its child nodes using MLP and aggregated by max pooling. For
classification, the above process is repeated until the root node is attained.
Following inferences are drawn for various Point Based Networks:
● Pointwise MLP networks are usually served as basic building blocks for other types of
networks to learn pointwise features.
● As a standard deep learning architecture, convolution-based networks can achieve
superior performance on irregular 3D point clouds. More attention should be paid to both
discrete and continuous convolution networks for irregular data.
● Due to its inherent strong capability to handle irregular data, graph-based networks have
attracted increasingly more attention in recent years. However, it is still challenging to
extend graph-based networks in the spectral domain to various graph structures.
● Most networks need to down-sample a point cloud into a fixed small size. This sampling
process discards details of the shape. Developing networks that can deal with large-scale
point clouds is still in its infancy.
3D Object Detection And Tracking
3D Object Detection
3D object detection aims to accurately locate all objects of interest in a given scene. They can
be further divided into :
● Region proposal-based methods
● Single shot methods
Region Proposal-based Methods
Several possible regions (also called proposals) which contain objects are first proposed.Then
region wise features are extracted to determine the category label of each proposal. They can
be further divided into:
3D Object Detection And Tracking
● Multi-view based: Fuse proposal wise features from different view maps
● Segmentation-based : These methods first leverage existing semantic segmentation
techniques to remove most background points, and then generate a large amount of
high-quality proposals on foreground points to save computation.
● Frustum-based methods: These methods first leverage existing 2D object detectors to
generate 2D candidate regions of objects and then extract a 3D frustum proposal for each
2D candidate region.
3D Object Detection And Tracking
Single Shot Methods
Class probabilities are predicted directly and regress 3D bounding boxes of objects using a
single-stage network.No post-processing or region proposal generation is needed. This makes
them the perfect candidate for real time applications and they can run at very high speed. They
can be further divided into:
● BEV-based : BEV representation is taken as the input.
● Point cloud-based: Point cloud is converted into regular representation after which CNN is
applied to predict both categories and 3D boxes of objects.
3D Scene Flow Estimation
Similar to optical flow estimation in 2D vision, several methods have started to learn useful information
for example 3D scene flow, spatial-temporary information from a sequence of point clouds.
Following observations are made for object detection and tracking:
● Region proposal-based methods are the most frequently investigated methods among these two
categories, and outperform single shot methods by a large margin on both KITTI test 3D and
BEV benchmarks.
● There are two limitations for existing 3D object detectors. First, the long-range detection
capability of existing methods is relatively poor. Second, how to fully exploit the texture
information in images is still an open problem.
● Multi-task learning is a future direction in 3D object detection. For example, MMF [99] learns a
crossmodality representation to achieve state-of-the-art detection performance by incorporating
multiple tasks.
● 3D object tracking and scene flow estimation are emerging research topics, and have gradually
attracted increasing attention since 2019.
3D POINT CLOUD SEGMENTATION
It requires the understanding of both global geometric structure and the fine-grained details of each
point.They can be further divided into:
● Semantic segmentation (scene level)
● Instance segmentation (object level) and
● Part segmentation (part level)
Semantic Segmentation
Point-based Networks
They work on irregular point clouds. Due to unstructured and orderless behaviour of point clouds, it is
not possible to directly apply CNNs. They can be further divided into pointwise MLP methods, point
convolution methods, RNNbased methods, and graph-based methods.
Pointwise MLP methods
For high efficiency, shared MLP is used as the basic unit in their network. To capture local geometric
patterns, these methods learn a feature for each point by aggregating the information from local
neighboring points.This is known as Neighboring feature pooling.
Semantic Segmentation
● Point Convolution Methods
They provide effective convolution operations for point clouds. Neighboring points are binned into
kernel cells and then convolved with kernel weights.
● RNN-based Methods
They capture inherent context features from point clouds. They are also used for semantic
segmentation of point clouds.
● Graph-based Methods
They capture underlying shapes and geometric structures of 3D point clouds.
Instance Segmentation
A more accurate and fine grained reasoning of points is needed, hence this is more challenging as
compared to semantic segmentation. It separates instances with the same semantic meaning in
addition to distinguishing points with different semantic meaning.It is further divided into
proposal-based methods and proposal-free methods.
● Proposal-based Methods
Instance segmentation is subdivided into 3D object detection and instance mask prediction. They are
straightforward and intuitive.But they require multi-stage training and pruning of redundant
proposals.Hence they are time taking and cost expensive computations.
● Proposal-free Methods
They don't have an object detection module. Instance segmentation is considered as a subsequent
clustering step after semantic segmentation.The existing methods assume that points belonging to the
same instance should have very similar features. Hence they focus mainly on discriminative feature
learning and point grouping.
Part Segmentation
It faces twofold difficulty. First, shape parts with the same semantic label have a large
geometric variation and ambiguity. Second, the method should be robust to noise and
sampling.
Conclusion
● Point-based networks are the most frequently investigated methods. However, point
representation naturally does not have the explicit neighbouring information, most existing
point-based methods have to resort the expensive neighbor searching mechanism.
● Learning from imbalanced data is still a challenging problem in point cloud segmentation.
● The majority of existing approaches work on small point clouds (e.g., 1m×1m with 4096
points). In practice, the point clouds acquired by depth sensors are usually immense and
large-scale. Therefore, it is desirable to further investigate the problem of efficient
segmentation of large-scale point clouds.
● A handful of works have started to learn spatio-temporal information from dynamic point
clouds. It is expected that the spatio-temporal information can help to improve the
performance of subsequent tasks such as 3D object recognition, segmentation, and
completion

Más contenido relacionado

La actualidad más candente

Deep learning for 3-D Scene Reconstruction and Modeling
Deep learning for 3-D Scene Reconstruction and Modeling Deep learning for 3-D Scene Reconstruction and Modeling
Deep learning for 3-D Scene Reconstruction and Modeling Yu Huang
 
Spectral clustering
Spectral clusteringSpectral clustering
Spectral clusteringSOYEON KIM
 
[기초개념] Graph Convolutional Network (GCN)
[기초개념] Graph Convolutional Network (GCN)[기초개념] Graph Convolutional Network (GCN)
[기초개념] Graph Convolutional Network (GCN)Donghyeon Kim
 
Graph Data Modeling Best Practices(Eric_Monk).pptx
Graph Data Modeling Best Practices(Eric_Monk).pptxGraph Data Modeling Best Practices(Eric_Monk).pptx
Graph Data Modeling Best Practices(Eric_Monk).pptxNeo4j
 
【DL輪読会】HyperDiffusion: Generating Implicit Neural Fields withWeight-Space Dif...
【DL輪読会】HyperDiffusion: Generating Implicit Neural Fields withWeight-Space Dif...【DL輪読会】HyperDiffusion: Generating Implicit Neural Fields withWeight-Space Dif...
【DL輪読会】HyperDiffusion: Generating Implicit Neural Fields withWeight-Space Dif...Deep Learning JP
 
Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph DatabasesDataStax
 
Zipline: Airbnb’s Machine Learning Data Management Platform with Nikhil Simha...
Zipline: Airbnb’s Machine Learning Data Management Platform with Nikhil Simha...Zipline: Airbnb’s Machine Learning Data Management Platform with Nikhil Simha...
Zipline: Airbnb’s Machine Learning Data Management Platform with Nikhil Simha...Databricks
 
Lec7: Medical Image Segmentation (I) (Radiology Applications of Segmentation,...
Lec7: Medical Image Segmentation (I) (Radiology Applications of Segmentation,...Lec7: Medical Image Segmentation (I) (Radiology Applications of Segmentation,...
Lec7: Medical Image Segmentation (I) (Radiology Applications of Segmentation,...Ulaş Bağcı
 
Stereo Matching by Deep Learning
Stereo Matching by Deep LearningStereo Matching by Deep Learning
Stereo Matching by Deep LearningYu Huang
 
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
Semantic Segmentation - Fully Convolutional Networks for Semantic SegmentationSemantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation岳華 杜
 
Computer Vision Structure from motion
Computer Vision Structure from motionComputer Vision Structure from motion
Computer Vision Structure from motionWael Badawy
 
Machine Learning Deep Learning AI and Data Science
Machine Learning Deep Learning AI and Data Science Machine Learning Deep Learning AI and Data Science
Machine Learning Deep Learning AI and Data Science Venkata Reddy Konasani
 
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual RepresentationsPR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual RepresentationsJinwon Lee
 
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...Deep Learning JP
 
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...Seonho Park
 
Image processing9 segmentation(pointslinesedges)
Image processing9 segmentation(pointslinesedges)Image processing9 segmentation(pointslinesedges)
Image processing9 segmentation(pointslinesedges)John Williams
 
Final thesis presentation
Final thesis presentationFinal thesis presentation
Final thesis presentationPawan Singh
 
Weakly supervised semantic segmentation of 3D point cloud
Weakly supervised semantic segmentation of 3D point cloudWeakly supervised semantic segmentation of 3D point cloud
Weakly supervised semantic segmentation of 3D point cloudArithmer Inc.
 

La actualidad más candente (20)

Deep learning for 3-D Scene Reconstruction and Modeling
Deep learning for 3-D Scene Reconstruction and Modeling Deep learning for 3-D Scene Reconstruction and Modeling
Deep learning for 3-D Scene Reconstruction and Modeling
 
Spectral clustering
Spectral clusteringSpectral clustering
Spectral clustering
 
[기초개념] Graph Convolutional Network (GCN)
[기초개념] Graph Convolutional Network (GCN)[기초개념] Graph Convolutional Network (GCN)
[기초개념] Graph Convolutional Network (GCN)
 
Graph Data Modeling Best Practices(Eric_Monk).pptx
Graph Data Modeling Best Practices(Eric_Monk).pptxGraph Data Modeling Best Practices(Eric_Monk).pptx
Graph Data Modeling Best Practices(Eric_Monk).pptx
 
【DL輪読会】HyperDiffusion: Generating Implicit Neural Fields withWeight-Space Dif...
【DL輪読会】HyperDiffusion: Generating Implicit Neural Fields withWeight-Space Dif...【DL輪読会】HyperDiffusion: Generating Implicit Neural Fields withWeight-Space Dif...
【DL輪読会】HyperDiffusion: Generating Implicit Neural Fields withWeight-Space Dif...
 
Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph Databases
 
Zipline: Airbnb’s Machine Learning Data Management Platform with Nikhil Simha...
Zipline: Airbnb’s Machine Learning Data Management Platform with Nikhil Simha...Zipline: Airbnb’s Machine Learning Data Management Platform with Nikhil Simha...
Zipline: Airbnb’s Machine Learning Data Management Platform with Nikhil Simha...
 
Lec7: Medical Image Segmentation (I) (Radiology Applications of Segmentation,...
Lec7: Medical Image Segmentation (I) (Radiology Applications of Segmentation,...Lec7: Medical Image Segmentation (I) (Radiology Applications of Segmentation,...
Lec7: Medical Image Segmentation (I) (Radiology Applications of Segmentation,...
 
Stereo Matching by Deep Learning
Stereo Matching by Deep LearningStereo Matching by Deep Learning
Stereo Matching by Deep Learning
 
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
Semantic Segmentation - Fully Convolutional Networks for Semantic SegmentationSemantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
 
Intepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural NetworksIntepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural Networks
 
Computer Vision Structure from motion
Computer Vision Structure from motionComputer Vision Structure from motion
Computer Vision Structure from motion
 
Machine Learning Deep Learning AI and Data Science
Machine Learning Deep Learning AI and Data Science Machine Learning Deep Learning AI and Data Science
Machine Learning Deep Learning AI and Data Science
 
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual RepresentationsPR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
 
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
 
U-Net (1).pptx
U-Net (1).pptxU-Net (1).pptx
U-Net (1).pptx
 
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
 
Image processing9 segmentation(pointslinesedges)
Image processing9 segmentation(pointslinesedges)Image processing9 segmentation(pointslinesedges)
Image processing9 segmentation(pointslinesedges)
 
Final thesis presentation
Final thesis presentationFinal thesis presentation
Final thesis presentation
 
Weakly supervised semantic segmentation of 3D point cloud
Weakly supervised semantic segmentation of 3D point cloudWeakly supervised semantic segmentation of 3D point cloud
Weakly supervised semantic segmentation of 3D point cloud
 

Similar a Deep learning for 3 d point clouds presentation

Introduction to 3D Computer Vision and Differentiable Rendering
Introduction to 3D Computer Vision and Differentiable RenderingIntroduction to 3D Computer Vision and Differentiable Rendering
Introduction to 3D Computer Vision and Differentiable RenderingPreferred Networks
 
3D Point Cloud Classification.pptx
3D Point Cloud Classification.pptx3D Point Cloud Classification.pptx
3D Point Cloud Classification.pptxsajalagarwal70
 
Best Techniques of Point cloud to 3D.pdf
Best Techniques of Point cloud to 3D.pdfBest Techniques of Point cloud to 3D.pdf
Best Techniques of Point cloud to 3D.pdfRvtcad
 
3-d interpretation from stereo images for autonomous driving
3-d interpretation from stereo images for autonomous driving3-d interpretation from stereo images for autonomous driving
3-d interpretation from stereo images for autonomous drivingYu Huang
 
3-d interpretation from single 2-d image V
3-d interpretation from single 2-d image V3-d interpretation from single 2-d image V
3-d interpretation from single 2-d image VYu Huang
 
Non-Photorealistic Rendering of 3D Point Clouds for Cartographic Visualization
Non-Photorealistic Rendering of 3D Point Clouds for Cartographic VisualizationNon-Photorealistic Rendering of 3D Point Clouds for Cartographic Visualization
Non-Photorealistic Rendering of 3D Point Clouds for Cartographic VisualizationMatthias Trapp
 
3-d interpretation from single 2-d image for autonomous driving II
3-d interpretation from single 2-d image for autonomous driving II3-d interpretation from single 2-d image for autonomous driving II
3-d interpretation from single 2-d image for autonomous driving IIYu Huang
 
10.1109@ICCMC48092.2020.ICCMC-000167.pdf
10.1109@ICCMC48092.2020.ICCMC-000167.pdf10.1109@ICCMC48092.2020.ICCMC-000167.pdf
10.1109@ICCMC48092.2020.ICCMC-000167.pdfmokamojah
 
fusion of Camera and lidar for autonomous driving I
fusion of Camera and lidar for autonomous driving Ifusion of Camera and lidar for autonomous driving I
fusion of Camera and lidar for autonomous driving IYu Huang
 
Real-time 3D Object Detection on LIDAR Point Cloud using Complex- YOLO V4
Real-time 3D Object Detection on LIDAR Point Cloud using Complex- YOLO V4Real-time 3D Object Detection on LIDAR Point Cloud using Complex- YOLO V4
Real-time 3D Object Detection on LIDAR Point Cloud using Complex- YOLO V4IRJET Journal
 
LiDAR-based Autonomous Driving III (by Deep Learning)
LiDAR-based Autonomous Driving III (by Deep Learning)LiDAR-based Autonomous Driving III (by Deep Learning)
LiDAR-based Autonomous Driving III (by Deep Learning)Yu Huang
 
Image Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyImage Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyNUPUR YADAV
 
Indoor scene understanding for autonomous agents
Indoor scene understanding for autonomous agentsIndoor scene understanding for autonomous agents
Indoor scene understanding for autonomous agentsVarun Bhaseen
 
[3D勉強会@関東] Deep Reinforcement Learning of Volume-guided Progressive View Inpa...
[3D勉強会@関東] Deep Reinforcement Learning of Volume-guided Progressive View Inpa...[3D勉強会@関東] Deep Reinforcement Learning of Volume-guided Progressive View Inpa...
[3D勉強会@関東] Deep Reinforcement Learning of Volume-guided Progressive View Inpa...Seiya Ito
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)ijceronline
 
fusion of Camera and lidar for autonomous driving II
fusion of Camera and lidar for autonomous driving IIfusion of Camera and lidar for autonomous driving II
fusion of Camera and lidar for autonomous driving IIYu Huang
 
A deep learning based stereo matching model for autonomous vehicle
A deep learning based stereo matching model for autonomous vehicleA deep learning based stereo matching model for autonomous vehicle
A deep learning based stereo matching model for autonomous vehicleIAESIJAI
 
Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...
Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...
Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...IRJET Journal
 
What is point cloud annotation?
What is point cloud annotation?What is point cloud annotation?
What is point cloud annotation?Annotation Support
 

Similar a Deep learning for 3 d point clouds presentation (20)

Introduction to 3D Computer Vision and Differentiable Rendering
Introduction to 3D Computer Vision and Differentiable RenderingIntroduction to 3D Computer Vision and Differentiable Rendering
Introduction to 3D Computer Vision and Differentiable Rendering
 
3D Point Cloud Classification.pptx
3D Point Cloud Classification.pptx3D Point Cloud Classification.pptx
3D Point Cloud Classification.pptx
 
Best Techniques of Point cloud to 3D.pdf
Best Techniques of Point cloud to 3D.pdfBest Techniques of Point cloud to 3D.pdf
Best Techniques of Point cloud to 3D.pdf
 
3-d interpretation from stereo images for autonomous driving
3-d interpretation from stereo images for autonomous driving3-d interpretation from stereo images for autonomous driving
3-d interpretation from stereo images for autonomous driving
 
3-d interpretation from single 2-d image V
3-d interpretation from single 2-d image V3-d interpretation from single 2-d image V
3-d interpretation from single 2-d image V
 
Non-Photorealistic Rendering of 3D Point Clouds for Cartographic Visualization
Non-Photorealistic Rendering of 3D Point Clouds for Cartographic VisualizationNon-Photorealistic Rendering of 3D Point Clouds for Cartographic Visualization
Non-Photorealistic Rendering of 3D Point Clouds for Cartographic Visualization
 
3-d interpretation from single 2-d image for autonomous driving II
3-d interpretation from single 2-d image for autonomous driving II3-d interpretation from single 2-d image for autonomous driving II
3-d interpretation from single 2-d image for autonomous driving II
 
10.1109@ICCMC48092.2020.ICCMC-000167.pdf
10.1109@ICCMC48092.2020.ICCMC-000167.pdf10.1109@ICCMC48092.2020.ICCMC-000167.pdf
10.1109@ICCMC48092.2020.ICCMC-000167.pdf
 
fusion of Camera and lidar for autonomous driving I
fusion of Camera and lidar for autonomous driving Ifusion of Camera and lidar for autonomous driving I
fusion of Camera and lidar for autonomous driving I
 
Real-time 3D Object Detection on LIDAR Point Cloud using Complex- YOLO V4
Real-time 3D Object Detection on LIDAR Point Cloud using Complex- YOLO V4Real-time 3D Object Detection on LIDAR Point Cloud using Complex- YOLO V4
Real-time 3D Object Detection on LIDAR Point Cloud using Complex- YOLO V4
 
LiDAR-based Autonomous Driving III (by Deep Learning)
LiDAR-based Autonomous Driving III (by Deep Learning)LiDAR-based Autonomous Driving III (by Deep Learning)
LiDAR-based Autonomous Driving III (by Deep Learning)
 
Image Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyImage Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A survey
 
Indoor scene understanding for autonomous agents
Indoor scene understanding for autonomous agentsIndoor scene understanding for autonomous agents
Indoor scene understanding for autonomous agents
 
[3D勉強会@関東] Deep Reinforcement Learning of Volume-guided Progressive View Inpa...
[3D勉強会@関東] Deep Reinforcement Learning of Volume-guided Progressive View Inpa...[3D勉強会@関東] Deep Reinforcement Learning of Volume-guided Progressive View Inpa...
[3D勉強会@関東] Deep Reinforcement Learning of Volume-guided Progressive View Inpa...
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
fusion of Camera and lidar for autonomous driving II
fusion of Camera and lidar for autonomous driving IIfusion of Camera and lidar for autonomous driving II
fusion of Camera and lidar for autonomous driving II
 
A deep learning based stereo matching model for autonomous vehicle
A deep learning based stereo matching model for autonomous vehicleA deep learning based stereo matching model for autonomous vehicle
A deep learning based stereo matching model for autonomous vehicle
 
Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...
Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...
Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...
 
What is point cloud annotation?
What is point cloud annotation?What is point cloud annotation?
What is point cloud annotation?
 
Mmpaper draft10
Mmpaper draft10Mmpaper draft10
Mmpaper draft10
 

Último

Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxTasha Penwell
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxHaritikaChhatwal1
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Boston Institute of Analytics
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxSimranPal17
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
convolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfconvolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfSubhamKumar3239
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectBoston Institute of Analytics
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxHimangsuNath
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 

Último (20)

Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptx
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptx
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
convolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfconvolutional neural network and its applications.pdf
convolutional neural network and its applications.pdf
 
Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis Project
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptx
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 

Deep learning for 3 d point clouds presentation

  • 1. Deep Learning for 3D Point Clouds Vijaylaxmi Nagurkar
  • 2. Introduction This article presents a summary of an in-depth study on Point Cloud Learning. Point cloud in AI has gained a lot of attention in recent years due to its contribution in evolving fields like autonomous driving, computer vision etc.Deep learning has already mastered solving problems related to 2D vision. But when it comes to 3D Point Clouds, deep learning techniques are still ina naive stage as they face a lot of unique challenges. This study covers the various methodologies to address such unique challenges. The major tasks which will be explained are: 3D shape classification, 3D object detection and tracking, and 3D point cloud segmentation. With numerous publicly available datasets like ModelNet, ShapeNet , ScanNet, Semantic3D, and the KITTI Vision Benchmark Suite. With these datasets, there are a lot of challenges that may arise due to different methods applied in these datasets. Hence the research of deep learning on 3D Point Clouds has gained further velocity to address these challenges.
  • 3. The following figure shows a taxonomy of existing deep learning methods for 3D point clouds.
  • 4. 3D Shape Classification Under this, methods first learn the embedding of individual points , then a global shape embedding is extracted from the whole point cloud using an aggregation method.After which several fully connected layers are connected to achieve classification. They are further divided into projection-based networks and point-based networks. Projection-based methods first project an unstructured point cloud into an intermediate regular representation, and then leverage the well-established 2D or 3D convolution to achieve shape classification. In contrast, point-based methods directly work on raw point clouds without any voxelization or projection. Point-based methods do not introduce explicit information loss and becoming increasingly popular.
  • 5. Projection-based Networks These methods project 3D point clouds into following modalities : ● Multi-View representation ● Volumetric representation Multi-view representation 3D object is first projected into multiple views, corresponding view-wise features are extracted , which are finally fused for accurate object detection. The major challenge is to aggregate multiple view-wise features into a discriminative global representation. Many methods have been proposed to overcome this challenge and to improve recognition accuracy. Volumetric representation Early methods usually apply 3D Convolution Neural Network (CNN) build upon the volumetric representation of 3D point clouds.
  • 7. Point-based Networks It emphasizes on feature learning of each point. They are further divided into: ● Pointwise MLP Networks ● Convolution-based Networks ● Graph-based Networks ● Data Indexing-based Networks ● Other Networks ● Pointwise MLP Networks Multi Layer Perceptrons are used to model each point independently, after which a symmetric function is used to aggregate a global feature. ● Convolution-based Networks As 3D Point Clouds are irregular, it is difficult to design convolution kernels for them.They can be further divided into continuous convolution networks and discrete convolution networks.
  • 9. Point-based Networks continued .. ● 3D Continuous Convolution Networks Convolution kernels are defined on a continuous space, and the weights for the neighbouring points are related to spatial distribution w.r.t centre point. ● 3D Discrete Convolution Networks Convolutional kernels are defined on regular grids, where the weights for neighboring points are related to the offsets with respect to the center point.
  • 10. Graph based Networks Each point in a point cloud is considered as the vertex of a graph after which directed edges for the graph based on the neighbors of each point are generated. Feature learning is then performed in spatial or spectral domains. These can be further divided into: ● Graph-based Methods in Spatial Domain. Operations like Convolution and Pooling are defined in spatial domain.convolution is implemented through MLP over spatial neighbors, pooling produces a new coarsened graph by aggregating information from each point’s neighbors. ● Graph-based Methods in Spectral Domain Convolutions are defined as spectral filtering. It is implemented as the multiplication of signals on graphs with eigenvectors of the graph Laplacian matrix.
  • 11. Data Indexing based Networks Based on different data indexing structures like octree and kd-tree, these networks are formed. Point features are learned hierarchically from leaf nodes to the root node along a tree.At each level, point features are first learned through MLP based on local cues (which models interdependencies between points in a local region) and global contextual cues (which models the relationship for one position with respect to all other positions). Then, the feature of a non-leaf node is computed from its child nodes using MLP and aggregated by max pooling. For classification, the above process is repeated until the root node is attained.
  • 12. Following inferences are drawn for various Point Based Networks: ● Pointwise MLP networks are usually served as basic building blocks for other types of networks to learn pointwise features. ● As a standard deep learning architecture, convolution-based networks can achieve superior performance on irregular 3D point clouds. More attention should be paid to both discrete and continuous convolution networks for irregular data. ● Due to its inherent strong capability to handle irregular data, graph-based networks have attracted increasingly more attention in recent years. However, it is still challenging to extend graph-based networks in the spectral domain to various graph structures. ● Most networks need to down-sample a point cloud into a fixed small size. This sampling process discards details of the shape. Developing networks that can deal with large-scale point clouds is still in its infancy.
  • 13. 3D Object Detection And Tracking 3D Object Detection 3D object detection aims to accurately locate all objects of interest in a given scene. They can be further divided into : ● Region proposal-based methods ● Single shot methods Region Proposal-based Methods Several possible regions (also called proposals) which contain objects are first proposed.Then region wise features are extracted to determine the category label of each proposal. They can be further divided into:
  • 14. 3D Object Detection And Tracking ● Multi-view based: Fuse proposal wise features from different view maps ● Segmentation-based : These methods first leverage existing semantic segmentation techniques to remove most background points, and then generate a large amount of high-quality proposals on foreground points to save computation. ● Frustum-based methods: These methods first leverage existing 2D object detectors to generate 2D candidate regions of objects and then extract a 3D frustum proposal for each 2D candidate region.
  • 15. 3D Object Detection And Tracking Single Shot Methods Class probabilities are predicted directly and regress 3D bounding boxes of objects using a single-stage network.No post-processing or region proposal generation is needed. This makes them the perfect candidate for real time applications and they can run at very high speed. They can be further divided into: ● BEV-based : BEV representation is taken as the input. ● Point cloud-based: Point cloud is converted into regular representation after which CNN is applied to predict both categories and 3D boxes of objects.
  • 16. 3D Scene Flow Estimation Similar to optical flow estimation in 2D vision, several methods have started to learn useful information for example 3D scene flow, spatial-temporary information from a sequence of point clouds. Following observations are made for object detection and tracking: ● Region proposal-based methods are the most frequently investigated methods among these two categories, and outperform single shot methods by a large margin on both KITTI test 3D and BEV benchmarks. ● There are two limitations for existing 3D object detectors. First, the long-range detection capability of existing methods is relatively poor. Second, how to fully exploit the texture information in images is still an open problem. ● Multi-task learning is a future direction in 3D object detection. For example, MMF [99] learns a crossmodality representation to achieve state-of-the-art detection performance by incorporating multiple tasks. ● 3D object tracking and scene flow estimation are emerging research topics, and have gradually attracted increasing attention since 2019.
  • 17. 3D POINT CLOUD SEGMENTATION It requires the understanding of both global geometric structure and the fine-grained details of each point.They can be further divided into: ● Semantic segmentation (scene level) ● Instance segmentation (object level) and ● Part segmentation (part level)
  • 18. Semantic Segmentation Point-based Networks They work on irregular point clouds. Due to unstructured and orderless behaviour of point clouds, it is not possible to directly apply CNNs. They can be further divided into pointwise MLP methods, point convolution methods, RNNbased methods, and graph-based methods. Pointwise MLP methods For high efficiency, shared MLP is used as the basic unit in their network. To capture local geometric patterns, these methods learn a feature for each point by aggregating the information from local neighboring points.This is known as Neighboring feature pooling.
  • 19. Semantic Segmentation ● Point Convolution Methods They provide effective convolution operations for point clouds. Neighboring points are binned into kernel cells and then convolved with kernel weights. ● RNN-based Methods They capture inherent context features from point clouds. They are also used for semantic segmentation of point clouds. ● Graph-based Methods They capture underlying shapes and geometric structures of 3D point clouds.
  • 20. Instance Segmentation A more accurate and fine grained reasoning of points is needed, hence this is more challenging as compared to semantic segmentation. It separates instances with the same semantic meaning in addition to distinguishing points with different semantic meaning.It is further divided into proposal-based methods and proposal-free methods. ● Proposal-based Methods Instance segmentation is subdivided into 3D object detection and instance mask prediction. They are straightforward and intuitive.But they require multi-stage training and pruning of redundant proposals.Hence they are time taking and cost expensive computations. ● Proposal-free Methods They don't have an object detection module. Instance segmentation is considered as a subsequent clustering step after semantic segmentation.The existing methods assume that points belonging to the same instance should have very similar features. Hence they focus mainly on discriminative feature learning and point grouping.
  • 21. Part Segmentation It faces twofold difficulty. First, shape parts with the same semantic label have a large geometric variation and ambiguity. Second, the method should be robust to noise and sampling.
  • 22. Conclusion ● Point-based networks are the most frequently investigated methods. However, point representation naturally does not have the explicit neighbouring information, most existing point-based methods have to resort the expensive neighbor searching mechanism. ● Learning from imbalanced data is still a challenging problem in point cloud segmentation. ● The majority of existing approaches work on small point clouds (e.g., 1m×1m with 4096 points). In practice, the point clouds acquired by depth sensors are usually immense and large-scale. Therefore, it is desirable to further investigate the problem of efficient segmentation of large-scale point clouds. ● A handful of works have started to learn spatio-temporal information from dynamic point clouds. It is expected that the spatio-temporal information can help to improve the performance of subsequent tasks such as 3D object recognition, segmentation, and completion