SlideShare una empresa de Scribd logo
1 de 8
Dense Pose: Dense Human
Pose Estimation In The Wild
Pre by Guangrui Li
目标
• 1,配合新颖的标注策略,建立了一个庞大的,RGB 图像和
surface-based representation对应的人体的数据集;
• 2,基于上一条提到的数据集,分别用FCN,region-based system
做了实验,实验证明,后者更棒;
• 3,在尝试了多种利用数据集的方式,并提出了最行之有效的利
用方式。
COCO-DensePose Dataset
task 1: 分割为头、躯干等几个部位
task 2:对于每一个部位,进行点的标注,在点的标注上,针对不同复杂程度的部位有着不同的
策略,详情应该还需要翻阅其引用的SMPL模型
Novel annotation method
• 标注准确率:
• 由于其可以渲染图像,因此直
接选取标注点的一部分,与渲
染模型中坐标真实值计算
geodesic distance。
Evaluation Method
• 1, pointwise
the Ratio of Correct Point (RCP) correspondences, where a
correspondence is declared correct if the geodesic distance is below a
certain threshold
2, Per-Instance
geodesic point similarity:
Learning Dense Human Pose Estimation
Fully-convolutional dense pose regression
1,classifier:分类为某个部位 [cross-entropy loss]
2,regressor:定位point坐标 [smooth L1 loss]
弊端: 这样一个网络承担了这么多任务(一个分类器和24和
regressor)的情况下,很难再保证scale-invariance
Learning Dense Human Pose Estimation
• Region based system
we use a cascade of region proposal generation and feature pooling, followed
by a fully-convolutional network that densely predicts discrete part labels and
continuous surface coordinates
Distillation-based ground-truth interpolation
• 对于每个训练样本,只提供标注点新的的一部分。
• 除此之外,利用了蒸馏的思路
• 首先,训练teacher net,这个net的目标是来重建没有提供标注的
点,
• 然后将该网络与student net一起训练,这样最终获得了更好的结
果。

Más contenido relacionado

Más de 哲东 郑

Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval
Polysemous Visual-Semantic Embedding for Cross-Modal RetrievalPolysemous Visual-Semantic Embedding for Cross-Modal Retrieval
Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval哲东 郑
 
Weijian image retrieval
Weijian image retrievalWeijian image retrieval
Weijian image retrieval哲东 郑
 
Scops self supervised co-part segmentation
Scops self supervised co-part segmentationScops self supervised co-part segmentation
Scops self supervised co-part segmentation哲东 郑
 
Video object detection
Video object detectionVideo object detection
Video object detection哲东 郑
 
C2 ae open set recognition
C2 ae open set recognitionC2 ae open set recognition
C2 ae open set recognition哲东 郑
 
Sota semantic segmentation
Sota semantic segmentationSota semantic segmentation
Sota semantic segmentation哲东 郑
 
Deep randomized embedding
Deep randomized embeddingDeep randomized embedding
Deep randomized embedding哲东 郑
 
Semantic Image Synthesis with Spatially-Adaptive Normalization
Semantic Image Synthesis with Spatially-Adaptive NormalizationSemantic Image Synthesis with Spatially-Adaptive Normalization
Semantic Image Synthesis with Spatially-Adaptive Normalization哲东 郑
 
Instance level facial attributes transfer with geometry-aware flow
Instance level facial attributes transfer with geometry-aware flowInstance level facial attributes transfer with geometry-aware flow
Instance level facial attributes transfer with geometry-aware flow哲东 郑
 
Learning to adapt structured output space for semantic
Learning to adapt structured output space for semanticLearning to adapt structured output space for semantic
Learning to adapt structured output space for semantic哲东 郑
 
Unsupervised Learning of Object Landmarks through Conditional Image Generation
Unsupervised Learning of Object Landmarks through Conditional Image GenerationUnsupervised Learning of Object Landmarks through Conditional Image Generation
Unsupervised Learning of Object Landmarks through Conditional Image Generation哲东 郑
 
Graph based global reasoning networks
Graph based global reasoning networks Graph based global reasoning networks
Graph based global reasoning networks 哲东 郑
 
Variational Discriminator Bottleneck
Variational Discriminator BottleneckVariational Discriminator Bottleneck
Variational Discriminator Bottleneck哲东 郑
 
GNorm and Rethinking pre training-ruijie
GNorm and Rethinking pre training-ruijieGNorm and Rethinking pre training-ruijie
GNorm and Rethinking pre training-ruijie哲东 郑
 
Smoothed manifold
Smoothed manifoldSmoothed manifold
Smoothed manifold哲东 郑
 
Controllable image to-video translation
Controllable image to-video translationControllable image to-video translation
Controllable image to-video translation哲东 郑
 
Comparator networks
Comparator networksComparator networks
Comparator networks哲东 郑
 

Más de 哲东 郑 (20)

Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval
Polysemous Visual-Semantic Embedding for Cross-Modal RetrievalPolysemous Visual-Semantic Embedding for Cross-Modal Retrieval
Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval
 
Weijian image retrieval
Weijian image retrievalWeijian image retrieval
Weijian image retrieval
 
Scops self supervised co-part segmentation
Scops self supervised co-part segmentationScops self supervised co-part segmentation
Scops self supervised co-part segmentation
 
Video object detection
Video object detectionVideo object detection
Video object detection
 
Center nets
Center netsCenter nets
Center nets
 
C2 ae open set recognition
C2 ae open set recognitionC2 ae open set recognition
C2 ae open set recognition
 
Sota semantic segmentation
Sota semantic segmentationSota semantic segmentation
Sota semantic segmentation
 
Deep randomized embedding
Deep randomized embeddingDeep randomized embedding
Deep randomized embedding
 
Semantic Image Synthesis with Spatially-Adaptive Normalization
Semantic Image Synthesis with Spatially-Adaptive NormalizationSemantic Image Synthesis with Spatially-Adaptive Normalization
Semantic Image Synthesis with Spatially-Adaptive Normalization
 
Instance level facial attributes transfer with geometry-aware flow
Instance level facial attributes transfer with geometry-aware flowInstance level facial attributes transfer with geometry-aware flow
Instance level facial attributes transfer with geometry-aware flow
 
Learning to adapt structured output space for semantic
Learning to adapt structured output space for semanticLearning to adapt structured output space for semantic
Learning to adapt structured output space for semantic
 
Unsupervised Learning of Object Landmarks through Conditional Image Generation
Unsupervised Learning of Object Landmarks through Conditional Image GenerationUnsupervised Learning of Object Landmarks through Conditional Image Generation
Unsupervised Learning of Object Landmarks through Conditional Image Generation
 
Graph based global reasoning networks
Graph based global reasoning networks Graph based global reasoning networks
Graph based global reasoning networks
 
Style gan
Style ganStyle gan
Style gan
 
Vi2vi
Vi2viVi2vi
Vi2vi
 
Variational Discriminator Bottleneck
Variational Discriminator BottleneckVariational Discriminator Bottleneck
Variational Discriminator Bottleneck
 
GNorm and Rethinking pre training-ruijie
GNorm and Rethinking pre training-ruijieGNorm and Rethinking pre training-ruijie
GNorm and Rethinking pre training-ruijie
 
Smoothed manifold
Smoothed manifoldSmoothed manifold
Smoothed manifold
 
Controllable image to-video translation
Controllable image to-video translationControllable image to-video translation
Controllable image to-video translation
 
Comparator networks
Comparator networksComparator networks
Comparator networks
 

Dense pose

  • 1. Dense Pose: Dense Human Pose Estimation In The Wild Pre by Guangrui Li
  • 2. 目标 • 1,配合新颖的标注策略,建立了一个庞大的,RGB 图像和 surface-based representation对应的人体的数据集; • 2,基于上一条提到的数据集,分别用FCN,region-based system 做了实验,实验证明,后者更棒; • 3,在尝试了多种利用数据集的方式,并提出了最行之有效的利 用方式。
  • 3. COCO-DensePose Dataset task 1: 分割为头、躯干等几个部位 task 2:对于每一个部位,进行点的标注,在点的标注上,针对不同复杂程度的部位有着不同的 策略,详情应该还需要翻阅其引用的SMPL模型
  • 4. Novel annotation method • 标注准确率: • 由于其可以渲染图像,因此直 接选取标注点的一部分,与渲 染模型中坐标真实值计算 geodesic distance。
  • 5. Evaluation Method • 1, pointwise the Ratio of Correct Point (RCP) correspondences, where a correspondence is declared correct if the geodesic distance is below a certain threshold 2, Per-Instance geodesic point similarity:
  • 6. Learning Dense Human Pose Estimation Fully-convolutional dense pose regression 1,classifier:分类为某个部位 [cross-entropy loss] 2,regressor:定位point坐标 [smooth L1 loss] 弊端: 这样一个网络承担了这么多任务(一个分类器和24和 regressor)的情况下,很难再保证scale-invariance
  • 7. Learning Dense Human Pose Estimation • Region based system we use a cascade of region proposal generation and feature pooling, followed by a fully-convolutional network that densely predicts discrete part labels and continuous surface coordinates
  • 8. Distillation-based ground-truth interpolation • 对于每个训练样本,只提供标注点新的的一部分。 • 除此之外,利用了蒸馏的思路 • 首先,训练teacher net,这个net的目标是来重建没有提供标注的 点, • 然后将该网络与student net一起训练,这样最终获得了更好的结 果。