SlideShare una empresa de Scribd logo
1 de 26
Descargar para leer sin conexión
한국보건정보통계학회 추계학술발표회 2013

“빅” 데이터의 분석적 시각화
Analytic Data Visualization
許 明 會

2013.11.29

고려대학교 통계학과 stat420@korea.ac.kr

1

Health Info & Stat
Data Visualization
- Descriptive vs Analytic ...
- Small vs Big ...

science

technology
art

2013.11.29

2

Health Info & Stat
Contents
- Scatterplot
- Biplot
- Regression Biplot
- Kernel PCA
- SVM Biplot

2013.11.29

3

Health Info & Stat
Scatterplot: 산점도
- “Lego” for analytic data visualization
- Reflecting the third variable

quakes:

2013.11.29

longitude(=x), latitude(=y), depth(=z)

4

Health Info & Stat
Scatterplot: 산점도
- For the case of large  (≧  ), over-plotting can produce
serious outcome.

Skin Segmentation Data:  (red) vs.  (green)
      

2013.11.29

5

Health Info & Stat
Scatterplot: 산점도
- For the case of large  (≧  ), alpha channel can be utilized.

Skin Segmentation Data:  (red) vs.  (green)
      

2013.11.29

6

Health Info & Stat
Scatterplot: 산점도
- lowess: A nonparametric regression for bivariate data

cars data: distance vs. speed

2013.11.29

7

Health Info & Stat
Scatterplot: 산점도
- 3D Rotation for three variables

Skin Segmentation Data:  (red),  (green),  (blue)

- ggobi:

2013.11.29

3D Rotation for four or more variables

8

Health Info & Stat
Biplot of Observations and Variables,

Gabriel (1971)

- The biplot is a graph that shows  observations and  variables.

Protein data (row: 25 nations, column: 9 protein sources)

2013.11.29

9

Health Info & Stat
Biplot of Observations and Variables,

Gabriel (1971)

- Idea: Linear projection

Protein data: variable cereal

2013.11.29

10

Health Info & Stat
Regression Biplot,

Huh and Lee (2013)

- Regression biplot is a graph for  observations of   ⋯    ,
arranged by predicted  .
- Assume that the model fit is determined by a function of linear
combination of   ⋯    . For instance,
   ⋯  ,


 
 
or

log           ⋯    .



- Set the vertical dimension by the direction of regression coefficients
  

  ⋮ ,
or      .
∥∥
  
- Set the horizontal dimension by the direction of principal axis of





  ⋯   ,



where  

denotes the orthogonal component generated from the

projection of   on  .

2013.11.29

11

Health Info & Stat
Regression Biplot,

Huh and Lee (2013)

Example 1. Stack Loss Data (  ;   loss of ammonia,         )

2013.11.29

12

Health Info & Stat
Regression Biplot,

Huh and Lee (2013)

Example 2. Magazine Data (  ;   Subscription (0,1),   )

2013.11.29

13

Health Info & Stat
Kernel PCA,

Scholkopf et al. (1998)

- For  observations    ⋯    ( × ), consider the nonlinear mapping
    ⋯   
to a Hilbert space, in which                      .
- Denoting            , Kernel PCA is obtained from
eigen-decomposing
             .






- Kernel PCA yields a plot of observations by projecting       ⋯      
on 









′  


where 

2013.11.29


′

   ′  ,

     ,   is an eigenvector of  .



14

Health Info & Stat
Kernel PCA Diagram (or Kernel Biplot),

Huh (2013)

- Aim: Representation of  variables in Kernel PC plot of observations.
- Proposed Procedure:

1) For each    ⋯    , map         on the plane,

   ⋯   , where    is a constant and     ⋯   ⋯    .
Projection is given by




′  

  ′   
′


 
 
  

 ″    
 ′ ″   
 ″ ″′  .

 ″  
 ″  
 ″  ″′  








2) For each  , link the projection points of   and  

2013.11.29

15

by an arrow.

Health Info & Stat
Example 1. Arrow diagrams [  ] for kernel PCA of the iris data
with rbf kernel,   

2013.11.29

16

Health Info & Stat
Example 1. Arrow diagrams [  ] for kernel PCA of the iris data
with rbf kernel,   

2013.11.29

17

Health Info & Stat
Example 2. Arrow diagrams [  ] for kernel PCA of the spam data
[      ]

2013.11.29

18

Health Info & Stat
SVM-Guided Biplot as an extension of Regression Biplot
- Idea: Combine Linear/Logistic Regression Biplot and Kernel PCA.
- Classification/Regression Part:
Classified

as

SVM classifier

  -1 or 1 for    ⋯   .
              ,


where

 

      , 





Vertical dimension is set to


  
   
  



2013.11.29







≧ .







(      ,        ).

19

Health Info & Stat
SVM-Guided Biplot: Classification
- Kernel PCA Part:
         
 

 
∴




(   
      ′  ),
 ′   ′



   ⋯   .

          ′                  ′   ′   


 ′   ′ ,

  ′   ⋯   .

Hence


 →      (   ) or          .




Horizontal dimension is determined by eigen-decomposing  .

- Perturbation Scheme for Arrow Diagrams.
Define      ,  ×  , where  represents a perturbation of
which the magnitude is controlled by . Then, project   on the first
(vertical) and the second (horizontal) dimension.

2013.11.29

20

Health Info & Stat
Example 1. Iris Data: Versicolor vs. Virginica [sigma=0.1, C=1,   ]

2013.11.29

21

Health Info & Stat
Importance of Variables

(in the case of large

)

- It is necessary to select a small number of variables in determining
the first and second dimensions.
- Measures of Importance (definition)  Length of Arrows
1) in vertical direction,
2) in horizontal direction.
- Plot arrow diagrams for importance variables only.

2013.11.29

22

Health Info & Stat
Example 2. Spam Data [sigma=0.1, C=10,   ],   

2013.11.29

23

Health Info & Stat
SVM-Guided Biplot: Regression
- The same method can be applied to SVM regression.
- Example 3. Aerobic Fitness [       ] for oxygen uptake (=  )
with RBF kernel ( =0.1, C=10,  =0.1,   )

2013.11.29

24

Health Info & Stat
Concluding Remarks
- Biplot method can be extended to be suited for linear regression or
classification (logistic regression).
- Biplot method can be extended to allow nonlinear mapping of
observations and variables, by fully utilizing kernel trick.

http://blog.naver.com/huh4200

금붕어 어항 (on the iPad)

2013.11.29

25

Health Info & Stat
References
Gabriel, K.R. (1971). “The biplot display of matrices with the application to
principal component analysis”. Biometrika, 58. 453-467.
Huh, M.H. (2013). “Arrow diagrams for kernel principal component analysis”.
Communications for Statistical Applications and Methods, 20. 175-184.
Huh, M.H. (2013). “SVM-guided biplot of observations and variables”.
Communications for Statistical Applications and Methods. (to appear)
Huh, M.H. and Lee, Y.G. (2013). “Biplots of multivariate data guided by linear
and/or logistic regression”. Communications for Statistical Applications and
Methods, 20. 129-136.
Scholkopf, B., Smola, A. and Muller, K.R. (1998). Nonlinear component analysis as
a kernel eigenvalue problem. Neural Computation, 10. 1299–1319.

2013.11.29

26

Health Info & Stat

Más contenido relacionado

La actualidad más candente

Parallel Processing Technique for Time Efficient Matrix Multiplication
Parallel Processing Technique for Time Efficient Matrix MultiplicationParallel Processing Technique for Time Efficient Matrix Multiplication
Parallel Processing Technique for Time Efficient Matrix MultiplicationIJERA Editor
 
Supporting Flight Test And Flight Matching
Supporting Flight Test And Flight MatchingSupporting Flight Test And Flight Matching
Supporting Flight Test And Flight Matchingj2aircraft
 
A Novel Approach to Analyze Satellite Images for Severe Weather Events
A Novel Approach to Analyze Satellite Images for Severe Weather EventsA Novel Approach to Analyze Satellite Images for Severe Weather Events
A Novel Approach to Analyze Satellite Images for Severe Weather EventsIJERA Editor
 
[ICLR/ICML2019読み会] A Wrapped Normal Distribution on Hyperbolic Space for Grad...
[ICLR/ICML2019読み会] A Wrapped Normal Distribution on Hyperbolic Space for Grad...[ICLR/ICML2019読み会] A Wrapped Normal Distribution on Hyperbolic Space for Grad...
[ICLR/ICML2019読み会] A Wrapped Normal Distribution on Hyperbolic Space for Grad...Yoshihiro Nagano
 

La actualidad más candente (6)

Parallel Processing Technique for Time Efficient Matrix Multiplication
Parallel Processing Technique for Time Efficient Matrix MultiplicationParallel Processing Technique for Time Efficient Matrix Multiplication
Parallel Processing Technique for Time Efficient Matrix Multiplication
 
Supporting Flight Test And Flight Matching
Supporting Flight Test And Flight MatchingSupporting Flight Test And Flight Matching
Supporting Flight Test And Flight Matching
 
A Novel Approach to Analyze Satellite Images for Severe Weather Events
A Novel Approach to Analyze Satellite Images for Severe Weather EventsA Novel Approach to Analyze Satellite Images for Severe Weather Events
A Novel Approach to Analyze Satellite Images for Severe Weather Events
 
Vldb14
Vldb14Vldb14
Vldb14
 
Four data models in GIS
Four data models in GISFour data models in GIS
Four data models in GIS
 
[ICLR/ICML2019読み会] A Wrapped Normal Distribution on Hyperbolic Space for Grad...
[ICLR/ICML2019読み会] A Wrapped Normal Distribution on Hyperbolic Space for Grad...[ICLR/ICML2019読み会] A Wrapped Normal Distribution on Hyperbolic Space for Grad...
[ICLR/ICML2019読み会] A Wrapped Normal Distribution on Hyperbolic Space for Grad...
 

Destacado

데이터 분석 실무 2강 (실습 1차)
데이터 분석 실무 2강 (실습 1차)데이터 분석 실무 2강 (실습 1차)
데이터 분석 실무 2강 (실습 1차)YongGeun Song
 
대화형지도 Carto를 활용한 데이터 분석 및 통찰력
대화형지도 Carto를 활용한 데이터  분석 및 통찰력대화형지도 Carto를 활용한 데이터  분석 및 통찰력
대화형지도 Carto를 활용한 데이터 분석 및 통찰력선경 김선경
 
장가(시집) 갈 수 있을까? 스피드 데이팅 데이터 분석
장가(시집) 갈 수 있을까? 스피드 데이팅 데이터 분석장가(시집) 갈 수 있을까? 스피드 데이팅 데이터 분석
장가(시집) 갈 수 있을까? 스피드 데이팅 데이터 분석YBIGTA
 
인프라 성능 데이터 분석 시작하기 (김아령)
인프라 성능 데이터 분석 시작하기 (김아령)인프라 성능 데이터 분석 시작하기 (김아령)
인프라 성능 데이터 분석 시작하기 (김아령)삵 (sarc.io)
 
데이터 분석 실무 1강
데이터 분석 실무 1강데이터 분석 실무 1강
데이터 분석 실무 1강YongGeun Song
 
음성인식 및 웹 기반 어플리케이션을 통한 유비쿼터스 스마트홈 제어
음성인식 및 웹 기반 어플리케이션을 통한 유비쿼터스 스마트홈 제어음성인식 및 웹 기반 어플리케이션을 통한 유비쿼터스 스마트홈 제어
음성인식 및 웹 기반 어플리케이션을 통한 유비쿼터스 스마트홈 제어Dongsam Byun
 
판매정보 빅데이터 분석을 통한 판매 예측 시스템
판매정보 빅데이터 분석을 통한 판매 예측 시스템판매정보 빅데이터 분석을 통한 판매 예측 시스템
판매정보 빅데이터 분석을 통한 판매 예측 시스템Dongsam Byun
 
데이터 분석을 통해 웹페이지 Ui 개선, 소비자의 니즈를 캐치하는 마케터 바쁘남 김두영
데이터 분석을 통해 웹페이지 Ui 개선, 소비자의 니즈를 캐치하는 마케터 바쁘남 김두영데이터 분석을 통해 웹페이지 Ui 개선, 소비자의 니즈를 캐치하는 마케터 바쁘남 김두영
데이터 분석을 통해 웹페이지 Ui 개선, 소비자의 니즈를 캐치하는 마케터 바쁘남 김두영FAST CAMPUS
 
비즈니스 데이터 분석의 현재와 미래
비즈니스 데이터 분석의 현재와 미래비즈니스 데이터 분석의 현재와 미래
비즈니스 데이터 분석의 현재와 미래HT Kim
 
예측 분석 산업별 사례 147
예측 분석 산업별 사례 147예측 분석 산업별 사례 147
예측 분석 산업별 사례 147eungjin cho
 
검색로그시스템 with Python
검색로그시스템 with Python검색로그시스템 with Python
검색로그시스템 with Pythonitproman35
 
파이썬 데이터 분석 3종세트
파이썬 데이터 분석 3종세트파이썬 데이터 분석 3종세트
파이썬 데이터 분석 3종세트itproman35
 
20141214 빅데이터실전기술 - 유사도 및 군집화 방법 (Similarity&Clustering)
20141214 빅데이터실전기술 - 유사도 및 군집화 방법 (Similarity&Clustering) 20141214 빅데이터실전기술 - 유사도 및 군집화 방법 (Similarity&Clustering)
20141214 빅데이터실전기술 - 유사도 및 군집화 방법 (Similarity&Clustering) Tae Young Lee
 
데이터분석의 길 4: “고수는 통계학습의 달인이다”
데이터분석의 길 4:  “고수는 통계학습의 달인이다”데이터분석의 길 4:  “고수는 통계학습의 달인이다”
데이터분석의 길 4: “고수는 통계학습의 달인이다”Jaimie Kwon (권재명)
 
[아꿈사] 게임 기초 수학 물리 1,2장
[아꿈사] 게임 기초 수학 물리 1,2장[아꿈사] 게임 기초 수학 물리 1,2장
[아꿈사] 게임 기초 수학 물리 1,2장sung ki choi
 
빅데이터 시각화 기술 특허 동향 분석
빅데이터 시각화 기술 특허 동향 분석빅데이터 시각화 기술 특허 동향 분석
빅데이터 시각화 기술 특허 동향 분석Newsjelly
 
빅데이터 분석 시각화 분석 : 4장 빅데이터와 시각화 디자인
빅데이터 분석 시각화 분석 : 4장 빅데이터와 시각화 디자인빅데이터 분석 시각화 분석 : 4장 빅데이터와 시각화 디자인
빅데이터 분석 시각화 분석 : 4장 빅데이터와 시각화 디자인Ji Lee
 
빅데이터 기술 현황과 시장 전망(2014)
빅데이터 기술 현황과 시장 전망(2014)빅데이터 기술 현황과 시장 전망(2014)
빅데이터 기술 현황과 시장 전망(2014)Channy Yun
 
빅데이터 분석 시각화 분석 : 3장 시각화 방법
빅데이터 분석 시각화 분석 : 3장 시각화 방법빅데이터 분석 시각화 분석 : 3장 시각화 방법
빅데이터 분석 시각화 분석 : 3장 시각화 방법Ji Lee
 
빅데이터 분석 시각화 분석 : 1장 시각화정의 2장 프로세스
빅데이터 분석 시각화 분석 : 1장 시각화정의 2장 프로세스빅데이터 분석 시각화 분석 : 1장 시각화정의 2장 프로세스
빅데이터 분석 시각화 분석 : 1장 시각화정의 2장 프로세스Ji Lee
 

Destacado (20)

데이터 분석 실무 2강 (실습 1차)
데이터 분석 실무 2강 (실습 1차)데이터 분석 실무 2강 (실습 1차)
데이터 분석 실무 2강 (실습 1차)
 
대화형지도 Carto를 활용한 데이터 분석 및 통찰력
대화형지도 Carto를 활용한 데이터  분석 및 통찰력대화형지도 Carto를 활용한 데이터  분석 및 통찰력
대화형지도 Carto를 활용한 데이터 분석 및 통찰력
 
장가(시집) 갈 수 있을까? 스피드 데이팅 데이터 분석
장가(시집) 갈 수 있을까? 스피드 데이팅 데이터 분석장가(시집) 갈 수 있을까? 스피드 데이팅 데이터 분석
장가(시집) 갈 수 있을까? 스피드 데이팅 데이터 분석
 
인프라 성능 데이터 분석 시작하기 (김아령)
인프라 성능 데이터 분석 시작하기 (김아령)인프라 성능 데이터 분석 시작하기 (김아령)
인프라 성능 데이터 분석 시작하기 (김아령)
 
데이터 분석 실무 1강
데이터 분석 실무 1강데이터 분석 실무 1강
데이터 분석 실무 1강
 
음성인식 및 웹 기반 어플리케이션을 통한 유비쿼터스 스마트홈 제어
음성인식 및 웹 기반 어플리케이션을 통한 유비쿼터스 스마트홈 제어음성인식 및 웹 기반 어플리케이션을 통한 유비쿼터스 스마트홈 제어
음성인식 및 웹 기반 어플리케이션을 통한 유비쿼터스 스마트홈 제어
 
판매정보 빅데이터 분석을 통한 판매 예측 시스템
판매정보 빅데이터 분석을 통한 판매 예측 시스템판매정보 빅데이터 분석을 통한 판매 예측 시스템
판매정보 빅데이터 분석을 통한 판매 예측 시스템
 
데이터 분석을 통해 웹페이지 Ui 개선, 소비자의 니즈를 캐치하는 마케터 바쁘남 김두영
데이터 분석을 통해 웹페이지 Ui 개선, 소비자의 니즈를 캐치하는 마케터 바쁘남 김두영데이터 분석을 통해 웹페이지 Ui 개선, 소비자의 니즈를 캐치하는 마케터 바쁘남 김두영
데이터 분석을 통해 웹페이지 Ui 개선, 소비자의 니즈를 캐치하는 마케터 바쁘남 김두영
 
비즈니스 데이터 분석의 현재와 미래
비즈니스 데이터 분석의 현재와 미래비즈니스 데이터 분석의 현재와 미래
비즈니스 데이터 분석의 현재와 미래
 
예측 분석 산업별 사례 147
예측 분석 산업별 사례 147예측 분석 산업별 사례 147
예측 분석 산업별 사례 147
 
검색로그시스템 with Python
검색로그시스템 with Python검색로그시스템 with Python
검색로그시스템 with Python
 
파이썬 데이터 분석 3종세트
파이썬 데이터 분석 3종세트파이썬 데이터 분석 3종세트
파이썬 데이터 분석 3종세트
 
20141214 빅데이터실전기술 - 유사도 및 군집화 방법 (Similarity&Clustering)
20141214 빅데이터실전기술 - 유사도 및 군집화 방법 (Similarity&Clustering) 20141214 빅데이터실전기술 - 유사도 및 군집화 방법 (Similarity&Clustering)
20141214 빅데이터실전기술 - 유사도 및 군집화 방법 (Similarity&Clustering)
 
데이터분석의 길 4: “고수는 통계학습의 달인이다”
데이터분석의 길 4:  “고수는 통계학습의 달인이다”데이터분석의 길 4:  “고수는 통계학습의 달인이다”
데이터분석의 길 4: “고수는 통계학습의 달인이다”
 
[아꿈사] 게임 기초 수학 물리 1,2장
[아꿈사] 게임 기초 수학 물리 1,2장[아꿈사] 게임 기초 수학 물리 1,2장
[아꿈사] 게임 기초 수학 물리 1,2장
 
빅데이터 시각화 기술 특허 동향 분석
빅데이터 시각화 기술 특허 동향 분석빅데이터 시각화 기술 특허 동향 분석
빅데이터 시각화 기술 특허 동향 분석
 
빅데이터 분석 시각화 분석 : 4장 빅데이터와 시각화 디자인
빅데이터 분석 시각화 분석 : 4장 빅데이터와 시각화 디자인빅데이터 분석 시각화 분석 : 4장 빅데이터와 시각화 디자인
빅데이터 분석 시각화 분석 : 4장 빅데이터와 시각화 디자인
 
빅데이터 기술 현황과 시장 전망(2014)
빅데이터 기술 현황과 시장 전망(2014)빅데이터 기술 현황과 시장 전망(2014)
빅데이터 기술 현황과 시장 전망(2014)
 
빅데이터 분석 시각화 분석 : 3장 시각화 방법
빅데이터 분석 시각화 분석 : 3장 시각화 방법빅데이터 분석 시각화 분석 : 3장 시각화 방법
빅데이터 분석 시각화 분석 : 3장 시각화 방법
 
빅데이터 분석 시각화 분석 : 1장 시각화정의 2장 프로세스
빅데이터 분석 시각화 분석 : 1장 시각화정의 2장 프로세스빅데이터 분석 시각화 분석 : 1장 시각화정의 2장 프로세스
빅데이터 분석 시각화 분석 : 1장 시각화정의 2장 프로세스
 

Similar a Analytic Data Visualization Techniques

MFBLP Method Forecast for Regional Load Demand System
MFBLP Method Forecast for Regional Load Demand SystemMFBLP Method Forecast for Regional Load Demand System
MFBLP Method Forecast for Regional Load Demand SystemCSCJournals
 
Computational Dual-system Imaging Processing Methods for Lumbar Spine Specime...
Computational Dual-system Imaging Processing Methods for Lumbar Spine Specime...Computational Dual-system Imaging Processing Methods for Lumbar Spine Specime...
Computational Dual-system Imaging Processing Methods for Lumbar Spine Specime...BRNSSPublicationHubI
 
tadejko2007.pdf
tadejko2007.pdftadejko2007.pdf
tadejko2007.pdfMhartono
 
iPlan BOLD MRI Mapping Clinical White Paper
iPlan BOLD MRI Mapping Clinical White PaperiPlan BOLD MRI Mapping Clinical White Paper
iPlan BOLD MRI Mapping Clinical White PaperBrainlab
 
Adaptive lifting based image compression scheme using interactive artificial ...
Adaptive lifting based image compression scheme using interactive artificial ...Adaptive lifting based image compression scheme using interactive artificial ...
Adaptive lifting based image compression scheme using interactive artificial ...csandit
 
IRJET- Diabetic Haemorrhage Detection using DWT and Elliptical LBP
IRJET- Diabetic Haemorrhage Detection using DWT and Elliptical LBPIRJET- Diabetic Haemorrhage Detection using DWT and Elliptical LBP
IRJET- Diabetic Haemorrhage Detection using DWT and Elliptical LBPIRJET Journal
 
Band Clustering for the Lossless Compression of AVIRIS Hyperspectral Images
Band Clustering for the Lossless Compression of AVIRIS Hyperspectral ImagesBand Clustering for the Lossless Compression of AVIRIS Hyperspectral Images
Band Clustering for the Lossless Compression of AVIRIS Hyperspectral ImagesIDES Editor
 
Hybrid medical image compression method using quincunx wavelet and geometric ...
Hybrid medical image compression method using quincunx wavelet and geometric ...Hybrid medical image compression method using quincunx wavelet and geometric ...
Hybrid medical image compression method using quincunx wavelet and geometric ...journalBEEI
 
Mr image compression based on selection of mother wavelet and lifting based w...
Mr image compression based on selection of mother wavelet and lifting based w...Mr image compression based on selection of mother wavelet and lifting based w...
Mr image compression based on selection of mother wavelet and lifting based w...ijma
 
A Joint QRS Detection and Data Compression Scheme for Wearable Sensors
A Joint QRS Detection and Data Compression Scheme for Wearable SensorsA Joint QRS Detection and Data Compression Scheme for Wearable Sensors
A Joint QRS Detection and Data Compression Scheme for Wearable Sensorsecgpapers
 
Application of Artificial Neural Network (Ann) In Operation of Reservoirs
Application of Artificial Neural Network (Ann) In Operation of ReservoirsApplication of Artificial Neural Network (Ann) In Operation of Reservoirs
Application of Artificial Neural Network (Ann) In Operation of ReservoirsIOSR Journals
 
MR Image Compression Based on Selection of Mother Wavelet and Lifting Based W...
MR Image Compression Based on Selection of Mother Wavelet and Lifting Based W...MR Image Compression Based on Selection of Mother Wavelet and Lifting Based W...
MR Image Compression Based on Selection of Mother Wavelet and Lifting Based W...ijma
 
Reduction of Active Power Loss byUsing Adaptive Cat Swarm Optimization
Reduction of Active Power Loss byUsing Adaptive Cat Swarm OptimizationReduction of Active Power Loss byUsing Adaptive Cat Swarm Optimization
Reduction of Active Power Loss byUsing Adaptive Cat Swarm Optimizationijeei-iaes
 
Human Activity Recognition Using AccelerometerData
Human Activity Recognition Using AccelerometerDataHuman Activity Recognition Using AccelerometerData
Human Activity Recognition Using AccelerometerDataIRJET Journal
 
Time Series Analysis - Modeling and Forecasting
Time Series Analysis - Modeling and ForecastingTime Series Analysis - Modeling and Forecasting
Time Series Analysis - Modeling and ForecastingMaruthi Nataraj K
 
Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...
Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...
Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...Seval Çapraz
 

Similar a Analytic Data Visualization Techniques (20)

MFBLP Method Forecast for Regional Load Demand System
MFBLP Method Forecast for Regional Load Demand SystemMFBLP Method Forecast for Regional Load Demand System
MFBLP Method Forecast for Regional Load Demand System
 
Computational Dual-system Imaging Processing Methods for Lumbar Spine Specime...
Computational Dual-system Imaging Processing Methods for Lumbar Spine Specime...Computational Dual-system Imaging Processing Methods for Lumbar Spine Specime...
Computational Dual-system Imaging Processing Methods for Lumbar Spine Specime...
 
tadejko2007.pdf
tadejko2007.pdftadejko2007.pdf
tadejko2007.pdf
 
iPlan BOLD MRI Mapping Clinical White Paper
iPlan BOLD MRI Mapping Clinical White PaperiPlan BOLD MRI Mapping Clinical White Paper
iPlan BOLD MRI Mapping Clinical White Paper
 
Adaptive lifting based image compression scheme using interactive artificial ...
Adaptive lifting based image compression scheme using interactive artificial ...Adaptive lifting based image compression scheme using interactive artificial ...
Adaptive lifting based image compression scheme using interactive artificial ...
 
IRJET- Diabetic Haemorrhage Detection using DWT and Elliptical LBP
IRJET- Diabetic Haemorrhage Detection using DWT and Elliptical LBPIRJET- Diabetic Haemorrhage Detection using DWT and Elliptical LBP
IRJET- Diabetic Haemorrhage Detection using DWT and Elliptical LBP
 
Band Clustering for the Lossless Compression of AVIRIS Hyperspectral Images
Band Clustering for the Lossless Compression of AVIRIS Hyperspectral ImagesBand Clustering for the Lossless Compression of AVIRIS Hyperspectral Images
Band Clustering for the Lossless Compression of AVIRIS Hyperspectral Images
 
L14.pdf
L14.pdfL14.pdf
L14.pdf
 
PCA and SVD in brief
PCA and SVD in briefPCA and SVD in brief
PCA and SVD in brief
 
Hybrid medical image compression method using quincunx wavelet and geometric ...
Hybrid medical image compression method using quincunx wavelet and geometric ...Hybrid medical image compression method using quincunx wavelet and geometric ...
Hybrid medical image compression method using quincunx wavelet and geometric ...
 
Lec-3 DIP.pptx
Lec-3 DIP.pptxLec-3 DIP.pptx
Lec-3 DIP.pptx
 
Mr image compression based on selection of mother wavelet and lifting based w...
Mr image compression based on selection of mother wavelet and lifting based w...Mr image compression based on selection of mother wavelet and lifting based w...
Mr image compression based on selection of mother wavelet and lifting based w...
 
A Joint QRS Detection and Data Compression Scheme for Wearable Sensors
A Joint QRS Detection and Data Compression Scheme for Wearable SensorsA Joint QRS Detection and Data Compression Scheme for Wearable Sensors
A Joint QRS Detection and Data Compression Scheme for Wearable Sensors
 
Application of Artificial Neural Network (Ann) In Operation of Reservoirs
Application of Artificial Neural Network (Ann) In Operation of ReservoirsApplication of Artificial Neural Network (Ann) In Operation of Reservoirs
Application of Artificial Neural Network (Ann) In Operation of Reservoirs
 
MR Image Compression Based on Selection of Mother Wavelet and Lifting Based W...
MR Image Compression Based on Selection of Mother Wavelet and Lifting Based W...MR Image Compression Based on Selection of Mother Wavelet and Lifting Based W...
MR Image Compression Based on Selection of Mother Wavelet and Lifting Based W...
 
Reduction of Active Power Loss byUsing Adaptive Cat Swarm Optimization
Reduction of Active Power Loss byUsing Adaptive Cat Swarm OptimizationReduction of Active Power Loss byUsing Adaptive Cat Swarm Optimization
Reduction of Active Power Loss byUsing Adaptive Cat Swarm Optimization
 
H235055
H235055H235055
H235055
 
Human Activity Recognition Using AccelerometerData
Human Activity Recognition Using AccelerometerDataHuman Activity Recognition Using AccelerometerData
Human Activity Recognition Using AccelerometerData
 
Time Series Analysis - Modeling and Forecasting
Time Series Analysis - Modeling and ForecastingTime Series Analysis - Modeling and Forecasting
Time Series Analysis - Modeling and Forecasting
 
Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...
Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...
Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...
 

Más de Myung-Hoe Huh

법에서의 통계학 대학원 바이오정보통계학과 워크숍 20150117
법에서의 통계학 대학원 바이오정보통계학과 워크숍 20150117법에서의 통계학 대학원 바이오정보통계학과 워크숍 20150117
법에서의 통계학 대학원 바이오정보통계학과 워크숍 20150117Myung-Hoe Huh
 
데이터 사이언티스트 키노트 Pt 20141008
데이터 사이언티스트 키노트 Pt 20141008데이터 사이언티스트 키노트 Pt 20141008
데이터 사이언티스트 키노트 Pt 20141008Myung-Hoe Huh
 
22 r data manipulation 2 pt 20140404
22 r data manipulation 2 pt 2014040422 r data manipulation 2 pt 20140404
22 r data manipulation 2 pt 20140404Myung-Hoe Huh
 
21 r data manipulation 1 pt 20140325
21 r data manipulation 1 pt 2014032521 r data manipulation 1 pt 20140325
21 r data manipulation 1 pt 20140325Myung-Hoe Huh
 
Data visualization using r pt 20140316
Data visualization using r pt 20140316Data visualization using r pt 20140316
Data visualization using r pt 20140316Myung-Hoe Huh
 
통계학의 유래와 전망 20130413
통계학의 유래와 전망 20130413통계학의 유래와 전망 20130413
통계학의 유래와 전망 20130413Myung-Hoe Huh
 
통계적 시각화 Pt 20130119 knou
통계적 시각화 Pt 20130119 knou통계적 시각화 Pt 20130119 knou
통계적 시각화 Pt 20130119 knouMyung-Hoe Huh
 

Más de Myung-Hoe Huh (7)

법에서의 통계학 대학원 바이오정보통계학과 워크숍 20150117
법에서의 통계학 대학원 바이오정보통계학과 워크숍 20150117법에서의 통계학 대학원 바이오정보통계학과 워크숍 20150117
법에서의 통계학 대학원 바이오정보통계학과 워크숍 20150117
 
데이터 사이언티스트 키노트 Pt 20141008
데이터 사이언티스트 키노트 Pt 20141008데이터 사이언티스트 키노트 Pt 20141008
데이터 사이언티스트 키노트 Pt 20141008
 
22 r data manipulation 2 pt 20140404
22 r data manipulation 2 pt 2014040422 r data manipulation 2 pt 20140404
22 r data manipulation 2 pt 20140404
 
21 r data manipulation 1 pt 20140325
21 r data manipulation 1 pt 2014032521 r data manipulation 1 pt 20140325
21 r data manipulation 1 pt 20140325
 
Data visualization using r pt 20140316
Data visualization using r pt 20140316Data visualization using r pt 20140316
Data visualization using r pt 20140316
 
통계학의 유래와 전망 20130413
통계학의 유래와 전망 20130413통계학의 유래와 전망 20130413
통계학의 유래와 전망 20130413
 
통계적 시각화 Pt 20130119 knou
통계적 시각화 Pt 20130119 knou통계적 시각화 Pt 20130119 knou
통계적 시각화 Pt 20130119 knou
 

Último

Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 

Último (20)

Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 

Analytic Data Visualization Techniques

  • 1. 한국보건정보통계학회 추계학술발표회 2013 “빅” 데이터의 분석적 시각화 Analytic Data Visualization 許 明 會 2013.11.29 고려대학교 통계학과 stat420@korea.ac.kr 1 Health Info & Stat
  • 2. Data Visualization - Descriptive vs Analytic ... - Small vs Big ... science technology art 2013.11.29 2 Health Info & Stat
  • 3. Contents - Scatterplot - Biplot - Regression Biplot - Kernel PCA - SVM Biplot 2013.11.29 3 Health Info & Stat
  • 4. Scatterplot: 산점도 - “Lego” for analytic data visualization - Reflecting the third variable quakes: 2013.11.29 longitude(=x), latitude(=y), depth(=z) 4 Health Info & Stat
  • 5. Scatterplot: 산점도 - For the case of large  (≧  ), over-plotting can produce serious outcome. Skin Segmentation Data:  (red) vs.  (green)        2013.11.29 5 Health Info & Stat
  • 6. Scatterplot: 산점도 - For the case of large  (≧  ), alpha channel can be utilized. Skin Segmentation Data:  (red) vs.  (green)        2013.11.29 6 Health Info & Stat
  • 7. Scatterplot: 산점도 - lowess: A nonparametric regression for bivariate data cars data: distance vs. speed 2013.11.29 7 Health Info & Stat
  • 8. Scatterplot: 산점도 - 3D Rotation for three variables Skin Segmentation Data:  (red),  (green),  (blue) - ggobi: 2013.11.29 3D Rotation for four or more variables 8 Health Info & Stat
  • 9. Biplot of Observations and Variables, Gabriel (1971) - The biplot is a graph that shows  observations and  variables. Protein data (row: 25 nations, column: 9 protein sources) 2013.11.29 9 Health Info & Stat
  • 10. Biplot of Observations and Variables, Gabriel (1971) - Idea: Linear projection Protein data: variable cereal 2013.11.29 10 Health Info & Stat
  • 11. Regression Biplot, Huh and Lee (2013) - Regression biplot is a graph for  observations of   ⋯    , arranged by predicted  . - Assume that the model fit is determined by a function of linear combination of   ⋯    . For instance,    ⋯  ,       or log           ⋯    .   - Set the vertical dimension by the direction of regression coefficients       ⋮ , or      . ∥∥    - Set the horizontal dimension by the direction of principal axis of      ⋯   ,  where   denotes the orthogonal component generated from the projection of   on  . 2013.11.29 11 Health Info & Stat
  • 12. Regression Biplot, Huh and Lee (2013) Example 1. Stack Loss Data (  ;   loss of ammonia,         ) 2013.11.29 12 Health Info & Stat
  • 13. Regression Biplot, Huh and Lee (2013) Example 2. Magazine Data (  ;   Subscription (0,1),   ) 2013.11.29 13 Health Info & Stat
  • 14. Kernel PCA, Scholkopf et al. (1998) - For  observations    ⋯    ( × ), consider the nonlinear mapping     ⋯    to a Hilbert space, in which                      . - Denoting            , Kernel PCA is obtained from eigen-decomposing              .       - Kernel PCA yields a plot of observations by projecting       ⋯       on      ′    where  2013.11.29  ′    ′  ,      ,   is an eigenvector of  .   14 Health Info & Stat
  • 15. Kernel PCA Diagram (or Kernel Biplot), Huh (2013) - Aim: Representation of  variables in Kernel PC plot of observations. - Proposed Procedure:  1) For each    ⋯    , map         on the plane,    ⋯   , where    is a constant and     ⋯   ⋯    . Projection is given by   ′     ′    ′           ″      ′ ″     ″ ″′  .   ″    ″    ″  ″′       2) For each  , link the projection points of   and   2013.11.29 15 by an arrow. Health Info & Stat
  • 16. Example 1. Arrow diagrams [  ] for kernel PCA of the iris data with rbf kernel,    2013.11.29 16 Health Info & Stat
  • 17. Example 1. Arrow diagrams [  ] for kernel PCA of the iris data with rbf kernel,    2013.11.29 17 Health Info & Stat
  • 18. Example 2. Arrow diagrams [  ] for kernel PCA of the spam data [      ] 2013.11.29 18 Health Info & Stat
  • 19. SVM-Guided Biplot as an extension of Regression Biplot - Idea: Combine Linear/Logistic Regression Biplot and Kernel PCA. - Classification/Regression Part: Classified as SVM classifier   -1 or 1 for    ⋯   .               ,  where         ,    Vertical dimension is set to              2013.11.29    ≧ .     (      ,        ). 19 Health Info & Stat
  • 20. SVM-Guided Biplot: Classification - Kernel PCA Part:                ∴   (          ′  ),  ′   ′     ⋯   .           ′                  ′   ′       ′   ′ ,   ′   ⋯   . Hence   →      (   ) or          .    Horizontal dimension is determined by eigen-decomposing  .  - Perturbation Scheme for Arrow Diagrams. Define      ,  ×  , where  represents a perturbation of which the magnitude is controlled by . Then, project   on the first (vertical) and the second (horizontal) dimension. 2013.11.29 20 Health Info & Stat
  • 21. Example 1. Iris Data: Versicolor vs. Virginica [sigma=0.1, C=1,   ] 2013.11.29 21 Health Info & Stat
  • 22. Importance of Variables (in the case of large ) - It is necessary to select a small number of variables in determining the first and second dimensions. - Measures of Importance (definition)  Length of Arrows 1) in vertical direction, 2) in horizontal direction. - Plot arrow diagrams for importance variables only. 2013.11.29 22 Health Info & Stat
  • 23. Example 2. Spam Data [sigma=0.1, C=10,   ],    2013.11.29 23 Health Info & Stat
  • 24. SVM-Guided Biplot: Regression - The same method can be applied to SVM regression. - Example 3. Aerobic Fitness [       ] for oxygen uptake (=  ) with RBF kernel ( =0.1, C=10,  =0.1,   ) 2013.11.29 24 Health Info & Stat
  • 25. Concluding Remarks - Biplot method can be extended to be suited for linear regression or classification (logistic regression). - Biplot method can be extended to allow nonlinear mapping of observations and variables, by fully utilizing kernel trick. http://blog.naver.com/huh4200 금붕어 어항 (on the iPad) 2013.11.29 25 Health Info & Stat
  • 26. References Gabriel, K.R. (1971). “The biplot display of matrices with the application to principal component analysis”. Biometrika, 58. 453-467. Huh, M.H. (2013). “Arrow diagrams for kernel principal component analysis”. Communications for Statistical Applications and Methods, 20. 175-184. Huh, M.H. (2013). “SVM-guided biplot of observations and variables”. Communications for Statistical Applications and Methods. (to appear) Huh, M.H. and Lee, Y.G. (2013). “Biplots of multivariate data guided by linear and/or logistic regression”. Communications for Statistical Applications and Methods, 20. 129-136. Scholkopf, B., Smola, A. and Muller, K.R. (1998). Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10. 1299–1319. 2013.11.29 26 Health Info & Stat