SlideShare una empresa de Scribd logo
1 de 19
Descargar para leer sin conexión
A Deep Learning Approach to Antibiotic Discovery
PR-232
주성훈, Samsung SDS
2020. 3. 15.
1. Research Background
1. Research Background
Introduction
3/19
• AutoML
, architecture hyperparameter .
• (NAS, Neural Architecture Search)
• Hyperparameters
• Learning rule (activation function, full forward pass, data augmentation, weight optimization, layer and weight pruning)
AutoML
https://arxiv.org/pdf/1810.13306.pdf
1. Research Background
Architecture search
• - Constrained search space
• building block search algorithm , NAS .
• constrained search space .
Search space :
Saining Xie et al. (2019) https://arxiv.org/pdf/1904.01569.pdf
PR-155
Golnaz Ghaisi et al. (2019) https://arxiv.org/pdf/1904.01569.pdf
PR-166
Yanan Sun et al. (2019) https://arxiv.org/pdf/1710.10741.pdf
4/19
1. Research Background
AutoML-Zero
• We propose to automatically search for whole ML algorithms using little restriction
on form and only simple mathematical operations as building blocks.
 Matrix decomposition derivative .
5/19
1. Research Background
AutoML-Zero
• we propose to automatically search for whole ML algorithms using little restriction
on form and only simple mathematical operations as building blocks.
백지상태에서 시작해서 최종 알고리즘 까지
정말 어마어마한 search space…
4일
6/19
2. Methods
P=5, T=3 일 때,
2. Methods
Type (i)
랜덤 연산 삽입/삭제.
삭제 확률이 삽입 확률의 두 배
Type (ii)
함수 내 연산 전부 교체
Type (iii)
Argument 하나만 교체.
Real-valued constant 수정할 때,
[0.5, 2.0]사이의 수 임의선택 후
곱하고 10%의 확률로 부호 바꿈
Evolutionary method
T만큼 랜덤선택
8/19
2. Methods
Step 3 best algorithm
9/19
3. Experimental Results
3. Experimental Results
Random search (RS)
• Evolution / RS success rate :
Acceptable algorithms .
Acceptable algorithm hand-designed reference model .
• Task difficulty :
RS 1 acceptable algorithms algorithms .
ex) linear regression case RS 1 acceptable algorithms 107.4 , linear regressor task difficulty 7.4 .
algorithm search space sparse , AutoML-Zero RS .
11/19
4 ops
7 ops
5 ops
9 ops
3. Experimental Results
AutoML-Zero hand-designed reference (2-layer FC NN)
• CIFAR-10 MNIST task
• 10 class , binary classification ; 10C2 = 45 pairs
• pair 8000 train/ 2000 valid example
• 45 36 – Tsearch
(search task . 1~10 evolution cycle )
• 45 9 – Tselect ( best accuracy )
• CIFAR-10 test set final evaluation
• Number of possible operations: 7/58/58 for Setup/Predict/Learn
 Figure 6 1 illustration , (5, 20) .
12/19
• Training Epoch : 1 or 10; evolution parameter: P=100, T=10
• Maximum num. instructions for Setup/Predict/Learn: 21/21/45.
3. Experimental Results
• Best model parameter (learning rate, uniform distribution mean ) Tselect dataset random search .
, linear/nonlinear baseline hyperparameters random search .
• [CIFAR-10 ] 5 trial best algorithm accuracy : 84.06 0.10%
Linear baseline : logistic regression, 77.65 0.22%
Nonlinear baseline : 2-layer fully connected neural network, 82.22 0.17%
• binary classification task :
1) SVHN (32 x 32 x 3) (88.12% AutoML-Zero vs. 59.58% linear baseline vs. 85.14% for nonlinear baseline)
2) down-sampled ImageNet (128 x 128 x 3) (80.78% vs. 76.44% vs. 78.44%)
3) Fashion MNIST (28 x 28 x 1) (98.60% vs. 97.90% vs. 98.21%).
 search space design convolution batch normalization .
AutoML-Zero hand-designed reference (2-layer FC NN)
AutoML-Zero 2-layer FC NN .
13/19
3. Experimental Results
Challenging task AutoML-Zero
1) Few training examples
• Training dataset 80 100 epoch ,
AutoML-Zero Noisy ReLU (dropout ) .
• ?
(80 examples) vs. (800 examples) 30 ,
(p<0.0005) noisy ReLU .
14/19
3. Experimental Results
Challenging task AutoML-Zero
2) Fast training
• Training dataset 800 10 epoch ,
AutoML-Zero learning-rate decay .
• ?
10 epoch vs. 100 epoch 30 ,
10 epoch case 30 (30/30), 100 epoch case 3 (3/30) learning-rate decay
.
15/19
3. Experimental Results
Challenging task AutoML-Zero
3) Multiple classes
• CIFAR-10 10 ,
Learning rate sin(weight )
.
• Multi-class vs. binary-class 30 ,
Binary-class (0/30)
Multi-class 24 (24/30) .
AutoML-Zero , .
16/19
2) Functional equivalence checking(FEC)
•
•
4) Hurdle :
•
•
3. Experimental Results
1) Migration
•
17/19
4. Conclusion
4. Conclusions 19/19
Thank you.
• AutoML ambitious goal (
) .
• future work higher-order tensor function call search space
.
• AutoML-Zero (Setup, Predict, Learn) ,
linear regressors, neural networks, gradient descent, multiplicative interactions,
weight averaging, normalized gradients
.
• AutoML-Zero
. .

Más contenido relacionado

La actualidad más candente

MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and ArchitecturesMetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
MLAI2
 

La actualidad más candente (20)

Mlp mixer image_process_210613 deeplearning paper review!
Mlp mixer image_process_210613 deeplearning paper review!Mlp mixer image_process_210613 deeplearning paper review!
Mlp mixer image_process_210613 deeplearning paper review!
 
【DL輪読会】Standardized Max Logits: A Simple yet Effective Approach for Identifyi...
【DL輪読会】Standardized Max Logits: A Simple yet Effective Approach for Identifyi...【DL輪読会】Standardized Max Logits: A Simple yet Effective Approach for Identifyi...
【DL輪読会】Standardized Max Logits: A Simple yet Effective Approach for Identifyi...
 
Review : PolarMask: Single Shot Instance Segmentation with Polar Representati...
Review : PolarMask: Single Shot Instance Segmentation with Polar Representati...Review : PolarMask: Single Shot Instance Segmentation with Polar Representati...
Review : PolarMask: Single Shot Instance Segmentation with Polar Representati...
 
DNR - Auto deep lab paper review ppt
DNR - Auto deep lab paper review pptDNR - Auto deep lab paper review ppt
DNR - Auto deep lab paper review ppt
 
[Review] BoxInst: High-Performance Instance Segmentation with Box Annotations...
[Review] BoxInst: High-Performance Instance Segmentation with Box Annotations...[Review] BoxInst: High-Performance Instance Segmentation with Box Annotations...
[Review] BoxInst: High-Performance Instance Segmentation with Box Annotations...
 
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]
 
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision TransformersEmerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision Transformers
 
Dear - 딥러닝 논문읽기 모임 김창연님
Dear - 딥러닝 논문읽기 모임 김창연님Dear - 딥러닝 논문읽기 모임 김창연님
Dear - 딥러닝 논문읽기 모임 김창연님
 
[unofficial] Pyramid Scene Parsing Network (CVPR 2017)
[unofficial] Pyramid Scene Parsing Network (CVPR 2017)[unofficial] Pyramid Scene Parsing Network (CVPR 2017)
[unofficial] Pyramid Scene Parsing Network (CVPR 2017)
 
Review: Incremental Few-shot Instance Segmentation [CDM]
Review: Incremental Few-shot Instance Segmentation [CDM]Review: Incremental Few-shot Instance Segmentation [CDM]
Review: Incremental Few-shot Instance Segmentation [CDM]
 
Vision Transformer(ViT) / An Image is Worth 16*16 Words: Transformers for Ima...
Vision Transformer(ViT) / An Image is Worth 16*16 Words: Transformers for Ima...Vision Transformer(ViT) / An Image is Worth 16*16 Words: Transformers for Ima...
Vision Transformer(ViT) / An Image is Worth 16*16 Words: Transformers for Ima...
 
Review: You Only Look One-level Feature
Review: You Only Look One-level FeatureReview: You Only Look One-level Feature
Review: You Only Look One-level Feature
 
Deep Learning Fast MRI Using Channel Attention in Magnitude Domain
Deep Learning Fast MRI Using Channel Attention in Magnitude DomainDeep Learning Fast MRI Using Channel Attention in Magnitude Domain
Deep Learning Fast MRI Using Channel Attention in Magnitude Domain
 
How much position information do convolutional neural networks encode? review...
How much position information do convolutional neural networks encode? review...How much position information do convolutional neural networks encode? review...
How much position information do convolutional neural networks encode? review...
 
Building and road detection from large aerial imagery
Building and road detection from large aerial imageryBuilding and road detection from large aerial imagery
Building and road detection from large aerial imagery
 
201907 AutoML and Neural Architecture Search
201907 AutoML and Neural Architecture Search201907 AutoML and Neural Architecture Search
201907 AutoML and Neural Architecture Search
 
Image classification using CNN
Image classification using CNNImage classification using CNN
Image classification using CNN
 
[PR-325] Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Tran...
[PR-325] Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Tran...[PR-325] Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Tran...
[PR-325] Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Tran...
 
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and ArchitecturesMetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
 
Mlp mixer an all-mlp architecture for vision
Mlp mixer  an all-mlp architecture for visionMlp mixer  an all-mlp architecture for vision
Mlp mixer an all-mlp architecture for vision
 

Similar a PR-232: AutoML-Zero:Evolving Machine Learning Algorithms From Scratch

Practical tips for handling noisy data and annotaiton
Practical tips for handling noisy data and annotaitonPractical tips for handling noisy data and annotaiton
Practical tips for handling noisy data and annotaiton
RyuichiKanoh
 
Multi-Layer Perceptrons
Multi-Layer PerceptronsMulti-Layer Perceptrons
Multi-Layer Perceptrons
ESCOM
 

Similar a PR-232: AutoML-Zero:Evolving Machine Learning Algorithms From Scratch (20)

PPT - AutoML-Zero: Evolving Machine Learning Algorithms From Scratch
PPT - AutoML-Zero: Evolving Machine Learning Algorithms From ScratchPPT - AutoML-Zero: Evolving Machine Learning Algorithms From Scratch
PPT - AutoML-Zero: Evolving Machine Learning Algorithms From Scratch
 
C3 w1
C3 w1C3 w1
C3 w1
 
AutoML lectures (ACDL 2019)
AutoML lectures (ACDL 2019)AutoML lectures (ACDL 2019)
AutoML lectures (ACDL 2019)
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
 
Automated Machine Learning via Sequential Uniform Designs
Automated Machine Learning via Sequential Uniform DesignsAutomated Machine Learning via Sequential Uniform Designs
Automated Machine Learning via Sequential Uniform Designs
 
Time series representations for better data mining
Time series representations for better data miningTime series representations for better data mining
Time series representations for better data mining
 
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
 
Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for De...
Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for De...Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for De...
Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for De...
 
Practical tips for handling noisy data and annotaiton
Practical tips for handling noisy data and annotaitonPractical tips for handling noisy data and annotaiton
Practical tips for handling noisy data and annotaiton
 
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
 
MLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott ClarkMLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott Clark
 
Spark Summit EU talk by Josef Habdank
Spark Summit EU talk by Josef HabdankSpark Summit EU talk by Josef Habdank
Spark Summit EU talk by Josef Habdank
 
Driving Moore's Law with Python-Powered Machine Learning: An Insider's Perspe...
Driving Moore's Law with Python-Powered Machine Learning: An Insider's Perspe...Driving Moore's Law with Python-Powered Machine Learning: An Insider's Perspe...
Driving Moore's Law with Python-Powered Machine Learning: An Insider's Perspe...
 
Prediction as a service with ensemble model in SparkML and Python ScikitLearn
Prediction as a service with ensemble model in SparkML and Python ScikitLearnPrediction as a service with ensemble model in SparkML and Python ScikitLearn
Prediction as a service with ensemble model in SparkML and Python ScikitLearn
 
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana CloudUsing SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
 
Deep_Learning__INAF_baroncelli.pdf
Deep_Learning__INAF_baroncelli.pdfDeep_Learning__INAF_baroncelli.pdf
Deep_Learning__INAF_baroncelli.pdf
 
Multi-Layer Perceptrons
Multi-Layer PerceptronsMulti-Layer Perceptrons
Multi-Layer Perceptrons
 
Large Scale Kernel Learning using Block Coordinate Descent
Large Scale Kernel Learning using Block Coordinate DescentLarge Scale Kernel Learning using Block Coordinate Descent
Large Scale Kernel Learning using Block Coordinate Descent
 
StackNet Meta-Modelling framework
StackNet Meta-Modelling frameworkStackNet Meta-Modelling framework
StackNet Meta-Modelling framework
 

Más de Sunghoon Joo

Más de Sunghoon Joo (17)

PR-445: Token Merging: Your ViT But Faster
PR-445: Token Merging: Your ViT But FasterPR-445: Token Merging: Your ViT But Faster
PR-445: Token Merging: Your ViT But Faster
 
PR-433: Test-time Training with Masked Autoencoders
PR-433: Test-time Training with Masked AutoencodersPR-433: Test-time Training with Masked Autoencoders
PR-433: Test-time Training with Masked Autoencoders
 
PR422_hyper-deep ensembles.pdf
PR422_hyper-deep ensembles.pdfPR422_hyper-deep ensembles.pdf
PR422_hyper-deep ensembles.pdf
 
PR-411: Model soups: averaging weights of multiple fine-tuned models improves...
PR-411: Model soups: averaging weights of multiple fine-tuned models improves...PR-411: Model soups: averaging weights of multiple fine-tuned models improves...
PR-411: Model soups: averaging weights of multiple fine-tuned models improves...
 
PR-393: ResLT: Residual Learning for Long-tailed Recognition
PR-393: ResLT: Residual Learning for Long-tailed RecognitionPR-393: ResLT: Residual Learning for Long-tailed Recognition
PR-393: ResLT: Residual Learning for Long-tailed Recognition
 
PR-383: Solving ImageNet: a Unified Scheme for Training any Backbone to Top R...
PR-383: Solving ImageNet: a Unified Scheme for Training any Backbone to Top R...PR-383: Solving ImageNet: a Unified Scheme for Training any Backbone to Top R...
PR-383: Solving ImageNet: a Unified Scheme for Training any Backbone to Top R...
 
PR-373: Revisiting ResNets: Improved Training and Scaling Strategies.
PR-373: Revisiting ResNets: Improved Training and Scaling Strategies.PR-373: Revisiting ResNets: Improved Training and Scaling Strategies.
PR-373: Revisiting ResNets: Improved Training and Scaling Strategies.
 
PR-339: Maintaining discrimination and fairness in class incremental learning
PR-339: Maintaining discrimination and fairness in class incremental learningPR-339: Maintaining discrimination and fairness in class incremental learning
PR-339: Maintaining discrimination and fairness in class incremental learning
 
PR-313 Training BatchNorm and Only BatchNorm: On the Expressive Power of Rand...
PR-313 Training BatchNorm and Only BatchNorm: On the Expressive Power of Rand...PR-313 Training BatchNorm and Only BatchNorm: On the Expressive Power of Rand...
PR-313 Training BatchNorm and Only BatchNorm: On the Expressive Power of Rand...
 
PR-298 PARADE: Passage representation aggregation for document reranking
PR-298 PARADE: Passage representation aggregation for document rerankingPR-298 PARADE: Passage representation aggregation for document reranking
PR-298 PARADE: Passage representation aggregation for document reranking
 
PR-285 Leveraging Semantic and Lexical Matching to Improve the Recall of Docu...
PR-285 Leveraging Semantic and Lexical Matching to Improve the Recall of Docu...PR-285 Leveraging Semantic and Lexical Matching to Improve the Recall of Docu...
PR-285 Leveraging Semantic and Lexical Matching to Improve the Recall of Docu...
 
PR-246: A deep learning system for differential diagnosis of skin diseases
PR-246: A deep learning system for differential diagnosis of skin diseasesPR-246: A deep learning system for differential diagnosis of skin diseases
PR-246: A deep learning system for differential diagnosis of skin diseases
 
PR-218: MFAS: Multimodal Fusion Architecture Search
PR-218: MFAS: Multimodal Fusion Architecture SearchPR-218: MFAS: Multimodal Fusion Architecture Search
PR-218: MFAS: Multimodal Fusion Architecture Search
 
PR-203: Class-Balanced Loss Based on Effective Number of Samples
PR-203: Class-Balanced Loss Based on Effective Number of SamplesPR-203: Class-Balanced Loss Based on Effective Number of Samples
PR-203: Class-Balanced Loss Based on Effective Number of Samples
 
PR-187 : MorphNet: Fast & Simple Resource-Constrained Structure Learning of D...
PR-187 : MorphNet: Fast & Simple Resource-Constrained Structure Learning of D...PR-187 : MorphNet: Fast & Simple Resource-Constrained Structure Learning of D...
PR-187 : MorphNet: Fast & Simple Resource-Constrained Structure Learning of D...
 
PR173 : Automatic Chemical Design Using a Data-Driven Continuous Representati...
PR173 : Automatic Chemical Design Using a Data-Driven Continuous Representati...PR173 : Automatic Chemical Design Using a Data-Driven Continuous Representati...
PR173 : Automatic Chemical Design Using a Data-Driven Continuous Representati...
 
PR-159 : Synergistic Image and Feature Adaptation: Towards Cross-Modality Dom...
PR-159 : Synergistic Image and Feature Adaptation: Towards Cross-Modality Dom...PR-159 : Synergistic Image and Feature Adaptation: Towards Cross-Modality Dom...
PR-159 : Synergistic Image and Feature Adaptation: Towards Cross-Modality Dom...
 

Último

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
ssuser89054b
 
DeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesDeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakes
MayuraD1
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Kandungan 087776558899
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
Neometrix_Engineering_Pvt_Ltd
 

Último (20)

kiln thermal load.pptx kiln tgermal load
kiln thermal load.pptx kiln tgermal loadkiln thermal load.pptx kiln tgermal load
kiln thermal load.pptx kiln tgermal load
 
AIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsAIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech students
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptx
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
Computer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to ComputersComputer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to Computers
 
DeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesDeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakes
 
Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
Engineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planesEngineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planes
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdf
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network Devices
 
Learn the concepts of Thermodynamics on Magic Marks
Learn the concepts of Thermodynamics on Magic MarksLearn the concepts of Thermodynamics on Magic Marks
Learn the concepts of Thermodynamics on Magic Marks
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
 
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best ServiceTamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 

PR-232: AutoML-Zero:Evolving Machine Learning Algorithms From Scratch

  • 1. A Deep Learning Approach to Antibiotic Discovery PR-232 주성훈, Samsung SDS 2020. 3. 15.
  • 3. 1. Research Background Introduction 3/19 • AutoML , architecture hyperparameter . • (NAS, Neural Architecture Search) • Hyperparameters • Learning rule (activation function, full forward pass, data augmentation, weight optimization, layer and weight pruning) AutoML https://arxiv.org/pdf/1810.13306.pdf
  • 4. 1. Research Background Architecture search • - Constrained search space • building block search algorithm , NAS . • constrained search space . Search space : Saining Xie et al. (2019) https://arxiv.org/pdf/1904.01569.pdf PR-155 Golnaz Ghaisi et al. (2019) https://arxiv.org/pdf/1904.01569.pdf PR-166 Yanan Sun et al. (2019) https://arxiv.org/pdf/1710.10741.pdf 4/19
  • 5. 1. Research Background AutoML-Zero • We propose to automatically search for whole ML algorithms using little restriction on form and only simple mathematical operations as building blocks.  Matrix decomposition derivative . 5/19
  • 6. 1. Research Background AutoML-Zero • we propose to automatically search for whole ML algorithms using little restriction on form and only simple mathematical operations as building blocks. 백지상태에서 시작해서 최종 알고리즘 까지 정말 어마어마한 search space… 4일 6/19
  • 8. P=5, T=3 일 때, 2. Methods Type (i) 랜덤 연산 삽입/삭제. 삭제 확률이 삽입 확률의 두 배 Type (ii) 함수 내 연산 전부 교체 Type (iii) Argument 하나만 교체. Real-valued constant 수정할 때, [0.5, 2.0]사이의 수 임의선택 후 곱하고 10%의 확률로 부호 바꿈 Evolutionary method T만큼 랜덤선택 8/19
  • 9. 2. Methods Step 3 best algorithm 9/19
  • 11. 3. Experimental Results Random search (RS) • Evolution / RS success rate : Acceptable algorithms . Acceptable algorithm hand-designed reference model . • Task difficulty : RS 1 acceptable algorithms algorithms . ex) linear regression case RS 1 acceptable algorithms 107.4 , linear regressor task difficulty 7.4 . algorithm search space sparse , AutoML-Zero RS . 11/19 4 ops 7 ops 5 ops 9 ops
  • 12. 3. Experimental Results AutoML-Zero hand-designed reference (2-layer FC NN) • CIFAR-10 MNIST task • 10 class , binary classification ; 10C2 = 45 pairs • pair 8000 train/ 2000 valid example • 45 36 – Tsearch (search task . 1~10 evolution cycle ) • 45 9 – Tselect ( best accuracy ) • CIFAR-10 test set final evaluation • Number of possible operations: 7/58/58 for Setup/Predict/Learn  Figure 6 1 illustration , (5, 20) . 12/19 • Training Epoch : 1 or 10; evolution parameter: P=100, T=10 • Maximum num. instructions for Setup/Predict/Learn: 21/21/45.
  • 13. 3. Experimental Results • Best model parameter (learning rate, uniform distribution mean ) Tselect dataset random search . , linear/nonlinear baseline hyperparameters random search . • [CIFAR-10 ] 5 trial best algorithm accuracy : 84.06 0.10% Linear baseline : logistic regression, 77.65 0.22% Nonlinear baseline : 2-layer fully connected neural network, 82.22 0.17% • binary classification task : 1) SVHN (32 x 32 x 3) (88.12% AutoML-Zero vs. 59.58% linear baseline vs. 85.14% for nonlinear baseline) 2) down-sampled ImageNet (128 x 128 x 3) (80.78% vs. 76.44% vs. 78.44%) 3) Fashion MNIST (28 x 28 x 1) (98.60% vs. 97.90% vs. 98.21%).  search space design convolution batch normalization . AutoML-Zero hand-designed reference (2-layer FC NN) AutoML-Zero 2-layer FC NN . 13/19
  • 14. 3. Experimental Results Challenging task AutoML-Zero 1) Few training examples • Training dataset 80 100 epoch , AutoML-Zero Noisy ReLU (dropout ) . • ? (80 examples) vs. (800 examples) 30 , (p<0.0005) noisy ReLU . 14/19
  • 15. 3. Experimental Results Challenging task AutoML-Zero 2) Fast training • Training dataset 800 10 epoch , AutoML-Zero learning-rate decay . • ? 10 epoch vs. 100 epoch 30 , 10 epoch case 30 (30/30), 100 epoch case 3 (3/30) learning-rate decay . 15/19
  • 16. 3. Experimental Results Challenging task AutoML-Zero 3) Multiple classes • CIFAR-10 10 , Learning rate sin(weight ) . • Multi-class vs. binary-class 30 , Binary-class (0/30) Multi-class 24 (24/30) . AutoML-Zero , . 16/19
  • 17. 2) Functional equivalence checking(FEC) • • 4) Hurdle : • • 3. Experimental Results 1) Migration • 17/19
  • 19. 4. Conclusions 19/19 Thank you. • AutoML ambitious goal ( ) . • future work higher-order tensor function call search space . • AutoML-Zero (Setup, Predict, Learn) , linear regressors, neural networks, gradient descent, multiplicative interactions, weight averaging, normalized gradients . • AutoML-Zero . .