Dimensionality Reduction

•Descargar como ODP, PDF•

3 recomendaciones•1,617 vistas

In statistics, machine learning, and information theory, dimensionality reduction or dimension reduction is the process of reducing the number of random variables under consideration by obtaining a set of principal variables.

Software

Agenda
● Machine Learning and its data
● Dimensionality Reduction
● Use cases of Dimensionality Reduction
● Feature Selection
● Feature Extraction

About Dimensionality Reduction
● “Curse of dimensionality”
○ High time + space complexity
○ Overfitting
○ Presence of irrelevant data
● What ml algo’s want
○ Uncorrelated data or independent variables
○ Less enough data to predict
● “blessing of dimensionality”

Feature Selection
● What is Feature Selection?
○ Simplify models to make them easier to interpret by users
○ Shorter training times
○ Avoid the curse of dimensionality
○ Enhanced generalization by reducing overfitting
● Ways to select subsets for feature selections are
○ Optimum method
○ Heuristic method
○ Randomized method
● Feature evaluation methods
○ Filter methods (unsupervised method)
○ Wrapper method (supervised method)

Subset Selection
● Forward Selection method
○ It starts with empty data set.
○ Try each remaining feature
○ Estimate classification/regression error for adding each feature
○ Select features that give maximum improvement
○ Stop when there is no significant improvement
● Backward Elimination method
○ Starts with the full feature set
○ Try removing features
○ Drop feature with smallest improvement/impact
○ Stop when there is no significant improvement

Univariate and MultiVariate
● Univariant
○ Pearson correlation coefficient
○ F-score
○ Chi-square
○ Signal to noise ratio
○ Mutual information
● Multivariat
○ Minimum Redundancy and Maximum Relevance (mRMR)
○ Fast Correlation based Feature Selection (FCBF)

Feature extraction
● It doesn’t select any subset
● It creates new feature set
● After creation features are uncorrelated among
themselves
● Types of feature extractions
○ PCA (Principal Component Analysis)
○ LDA (Linear Discriminant Analysis)

Eigenvector and eigenvalues
● Principal component = eigenvector
● When we use svd, every eigenvector has a
eigenvalue
● Terminate those eigenvector with value 0
● We will have projection values which could be later
on used for reducing the dimensions of other inputs.

References
● https://www.youtube.com/watch?v=EWmCkVfPnJ8&index=3&list=PLYihddLF-CgYuW
● https://medium.freecodecamp.org/the-curse-of-dimensionality-how-we-can-save-big

Más contenido relacionado

La actualidad más candente

Machine Learning With Logistic RegressionKnoldus Inc.

Classification Based Machine Learning AlgorithmsMd. Main Uddin Rony

Dimension reduction techniques[Feature Selection]AAKANKSHA JAIN

Dimensionality reductionShatakirti Er

Fuzzy Clustering(C-means, K-means)Fellowship at Vodafone FutureLab

Principal Component Analysis (PCA) and LDA PPT SlidesAbhishekKumar4995

Feature selection concepts and methodsReza Ramezani

Logistic regressionYashwantGahlot1

K mean-clustering algorithmparry prabhu

Introduction to Machine Learning ClassifiersFunctional Imperative

Lecture 18: Gaussian Mixture Models and Expectation Maximizationbutest

Logistic regression in Machine LearningKuppusamy P

Principal Component AnalysisRicardo Wendell Rodrigues da Silveira

Pca pptDheeraj Dwivedi

Gradient BoostingNghia Bui Van

Linear discriminant analysisBangalore

K Nearest NeighborsTilani Gunawardena PhD(UNIBAS), BSc(Pera), FHEA(UK), CEng, MIESL

Random forestMusa Hawamdah

Data Science - Part XII - Ridge Regression, LASSO, and Elastic NetsDerek Kane

Dimensionality Reduction | Machine Learning | CloudxLabCloudxLab

La actualidad más candente (20)

Machine Learning With Logistic Regression

Classification Based Machine Learning Algorithms

Dimension reduction techniques[Feature Selection]

Dimensionality reduction

Fuzzy Clustering(C-means, K-means)

Principal Component Analysis (PCA) and LDA PPT Slides

Feature selection concepts and methods

Logistic regression

K mean-clustering algorithm

Introduction to Machine Learning Classifiers

Lecture 18: Gaussian Mixture Models and Expectation Maximization

Logistic regression in Machine Learning

Principal Component Analysis

Pca ppt

Gradient Boosting

Linear discriminant analysis

K Nearest Neighbors

Random forest

Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets

Dimensionality Reduction | Machine Learning | CloudxLab

Similar a Dimensionality Reduction

Automatic Machine Learning, AutoMLHimadri Mishra

Data mining with wekaHein Min Htike

PyCon Balkans 2018 // Recommender systems - collaborative filtering and dimen...Mladen Jovanovic

Structured prediction with reinforcement learningguruprasad110

Botnet detection in SDN by DL techniquesIvan Letteri

DC02. Interpretation of predictionsAnton Kulesh

Team AAA Pitch - Deeptech AI HackathonMatthias Schedel

SKLearn Workshop.pptxfsxflyer789Productio

FlumeJava: Easy, Efficient Data-Parallel PipelinesMiro Cupak

Willump: Optimizing Feature Computation in ML InferenceDatabricks

Service discovery using crd ts fun-confMushtaq Ahmed

An introduction to deep reinforcement learningBig Data Colombia

30thSep2014Mia liu

Reinforcement Learning 8: Planning and Learning with Tabular MethodsSeung Jae Lee

Feature Selection Strategies for HTTP Botnet Traffic DetectionIvan Letteri

Paper Reading: Smooth ScanPingCAP

Active Learning on Question Answering with DialoguesJinho Choi

IDS for IoT.pptxRashilaShrestha

Service discovery using crdtMushtaq Ahmed

Introduction to machine learning and applications (1)Manjunath Sindagi

Similar a Dimensionality Reduction (20)

Automatic Machine Learning, AutoML

Data mining with weka

PyCon Balkans 2018 // Recommender systems - collaborative filtering and dimen...

Structured prediction with reinforcement learning

Botnet detection in SDN by DL techniques

DC02. Interpretation of predictions

Team AAA Pitch - Deeptech AI Hackathon

SKLearn Workshop.pptx

FlumeJava: Easy, Efficient Data-Parallel Pipelines

Willump: Optimizing Feature Computation in ML Inference

Service discovery using crd ts fun-conf

An introduction to deep reinforcement learning

30thSep2014

Reinforcement Learning 8: Planning and Learning with Tabular Methods

Feature Selection Strategies for HTTP Botnet Traffic Detection

Paper Reading: Smooth Scan

Active Learning on Question Answering with Dialogues

IDS for IoT.pptx

Service discovery using crdt

Introduction to machine learning and applications (1)

Más de Knoldus Inc.

Mastering Web Scraping with JSoup Unlocking the Secrets of HTML ParsingKnoldus Inc.

Akka gRPC Essentials A Hands-On IntroductionKnoldus Inc.

Entity Core with Core Microservices.pptxKnoldus Inc.

Introduction to Redis and its features.pptxKnoldus Inc.

GraphQL with .NET Core Microservices.pdfKnoldus Inc.

NuGet Packages Presentation (DoT NeT).pptxKnoldus Inc.

Data Quality in Test Automation Navigating the Path to Reliable TestingKnoldus Inc.

K8sGPTThe AI way to diagnose KubernetesKnoldus Inc.

Introduction to Circle Ci Presentation.pptxKnoldus Inc.

Robusta -Tool Presentation (DevOps).pptxKnoldus Inc.

Optimizing Kubernetes using GOLDILOCKS.pptxKnoldus Inc.

Azure Function App Exception Handling.pptxKnoldus Inc.

CQRS Design Pattern Presentation (Java).pptxKnoldus Inc.

ETL Observability: Azure to Snowflake PresentationKnoldus Inc.

Scripting with K6 - Beyond the Basics PresentationKnoldus Inc.

Getting started with dotnet core Web APIsKnoldus Inc.

Introduction To Rust part II PresentationKnoldus Inc.

Data governance with Unity Catalog PresentationKnoldus Inc.

Configuring Workflows & Validators in JIRAKnoldus Inc.

Advanced Python (with dependency injection and hydra configuration packages)Knoldus Inc.

Más de Knoldus Inc. (20)

Mastering Web Scraping with JSoup Unlocking the Secrets of HTML Parsing

Akka gRPC Essentials A Hands-On Introduction

Entity Core with Core Microservices.pptx

Introduction to Redis and its features.pptx

GraphQL with .NET Core Microservices.pdf

NuGet Packages Presentation (DoT NeT).pptx

Data Quality in Test Automation Navigating the Path to Reliable Testing

K8sGPTThe AI way to diagnose Kubernetes

Introduction to Circle Ci Presentation.pptx

Robusta -Tool Presentation (DevOps).pptx

Optimizing Kubernetes using GOLDILOCKS.pptx

Azure Function App Exception Handling.pptx

CQRS Design Pattern Presentation (Java).pptx

ETL Observability: Azure to Snowflake Presentation

Scripting with K6 - Beyond the Basics Presentation

Getting started with dotnet core Web APIs

Introduction To Rust part II Presentation

Data governance with Unity Catalog Presentation

Configuring Workflows & Validators in JIRA

Advanced Python (with dependency injection and hydra configuration packages)

Último

Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveCall Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure

Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq

Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH

The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171

why an Opensea Clone Script might be your perfect match.pdfjoe51371421

The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS

A Secure and Reliable Document Management System is Essential.docxComplianceQuest1

Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812

DNT_Corporate presentation know about usDynamic Netsoft

Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.

Diamond Application Development Crafting Solutions with PrecisionSolGuruz

(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700

How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes

Active Directory Penetration Testing, cionsystems.com.pdfCionsystems

Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy

Optimizing AI for immediate response in Smart CCTVshikhaohhpro

Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531

What is Binary Language? Computer Number SystemsJheuzeDellosa

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01

Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.

Dimensionality Reduction

1. Dimensionality Reduction Pranjut Gogoi Lead Software Consultant Knoldus Software LLP

2. Agenda ● Machine Learning and its data ● Dimensionality Reduction ● Use cases of Dimensionality Reduction ● Feature Selection ● Feature Extraction

3. ML Data Sample

4. About Dimensionality Reduction ● “Curse of dimensionality” ○ High time + space complexity ○ Overfitting ○ Presence of irrelevant data ● What ml algo’s want ○ Uncorrelated data or independent variables ○ Less enough data to predict ● “blessing of dimensionality”

5. Feature Selection ● What is Feature Selection? ○ Simplify models to make them easier to interpret by users ○ Shorter training times ○ Avoid the curse of dimensionality ○ Enhanced generalization by reducing overfitting ● Ways to select subsets for feature selections are ○ Optimum method ○ Heuristic method ○ Randomized method ● Feature evaluation methods ○ Filter methods (unsupervised method) ○ Wrapper method (supervised method)

6. How Feature Selection works

7. Subset Selection ● Forward Selection method ○ It starts with empty data set. ○ Try each remaining feature ○ Estimate classification/regression error for adding each feature ○ Select features that give maximum improvement ○ Stop when there is no significant improvement ● Backward Elimination method ○ Starts with the full feature set ○ Try removing features ○ Drop feature with smallest improvement/impact ○ Stop when there is no significant improvement

8. Univariate and MultiVariate ● Univariant ○ Pearson correlation coefficient ○ F-score ○ Chi-square ○ Signal to noise ratio ○ Mutual information ● Multivariat ○ Minimum Redundancy and Maximum Relevance (mRMR) ○ Fast Correlation based Feature Selection (FCBF)

9. Feature extraction ● It doesn’t select any subset ● It creates new feature set ● After creation features are uncorrelated among themselves ● Types of feature extractions ○ PCA (Principal Component Analysis) ○ LDA (Linear Discriminant Analysis)

10. Principal Component Analysis

11. PCA 1-D

12. PCA 2-D

13. PCA 3-D

14. PC1 2-D

15. PCA 1-D

16. Eigenvector and eigenvalues ● Principal component = eigenvector ● When we use svd, every eigenvector has a eigenvalue ● Terminate those eigenvector with value 0 ● We will have projection values which could be later on used for reducing the dimensions of other inputs.

17. References ● https://www.youtube.com/watch?v=EWmCkVfPnJ8&index=3&list=PLYihddLF-CgYuW ● https://medium.freecodecamp.org/the-curse-of-dimensionality-how-we-can-save-big

18. Thank You!!

Dimensionality Reduction

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Dimensionality Reduction

Similar a Dimensionality Reduction (20)

Más de Knoldus Inc.

Más de Knoldus Inc. (20)

Último

Último (20)

Dimensionality Reduction