Se ha denunciado esta presentación.
Se está descargando tu SlideShare. ×

ML DL AI DS BD - An Introduction

Cargando en…3

Eche un vistazo a continuación

1 de 39 Anuncio

Más Contenido Relacionado

Presentaciones para usted (20)

Similares a ML DL AI DS BD - An Introduction (20)


Más de Dony Riyanto (20)

Más reciente (20)


ML DL AI DS BD - An Introduction

  1. 1. Machine Learning, Deep Learning, AI, Big Data, Data Science, Data Analytics by Dony Riyanto Prepared and Presented to Panin Asset Management January 2019
  2. 2. General Definitions
  3. 3. Machine Learning • Machine (ML) is a field of artificial intelligence that uses statistical techniques to give computer systems the ability to "learn" (e.g., progressively improve performance on a specific task) from data, without being explicitly programmed.
  4. 4. Machine learning tasks Machine learning tasks are typically classified into several broad categories: • Supervised learning: The computer is presented with example inputs and their desired outputs, given by a "teacher", and the goal is to learn a general rule that maps inputs to outputs. As special cases, the input signal can be only partially available, or restricted to special feedback. • Semi-supervised learning: The computer is given only an incomplete training signal: a training set with some (often many) of the target outputs missing. • Active learning: The computer can only obtain training labels for a limited set of instances (based on a budget), and also has to optimize its choice of objects to acquire labels for. When used interactively, these can be presented to the user for labeling. • Unsupervised learning: No labels are given to the learning algorithm, leaving it on its own to find structure in its input. Unsupervised learning can be a goal in itself (discovering hidden patterns in data) or a means towards an end (feature learning). • Reinforcement learning: Data (in form of rewards and punishments) are given only as feedback to the program's actions in a dynamic environment, such as driving a vehicle or playing a game against an opponent.[4]:3
  5. 5. ML Diagram
  6. 6. ML Diagram
  7. 7. 1. Regression Algorithms • Ordinary Least Squares Regression (OLSR) • Linear Regression • Logistic Regression • Stepwise Regression • Multivariate Adaptive Regression Splines (MARS) • Locally Estimated Scatterplot Smoothing (LOESS) 2. Instance-based Algorithms • k-Nearest Neighbour (kNN) • Learning Vector Quantization (LVQ) • Self-Organizing Map (SOM) • Locally Weighted Learning (LWL) 3. Regularization Algorithms • Ridge Regression • Least Absolute Shrinkage and Selection Operator (LASSO) • Elastic Net • Least-Angle Regression (LARS) 4. Decision Tree Algorithms • Classification and Regression Tree (CART) • Iterative Dichotomiser 3 (ID3) • C4.5 and C5.0 (different versions of a powerful approach) • Chi-squared Automatic Interaction Detection (CHAID) • Decision Stump • M5 • Conditional Decision Trees 5. Bayesian Algorithms • Naive Bayes • Gaussian Naive Bayes • Multinomial Naive Bayes • Averaged One-Dependence Estimators (AODE) • Bayesian Belief Network (BBN) • Bayesian Network (BN) 6. Clustering Algorithms • k-Means • k-Medians • Expectation Maximisation (EM) • Hierarchical Clustering
  8. 8. 7. Association Rule Learning Algorithms • Apriori algorithm • Eclat algorithm 8. Artificial Neural Network Algorithms • Perceptron • Back-Propagation • Hopfield Network • Radial Basis Function Network (RBFN) 9. Deep Learning Algorithms • Deep Boltzmann Machine (DBM) • Deep Belief Networks (DBN) • Convolutional Neural Network (CNN) • Stacked Auto-Encoders 10. Dimensionality Reduction Algorithms • Principal Component Analysis (PCA) • Principal Component Regression (PCR) • Partial Least Squares Regression (PLSR) • Sammon Mapping • Multidimensional Scaling (MDS) • Projection Pursuit • Linear Discriminant Analysis (LDA) • Mixture Discriminant Analysis (MDA) • Quadratic Discriminant Analysis (QDA) • Flexible Discriminant Analysis (FDA) 11. Ensemble Algorithms • Boosting • Bootstrapped Aggregation (Bagging) • AdaBoost • Stacked Generalization (blending) • Gradient Boosting Machines (GBM) • Gradient Boosted Regression Trees (GBRT) • Random Forest 12. Other Algorithms • Computational intelligence (evolutionary algorithms, etc.) • Computer Vision (CV) • Natural Language Processing (NLP) • Recommender Systems • Reinforcement Learning • Graphical Models
  9. 9. Machine learning applications Another categorization of machine learning tasks arises when one considers the desired output of a machine-learned system: • In classification, inputs are divided into two or more classes, and the learner must produce a model that assigns unseen inputs to one or more (multi-label classification) of these classes. This is typically tackled in a supervised way. Spam filtering is an example of classification, where the inputs are email (or other) messages and the classes are "spam" and "not spam". • In regression, also a supervised problem, the outputs are continuous rather than discrete. • In clustering, a set of inputs is to be divided into groups. Unlike in classification, the groups are not known beforehand, making this typically an unsupervised task. • Density estimation finds the distribution of inputs in some space. • Dimensionality reduction simplifies inputs by mapping them into a lower-dimensional space. Topic modeling is a related problem, where a program is given a list of human language documents and is tasked to find out which documents cover similar topics. • Among other categories of machine learning problems, learning to learn learns its own inductive bias based on previous experience. Developmental learning, elaborated for robot learning, generates its own sequences (also called curriculum) of learning situations to cumulatively acquire repertoires of novel skills through autonomous self-exploration and social interaction with human teachers and using guidance mechanisms such as active learning, maturation, motor synergies, and imitation.
  10. 10. Relation to Data Mining Machine learning and data mining often employ the same methods and overlap significantly, but while machine learning focuses on prediction, based on known properties learned from the training data, data mining focuses on the discovery of (previously) unknown properties in the data (this is the analysis step of knowledge discovery in databases). Data mining uses many machine learning methods, but with different goals; on the other hand, machine learning also employs data mining methods as "unsupervised learning" or as a preprocessing step to improve learner accuracy. Much of the confusion between these two research communities (which do often have separate conferences and separate journals, ECML PKDD being a major exception) comes from the basic assumptions they work with: in machine learning, performance is usually evaluated with respect to the ability to reproduce known knowledge, while in knowledge discovery and data mining (KDD) the key task is the discovery of previously unknown knowledge. Evaluated with respect to known knowledge, an uninformed (unsupervised) method will easily be outperformed by other supervised methods, while in a typical KDD task, supervised methods cannot be used due to the unavailability of training data.
  11. 11. When to use Machine Learning
  12. 12. ML between AI and DL
  13. 13. Deep Learning • Deep learning (also known as deep structured learning or hierarchical learning) is part of a broader family of machine learning methods based on learning data representations, as opposed to task-specific algorithms. Learning can also be supervised, semi-supervised or unsupervised. • Deep learning architectures such as deep neural networks, deep belief networks and recurrent neural networks have been applied to fields including computer vision, speech recognition, natural language processing, audio recognition, social network filtering, machine translation, bioinformatics, drug design, medical image analysis, material inspection and board game programs, where they have produced results comparable to and in some cases superior to human experts. • Deep learning models are vaguely inspired by information processing and communication patterns in biological nervous systems yet have various differences from the structural and functional properties of biological brains (especially human brains), which make them incompatible with neuroscience evidences.
  14. 14. Deep Learning (contd) • Most modern deep learning models are based on an artificial neural network, although they can also include propositional formulas or latent variables organized layer-wise in deep generative models such as the nodes in deep belief networks and deep Boltzmann machines. • In deep learning, each level learns to transform its input data into a slightly more abstract and composite representation. In an image recognition application, the raw input may be a matrix of pixels; the first representational layer may abstract the pixels and encode edges; the second layer may compose and encode arrangements of edges; the third layer may encode a nose and eyes; and the fourth layer may recognize that the image contains a face. Importantly, a deep learning process can learn which features to optimally place in which level on its own. (Of course, this does not completely obviate the need for hand- tuning; for example, varying numbers of layers and layer sizes can provide different degrees of abstraction.) • The "deep" in "deep learning" refers to the number of layers through which the data is transformed. More precisely, deep learning systems have a substantial credit assignment path (CAP) depth. The CAP is the chain of transformations from input to output. CAPs describe potentially causal connections between input and output. For a feedforward neural network, the depth of the CAPs is that of the network and is the number of hidden layers plus one (as the output layer is also parameterized). For recurrent neural networks, in which a signal may propagate through a layer more than once, the CAP depth is potentially unlimited.[2] No universally agreed upon threshold of depth divides shallow learning from deep learning, but most researchers agree that deep learning involves CAP depth > 2. CAP of depth 2 has been shown to be a universal approximator in the sense that it can emulate any function.[citation needed] Beyond that more layers do not add to the function approximator ability of the network. Deep models (CAP > 2) are able to extract better features than shallow models and hence, extra layers help in learning features.
  15. 15. Deep Learning is Better Process of ML
  16. 16. DL Application • Automatic speech recognition • Image recognition • Visual art processing • Natural language processing • Drug discovery and toxicology • CRM • Recomendation Systems • Bioinformatics • Mobile advertising • Image restoration • Financial fraud detection • Military
  17. 17. Artificial Intelligence
  18. 18. Artificial Intelligence • Artificial intelligence (AI), sometimes called machine intelligence, is intelligence demonstrated by machines, in contrast to the natural intelligence displayed by humans and other animals. In computer science AI research is defined as the study of "intelligent agents": any device that perceives its environment and takes actions that maximize its chance of successfully achieving its goals. Colloquially, the term "artificial intelligence" is applied when a machine mimics "cognitive" functions that humans associate with other human minds, such as "learning" and "problem solving". • Includes: • Experts Systems • Fuzzy Logic • Robotics (e.g humanoid, arm, chatbot, etc) • Natural Language Processing • Neural Language (general) • Etc • Some of the subfield of AI used/transformed itu Machine Learning/Deep Learning field, with the impact of advance computational technology and huge amount of data (especially unstructured/ no row+column)
  19. 19. Data Science • Data science is an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from data in various forms, both structured and unstructured, similar to data mining. • Data science is a "concept to unify statistics, data analysis, machine learning and their related methods" in order to "understand and analyze actual phenomena" with data. It employs techniques and theories drawn from many fields within the context of mathematics, statistics, information science, and computer science. • Turing award winner Jim Gray imagined data science as a "fourth paradigm" of science (empirical, theoretical, computational and now data-driven) and asserted that "everything about science is changing because of the impact of information technology" and the data deluge.
  20. 20. DS Area
  21. 21. Data Analysis • Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making. Data analysis has multiple facets and approaches, encompassing diverse techniques under a variety of names, while being used in different business, science, and social science domains. • Data mining is a particular data analysis technique that focuses on modeling and knowledge discovery for predictive rather than purely descriptive purposes, while business intelligence covers data analysis that relies heavily on aggregation, focusing mainly on business information.[1] In statistical applications, data analysis can be divided into descriptive statistics, exploratory data analysis (EDA), and confirmatory data analysis (CDA). EDA focuses on discovering new features in the data while CDA focuses on confirming or falsifying existing hypotheses. Predictive analytics focuses on application of statistical models for predictive forecasting or classification, while text analytics applies statistical, linguistic, and structural techniques to extract and classify information from textual sources, a species of unstructured data. All of the above are varieties of data analysis. • Data integration is a precursor to data analysis,[according to whom?] and data analysis is closely linked[how?] to data visualization and data dissemination. The term data analysis is sometimes used as a synonym for data modeling.
  22. 22. Business Intelligence • Business intelligence (BI) comprises the strategies and technologies used by enterprises for the data analysis of business information. BI technologies provide historical, current and predictive views of business operations. Common functions of business intelligence technologies include reporting, online analytical processing, analytics, data mining, process mining, complex event processing, business performance management, benchmarking, text mining, predictive analytics and prescriptive analytics. BI technologies can handle large amounts of structured to help identify, develop and otherwise create new strategic business opportunities. They aim to allow for the easy interpretation of these data. Identifying new opportunities and implementing an effective strategy based on insights can provide businesses with a competitive market advantage and long-term stability. • Business intelligence can be used by enterprises to support a wide range of business decisions ranging from operational to strategic. Basic operating decisions include product positioning or pricing. Strategic business decisions involve priorities, goals and directions at the broadest level. In all cases, BI is most effective when it combines data derived from the market in which a company operates (external data) with data from company sources internal to the business such as financial and operations data (internal data). When combined, external and internal data can provide a complete picture • Business Intelligence (structured data source, enterprise goal intensive, traditional process flow, result: insight from previous data for executives) VS Big Data Analytics (unstructured/semi structured, wide- range goals, modern aproach/flow, result: predictive, for executive and/or machine to decide next step)
  23. 23. BI Steps
  24. 24. The “cube” output
  25. 25. Cube Example
  26. 26. Big Data • Big data is a term used to refer to data sets that are too large or complex for traditional data- processing application software to adequately deal with. Data with many cases (rows) offer greater statistical power, while data with higher complexity (more attributes or columns) may lead to a higher false discovery rate. Big data challenges include capturing data, data storage, data analysis, search, sharing, transfer, visualization, querying, updating, information privacy and data source. Big data was originally associated with three key concepts: volume, variety, and velocity. Other concepts later attributed with big data are veracity (i.e., how much noise is in the data) and value. • Current usage of the term "big data" tends to refer to the use of predictive analytics, user behavior analytics, or certain other advanced data analytics methods that extract value from data, and seldom to a particular size of data set. "There is little doubt that the quantities of data now available are indeed large, but that’s not the most relevant characteristic of this new data ecosystem."Analysis of data sets can find new correlations to" spot business trends, prevent diseases, combat crime and so on."Scientists, business executives, practitioners of medicine, advertising and governments alike regularly meet difficulties with large data-sets in areas including Internet search, fintech, urban informatics, and business informatics. Scientists encounter limitations in e-Science work, including meteorology, genomics, connectomics, complex physics simulations, biology and environmental research.
  27. 27. Big Data Definition shiftingThe term has been in use since the 1990s, with some giving credit to John Mashey for popularizing the term. Big data usually includes data sets with sizes beyond the ability of commonly used software tools to capture, curate, manage, and process data within a tolerable elapsed time. Big data philosophy encompasses unstructured, semi-structured and structured data, however the main focus is on unstructured data. Big data "size" is a constantly moving target, as of 2012 ranging from a few dozen terabytes to many exabytes of data. Big data requires a set of techniques and technologies with new forms of integration to reveal insights from datasets that are diverse, complex, and of a massive scale. A 2016 definition states that "Big data represents the information assets characterized by such a high volume, velocity and variety to require specific technology and analytical methods for its transformation into value". Additionally, a new V, veracity, is added by some organizations to describe it, revisionism challenged by some industry authorities. The three Vs (volume, variety and velocity) have been further expanded to other complementary characteristics of big data: • Machine learning: big data often doesn't ask why and simply detects patterns • Digital footprint: big data is often a cost-free byproduct of digital interaction (e.g people interaction in Facebook) A 2018 definition states "Big data is where parallel computing tools are needed to handle data", and notes, "This represents a distinct and clearly defined change in the computer science used, via parallel programming theories, and losses of some of the guarantees and capabilities made by Codd’s relational model". The growing maturity of the concept more starkly delineates the difference between "big data" and "Business Intelligence": • Business Intelligence uses descriptive statistics with data with high information density to measure things, detect trends, etc. • Big data uses inductive statistics and concepts from nonlinear system identification to infer laws (regressions, nonlinear relationships, and causal effects) from large sets of data with low information density to reveal relationships and dependencies, or to perform predictions of outcomes and behaviors.
  28. 28. Source: unstructured/raw data raw data
  29. 29. The Pyramid of Wisdom
  30. 30. Big Data is surounding us
  31. 31. Perkenalan • Mendalami pemrograman dan teknik komputer/elektronik sejak 1991 hingga sekarang (-/+ 29 tahun) • Lebih menyukai bahasa pemrograman, logika dan algoritma (problem solving) • Pernah bekerja profesional bbrp perusahaan lintas industri (ISP, pabrik, broadcasting, telekomunikasi, finance/finserv, buku/perpustakaan) • Pernah berkarya sebagai dosen/pengajar dan konsultan lintas industri (agriculture, aquafarming/coldstorage, telco, retail, manufacturer, human resources/capital, dsb) • Sejak 2013 mulai mendalami ke dunia IoT, Drone/UAV, infosec, bigdata, machine learning, dsb • Aplikasi AI pertama pernah dibuat tahun 1992 menggunakan Turbo Prolog, voice recognition (1992/93), polymorphic / heuristic code (1993), aplikasi Fuzzy Logic (1996/97), dosen AI tahun 2008/10, mulai mempelajari map reduce/hadoop 2014. • Berpartisipasi aktif dalam bbrp conference terkait Big Data, antara lain: Data For Public Policy (international conference, PBB/Pulse Lab), Big Data Week (1st, 2nd), Precission Agruculture projects (with CIagri), Jakarta Smart City (literacy index, BPAD DKI Jakarta), dsb