This presentation is a friendly introduction to Artificial Intelligence, Data Science and Machine Learning. It touches on the beginnings of AI, the steps involved in Data Science, the roles involving operations on data, and the buzz around "Technology Singularity".
It ends by looking at tools and system requirements for people who might want to start a career in AI.
Have fun exploring Artificial Intelligence!
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
An Elementary Introduction to Artificial Intelligence, Data Science and Machine Learning
1. An Elementary Introduction to Data
Science, Machine Learning & Artificial
Intelligence
Author: Agbo Dozie
2. 00 Turing’s 1959 Question
01
Introduction to Artificial
Intelligence
02 The Machines and The Algorithms
03
Mathematics in Artificial
Intelligence
04 Real-world use cases of Artificial
Intelligence
05
Why and How to start a Data
Science & AI career
06 AI Singularity
07
System Requirements for ML, DS,
and DL
08 Tools for Artificial Intelligence
| Content
7. | Introduction to AI
What is AI?
The English Oxford Living
Dictionary gives this
definition:
“The theory and development
of computer systems able to
perform tasks normally
requiring human intelligence,
such as visual perception,
speech recognition, decision-
making, and translation
between languages.”
Artificial
Intelligence
A program that
can sense,
reason, act, and
adapt
Machine
Learning
Deep
Learning
Data Science
Merriam-Webster
Dictionary says
“Artificial
Intelligence is a
branch of
computer science
that deals with
simulating human
intelligence”.
8. | AI complements the skills that humans are naturally good at
Human
Common Sense • Morals • Imagination • Compassion • Abstraction
• Dreaming • Generalisation
AI
Locating Knowledge • Pattern Identification • Natural Language
• Machine Learning • Eliminating Bias • Endless Capacity
9. | The drivers behind AI’s projected growth
Spectacular improvements in AI
performances
Thanks to new technologies and the increase in data
generation significant improvements in AI research are made
1
Growing awareness of AI importance
and social implications
The emergence of daily life applications of AI have risen
awareness of its importance for the future and its social
implications
3
Increasing adoption of AI in businesses
and daily life
The performance improvements brought especially by deep
learning has enabled the development of more sophisticated
applications fit for business and daily life use
2
Projected global AI market revenue
$36.8b
$6.0b
$0.6b
+57%
per year
2020F 2025E2016
Source: Tractica
10. | AI: Regulatory Considerations
Government Office for Science
Artificial intelligence:
opportunities and implications for
the future of decision making
Financial Stability Board :
Artificial intelligence and machine
learning in financial services
The Institute of Internal Auditors
Artificial Intelligence –
Considerations for the Profession of
Internal Auditing
Centre for Data Ethics and
Innovation:
To provide independent and expert
advice to the UK government
Department for Digital, Culture,
Media & Sport
Centre for Data Ethics and
Innovation}
12. 294Billion
Emails get sent worldwide
everyday, in 2019.
By 2025, it’s
estimated that 463
exabytes of data
will be created each
day globally – that’s
the equivalent of
212,765,957 DVDs
per day!
500million
Tweets are sent out each day. That
means about 6000 tweet every
second. The most popular emoji in
these tweet is the tears of joy 😂
4.2 million
Times a year, we blink our eyes.
Kisses are made
everyday70
Million
$13 Billion
global AI market in
2017, it is expected
to grow at an annual
growth rate of 50.1%
65 Billion
WhatsApp messages gets sent
everyday. As at October 2019,
the 2nd most common emoji on
the platform was the “Red
❤️. There’s a lot of love in the
air, or should I say on WhatsApp.
Source: WEForum
| Data Science works because Data is being created at a Phenomenal Rate...
13.
14. | 5 W’s of Data Science
DATA DATA SCIENCE
WHEN
it is applied
At the beginning of you analysis After the data has been
gathered & organized
After BI reports have been created and discussed
WHY
you need it
Data-driven decisions require well-organized and relevant
row data stored in a digital format
Use data to create reports
and dashboards to gain
business insights
Access potential future
scenarios by using
advanced statistical
methods
Utilize artificial intelligence
to predict behavior in
unprecedented ways
WHAT
techniques are involved
Data Collection
Preprocessing
• Class labeling
(categorical vs
numerical)
• Data cleansing
• Dealing with missing
values
Data Collection
Preprocessing
• Class labeling (number,
text, images, videos,
audio)
• Data cleansing
• Dealing with missing
values
Analyze the data
Extract info and present it
in the form of
• Metrics
• KPIs
• Reports
• dashboards
Regression
Clustering
Factor Analysis
Time Series
Supervised Learning
Unsupervised Learning
Reinforcement Learning
WHERE
it be applied
Basic Customer Data
Historical Stock Price Data
Social Media
Financial Trading Data
Price Optimization
Inventory Management
User Experience (UX)
Sales Forecasting
Fraud Detection
Client Retention
WHO
is performs tasks
Data Architect
Data Engineer
Database Administrator
Big Data Architect
Big Data Engineer
BI Analyst
BI Consultant
BI Developer
Data Scientist
Data Analyst
Data Scientist
Machine Learning Engineer
Machine
Learning
Traditional Big
Business
Intelligence
Traditional
Methods
NOWPAST FUTURE
15. Problem
Understanding
Data Mining
Data Cleaning
Feature Engineering
Predictive Analytics
Visualization
| The Data Science Lifecycle
▪ The solution to the problem is likely
to have enough positive impact to
justify the effort.
▪ Enough data is available in a
usable format.
▪ Stakeholders are interested in
applying data science to solve the
problem.
The problem should be clear, concise, and measurable.
Basic characteristics of a well-defined data problem:
16. Problem
Understanding
Data Mining
Data Cleaning
Feature Engineering
Predictive Analytics
Visualization
| The Data Science Lifecycle
In simple words, data mining is defined as a process used to
extract usable data from a larger set of any raw data.
17. Problem
Understanding
Data Mining
Data Cleaning
Feature Engineering
Predictive Analytics
Visualization
| The Data Science Lifecycle
Data cleansing or data cleaning is the process of detecting and
correcting (or removing) corrupt or inaccurate records from a record set,
table, or database and
refers to identifying
incomplete, incorrect,
inaccurate or irrelevant
parts of the data and
then replacing,
modifying, or deleting
the dirty or coarse data.
18. Problem
Understanding
Data Mining
Data Cleaning
Feature Engineering
Predictive Analytics
Visualization
| The Data Science Lifecycle
Feature engineering is the process of using domain knowledge to extract
features from raw data via data mining techniques.
Too many cooks spoil the broth.—Old Proverb
19. Problem
Understanding
Data Mining
Data Cleaning
Feature Engineering
Predictive Analytics
Visualization
| The Data Science Lifecycle
Predictive analytics encompasses a variety of statistical techniques from
data mining, predictive modelling, and machine learning, that analyze
current and historical facts to make predictions about future or
otherwise unknown events.—Wikipedia
20. Problem
Understanding
Data Mining
Data Cleaning
Feature Engineering
Predictive Analytics
Visualization
| The Data Science Lifecycle
Data visualization is the graphical representation of information
and data. By using visual elements like charts, graphs, and maps, data
visualization tools provide an accessible way to see and understand
trends, outliers, and patterns in data.—Tableau
22. | Machine Learning & Branches
Machine learning is an application of artificial intelligence (AI) that provides systems the
ability to automatically learn and improve from experience without being explicitly
programmed. (Source: Expertsystem)
Branches of Machine Learning
Supervised Learning Unsupervised Learning Reinforcement Learning
Supervised learning is the
machine learning task of learning
a function that maps an input to an
output based on example input-
output pairs.
It infers a function from labeled
training data consisting of a set of
training examples.
Source: Wikipedia
Unsupervised learning is a type of
machine learning algorithm used
to draw inferences from datasets
consisting of input data without
labeled responses. The most
common unsupervised learning
method is cluster analysis, which
is used for exploratory data
analysis to find hidden patterns or
grouping in data.
Source: Mathworks
Reinforcement learning is the training
of machine learning models to make a
sequence of decisions. The agent
learns to achieve a goal in an uncertain
environment.
An agent gets trained based on a
reward-punishment system for right
and wrong choices respectively. Hence
the right choices are reinforced.
Source: Deepsense.ai
Machine learning involves the use of algorithm to detect patterns in large sets of data.
24. | Deep Learning (On a very High Level)
Deep learning is a machine learning technique that teaches computers to do what comes
naturally to humans: learn by example. In Deep Learning, artificial neural networks learn
patterns by propagating forward and backward through the network, updating assumed
weights and biases. It is the key to voice control in consumer devices like phones, tablets,
TVs, and hands-free speakers. (Source: Mathworks)
How Deep Learning Works
29. | The Place of Math in AI or AI in mAthematIcs…
“A person working
in the field of AI
who doesn’t know
math is like a
politician who
doesn’t know how
to persuade. Both
have an
inescapable area
to work upon!”
—Abhishek Parbhakar
▪ Linear Algebra and Calculus (Multivariate)
▪ Probability (Baye’s Theorem, Probability Distributions,
Conjugate Priors, Random Variable, etc)
▪ Statistics
▪ Markov Chains - definition, transition matrix, stationarity
▪ Information theory - entropy, cross-entropy, KL
divergence, mutual information
▪ And imo, we should just learn more Math; you never know
when you would need it.
31. | Use Cases of Artificial Intelligence
▪ Tesla
▪ Netflix and YouTube
▪ Siri, Alexa, and Amazon Echo
▪ IBM Watson
▪ Retina AI
32. Why and How to
Start a Data
Science & AI
Career
5
33. | Why Start a Data Science / AI Career
▪ It is dubbed the “sexiest job of the 21st century” by the Harvard
Business Review
▪ The average data scientist salary is $113,436, according to Glassdoor.
Okay, that is in the United States, but the pay is also fairly decent in
other parts of the world if you know your onions
▪ With the astronomical rise in data generation, the job of a data scientist
would only go higher. If you would not mind crunching data to solve
problems, why restrain yourself from becoming a data scientist?
34. | How to Start a Data Science / AI Career
▪ Choose the role that interests you
▪ Take up a course and complete it
▪ Choose a language and stick to it
▪ Join a peer group
▪ Focus on applications and not just theories
▪ Follow the right resources
▪ Work on communication skills
35. How Long
Would it Take
Before the
Machines Take
Over? An AI
Apocalypse…
or more
accurately, the
infamous
“Technology
Singularity”?
666
36. | AI Armageddon or Not AI Armageddon…?
SO the Question is: Do
you Think AI will get
so powerful and
colonize the planet?
“The pace of progress in artificial
intelligence (I’m not referring to
narrow AI) is incredibly fast. Unless
you have direct exposure to groups
like DeepMind, you have no idea
how fast—it is growing at a pace
close to exponential. The risk of
something seriously dangerous
happening is in the five-year
timeframe. 10 years at most.” —Elon
Musk wrote in a comment on
Edge.org
“The Development of full Artificial
Intelligence could spell the end of the
human race. It would take off on its own,
and re-design itself at an ever increasing
rate. Humans, who are limited by slow
biological evolution, couldn’t compete, and
would be superseded.” —Stephen Hawking
told BBC
37. | Not in a 100 Years!
“The big AI dreams of making
machines that could someday evolve
to do intelligent things like humans
could - I was turned off by that. I
didn't really think that was feasible
when I first joined Stanford.”—
Andrew Ng
My Argument is based on:
▪ Moore’s law would not support rapid demand for AI processing: this
resembles the Computational Complexity argument.
▪ In physics and philosophy, we are still battling to understand
consciousness; to understand existential and emotional intelligence. Those
qualities would be needed by an AI that wishes to take over the world. Like,
can AI appreciate good music yet? No.
▪ Not very traditional, my instincts. It worked for Ramanujan, after all ☺.
I do not think we should fear any super-intelligent AI colonization in at least xx years
39. | Other Reasons Against a Singularity
There are fundamental limits in the Universe; no signal for instance propagates faster than the speed of
light. Dunbar’s number is the observed correlation between brain size for primates and average social
group size. This puts a limit of between 100 and 250 stable relationships on human social groups. There is
no proofs that AI can maintain a stable relationship.
And how do we forget Vernor Vinge?
41. ▪ GPU: RTX 2070 or RTX 2080 Ti. GTX 1070, GTX 1080, GTX 1070 Ti, and GTX 1080.
▪ CPU: 1-2 cores per GPU depending how you preprocess data. > 2GHz; CPU should support the number of
GPUs that you want to run. PCIe lanes do not matter.
▪ RAM:– Clock rates do not matter — buy the cheapest RAM.– Buy at least as much CPU RAM to match the RAM
of your largest GPU.– More RAM can be useful if you frequently work with large datasets.
▪ Hard drive/SSD:– Hard drive for data (>= 3TB)– Use SSD for comfort and preprocessing small datasets.
▪ PSU:– Add up watts of GPUs + CPU. Then multiply the total by 110% for required Wattage.– Get a high
efficiency rating if you use a multiple GPUs.– Make sure the PSU has enough PCIe connectors (6+8pins)
▪ Cooling:– CPU: get standard CPU cooler or all-in-one (AIO) water cooling solution– GPU:– Use air cooling–
Get GPUs with “blower-style” fans if you buy multiple GPUs– Set coolbits flag in your Xorg config to control
fan speeds
▪ Motherboard:– Get as many PCIe slots as you need for your (future) GPUs (one GPU takes two slots; max 4
GPUs per system)
▪ Monitors:– An additional monitor might make you more productive than an additional GPU.
| Focus on Requirements for Deep Learning
43. ▪ Language: Python, Julia, R, etc.
▪ Platform: Jupyter Notebook, Anaconda, Google Colaboratory, and Text Editors from Atom to VS Code
and PyCharm etc
▪ Excel, Tableau, Power BI for visualization
General Data Science & Machine Learning Tools
Frameworks for Deep Learning
▪ Tensorflow
▪ PyTorch
▪ Keras
▪ MXNet
| Tools for Data Science + Deep Learning Framework
▪ CNTK (Microsoft Cognitive Toolkit)
▪ Caffe and Caffe2
▪ DeepLearning4J
▪ Chainer
44. ?
Thanks for Listening!!
Any Questions?
Along with this presentation is a 5-month guide to bootstrap a career in Data Science; someone graciously
compiled the document, the Universe bless their souls. Contact me at agbodozie660@gmail.com.