2. Key technology aspects
CRISP - DM
• CRISP - DM stands for cross-industry standard process for data mining.
• Most widely used and relied upon analytics process in the world (Forbes report).
• Consists of 6 stages:
1. Business understanding
2. Data understanding
3. Data preparation
4. Modeling
5. Evaluation
6. Deployment
PROJECT OF IIT DELHI | AIA | DEPT OF HEAVY INDUSTRIES, GOVERNMENT OF INDIA
3. Key technology aspects
PROJECT OF IIT DELHI | AIA | DEPT OF HEAVY INDUSTRIES, GOVERNMENT OF INDIA
Business Understanding
• First phase of CRISP - DM
• For Data Scientist
1. Key aspect is to understand the business problem and to generate value for the
company.
2. ‘Value is generated by putting models into context within the business processes of a
company to solve problems’.
• For Business Analyst
1. Gap in the understanding of machine learning methods.
Solution : Focus on the mapping of the problem and contemplate for the correct solution.
Traditional methods like Business Intelligence and Six Sigma should be explored before
machine learning.
4. Key technology aspects
PROJECT OF IIT DELHI | AIA | DEPT OF HEAVY INDUSTRIES, GOVERNMENT OF INDIA
Defining Labels
• Map the problem into a data science method.
• Supervised learning takes more efforts but is easier to optimize than
unsupervised learning, hence more recommended.
• Unsupervised learning models do not provide a qualitative measure to tune and
evaluate the model.
• Try to convert the supervised learning problem into a targeting problem.
• While selling your business model, try to include lesser equations and
mathematical language in it to be reliable.
5. Key technology aspects
PROJECT OF IIT DELHI | AIA | DEPT OF HEAVY INDUSTRIES, GOVERNMENT OF INDIA
Defining Labels
• Answer the questions: ‘What do you want to ask the data?’
Requirements
• Labels need to match the business needs
The problem being worked upon must be aligned with the business goals.
• Labels need to exist
Situations for which very little data is available cannot be used for machine learning model
training.
• Labels need to be actionable
On prediction about the given situation, steps should be taken as per the requirements for
generation of value.
6. Key technology aspects
PROJECT OF IIT DELHI | AIA | DEPT OF HEAVY INDUSTRIES, GOVERNMENT OF INDIA
Performance Analysis
• Involves measuring the quality of data science algorithms.
• RMSE, AUC, AUPRC are meant for statistical problems and are less suited to business
problems.
Regression Tasks
• Overestimation and underestimation should be avoided for excess material management
and loss of shipment time respectively.
Classification Tasks
• False positives and false negatives can impact business decisions. Medical tests are a
good example where false positives lead to money loss and false negatives lead to
treatment delay.
7. Key technology aspects
PROJECT OF IIT DELHI | AIA | DEPT OF HEAVY INDUSTRIES, GOVERNMENT OF INDIA
Business Aligned Performance
• Value based performance measures can be used i.e, focus on maximizing revenue for the
company rather than good statistical outcomes.
• Such performance measures for unsupervised learning are harder to find.
Defining Success Parameters
• After the project achieves a baseline, deploy and work in parallel.
• Compare our project to the currently deployed models for analysis.
• Else, use the naïve/ simple model approach for performance.
• Use the proper methods pertaining to validation.
8. Key technology aspects
PROJECT OF IIT DELHI | AIA | DEPT OF HEAVY INDUSTRIES, GOVERNMENT OF INDIA
Profile Generation
• This primarily involves filtering the data, cleaning the data, scrutiny of sources
and data preparation for accurate results.
• Detection of patterns in data for predicting labels in the machine learning model.
• Establishing time period when the data is available which is different from batch
running processes.
• The raw data is never aggregated in such a way that the aggregated data is as
appropriate for the task of machine learning as the raw data. Thus, to get the
best results possible in your project, it is important to get access to the
underlying data and use that for model training and evaluation.
9. Business goals
PROJECT OF IIT DELHI | AIA | DEPT OF HEAVY INDUSTRIES, GOVERNMENT OF INDIA
Automation
• Every business carries some inefficiencies that can be replaced by high
performing algorithms.
Optimization
• Enterprises are using artificial intelligence algorithms to optimize processes that
reduce overheads and improve output.
• Tightening operations means smarter budgeting and more profit.
• AI can autonomously aggregate and crunch data to provide cohesion across
sales and marketing. Data scrapping has been democratized and made
accessible by AI.
10. Business goals
PROJECT OF IIT DELHI | AIA | DEPT OF HEAVY INDUSTRIES, GOVERNMENT OF INDIA
• Data unification and customer
insight are easily and autonomously
accomplished with AI. The business
case for AI rests on its ability to
free-up human time.
• Other technologies such as Smart
Sensors, Microcontrollers, Real
Time Dashboard, Augmented
Reality using Unity etc. can be
integrated to achieve smart
manufacturing.
Roadblocks in Implementation
Questions
What is in it for me ?
Solution: Critically outline the business
problems and define the goals.
That’s not my job ?
Solution: Since 70% projects are not
implemented due to employee resistance
(McKinsey Report). We must, first of all,
clarify the goals and find the people with
leadership qualities to undertake the
project.
11. PROJECT OF IIT DELHI | AIA | DEPT OF HEAVY INDUSTRIES, GOVERNMENT OF INDIA
Business Goals for Indian Manufacturers
• Diagnosing problems and taking corrective action in time.
• Enabling entrepreneurs to effectively manage multiple facilities and make them
consumer centric.
• Predictive maintenance and quality analytics.
• Digital twin implementation (a combination of physics modelling Simulink and
real data of a machine).
• Big data driven processes.
• Process visualization and modular production assets.
12. Elevator Pitch
PROJECT OF IIT DELHI | AIA | DEPT OF HEAVY INDUSTRIES, GOVERNMENT OF INDIA
Indian manufacturing sector typically suffers from downtime, high latency in error
rectification process, and low skilled labour which can be optimized by adopting this
approach.
Downtime can be tackled by predictive maintenance and RUL estimation while error
rectification process can be made faster through suggestions based on real time
data monitoring.
More realization of profits and higher ability to use data are only the bare minimum
benefits that can be received from this. (Power of 1%!!!)