SlideShare una empresa de Scribd logo
1 de 33
Improving Casting Porosity with Data Mining Creed Darling Marcin Kuta Kyle Saginus ENMA 6060 Innovation  & Technology December 7, 2009
Overview 1.0 Business Understanding 2.0 Data Exploration and Description 3.0 Modeling 4.0 Evaluation 5.0 Deployment 2
1.0 Business Understanding 1.1 Business Objectives 1.1.1 Background 1.1.2 Business Objectives 1.1.3 Business Success Criteria 1.2 Assess Situation 1.3 Risks and Contingencies  1.4 Costs and Benefits 1.5 Data Mining Goals 1.5.1 Data Mining Goals 1.5.2 Data Mining Success Criteria 3
1.1.1 Background A casting with high gas porosity is considered defective In the process of casting, defects are put back in the “melting pot” The loss of capital from defects is mostly overhead and reduced throughput Defects have to be reduced to reduce waste and increase throughput without  increasing foundry capacity 4
1.1.2 Business Objectives	 Primary Business Objective: Reduce gas porosity in castings to Improve quality of foundry castings Increase throughput Reduce waste Secondary Business Objective: Compare our data mining results obtained in this project against previous findings. 5
1.1.3 Business Success Criteria  Measurable Outcomes: Lower operating cost Improve throughput without additional capital expenditures Reduce waste 6
1.2 Assess Situation Inventory of Resources 3 students on team Dataset with 39 attributes and 172 samples The team has access to Weka software Requirements, Assumptions, and Restraints The project deadline is December 7, 2009 The results need to state which variables should be altered and to what degree in order to improve quality of castings, reduce cost, reduce waste, and to improve throughput without increasing foundry capacity.  7
1.3 Risks & Contingencies Communication with project sponsor might be difficult. The sponsor has published articles using data mining on our dataset Limited knowledge of casting process Data is data Our tools might limit our findings We have many different algorithms that can be employed 8
1.4 Costs & Benefits There is no cost to execute this project as the project sponsor provided the data There is no direct financial benefit to the team; however, the information gained can help the project sponsor to understand critical variables that affect the quality of casting in a foundry environment, which can then increase throughput for the foundry 9
1.5.1 Data Mining Goals To determine the variables (of the 39 provided) that have the greatest impact on reducing gas porosity in the sand castings in order to increase the quality of the cast part.  10
1.5.2 Data Mining Success Criteria Compare our data to the findings of the project sponsor  Our results agree with our sponsors but add something new 11
2.0 Data Exploration and Description 2.1 Data Collection and Description 2.2 Data Quality 2.3 Data Selection 2.4 Data Integration 12
2.1 Data Collection and Description Project sponsor collected over 6500 points of data 39 attributes and 172 samples Attribute Types: Elements in the final casting composition Information regarding the casting process Chronological data regarding the cast Data is a mixture of numerical and nominal 13
2.2 Data Quality Upon review of a number of articles addressing casting procedures, it appears the data collected by the team’s sponsor is complete and it does not contain any significant errors.  Further analysis of the provided data may reveal areas of investigation that could be broadened to provide complete findings. 14
2.3 Data Selection In an article published by the sponsor it was discovered that 11 of the attributes were found to be unnecessary in the model Trials were run using our data mining tools with the attributes included and removed and only a small difference in error was observed We removed the eleven variables from the dataset 15
2.3 Data Selection 16
2.4 Data Integration Created two new attributes from existing attributes Total Impurities – summation of FeMnSi, FeSi, FeCaSi, and Ca Total Al, Si, P – summation of Al, Si, P No formating issues were found with the data 17
3.0 Modeling 3.1 Modeling Techniques  3.2 Build Model 3.3 Test Model 3.4 Assess Model 18
3.1 Modeling techniques Experiments were conducted with each of the following algorithms to discover our best model One Rule Classifier Naïve Bayes Classifier J48 Decision Tree Classifier Multilayer Perception Neural Network Training sets were used with some discretization The dataset with the 11 attributes removed and integrated dataset were used for all of the experiments 19
3.2 Build Model The experiments using the J48 Decision Tree classifier resulted in the best model for our dataset This algorithm was chosen to be used for analyzing our dataset The following slides show the details of building the model 20
3.2 Build Model 21
3.2 Build Model 22
3.3 Test Model 23
3.4 Assess Model 24 From the Weka output we can see that the model only misclassified 2.33% of the instances The classification results in the Confusion Matrix
4.O Evaluation 4.1 Evaluate Results 4.2 Review Process 4.3 Next Steps 25
4.1 Evaluate Results From the J48 Decision Tree we can see the attribute that has the largest impact on the porosity is the “molding team number” This indicates that the process of casting is very dependent on the workers This is also not a surprise and was already recognized by the sponsor The molding team attribute was removed from the dataset to see which attribute was the next most important 26
4.1 Evaluate Results 27 ,[object Object]
Also confirms that the process is heavily influenced by manufacturing personnel,[object Object],[object Object]
4.2 Review Process Process seems to have followed all of the correct steps and has yielded acceptable results Results  could be improved or new findings made by using a different software package with different algorithms 30
4.3 Next Steps For continuous monitoring and improvement of the casting porosity while limiting excess data acquisition, it is suggested that data pertaining to casting failures should only be collected on the attributes appearing in the J48 Decision Tree 31
5.0 Deployment Plan The completed model and results met our business objectives Send our final report and model to sponsor for  Review Monitoring and maintenance 32
THANK YOU
Test presentation

Más contenido relacionado

Similar a Test presentation

Software Productivity Framework
Software Productivity Framework Software Productivity Framework
Software Productivity Framework
Zinnov
 
IE7610_REPORT_GROUP_8
IE7610_REPORT_GROUP_8IE7610_REPORT_GROUP_8
IE7610_REPORT_GROUP_8
Parag Kapile
 
Assessing Your Processes using ISO Standards
Assessing Your Processes using ISO StandardsAssessing Your Processes using ISO Standards
Assessing Your Processes using ISO Standards
PECB
 
Supplier evaluation criteria
Supplier evaluation criteriaSupplier evaluation criteria
Supplier evaluation criteria
Art Acosta
 

Similar a Test presentation (20)

Software Productivity Framework
Software Productivity Framework Software Productivity Framework
Software Productivity Framework
 
Agile, qa and data projects geek night 2020
Agile, qa and data projects   geek night 2020Agile, qa and data projects   geek night 2020
Agile, qa and data projects geek night 2020
 
IE7610_REPORT_GROUP_8
IE7610_REPORT_GROUP_8IE7610_REPORT_GROUP_8
IE7610_REPORT_GROUP_8
 
Assessing Your Processes using ISO Standards
Assessing Your Processes using ISO StandardsAssessing Your Processes using ISO Standards
Assessing Your Processes using ISO Standards
 
Black_Friday_Sales_Trushita
Black_Friday_Sales_TrushitaBlack_Friday_Sales_Trushita
Black_Friday_Sales_Trushita
 
Case demo powerpoint-final
Case demo powerpoint-finalCase demo powerpoint-final
Case demo powerpoint-final
 
2F9_S4HANA2020_BPD_EN_US.docx
2F9_S4HANA2020_BPD_EN_US.docx2F9_S4HANA2020_BPD_EN_US.docx
2F9_S4HANA2020_BPD_EN_US.docx
 
Six Sigma - The Journey of Quality and Management
Six Sigma - The Journey of Quality and Management Six Sigma - The Journey of Quality and Management
Six Sigma - The Journey of Quality and Management
 
How performance management can improve client satisfaction
How performance management can improve client satisfactionHow performance management can improve client satisfaction
How performance management can improve client satisfaction
 
Six sigma ajal
Six sigma ajalSix sigma ajal
Six sigma ajal
 
SIX SIGMA PPT.pptx
SIX SIGMA PPT.pptxSIX SIGMA PPT.pptx
SIX SIGMA PPT.pptx
 
PMBOK® Guide 5th edition Processes Flow in English
PMBOK® Guide 5th edition Processes Flow in EnglishPMBOK® Guide 5th edition Processes Flow in English
PMBOK® Guide 5th edition Processes Flow in English
 
Itto slide share
Itto slide shareItto slide share
Itto slide share
 
Testing Metrics: Project, Product, Process
Testing Metrics: Project, Product, ProcessTesting Metrics: Project, Product, Process
Testing Metrics: Project, Product, Process
 
Supplier evaluation criteria
Supplier evaluation criteriaSupplier evaluation criteria
Supplier evaluation criteria
 
Howe Street Basic Project Approach
Howe Street Basic Project ApproachHowe Street Basic Project Approach
Howe Street Basic Project Approach
 
dd presentation.pdf
dd presentation.pdfdd presentation.pdf
dd presentation.pdf
 
New for 2018 MRO master data auditing and cleansing
New for 2018 MRO  master data auditing and cleansingNew for 2018 MRO  master data auditing and cleansing
New for 2018 MRO master data auditing and cleansing
 
Spi Cost Roi
Spi Cost RoiSpi Cost Roi
Spi Cost Roi
 
OM2_Lecture 11vvvhhbbjjbjdjjeebjrhvhuuhh
OM2_Lecture 11vvvhhbbjjbjdjjeebjrhvhuuhhOM2_Lecture 11vvvhhbbjjbjdjjeebjrhvhuuhh
OM2_Lecture 11vvvhhbbjjbjdjjeebjrhvhuuhh
 

Último

Popular Kala Jadu, Kala ilam specialist in USA and Bangali Amil baba in Saudi...
Popular Kala Jadu, Kala ilam specialist in USA and Bangali Amil baba in Saudi...Popular Kala Jadu, Kala ilam specialist in USA and Bangali Amil baba in Saudi...
Popular Kala Jadu, Kala ilam specialist in USA and Bangali Amil baba in Saudi...
baharayali
 
Top 10 Amil baba list Famous Amil baba In Pakistan Amil baba Kala jadu in Raw...
Top 10 Amil baba list Famous Amil baba In Pakistan Amil baba Kala jadu in Raw...Top 10 Amil baba list Famous Amil baba In Pakistan Amil baba Kala jadu in Raw...
Top 10 Amil baba list Famous Amil baba In Pakistan Amil baba Kala jadu in Raw...
Amil Baba Naveed Bangali
 
Popular Kala Jadu, Kala jadu Expert in Islamabad and Kala jadu specialist in ...
Popular Kala Jadu, Kala jadu Expert in Islamabad and Kala jadu specialist in ...Popular Kala Jadu, Kala jadu Expert in Islamabad and Kala jadu specialist in ...
Popular Kala Jadu, Kala jadu Expert in Islamabad and Kala jadu specialist in ...
baharayali
 
Popular Kala Jadu, Black magic specialist in Sialkot and Kala ilam specialist...
Popular Kala Jadu, Black magic specialist in Sialkot and Kala ilam specialist...Popular Kala Jadu, Black magic specialist in Sialkot and Kala ilam specialist...
Popular Kala Jadu, Black magic specialist in Sialkot and Kala ilam specialist...
baharayali
 
Famous Kala Jadu, Black magic specialist in Rawalpindi and Bangali Amil baba ...
Famous Kala Jadu, Black magic specialist in Rawalpindi and Bangali Amil baba ...Famous Kala Jadu, Black magic specialist in Rawalpindi and Bangali Amil baba ...
Famous Kala Jadu, Black magic specialist in Rawalpindi and Bangali Amil baba ...
baharayali
 
Top 10 Amil baba list Famous Amil baba In Pakistan Amil baba Kala jadu in Raw...
Top 10 Amil baba list Famous Amil baba In Pakistan Amil baba Kala jadu in Raw...Top 10 Amil baba list Famous Amil baba In Pakistan Amil baba Kala jadu in Raw...
Top 10 Amil baba list Famous Amil baba In Pakistan Amil baba Kala jadu in Raw...
Amil Baba Naveed Bangali
 
Famous Kala Jadu, Black magic expert in UK and Kala ilam expert in Saudi Arab...
Famous Kala Jadu, Black magic expert in UK and Kala ilam expert in Saudi Arab...Famous Kala Jadu, Black magic expert in UK and Kala ilam expert in Saudi Arab...
Famous Kala Jadu, Black magic expert in UK and Kala ilam expert in Saudi Arab...
baharayali
 

Último (20)

A Spiritual Guide To Truth v10.pdf xxxxxxx
A Spiritual Guide To Truth v10.pdf xxxxxxxA Spiritual Guide To Truth v10.pdf xxxxxxx
A Spiritual Guide To Truth v10.pdf xxxxxxx
 
About Kabala (English) | Kabastro.com | Kabala.vn
About Kabala (English) | Kabastro.com | Kabala.vnAbout Kabala (English) | Kabastro.com | Kabala.vn
About Kabala (English) | Kabastro.com | Kabala.vn
 
Meaning of 22 numbers in Matrix Destiny Chart | 22 Energy Calculator
Meaning of 22 numbers in Matrix Destiny Chart | 22 Energy CalculatorMeaning of 22 numbers in Matrix Destiny Chart | 22 Energy Calculator
Meaning of 22 numbers in Matrix Destiny Chart | 22 Energy Calculator
 
Louise de Marillac and Care for the Elderly
Louise de Marillac and Care for the ElderlyLouise de Marillac and Care for the Elderly
Louise de Marillac and Care for the Elderly
 
Genesis 1:5 - Meditate the Scripture Daily bit by bit
Genesis 1:5 - Meditate the Scripture Daily bit by bitGenesis 1:5 - Meditate the Scripture Daily bit by bit
Genesis 1:5 - Meditate the Scripture Daily bit by bit
 
Popular Kala Jadu, Kala ilam specialist in USA and Bangali Amil baba in Saudi...
Popular Kala Jadu, Kala ilam specialist in USA and Bangali Amil baba in Saudi...Popular Kala Jadu, Kala ilam specialist in USA and Bangali Amil baba in Saudi...
Popular Kala Jadu, Kala ilam specialist in USA and Bangali Amil baba in Saudi...
 
Zulu - The Epistle of Ignatius to Polycarp.pdf
Zulu - The Epistle of Ignatius to Polycarp.pdfZulu - The Epistle of Ignatius to Polycarp.pdf
Zulu - The Epistle of Ignatius to Polycarp.pdf
 
Top 10 Amil baba list Famous Amil baba In Pakistan Amil baba Kala jadu in Raw...
Top 10 Amil baba list Famous Amil baba In Pakistan Amil baba Kala jadu in Raw...Top 10 Amil baba list Famous Amil baba In Pakistan Amil baba Kala jadu in Raw...
Top 10 Amil baba list Famous Amil baba In Pakistan Amil baba Kala jadu in Raw...
 
Legends of the Light v2.pdf xxxxxxxxxxxxx
Legends of the Light v2.pdf xxxxxxxxxxxxxLegends of the Light v2.pdf xxxxxxxxxxxxx
Legends of the Light v2.pdf xxxxxxxxxxxxx
 
St. Louise de Marillac and Abandoned Children
St. Louise de Marillac and Abandoned ChildrenSt. Louise de Marillac and Abandoned Children
St. Louise de Marillac and Abandoned Children
 
NoHo First Good News online newsletter May 2024
NoHo First Good News online newsletter May 2024NoHo First Good News online newsletter May 2024
NoHo First Good News online newsletter May 2024
 
Lesson 6 - Our Spiritual Weapons - SBS.pptx
Lesson 6 - Our Spiritual Weapons - SBS.pptxLesson 6 - Our Spiritual Weapons - SBS.pptx
Lesson 6 - Our Spiritual Weapons - SBS.pptx
 
St. Louise de Marillac and Galley Prisoners
St. Louise de Marillac and Galley PrisonersSt. Louise de Marillac and Galley Prisoners
St. Louise de Marillac and Galley Prisoners
 
Codex Singularity: Search for the Prisca Sapientia
Codex Singularity: Search for the Prisca SapientiaCodex Singularity: Search for the Prisca Sapientia
Codex Singularity: Search for the Prisca Sapientia
 
Popular Kala Jadu, Kala jadu Expert in Islamabad and Kala jadu specialist in ...
Popular Kala Jadu, Kala jadu Expert in Islamabad and Kala jadu specialist in ...Popular Kala Jadu, Kala jadu Expert in Islamabad and Kala jadu specialist in ...
Popular Kala Jadu, Kala jadu Expert in Islamabad and Kala jadu specialist in ...
 
Popular Kala Jadu, Black magic specialist in Sialkot and Kala ilam specialist...
Popular Kala Jadu, Black magic specialist in Sialkot and Kala ilam specialist...Popular Kala Jadu, Black magic specialist in Sialkot and Kala ilam specialist...
Popular Kala Jadu, Black magic specialist in Sialkot and Kala ilam specialist...
 
Famous Kala Jadu, Black magic specialist in Rawalpindi and Bangali Amil baba ...
Famous Kala Jadu, Black magic specialist in Rawalpindi and Bangali Amil baba ...Famous Kala Jadu, Black magic specialist in Rawalpindi and Bangali Amil baba ...
Famous Kala Jadu, Black magic specialist in Rawalpindi and Bangali Amil baba ...
 
Story of The Soldier Son Portrait who died to save others
Story of The Soldier Son Portrait who died to save othersStory of The Soldier Son Portrait who died to save others
Story of The Soldier Son Portrait who died to save others
 
Top 10 Amil baba list Famous Amil baba In Pakistan Amil baba Kala jadu in Raw...
Top 10 Amil baba list Famous Amil baba In Pakistan Amil baba Kala jadu in Raw...Top 10 Amil baba list Famous Amil baba In Pakistan Amil baba Kala jadu in Raw...
Top 10 Amil baba list Famous Amil baba In Pakistan Amil baba Kala jadu in Raw...
 
Famous Kala Jadu, Black magic expert in UK and Kala ilam expert in Saudi Arab...
Famous Kala Jadu, Black magic expert in UK and Kala ilam expert in Saudi Arab...Famous Kala Jadu, Black magic expert in UK and Kala ilam expert in Saudi Arab...
Famous Kala Jadu, Black magic expert in UK and Kala ilam expert in Saudi Arab...
 

Test presentation

  • 1. Improving Casting Porosity with Data Mining Creed Darling Marcin Kuta Kyle Saginus ENMA 6060 Innovation & Technology December 7, 2009
  • 2. Overview 1.0 Business Understanding 2.0 Data Exploration and Description 3.0 Modeling 4.0 Evaluation 5.0 Deployment 2
  • 3. 1.0 Business Understanding 1.1 Business Objectives 1.1.1 Background 1.1.2 Business Objectives 1.1.3 Business Success Criteria 1.2 Assess Situation 1.3 Risks and Contingencies 1.4 Costs and Benefits 1.5 Data Mining Goals 1.5.1 Data Mining Goals 1.5.2 Data Mining Success Criteria 3
  • 4. 1.1.1 Background A casting with high gas porosity is considered defective In the process of casting, defects are put back in the “melting pot” The loss of capital from defects is mostly overhead and reduced throughput Defects have to be reduced to reduce waste and increase throughput without increasing foundry capacity 4
  • 5. 1.1.2 Business Objectives Primary Business Objective: Reduce gas porosity in castings to Improve quality of foundry castings Increase throughput Reduce waste Secondary Business Objective: Compare our data mining results obtained in this project against previous findings. 5
  • 6. 1.1.3 Business Success Criteria Measurable Outcomes: Lower operating cost Improve throughput without additional capital expenditures Reduce waste 6
  • 7. 1.2 Assess Situation Inventory of Resources 3 students on team Dataset with 39 attributes and 172 samples The team has access to Weka software Requirements, Assumptions, and Restraints The project deadline is December 7, 2009 The results need to state which variables should be altered and to what degree in order to improve quality of castings, reduce cost, reduce waste, and to improve throughput without increasing foundry capacity. 7
  • 8. 1.3 Risks & Contingencies Communication with project sponsor might be difficult. The sponsor has published articles using data mining on our dataset Limited knowledge of casting process Data is data Our tools might limit our findings We have many different algorithms that can be employed 8
  • 9. 1.4 Costs & Benefits There is no cost to execute this project as the project sponsor provided the data There is no direct financial benefit to the team; however, the information gained can help the project sponsor to understand critical variables that affect the quality of casting in a foundry environment, which can then increase throughput for the foundry 9
  • 10. 1.5.1 Data Mining Goals To determine the variables (of the 39 provided) that have the greatest impact on reducing gas porosity in the sand castings in order to increase the quality of the cast part. 10
  • 11. 1.5.2 Data Mining Success Criteria Compare our data to the findings of the project sponsor Our results agree with our sponsors but add something new 11
  • 12. 2.0 Data Exploration and Description 2.1 Data Collection and Description 2.2 Data Quality 2.3 Data Selection 2.4 Data Integration 12
  • 13. 2.1 Data Collection and Description Project sponsor collected over 6500 points of data 39 attributes and 172 samples Attribute Types: Elements in the final casting composition Information regarding the casting process Chronological data regarding the cast Data is a mixture of numerical and nominal 13
  • 14. 2.2 Data Quality Upon review of a number of articles addressing casting procedures, it appears the data collected by the team’s sponsor is complete and it does not contain any significant errors. Further analysis of the provided data may reveal areas of investigation that could be broadened to provide complete findings. 14
  • 15. 2.3 Data Selection In an article published by the sponsor it was discovered that 11 of the attributes were found to be unnecessary in the model Trials were run using our data mining tools with the attributes included and removed and only a small difference in error was observed We removed the eleven variables from the dataset 15
  • 17. 2.4 Data Integration Created two new attributes from existing attributes Total Impurities – summation of FeMnSi, FeSi, FeCaSi, and Ca Total Al, Si, P – summation of Al, Si, P No formating issues were found with the data 17
  • 18. 3.0 Modeling 3.1 Modeling Techniques 3.2 Build Model 3.3 Test Model 3.4 Assess Model 18
  • 19. 3.1 Modeling techniques Experiments were conducted with each of the following algorithms to discover our best model One Rule Classifier Naïve Bayes Classifier J48 Decision Tree Classifier Multilayer Perception Neural Network Training sets were used with some discretization The dataset with the 11 attributes removed and integrated dataset were used for all of the experiments 19
  • 20. 3.2 Build Model The experiments using the J48 Decision Tree classifier resulted in the best model for our dataset This algorithm was chosen to be used for analyzing our dataset The following slides show the details of building the model 20
  • 24. 3.4 Assess Model 24 From the Weka output we can see that the model only misclassified 2.33% of the instances The classification results in the Confusion Matrix
  • 25. 4.O Evaluation 4.1 Evaluate Results 4.2 Review Process 4.3 Next Steps 25
  • 26. 4.1 Evaluate Results From the J48 Decision Tree we can see the attribute that has the largest impact on the porosity is the “molding team number” This indicates that the process of casting is very dependent on the workers This is also not a surprise and was already recognized by the sponsor The molding team attribute was removed from the dataset to see which attribute was the next most important 26
  • 27.
  • 28.
  • 29. 4.2 Review Process Process seems to have followed all of the correct steps and has yielded acceptable results Results could be improved or new findings made by using a different software package with different algorithms 30
  • 30. 4.3 Next Steps For continuous monitoring and improvement of the casting porosity while limiting excess data acquisition, it is suggested that data pertaining to casting failures should only be collected on the attributes appearing in the J48 Decision Tree 31
  • 31. 5.0 Deployment Plan The completed model and results met our business objectives Send our final report and model to sponsor for Review Monitoring and maintenance 32

Notas del editor

  1. Welcome to our presentation on improving casting porosity with data mining. This presentation was completed for Marquette University’s Engineering Management course on Innovation and Technology by Creed Darling, Marcin Kuta, and Kyle Saginus.
  2. Here is a quick overview of the presentation. We will start by first discussing our business understanding of what we planned to accomplish with this project. Then we will discuss the different aspects of the data and the model we developed to analyze it. We will conclude the presentation by evaluating our model in terms of accomplishing our business goals and the deployment of our model.
  3. For the business understanding section of this presentation we will give a short background along with our objectives and success criteria. We will then present our team’s situation including our requirements and resources followed by the risks and contingency plans for the project. Finally we will talk about the costs and benefits of the project and how we planned to accomplish our business objectives through data mining.
  4. Metal castings are used throughout several industries, varying anywhere from the construction industry to the aerospace industry. Depending upon the final application for the casting, overall quality is essential. Among other factors, castings can be considered defective when a certain level of gas porosity is present in the final product. A defective part is typically returned to the melting pot and recast. This might lead you to believe that defective castings don’t have a significant cost associated with them but a casting’s cost is usually about 10% material and 90% overhead, so a defective casting does have a major cost. Also each casting that is considered defective is a product that went through the entire process but did not make it out the door, so the foundry’s throughput is reduced. Reducing defective parts will increase a foundry’s throughput without spending more capital to increase capacity.
  5. Based on the background our business objectives are simple. Our goal was to reduce the gas porosity in castings to improve the quality of castings, increase the throughput of the foundry, and reduce waste. Our project sponsor has already collected data and analyzed it using data mining techniques. It might seem then that the project is already complete but there are many different approaches in data mining and a new team can always add new insight. Our secondary business objective was to compare our findings to our sponsors and hopefully find similarities to ensure our approach is correct, but also to bring something new to the table.
  6. For our business success criteria, based on our business objectives, we are focusing on the following measurable outcomes. By reducing the number of defects, our project will be successful if there is a lower operating cost, improved throughput in the absence of increased capacity, and reduced waste.
  7. The resources available to the team are three students in the innovation and technology course, the dataset collected by the sponsor, and access to Weka data mining freeware.We were required to complete the project by December 7th (today),and the results found needed to meet the business objectives. Since we are not directly linked to the foundry we needed to make recommendations to our project sponsor as to what attributes in the dataset we were given have the largest impact on high casting porosity. The project sponsor will then employ the recommendations.
  8. There are a couple risks associated with the project that were addressed at the beginning of the project and contingency plans were developed. Since our sponsor is from a foreign country it was assumed that it would be difficult to communicate with our sponsor, however it was thought shouldn’t inhibit our work as the sponsor had already published numerous articles on this data. Our team had a limited knowledge of the casting process, but in one sense it is not necessary to understand the process to analyze the data. Also when confirming if our recommendations were accurate the team had the sponsors findings to see if the analysis was headed in the right direction. The software package that we had available to us may not have the best model for this data, but there are enough algorithms in the software that the team felt it would able to find a model that is sufficient.
  9. The team will not be compensated in any way for completing the project so there is no cost involved for our analysis, but there is also no direct benefit to the team aside from gaining experience in data mining. The foundry will however receive a benefit from our project by possibly improving the quality of their castings, and identifying areas of process improvement that increase the throughput of the foundry and reduce operating cost.
  10. The teams data mining goal was to find the attribute(s) in the dataset that have the largest impact on high casting porosity resulting defective parts. With this knowledge the casting process can be altered to reduce the number of defective castings.
  11. The team’s data mining success criteria depend mostly on the sponsor’s findings. We will know that we have found a good model when our results show findings similar to our sponsor’s. We want to add something to our sponsor’s knowledge of the data, so we will also consider the data mining to be successful when we have found something new.
  12. Following the Business Understanding section, this segment of the presentation will focus on Data Exploration and Description. Here we will explore Data Collection and Description, Data Quality, Data Selection and Data Integration.
  13. The team was provided with over 6500 points of data which were collected by the project sponsor. The data set consisted of 39 attributes derived from 172 samples. The attribute types provided insight into the physical nature of the final product by listing elements in the final composition of the casting. Other attributes provided valuable information into the casting process technology by listing information regarding the casting process. The last set of attributes characterized the casting process itself by listing chronological data regarding the cast. The data points were a mixture of numerical and nominal values.
  14. Since the data points were provided by the project sponsor, data evaluation step was performed by the team in order to ensure adequate data quality. The team’s focus was placed on thorough review of published literature related to the topic of gas porosity in a foundry process. The article review supported the sponsor’s data definition and collection methods. Furthermore, it lead the team to believe that further analysis of the provided data may reveal areas of investigation that could be broadened to provide complete findings.
  15. In parallel to data quality, data selection is a critical step in the data mining analysis. In one article published by the project sponsor it was discovered that 11 of the 39 attributes listed were found to provide no insight or value to the model. In order to ensure successful data selection progression, trials were run using various data mining tools and models with all attributes included and removed when a small difference in error was observed. In conclusion 11 attributes were removed from the data set that showed no influence on gas porosity in the final casting.
  16. Out of the 39 attributes listed in this section, the following 11 attributes showed no influence on the model – those were labels as “Iron and Silicon Amount,” “Final %Al,” “Final %P”, “Nozzle Supplier Code,” “Pouring Order,” “Mould Quality,” “Core Coating Code,” “Molding Sand Code,” “Molding Coating Code,” “Environment Temperature Before Pour” and “Bar Test Casting Porosity”
  17. The data integration and data review lead to the development of two new attributes from the data set. The two attributes were “Total Impurities consisted of summation of FeMnSi, FeSi, FeCaSi, Ca levels” and “Total of Al, Si,and P levels which consisted of summation of Al, Si and P.” No formatting issues were found with the data.
  18. The next step within the project was to develop proper model which enabled the team to analyze the data points and derive appropriate conclusions related to the gas porosity issues found in the casting process. This section of the presentation will focus on Modeling Techniques, Building a Model, Testing a Model and Assessing of the Model.
  19. In an effort to define and build the best model, the team has evaluated various algorithms found in Weka software. A number of simulations were conducted using One Rule Classifier, Naïve Bayes Classifier, J48 Decision Tree Classifier and Multilayer Perception Neural Network algorithm. In order to generate adequate results, the training sets were used with some level of discretization. The data set with the 11 attributes removed and the integrated data set were used for all of the experiments.
  20. The simulation efforts using a number of different algorithms resulted in selection of J48 Decision Tree classifier as appropriate model for this project. The J48 algorithm generated best results and yielded the lowest classification error.
  21. This is the J48 model set up. The model was generated using the software defaults with an exception of “saveinstancedata.” This feature was changed to true. This allowed the team to find out how each sample is classified after the building the J48 classification tree.
  22. This table lists the number of misclassified instances. The original data set with 11 attributes removed showed the same results as the integrated data set. We can see that non discretized sets generated only 4 misclassified attributes, whereas the discretized data sets yielded as many as 13 misclassified instances.
  23. This is the J48 classification tree generated by the model using the non-discretized, original dataset. We can see that the tree has 9 leaves, and that the total size of the tree is 23 elements. The size of the tree is important when testing the model. In general, the goal of the tree learner is to classify the most test samples correctly while reducing the tree size and the number of leaves. Considering the number of attributes in this data set, the model had done a good job in generating only 9 leaves and 23 elements.
  24. The final step within the modeling phase consisted of model assessment. From the Weka output shown, it was observed that the model only misclassified 2.33% of the instances This value was derived from misclassification of 4 out of 172 samples available.