FusionOps

PRERNA SHARMA
July 18, 2016
FusionOps
DS Developer Test
1
FusionOps|7/18/2016
FusionOps
DS Developer Test
Solution -1
Flow Chart
Sales Data Analysis
Summary:-
Sales Date Sales Quantity Unit Cost Unit Price Marketing Spent Discount Flag
Count 4000 4000 4000 4000 4000 4000
Mean 20132640.4 5047.348838 4.12248 12.1791675 193.22525 0.3215
Std 9731.630793 26503.76132 8.65621441 25.80803025 318.6396473 0.467110584
Min 20120131 9 0.07 0.32 3 0
25% 20121105.25 408.75 0.91 2.9 60 0
50% 20130880.5 963 1.46 4.32 108 0
75% 20140655.25 2117 2.2 6.51 203.25 1
Max 20150430 390630 52.38 166.43 3966 1
TABLE 1:- SUMMARY OF THE SALES LEVEL DATA
2
FusionOps|7/18/2016
In the above table, we found the proper details (i.e Central Tendency and Variations) of the given data.
As the better result Mean of the Sales Quantity is 5047.34883 and the Median is 6899.864279. It means that the
Median of the Sales Quantity is Higher than the Mean of the Sales Quantity.
Now, Mean of the Marketing Spent is 193.22525 and Median of the Marketing Spent is 155.981928. It
means that Mean of the Marketing Spent is higher than the Median of the Marketing Spent.
Uni-Variate and Bi-Variate Analysis:-
FIGURE 1:-PAIR WISE PLOT OF SALES LEVEL ANALYSIS
3
FusionOps|7/18/2016
FIGURE 2:- BARPLOT OF ABC CLASS
From the above Fiqure:-2 , we found the total count values of each category of the ABC Class as shown below:-
ABC Class Total Counts Discount Flags
A 1080 351
B 1480 453
C 1440 482
TABLE 2 CATEGORIES OF CLASS ABC
4
FusionOps|7/18/2016
Summary Of the Categories of the Class ABC(Product Segmentation):-
TABLE 3:-SUMMARY OF CATEGORY A
TABLE 4:- SUMMARY OF CLASS B
Sales Date Sales Quantity Unit Cost Unit Price Marketing Spent Discount Flag
Count 1080 1080 1080 1080 1080 1080
Mean 20132640.4 16049.80648 4.633481481 17.26198148 453.6935185 0.325
Std 9734.921816 49349.07531 8.36362872 35.78886613 520.6330711 0.468591841
Min 20120131 47 0.07 0.32 54 0
25% 20121105.25 1266.75 0.45 1.55 190.5 0
50% 20130880.5 4185.5 1.03 3.12 297 0
75% 20140655.25 10848 2.9 6.725 479.25 1
Max 20150430 390630 32.39 166.43 3966 1
Sales Date Sales
Quantity
Unit Cost Unit Price Marketing
Spent
Discount
Flag
count 1480 1480 1480 1480 1480 1480
mean 20132640.4 1460.31 4.752783784 12.00458108 132.7682432 0.306081081
std 9733.702872 992.2210921 9.222109011 22.10871481 71.55556198 0.461019588
min 20120131 33.9 0.45 1.78 27 0
25% 20121105.25 783.5 0.8875 2.77 75 0
50% 20130880.5 1394 1.2 3.29 117 0
75% 20140655.25 1937.5 2.32 7.65 173.25 1
max 20150430 6951 39.21 110.8 531 1
5
FusionOps|7/18/2016
Sales Date Sales
Quantity
Unit Cost Unit Price Marketing
Spent
Discount
Flag
count 1440 1440 1440 1440 1440 1440
mean 20132640.4 482.1844097 3.091416667 8.546493056 60.01041667 0.334722222
std 9733.794272 294.1944846 8.167083891 18.85617808 34.8026187 0.472057205
min 20120131 9 0.66 3.02 3 0
25% 20121105.25 248 1.29 4.12 33 0
50% 20130880.5 464.5 1.74 5.01 53 0
75% 20140655.25 646.25 2.02 6.4725 78 1
max 20150430 1895 52.38 134.41 216 1
TABLE 5:- SUMMARY OF CLASS C
From the Above Summary, we came to know that :-
 Sales Quantity of Class A is more as compared to Class B and Class C.
 Marketing Spent of Class A is more as compared to Class B and Class C.
 Unit Price of Class A of last 25 percentiles of Sales are more as compared to Class B and Class C. In
other words, we can say that A comprises of more numbers of expensive Sales History.
 It is very exciting to note that the Standard Deviation of Marketing Spent of the sales of the Class A
is 520 which is a way more expensive than respective values of Class B and Class C.
 Discount Flag of C is more as compared to Class A and Class B.
 Sales Quantity of Class A :- Product in the category Class A are in huge demand as compared to Class B
and Class C.
Class Total Sales
A 17333791.0
B 2161258.8
C 694345.55
TABLE 6 TOTAL SALES
6
FusionOps|7/18/2016
BARPLOT OF DISCOUNT FLAGS
FIGURE 3:- BARPLOT OF DISCOUNT FLAG
From the above Barplot of Discount Flag, we get that there was more sale of the product under Discount as
compared to Promotion as you see in the below table:-
Discount Flag Total Counts
Promotion 1286
Discount 2714
TABLE 7 TOTAL COUNTS OF DISCOUNT FLAGS
As in the above result we see that 67.85 percent of Products are sold under the Discount and the remaining 32.15
percent of the Products are sold under Promotion.
7
FusionOps|7/18/2016
Correlation Of Sales Level Data:-
From the above table we see significant Correlation between Marketing Spent and Sales Quantity that is
0.622740066 which means if we increase the Marketing Spent , the Sales Quantity also increase.
Correlation between Unit Price and Unit cost is also good that is 0.940819548 which means if we
increase the Unit Price , the Unit Cost also increase.
Sales Date Sales
Quantity
Unit Cost Unit Price Marketing
Spent
Discount
Flag
Sales Date 1 0.018289693 0.000642644 0.000710024 0.022292488 -
0.006249567
Sales
Quantity
0.018289693 1 -
0.079925357
-
0.077920384
0.622740066 0.000442116
Unit Cost 0.000642644 -0.079925357 1 0.940819548 0.106927754 0.00759269
Unit Price 0.000710024 -0.077920384 0.940819548 1 0.237334958 -
0.035743076
Marketing
Spent
0.022292488 0.622740066 0.106927754 0.237334958 1 -
0.150267176
Discount
Flag
-
0.006249567
0.000442116 0.00759269 -
0.035743076
-0.150267176 1
TABLE 8 CORRELATION OF SALES LEVEL DATA
8
FusionOps|7/18/2016
Product Level Analysis
9
FusionOps|7/18/2016
Summary Of Product Level Data:-
Unit Cost Sales Quantity Unit Price Discount Flag Marketing Spent
count 100 100 100 100 100
mean 4.12248 201893.9535 12.1791675 12.86 7729.01
std 8.689409387 1036170.382 25.76945362 1.675853469 11271.81413
min 0.0775 1116 0.3785 10 931
25% 0.9100625 17890 2.892375 11 2897.25
50% 1.43425 38601.5 4.379625 13 4688
75% 2.157125 85749.45 6.6428125 14 8529.75
max 51.201 10290924 147.223 17 78490
TABLE 9 SUMMARY OF THE PRODUCT LEVEL DATA
From the above table , we see the Product level of the Data:-
As the Mean of the Sales Quantity is 201893.9535 and the Median is 275403.451160.
It means that the Median of the Sales Quantity is Higher than the Mean of the Sales Quantity.
10
FusionOps|7/18/2016
Barplot of Class ABC (Product Level)
FIGURE 1 BARPLOT OF CLASS ABC(PRODUCT LEVEL DATA)
From the above Fiqure , we found the total count values of each category of the ABC Class as shown below:-
Abc Class Total Counts
A 27
B 37
C 36
TABLE 10 TOTAL COUNTS OF THE CLASS ABC
11
FusionOps|7/18/2016
Summary of The Class ABC(Product Level Data):-
Unit Cost Sales Quantity Unit Price Discount Flag Marketing Spent
count 27 27 27 27 27
mean 4.633481481 641992.2593 17.26198148 13 18147.74074
Std 8.509789031 1951783.957 36.22357567 1.61721508 17879.64391
Min 0.0775 3933 0.3785 11 7353
25% 0.454375 56306.5 1.671125 11.5 9903.5
50% 1.005 177058 2.99625 13 12979
75% 2.666375 432854 6.612875 14.5 15234
Max 31.823 10290924 147.223 16 78490
TABLE 11 SUMMARY OF CLASS A (PRODUCT LEVEL)
Unit Cost Sales Quantity Unit Price Discount Flag Marketing Spent
count 37 37 37 37 37
mean 4.752783784 58412.4 12.00458108 12.24324324 5310.72973
Std 9.335749421 37122.25589 22.24332308 1.516674092 1464.675687
Min 0.485 1754.8 2.1505 10 3181
25% 0.88975 35531 2.81275 11 4255
50% 1.179 61624 3.155 12 5090
75% 2.27975 72011 7.22575 13 5879
Max 38.32575 152059 95.622 15 8979
TABLE 12 SUMMARY OF CLASS B (PRODUCT LEVEL)
12
FusionOps|7/18/2016
Unit Cost Sales Quantity Unit Price Discount Flag Marketing Spent
count 36 36 36 36 36
mean 3.091416667 19287.37639 8.546493056 13.38888889 2400.416667
std 8.271471763 10679.29319 19.00472153 1.711770642 755.2505876
min 0.716 1116 3.60075 11 931
25% 1.369125 10757.75 4.29525 12 1789.25
50% 1.7075 20211.5 5.016625 13 2385.5
75% 2.0568125 25934 6.2524375 14.25 2921.75
max 51.201 38983 119.047 17 3811
TABLE 13 SUMMARY OF CLASS C (PRODUCT LEVEL)
From the Above Summary of the PRODUCT LEVEL DATA, we came to know that :-
 Sales Quantity of Class A is more as compared to Class B and Class C.
 Marketing Spent of Class A is more as compared to Class B and Class C.
 Unit Price of Class A of last 25 percentiles of Sales are more as compared to Class B and Class C. In
other words, we can say that A comprises of more numbers of expensive Sales History.
 It is very exciting to note that the Standard Deviation of Marketing Spent of the sales of the Class A
is 520 which is a way more expensive than respective values of Class B and Class C.
 Discount Flag of C is more as compared to Class A and Class B.
 Sales Quantity of Class A :- Product in the category Class A are in huge demand as compared to Class B
and Class C.
Correlation Of Product Level Data
From the above table we see the significant Correlation between Marketing Spent and Sales Quantity and
Correlation between Unit Price and Unit Cost
Unit Cost Sales Quantity Unit Price Discount Flag Marketing Spent
Unit Cost 1 -0.082260483 0.947731937 0.092942086 0.12113583
Sales Quantity -0.082260483 1 -0.080638021 0.017179196 0.694245221
Unit Price 0.947731937 -0.080638021 1 0.130603683 0.258419279
Discount Flag 0.092942086 0.017179196 0.130603683 1 0.053700919
Marketing Spent 0.12113583 0.694245221 0.258419279 0.053700919 1
TABLE 13 CORRELATION OF PRODUCT LEVEL DATA
13
FusionOps|7/18/2016
Solution-2
Time Series Analytic
Product-1
1. Data
2. Models
14
FusionOps|7/18/2016
15
FusionOps|7/18/2016
3. Comparison
16
FusionOps|7/18/2016
Product-2
1.Data
2. Models
17
FusionOps|7/18/2016
18
FusionOps|7/18/2016
3.Comparisons
19
FusionOps|7/18/2016
Product-3
1. Data
2. Models
20
FusionOps|7/18/2016
21
FusionOps|7/18/2016
3. Comparison
22
FusionOps|7/18/2016
Product-4
1. Data
2. Models
23
FusionOps|7/18/2016
24
FusionOps|7/18/2016
3. Comparisons
25
FusionOps|7/18/2016
Product-5
1. Data
2. Models
26
FusionOps|7/18/2016
27
FusionOps|7/18/2016
3. Comparisons
28
FusionOps|7/18/2016
Product-6
1. Data
2. Models
29
FusionOps|7/18/2016
30
FusionOps|7/18/2016
3. Comparisons
31
FusionOps|7/18/2016
Product-7
1. Data
2. Models
32
FusionOps|7/18/2016
33
FusionOps|7/18/2016
3. Comparisons
34
FusionOps|7/18/2016
Product-8
1. Data
2. Models
35
FusionOps|7/18/2016
36
FusionOps|7/18/2016
3. Comparisons
37
FusionOps|7/18/2016
Product-9
1. Data
2. Models
38
FusionOps|7/18/2016
39
FusionOps|7/18/2016
3. Comparisons
40
FusionOps|7/18/2016
Product-10
1. Data
2.Models
41
FusionOps|7/18/2016
42
FusionOps|7/18/2016
3.Comparisons
43
FusionOps|7/18/2016
Results And Conclusions:-
From the Data of all the Product ,ARIMA stands out as the best model we have use in R Statistical
Programming Language from Time Series Analysis in which an Auto.Arima() function automatically
calculate ‘p’ (lag for AR) , ‘q’ (lag for MA) and ‘d’ (Stationary Flag) based on AIC (Akaike’s An Information
Criterion)
Please find the attached result of Time Series Forecasting for the month of January , February and March for
all the Product.
It is interesting to see all the comparison histogram of AIC , ARIMA as the minimum AIC except in products where
Exponential Smoothing is just better than ARIMA.
In Product 2, Product 4, Product 5, Product 6 and Product 7 shows an incremental trend in the Demand
whereas the rest has a fluctuating Demand along a constant rolling mean.
Interestingly Product 3 shows a decremented trend in the Demand with respect to Time.
The Product 4, Product 5, Product 6, Product 7, Product 8,Product 9 also shows the Seasonal
Behaviour.
Product January February March
Prod_1 540 476 466
Prod_2 2935 2941 2951
Prod_3 3279 3225 3229
Prod_4 8033 8108 7849
Prod_5 12450 12174 12255
Prod_6 1578 1609 1679
Prod_7 21174 20865 20204
Prod_8 233 284 257
Prod_9 82 125 60
Prod_10 1 0 2
44
FusionOps|7/18/2016
Code :
library(dataiku)
library(forecast)
library(dplyr)
#importing data for the DataSet-2
demandingdata <- read.csv('/home/prerna/Documents/DSDeveloperTest/DataSet-2.csv')
demandingdata= transform(demandingdata, ym = as.yearmon(as.character(demandingdata$Date), "%Y%m"))
pr=c()
for (i in unique(demandingdata$Product)) {
#creating temporary data for the products
temp=demandingdata[demandingdata$Product==i,]
#time series forecasting for the Demand
timedata=ts(temp$Demand,start = 2013,frequency = 12)
plot(timedata)
#forcast for the Exponential Smooting
m_ets = ets(timedata)
f_ets = forecast(m_ets, h=3) # forecast 24 months into the future
plot(f_ets)
#applying auto.arima() function for the forecast
m_aa = auto.arima(timedata)
f_aa = forecast(m_aa, h=3)
plot(f_aa)
#TBATS model for the forecast.
m_tbats = tbats(timedata)
f_tbats = forecast(m_tbats, h=3)
plot(f_tbats)
#Barplot for the ETS, ARIMA and TBATS
barplot(c(ETS=m_ets$aic, ARIMA=m_aa$aic, TBATS=m_tbats$AIC), col="light blue", ylab="AIC")
last_date = index(timedata)[length(timedata)]
#forecast for the predicted result of the Products
forecast_df =f_aa
x=as.data.frame(f_aa)
pr=rbind(pr,x$`Point Forecast`)
}
#exporting csv
write.csv(pr,file='predictedresult.csv')
45
FusionOps|7/18/2016
Solution-3
Task-1
1)Creating Data
a) $mysql -u root -p ******
b) Mysql> create database FusionOps;
2)Pushing CSV’s to MySQL Using Python:-
#!/usr/bin/python
import MySQLdb
from pandas.io import sql
import pandas as pd
# Open database connection
db = MySQLdb.connect("localhost","root","prerna1289","FusionOps" )
cursor = db.cursor()
df=pd.read_csv("/home/prerna/Desktop/SalesData.csv")
df['Sales Order Date']=pd.to_datetime(df['Sales Order Date'])
df.to_sql(con=db, name='SalesData', if_exists='replace', flavor='mysql')
sd=cursor.execute("select * from SalesData;")
print sd
df=pd.read_csv("/home/prerna/Desktop/PurchasingData.csv")
df['Replenishment Date']=pd.to_datetime(df['Replenishment Date'])
df.to_sql(con=db, name='PurchasingData', if_exists='replace', flavor='mysql')
pd1=cursor.execute("select * from PurchasingData;")
print pd1
46
FusionOps|7/18/2016
Task-2
1)Creating DailySalesAndStockData table:-
#!/usr/bin/python
from datetime import date,datetime
import MySQLdb
from pandas.io import sql
import pandas as pd
# Open database connection
db = MySQLdb.connect("localhost","root","prerna1289","FusionOps" )
cursor = db.cursor()
data_you_need=pd.DataFrame(columns=['Date','PartNo', 'ShopNo' ,'Sales_Quantity' ,'Sales_Quantity_Cum' ,'End-of-day
Stock'])
df1=pd.read_sql('select * from SalesData GROUP BY CONCAT(Part_Number,ShopNo);', con=db)
df1=df1[list(['Part_Number','ShopNo'])]
for index, row in df1.iterrows():
df_sale = pd.read_sql('select * from SalesData where Part_Number= ''+str(row['Part_Number'])+'' and
ShopNo=''+str(row['ShopNo'])+'';', con=db)
df_purchase= pd.read_sql('select * from PurchasingData where Part_Number= ''+str(row['Part_Number'])+'' and
ShopNo=''+str(row['ShopNo'])+'';', con=db)
date1=pd.date_range(date(2014,12,15), date(2015,3,31), freq='D')
df=pd.DataFrame(date1, index=date1,columns=['Date'])
df['PartNo']=str(row['Part_Number'])
df['ShopNo']=str(row['ShopNo'])
result = pd.merge(df, df_sale[list(['Sales_Order_Date','Sales_Quantity'])], how='left', left_on=['Date'],
right_on=['Sales_Order_Date'])
df = result.fillna(0)
result = pd.merge(df, df_purchase[list(['Replenishment_Date','Quantity_Produced/Bought'])], how='left',
left_on=['Date'], right_on=['Replenishment_Date'])
result= result[list(['Date','PartNo','ShopNo','Sales_Quantity','Quantity_Produced/Bought'])].fillna(0)
47
FusionOps|7/18/2016
result['Sales_Quantity_Cum']=result.Sales_Quantity.cumsum()
result['Quantity_Produced/Bought_Cum']=result['Quantity_Produced/Bought'].cumsum()
result['End-of-day Stock']=result['Quantity_Produced/Bought_Cum']-result['Sales_Quantity_Cum']
result=result[list(['Date','PartNo', 'ShopNo' ,'Sales_Quantity' ,'Sales_Quantity_Cum' ,'End-of-day Stock'])]
data_you_need=pd.concat([data_you_need,result[result['Date']>date(2014,12,31)]],ignore_index=True)
data_you_need.to_csv('out.csv',date_format='%d %b %Y')
data_you_need.to_sql(con=db, name='DailySalesAndStockData', if_exists='replace', flavor='mysql')
2)Please Find the attached Solution4.csv

Recomendados

Busenviron por
BusenvironBusenviron
Busenvironninjaiuj
290 vistas28 diapositivas
Edita Financial Analysis por
Edita Financial Analysis Edita Financial Analysis
Edita Financial Analysis Abdallah Elngar
412 vistas29 diapositivas
Greater Baton Rouge Foreclosure Home Sales Improvement Report Q3 2011 vs Q3 2014 por
Greater Baton Rouge Foreclosure Home Sales Improvement Report Q3 2011 vs Q3 2014Greater Baton Rouge Foreclosure Home Sales Improvement Report Q3 2011 vs Q3 2014
Greater Baton Rouge Foreclosure Home Sales Improvement Report Q3 2011 vs Q3 2014Bill Cobb, Appraiser
809 vistas12 diapositivas
CVP Relationships (Chapter 5 Connect Homework) por
CVP Relationships (Chapter 5 Connect Homework)CVP Relationships (Chapter 5 Connect Homework)
CVP Relationships (Chapter 5 Connect Homework)Emily Bauer
7K vistas27 diapositivas
PrernaSharma_DataScientist por
PrernaSharma_DataScientistPrernaSharma_DataScientist
PrernaSharma_DataScientistPrerna Sharma
146 vistas1 diapositiva
SAS CV of Narayana por
SAS CV of NarayanaSAS CV of Narayana
SAS CV of NarayanaNarayana P
138 vistas2 diapositivas

Más contenido relacionado

Similar a FusionOps

Marcos' Account Audit via Google Data Studio por
Marcos' Account Audit via Google Data StudioMarcos' Account Audit via Google Data Studio
Marcos' Account Audit via Google Data StudioMarcosJrFarnacio
340 vistas9 diapositivas
Marketing Plan Model por
Marketing Plan ModelMarketing Plan Model
Marketing Plan ModelJuan Antonio Benito
3K vistas23 diapositivas
Krokosz lecture3 cost-volume-profit analysis por
Krokosz lecture3 cost-volume-profit analysisKrokosz lecture3 cost-volume-profit analysis
Krokosz lecture3 cost-volume-profit analysisZofia Krokosz-Krynke
33 vistas11 diapositivas
Shield Hand and Body Sanitizing Lotion por
Shield Hand and Body Sanitizing LotionShield Hand and Body Sanitizing Lotion
Shield Hand and Body Sanitizing Lotionaccld2015
1.4K vistas22 diapositivas
Chapter_09.ppt por
Chapter_09.pptChapter_09.ppt
Chapter_09.pptssuserc9c6261
3 vistas23 diapositivas
Thread DB 2-18-SP-Mi e Chegg Study Guided SX Connect × Secure.pdf por
Thread DB 2-18-SP-Mi  e Chegg Study  Guided SX Connect ×  Secure.pdfThread DB 2-18-SP-Mi  e Chegg Study  Guided SX Connect ×  Secure.pdf
Thread DB 2-18-SP-Mi e Chegg Study Guided SX Connect × Secure.pdfdhavalbl38
2 vistas1 diapositiva

Similar a FusionOps(20)

Marcos' Account Audit via Google Data Studio por MarcosJrFarnacio
Marcos' Account Audit via Google Data StudioMarcos' Account Audit via Google Data Studio
Marcos' Account Audit via Google Data Studio
MarcosJrFarnacio340 vistas
Shield Hand and Body Sanitizing Lotion por accld2015
Shield Hand and Body Sanitizing LotionShield Hand and Body Sanitizing Lotion
Shield Hand and Body Sanitizing Lotion
accld20151.4K vistas
Thread DB 2-18-SP-Mi e Chegg Study Guided SX Connect × Secure.pdf por dhavalbl38
Thread DB 2-18-SP-Mi  e Chegg Study  Guided SX Connect ×  Secure.pdfThread DB 2-18-SP-Mi  e Chegg Study  Guided SX Connect ×  Secure.pdf
Thread DB 2-18-SP-Mi e Chegg Study Guided SX Connect × Secure.pdf
dhavalbl382 vistas
Operations performance analysis that facilitates an informed decision making ... por Bojan Mitrovic, M.A.
Operations performance analysis that facilitates an informed decision making ...Operations performance analysis that facilitates an informed decision making ...
Operations performance analysis that facilitates an informed decision making ...
Transforming big data into supply chain analytics por Tristan Wiggill
Transforming big data into supply chain analyticsTransforming big data into supply chain analytics
Transforming big data into supply chain analytics
Tristan Wiggill3.7K vistas
Bach-Business Simulation por Diana Franco
Bach-Business SimulationBach-Business Simulation
Bach-Business Simulation
Diana Franco485 vistas
Comeos Blend360 Retail Value from Data Content v05.pdf por LucienvanderHoeven2
Comeos Blend360 Retail Value from Data Content v05.pdfComeos Blend360 Retail Value from Data Content v05.pdf
Comeos Blend360 Retail Value from Data Content v05.pdf
Slide Makeover #77: When you are forced to show a large spreadsheet por Dave Paradi
Slide Makeover #77: When you are forced to show a large spreadsheetSlide Makeover #77: When you are forced to show a large spreadsheet
Slide Makeover #77: When you are forced to show a large spreadsheet
Dave Paradi43.6K vistas
Vera Bradley Final Presentation por Bashayer Baljon
Vera Bradley Final PresentationVera Bradley Final Presentation
Vera Bradley Final Presentation
Bashayer Baljon699 vistas
DataYTD (Jan-June 2016 vs. Jan-June 2015)QTD (Apr, May, Jun 2016 por OllieShoresna
DataYTD (Jan-June 2016 vs. Jan-June 2015)QTD (Apr, May, Jun 2016  DataYTD (Jan-June 2016 vs. Jan-June 2015)QTD (Apr, May, Jun 2016
DataYTD (Jan-June 2016 vs. Jan-June 2015)QTD (Apr, May, Jun 2016
OllieShoresna3 vistas
Value Line Investment Research por Carson Fears
Value Line Investment ResearchValue Line Investment Research
Value Line Investment Research
Carson Fears695 vistas
DataYTD (Jan-June 2016 vs. Jan-June 2015)QTD (Apr, May, Jun 2016 .docx por simonithomas47935
DataYTD (Jan-June 2016 vs. Jan-June 2015)QTD (Apr, May, Jun 2016  .docxDataYTD (Jan-June 2016 vs. Jan-June 2015)QTD (Apr, May, Jun 2016  .docx
DataYTD (Jan-June 2016 vs. Jan-June 2015)QTD (Apr, May, Jun 2016 .docx
Data Analysis Project 3Presented By Yiwen LiNational Ec.docx por Jack632244
Data Analysis Project 3Presented By Yiwen LiNational Ec.docxData Analysis Project 3Presented By Yiwen LiNational Ec.docx
Data Analysis Project 3Presented By Yiwen LiNational Ec.docx
Jack6322441 vista

FusionOps

  • 1. PRERNA SHARMA July 18, 2016 FusionOps DS Developer Test
  • 2. 1 FusionOps|7/18/2016 FusionOps DS Developer Test Solution -1 Flow Chart Sales Data Analysis Summary:- Sales Date Sales Quantity Unit Cost Unit Price Marketing Spent Discount Flag Count 4000 4000 4000 4000 4000 4000 Mean 20132640.4 5047.348838 4.12248 12.1791675 193.22525 0.3215 Std 9731.630793 26503.76132 8.65621441 25.80803025 318.6396473 0.467110584 Min 20120131 9 0.07 0.32 3 0 25% 20121105.25 408.75 0.91 2.9 60 0 50% 20130880.5 963 1.46 4.32 108 0 75% 20140655.25 2117 2.2 6.51 203.25 1 Max 20150430 390630 52.38 166.43 3966 1 TABLE 1:- SUMMARY OF THE SALES LEVEL DATA
  • 3. 2 FusionOps|7/18/2016 In the above table, we found the proper details (i.e Central Tendency and Variations) of the given data. As the better result Mean of the Sales Quantity is 5047.34883 and the Median is 6899.864279. It means that the Median of the Sales Quantity is Higher than the Mean of the Sales Quantity. Now, Mean of the Marketing Spent is 193.22525 and Median of the Marketing Spent is 155.981928. It means that Mean of the Marketing Spent is higher than the Median of the Marketing Spent. Uni-Variate and Bi-Variate Analysis:- FIGURE 1:-PAIR WISE PLOT OF SALES LEVEL ANALYSIS
  • 4. 3 FusionOps|7/18/2016 FIGURE 2:- BARPLOT OF ABC CLASS From the above Fiqure:-2 , we found the total count values of each category of the ABC Class as shown below:- ABC Class Total Counts Discount Flags A 1080 351 B 1480 453 C 1440 482 TABLE 2 CATEGORIES OF CLASS ABC
  • 5. 4 FusionOps|7/18/2016 Summary Of the Categories of the Class ABC(Product Segmentation):- TABLE 3:-SUMMARY OF CATEGORY A TABLE 4:- SUMMARY OF CLASS B Sales Date Sales Quantity Unit Cost Unit Price Marketing Spent Discount Flag Count 1080 1080 1080 1080 1080 1080 Mean 20132640.4 16049.80648 4.633481481 17.26198148 453.6935185 0.325 Std 9734.921816 49349.07531 8.36362872 35.78886613 520.6330711 0.468591841 Min 20120131 47 0.07 0.32 54 0 25% 20121105.25 1266.75 0.45 1.55 190.5 0 50% 20130880.5 4185.5 1.03 3.12 297 0 75% 20140655.25 10848 2.9 6.725 479.25 1 Max 20150430 390630 32.39 166.43 3966 1 Sales Date Sales Quantity Unit Cost Unit Price Marketing Spent Discount Flag count 1480 1480 1480 1480 1480 1480 mean 20132640.4 1460.31 4.752783784 12.00458108 132.7682432 0.306081081 std 9733.702872 992.2210921 9.222109011 22.10871481 71.55556198 0.461019588 min 20120131 33.9 0.45 1.78 27 0 25% 20121105.25 783.5 0.8875 2.77 75 0 50% 20130880.5 1394 1.2 3.29 117 0 75% 20140655.25 1937.5 2.32 7.65 173.25 1 max 20150430 6951 39.21 110.8 531 1
  • 6. 5 FusionOps|7/18/2016 Sales Date Sales Quantity Unit Cost Unit Price Marketing Spent Discount Flag count 1440 1440 1440 1440 1440 1440 mean 20132640.4 482.1844097 3.091416667 8.546493056 60.01041667 0.334722222 std 9733.794272 294.1944846 8.167083891 18.85617808 34.8026187 0.472057205 min 20120131 9 0.66 3.02 3 0 25% 20121105.25 248 1.29 4.12 33 0 50% 20130880.5 464.5 1.74 5.01 53 0 75% 20140655.25 646.25 2.02 6.4725 78 1 max 20150430 1895 52.38 134.41 216 1 TABLE 5:- SUMMARY OF CLASS C From the Above Summary, we came to know that :-  Sales Quantity of Class A is more as compared to Class B and Class C.  Marketing Spent of Class A is more as compared to Class B and Class C.  Unit Price of Class A of last 25 percentiles of Sales are more as compared to Class B and Class C. In other words, we can say that A comprises of more numbers of expensive Sales History.  It is very exciting to note that the Standard Deviation of Marketing Spent of the sales of the Class A is 520 which is a way more expensive than respective values of Class B and Class C.  Discount Flag of C is more as compared to Class A and Class B.  Sales Quantity of Class A :- Product in the category Class A are in huge demand as compared to Class B and Class C. Class Total Sales A 17333791.0 B 2161258.8 C 694345.55 TABLE 6 TOTAL SALES
  • 7. 6 FusionOps|7/18/2016 BARPLOT OF DISCOUNT FLAGS FIGURE 3:- BARPLOT OF DISCOUNT FLAG From the above Barplot of Discount Flag, we get that there was more sale of the product under Discount as compared to Promotion as you see in the below table:- Discount Flag Total Counts Promotion 1286 Discount 2714 TABLE 7 TOTAL COUNTS OF DISCOUNT FLAGS As in the above result we see that 67.85 percent of Products are sold under the Discount and the remaining 32.15 percent of the Products are sold under Promotion.
  • 8. 7 FusionOps|7/18/2016 Correlation Of Sales Level Data:- From the above table we see significant Correlation between Marketing Spent and Sales Quantity that is 0.622740066 which means if we increase the Marketing Spent , the Sales Quantity also increase. Correlation between Unit Price and Unit cost is also good that is 0.940819548 which means if we increase the Unit Price , the Unit Cost also increase. Sales Date Sales Quantity Unit Cost Unit Price Marketing Spent Discount Flag Sales Date 1 0.018289693 0.000642644 0.000710024 0.022292488 - 0.006249567 Sales Quantity 0.018289693 1 - 0.079925357 - 0.077920384 0.622740066 0.000442116 Unit Cost 0.000642644 -0.079925357 1 0.940819548 0.106927754 0.00759269 Unit Price 0.000710024 -0.077920384 0.940819548 1 0.237334958 - 0.035743076 Marketing Spent 0.022292488 0.622740066 0.106927754 0.237334958 1 - 0.150267176 Discount Flag - 0.006249567 0.000442116 0.00759269 - 0.035743076 -0.150267176 1 TABLE 8 CORRELATION OF SALES LEVEL DATA
  • 10. 9 FusionOps|7/18/2016 Summary Of Product Level Data:- Unit Cost Sales Quantity Unit Price Discount Flag Marketing Spent count 100 100 100 100 100 mean 4.12248 201893.9535 12.1791675 12.86 7729.01 std 8.689409387 1036170.382 25.76945362 1.675853469 11271.81413 min 0.0775 1116 0.3785 10 931 25% 0.9100625 17890 2.892375 11 2897.25 50% 1.43425 38601.5 4.379625 13 4688 75% 2.157125 85749.45 6.6428125 14 8529.75 max 51.201 10290924 147.223 17 78490 TABLE 9 SUMMARY OF THE PRODUCT LEVEL DATA From the above table , we see the Product level of the Data:- As the Mean of the Sales Quantity is 201893.9535 and the Median is 275403.451160. It means that the Median of the Sales Quantity is Higher than the Mean of the Sales Quantity.
  • 11. 10 FusionOps|7/18/2016 Barplot of Class ABC (Product Level) FIGURE 1 BARPLOT OF CLASS ABC(PRODUCT LEVEL DATA) From the above Fiqure , we found the total count values of each category of the ABC Class as shown below:- Abc Class Total Counts A 27 B 37 C 36 TABLE 10 TOTAL COUNTS OF THE CLASS ABC
  • 12. 11 FusionOps|7/18/2016 Summary of The Class ABC(Product Level Data):- Unit Cost Sales Quantity Unit Price Discount Flag Marketing Spent count 27 27 27 27 27 mean 4.633481481 641992.2593 17.26198148 13 18147.74074 Std 8.509789031 1951783.957 36.22357567 1.61721508 17879.64391 Min 0.0775 3933 0.3785 11 7353 25% 0.454375 56306.5 1.671125 11.5 9903.5 50% 1.005 177058 2.99625 13 12979 75% 2.666375 432854 6.612875 14.5 15234 Max 31.823 10290924 147.223 16 78490 TABLE 11 SUMMARY OF CLASS A (PRODUCT LEVEL) Unit Cost Sales Quantity Unit Price Discount Flag Marketing Spent count 37 37 37 37 37 mean 4.752783784 58412.4 12.00458108 12.24324324 5310.72973 Std 9.335749421 37122.25589 22.24332308 1.516674092 1464.675687 Min 0.485 1754.8 2.1505 10 3181 25% 0.88975 35531 2.81275 11 4255 50% 1.179 61624 3.155 12 5090 75% 2.27975 72011 7.22575 13 5879 Max 38.32575 152059 95.622 15 8979 TABLE 12 SUMMARY OF CLASS B (PRODUCT LEVEL)
  • 13. 12 FusionOps|7/18/2016 Unit Cost Sales Quantity Unit Price Discount Flag Marketing Spent count 36 36 36 36 36 mean 3.091416667 19287.37639 8.546493056 13.38888889 2400.416667 std 8.271471763 10679.29319 19.00472153 1.711770642 755.2505876 min 0.716 1116 3.60075 11 931 25% 1.369125 10757.75 4.29525 12 1789.25 50% 1.7075 20211.5 5.016625 13 2385.5 75% 2.0568125 25934 6.2524375 14.25 2921.75 max 51.201 38983 119.047 17 3811 TABLE 13 SUMMARY OF CLASS C (PRODUCT LEVEL) From the Above Summary of the PRODUCT LEVEL DATA, we came to know that :-  Sales Quantity of Class A is more as compared to Class B and Class C.  Marketing Spent of Class A is more as compared to Class B and Class C.  Unit Price of Class A of last 25 percentiles of Sales are more as compared to Class B and Class C. In other words, we can say that A comprises of more numbers of expensive Sales History.  It is very exciting to note that the Standard Deviation of Marketing Spent of the sales of the Class A is 520 which is a way more expensive than respective values of Class B and Class C.  Discount Flag of C is more as compared to Class A and Class B.  Sales Quantity of Class A :- Product in the category Class A are in huge demand as compared to Class B and Class C. Correlation Of Product Level Data From the above table we see the significant Correlation between Marketing Spent and Sales Quantity and Correlation between Unit Price and Unit Cost Unit Cost Sales Quantity Unit Price Discount Flag Marketing Spent Unit Cost 1 -0.082260483 0.947731937 0.092942086 0.12113583 Sales Quantity -0.082260483 1 -0.080638021 0.017179196 0.694245221 Unit Price 0.947731937 -0.080638021 1 0.130603683 0.258419279 Discount Flag 0.092942086 0.017179196 0.130603683 1 0.053700919 Marketing Spent 0.12113583 0.694245221 0.258419279 0.053700919 1 TABLE 13 CORRELATION OF PRODUCT LEVEL DATA
  • 44. 43 FusionOps|7/18/2016 Results And Conclusions:- From the Data of all the Product ,ARIMA stands out as the best model we have use in R Statistical Programming Language from Time Series Analysis in which an Auto.Arima() function automatically calculate ‘p’ (lag for AR) , ‘q’ (lag for MA) and ‘d’ (Stationary Flag) based on AIC (Akaike’s An Information Criterion) Please find the attached result of Time Series Forecasting for the month of January , February and March for all the Product. It is interesting to see all the comparison histogram of AIC , ARIMA as the minimum AIC except in products where Exponential Smoothing is just better than ARIMA. In Product 2, Product 4, Product 5, Product 6 and Product 7 shows an incremental trend in the Demand whereas the rest has a fluctuating Demand along a constant rolling mean. Interestingly Product 3 shows a decremented trend in the Demand with respect to Time. The Product 4, Product 5, Product 6, Product 7, Product 8,Product 9 also shows the Seasonal Behaviour. Product January February March Prod_1 540 476 466 Prod_2 2935 2941 2951 Prod_3 3279 3225 3229 Prod_4 8033 8108 7849 Prod_5 12450 12174 12255 Prod_6 1578 1609 1679 Prod_7 21174 20865 20204 Prod_8 233 284 257 Prod_9 82 125 60 Prod_10 1 0 2
  • 45. 44 FusionOps|7/18/2016 Code : library(dataiku) library(forecast) library(dplyr) #importing data for the DataSet-2 demandingdata <- read.csv('/home/prerna/Documents/DSDeveloperTest/DataSet-2.csv') demandingdata= transform(demandingdata, ym = as.yearmon(as.character(demandingdata$Date), "%Y%m")) pr=c() for (i in unique(demandingdata$Product)) { #creating temporary data for the products temp=demandingdata[demandingdata$Product==i,] #time series forecasting for the Demand timedata=ts(temp$Demand,start = 2013,frequency = 12) plot(timedata) #forcast for the Exponential Smooting m_ets = ets(timedata) f_ets = forecast(m_ets, h=3) # forecast 24 months into the future plot(f_ets) #applying auto.arima() function for the forecast m_aa = auto.arima(timedata) f_aa = forecast(m_aa, h=3) plot(f_aa) #TBATS model for the forecast. m_tbats = tbats(timedata) f_tbats = forecast(m_tbats, h=3) plot(f_tbats) #Barplot for the ETS, ARIMA and TBATS barplot(c(ETS=m_ets$aic, ARIMA=m_aa$aic, TBATS=m_tbats$AIC), col="light blue", ylab="AIC") last_date = index(timedata)[length(timedata)] #forecast for the predicted result of the Products forecast_df =f_aa x=as.data.frame(f_aa) pr=rbind(pr,x$`Point Forecast`) } #exporting csv write.csv(pr,file='predictedresult.csv')
  • 46. 45 FusionOps|7/18/2016 Solution-3 Task-1 1)Creating Data a) $mysql -u root -p ****** b) Mysql> create database FusionOps; 2)Pushing CSV’s to MySQL Using Python:- #!/usr/bin/python import MySQLdb from pandas.io import sql import pandas as pd # Open database connection db = MySQLdb.connect("localhost","root","prerna1289","FusionOps" ) cursor = db.cursor() df=pd.read_csv("/home/prerna/Desktop/SalesData.csv") df['Sales Order Date']=pd.to_datetime(df['Sales Order Date']) df.to_sql(con=db, name='SalesData', if_exists='replace', flavor='mysql') sd=cursor.execute("select * from SalesData;") print sd df=pd.read_csv("/home/prerna/Desktop/PurchasingData.csv") df['Replenishment Date']=pd.to_datetime(df['Replenishment Date']) df.to_sql(con=db, name='PurchasingData', if_exists='replace', flavor='mysql') pd1=cursor.execute("select * from PurchasingData;") print pd1
  • 47. 46 FusionOps|7/18/2016 Task-2 1)Creating DailySalesAndStockData table:- #!/usr/bin/python from datetime import date,datetime import MySQLdb from pandas.io import sql import pandas as pd # Open database connection db = MySQLdb.connect("localhost","root","prerna1289","FusionOps" ) cursor = db.cursor() data_you_need=pd.DataFrame(columns=['Date','PartNo', 'ShopNo' ,'Sales_Quantity' ,'Sales_Quantity_Cum' ,'End-of-day Stock']) df1=pd.read_sql('select * from SalesData GROUP BY CONCAT(Part_Number,ShopNo);', con=db) df1=df1[list(['Part_Number','ShopNo'])] for index, row in df1.iterrows(): df_sale = pd.read_sql('select * from SalesData where Part_Number= ''+str(row['Part_Number'])+'' and ShopNo=''+str(row['ShopNo'])+'';', con=db) df_purchase= pd.read_sql('select * from PurchasingData where Part_Number= ''+str(row['Part_Number'])+'' and ShopNo=''+str(row['ShopNo'])+'';', con=db) date1=pd.date_range(date(2014,12,15), date(2015,3,31), freq='D') df=pd.DataFrame(date1, index=date1,columns=['Date']) df['PartNo']=str(row['Part_Number']) df['ShopNo']=str(row['ShopNo']) result = pd.merge(df, df_sale[list(['Sales_Order_Date','Sales_Quantity'])], how='left', left_on=['Date'], right_on=['Sales_Order_Date']) df = result.fillna(0) result = pd.merge(df, df_purchase[list(['Replenishment_Date','Quantity_Produced/Bought'])], how='left', left_on=['Date'], right_on=['Replenishment_Date']) result= result[list(['Date','PartNo','ShopNo','Sales_Quantity','Quantity_Produced/Bought'])].fillna(0)
  • 48. 47 FusionOps|7/18/2016 result['Sales_Quantity_Cum']=result.Sales_Quantity.cumsum() result['Quantity_Produced/Bought_Cum']=result['Quantity_Produced/Bought'].cumsum() result['End-of-day Stock']=result['Quantity_Produced/Bought_Cum']-result['Sales_Quantity_Cum'] result=result[list(['Date','PartNo', 'ShopNo' ,'Sales_Quantity' ,'Sales_Quantity_Cum' ,'End-of-day Stock'])] data_you_need=pd.concat([data_you_need,result[result['Date']>date(2014,12,31)]],ignore_index=True) data_you_need.to_csv('out.csv',date_format='%d %b %Y') data_you_need.to_sql(con=db, name='DailySalesAndStockData', if_exists='replace', flavor='mysql') 2)Please Find the attached Solution4.csv