Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

Gentle introduction to Machine Learning

182 visualizaciones

Publicado el

We start with a presentation of 1Tap then we do a gentle introduction to Machine Learning.

Publicado en: Tecnología
  • Sé el primero en comentar

  • Sé el primero en recomendar esto

Gentle introduction to Machine Learning

  1. 1. 1 Roman Orac, 1Tap Machine Learning & Data Analysis A Gentle introduction to Machine Learning
  2. 2. 1Tap is a Automated Accounting Platform For the Self Employed* * Sole Trader, Sole Proprietor, Freelancer, Contractor, Independent, Non Incorporated Businesses Fully
  3. 3. The Self Employed can’t buy the stuff they want Profit…Welfare… Taxes… No idea That is a problem for the new year... Denied... Hopefully I get better real soon... Credit… 6
  4. 4. Making Self Employment > Employment Our Mission
  5. 5. 1Tap Receipts Take a photo Data Extracted Tax Return updated Customers Love it 1 2 3 4
  6. 6. The foundation of our apps Ruby on Rails Restful JSON API 4.0 Code Climate GPA
  7. 7. Enough about us … What is Machine Learning Anyway?
  8. 8. What is Machine Learning? Training data Machine Learning algorithm ClassifierNew samples Prediction Pre-processing ● Machine Learning is the science of getting computers to act without being explicitly programmed
  9. 9. Predict survival on the Titanic In 1912 the Titanic sank, killing 1,502 out of 2,224 passengers and crew. Some groups of people were more likely to survive than others.
  10. 10. Let’s look at the data Abbreviations ● Embarked: Port of embarkation ○ C = Cherbourg ○ Q = Queenstown ○ S = Southampton ● Parch: Number of parents/children aboard ● Pclass: Passenger's class ● SibSp: Number of siblings/spouses aboard ● Survived: Survived (1) or died (0) ● Ticket: Ticket number
  11. 11. Understanding the data ● Distributions of the fare of passengers who survived or did not survive ● Many passengers with cheaper fares died ● Is fare a good predictive variable?
  12. 12. Most Important Step: Data preprocessing Original data Preprocessed data preprocessing ● Clean the data ● Encode attributes ● Fill in missing values ● Add new attributes
  13. 13. Decision Tree ● Use training set and build a decision tree model ● Use the model to predict new samples
  14. 14. What types of problems do we solve with ML at 1Tap?
  15. 15. Receipt categorization Initial receipt categorization based on company’s industry deterministic categorization many mis-categorization The Numbers 600K categorized receipts 40K users 80K new receipts every month
  16. 16. Receipt categorization with ML Categorizing receipts in a smarter and more contextual way
  17. 17. ● Features: ○ user’s profession ○ vendor name, date, expense total and text ● Preprocessing: ○ Filter receipts ○ Recategorize most obvious receipts ● Train a classifier that categorizes receipts ● This approach improves categorization as receipt text adds more context Receipt categorization with ML
  18. 18. Questions?
  19. 19. Come talk to us over pizza! Nejc, Human Resources Roman, Machine Learning Vesna, Head of Product