Estola meetup big_datacampla_6_14_evan_estola

•

0 likes•477 views

This document summarizes Evan Estola's presentation on machine learning and recommendations at Meetup. It discusses how Meetup uses tools like Hive, Spark, Scala, R and Python for tasks like collaborative filtering, cleaning data, and supervised learning to make recommendations in ways that address the challenges of sparsity, cold starts, and incomplete data for their platform. It also describes how they use ensemble learning and map Facebook likes to Meetup topics to generate additional features to improve their recommendation models. The results showed a positive weight and 1.5% lift in conversions from adding the Facebook-generated topics feature.

Data & Analytics Technology Education

Beyond Collaborative
Filtering: ML &
Recommendations at
Meetup
Evan Estola
Meetup.com
evan@meetup.com
@estola

My Background
● Machine Learning Engineer
● At Meetup since May 2012
● BS Computer Science
○ Information Retrieval
○ Data Mining
○ Math
■ Linear Algebra
■ Graph Theory

Why Meetup data is cool
● Real people meeting up
● Every meetup could change someone's life
● No ads, just do the best thing
● Oh and >125 million rsvps by >17 million
members
● ~3 million rsvps in the last 30 days
○ >1/second

Tools at Meetup
● Hive - SQL on Hadoop
● Spark - Distributed Scala on Hadoop cluster
● Scala - Recommendations service
● R - Data analysis, Model building
● Python - Scripting, Data organizing
● Java - Backend of our web stack

Collaborative Filtering
● Classic recommendations approach
● Users who like this also like this

Weaknesses of CF
● Sparsity
● Cold Start
● Coverage
● Diversity

Why Recs at Meetup are hard
● Incomplete Data (topics)
● Cold start
● Asking user for data is hard
● Going to meetups is scary
● Sparsity
○ Location
○ Groups/person
○ Membership: 0.001%
○ Compare to Netflix: 1%

Cleaning data
● Schenectady
● Beverly Hills
● Fake RSVP boosts (+100 guests!)
● Rsvp hogs

Real data is gross
● Preprocessing is critical!
○ missing data
○ outliers
○ log scale
○ bucketing
○ sampling bias

Supervised Learning/Classification
● “Inferring a function from labeled training
data”
● Joined Meetup group/Didn’t join Meetup
group

Ranking
● Membership << expected error rate
○ Sample to 50/50 join/no-join
● Model output label no longer explicitly true
● Use a classifier that gives you a useful
output

Ensemble Learning
“... use multiple learning algorithms to obtain
better predictive performance than could be
obtained from any of the constituent learning
algorithms”

Ensemble Learning
● Collaborative Filtering on Topics
● Other simple features

Logistic Regression Output
● RsvpScore 0.02
● FbFriends 2.02
● 2ndFbFriends 0.09
● AgeUnmatch -2.40
● GenUnmatch -3.37
● Distance -0.04
● StateMatch 0.54
● CountyMatch 0.41
● ZipScore 0.06
● TopicScore 4.14
● ExtendedTS 0.47
● RelatedTS 0.66
● FbLikeTS 0.78

Facebook Likes
● Lots of information, but how to use?
● Map to topics, let training the model take
care of the rest!
● Bonus: Recommendations server knows
topics, generated topics can be passed in by
request

Mapping FB Likes to Meetup Topics
● Text based?
○ Go(game) vs Go(lang)?
○ Burton?
● Data approach!
○ Grab most popular topics across all members with
the same like

Normalization
● Top topics for Burton-Likers
○ Meeting New People, Coffee, bla bla
○ Most popular still dominates
● Normalize based on expected topic
occurrence in sample

Normalization
● For members with a given Like
● Compare percent with each topic to
expected among total population
● Burton:
○ 20% “Meeting New People”
○ 9% “Snowboarding”
● Total population
○ 20% “Meeting New People”
○ 2% “Snowboarding

Processing
● Load FB Like connections, topics into
Hadoop
● Process with Hive to generate top topics for
each like
● Join with member likes to generate top
topics per member
● Add feature to model using FB-Like-
Generated-Topics crossover with groups...

Results
● Positive weight
○ Very good sign
○ Captured information about member identity, not just
behavior
● Deploy/Split test
○ 1.5% lift in conversion overall
○ (Only have facebook data for ~10% of members)

Thanks!
Smart people come work with me.
http://www.meetup.com/jobs/

Similar to Estola meetup big_datacampla_6_14_evan_estola

Parismlmeetupfinalslides 151209190037-lva1-app6892

mercedes calderon

Customer segmentation scbcn17

Julio Martinez

Life in CSE.pptx

VedVekhande

Computer Science Career Guidance

Deepak Sood

Be Part of a Community

Miriah Peterson

Cepstrum Placement Talk 2022.pptx

gyan98

Data analysis with Pandas and Spark

Felix Crisan

As systems and user bases grow, a once abundant resource can become scarce. While scaling out PlayStation services to millions of users at over a 100,000 requests/second, network throughput became a precious resource to optimize for. Alex and Dustin talk about how the microservices that power Playstation achieved low latency interactions while conserving on precious network bandwidth. These services powered by Amazon Elastic Load Balancing and Amazon DynamoDB benefitted from soft-state optimizations, a pattern that is used in complex interactions such as searching through a user’s social graph in sub 100 ms, or a user’s game library in 7 ms. As a developer utilizing Amazon Web services, you will discover new patterns and implementations which will better utilize your network, instances, and load balancers in order to deliver personalized experiences to millions of users while saving costs.

AWS re:Invent 2016| GAM302 | Sony PlayStation: Breaking the Bandwidth Barrier...

Amazon Web Services

Graph processing at scale using spark & graph frames

Ron Barabash

Mahout is an open source machine learning library from Apache. From its humble beginnings at Apache Lucene, the project has grown into a active community of developers, machine learning experts and enthusiasts. With v0.5 released recently, the project has been focussing full steam on developing stable APIs with an eye on our major milestone of v1.0. The speaker has been with Mahout from his days in college as a computer science student. The talk will focus on the major use cases of Mahout. The design decisions, things that worked, things that didn't, and things to expect in the future releases. http://sdec.kr/

SDEC2011 Mahout - the what, the how and the why

Korea Sdec

Dataiku hadoop summit - semi-supervised learning with hadoop for understand...

Dataiku

Maintaining high quality user generated content through machine learning

Nikhil Dandekar

Better programmer overview

CourseHunt

Until 2014, gutefrage.net was striving towards chaos and chaos had many faces like * Bad answers were shown to the users and hiding the good and helpful ones * Spam was overlooked in the huge amount of generated content * Tags do not always represent the intended topic of the question * The page load time led to aborting users before the site is fully rendered These problems imply bad consequences for the user experience, the manageability of our content, the image of our platform and also on the revenue. To tackle these big problems, we decided to leverage the full power of our data. We put great effort into an automatic rating of the answers and hide the really bad ones from every user, we alert problems with Spammers in realtime to our Community Management, improve the page load time tremendously and currently testing prototypes for (semi-) automatically inferring the topic of a question. I will show you on some examples, how we discovered the problems by making the data visible to everyone in the company, fixing them either by advanced Machine Learning techniques or by relying on the „collective brain“ of our community and improving the user experience step by step until the chaos is finally defeated.

Overcome the Reign of Chaos

Michael Stockerl

District Data Labs Workshop Current Workshop: August 23, 2014 Previous Workshops: - April 5, 2014 Data products are usually software applications that derive their value from data by leveraging the data science pipeline and generate data through their operation. They aren’t apps with data, nor are they one time analyses that produce insights - they are operational and interactive. The rise of these types of applications has directly contributed to the rise of the data scientist and the idea that data scientists are professionals “who are better at statistics than any software engineer and better at software engineering than any statistician.” These applications have been largely built with Python. Python is flexible enough to develop extremely quickly on many different types of servers and has a rich tradition in web applications. Python contributes to every stage of the data science pipeline including real time ingestion and the production of APIs, and it is powerful enough to perform machine learning computations. In this class we’ll produce a data product with Python, leveraging every stage of the data science pipeline to produce a book recommender.

Building Data Apps with Python

Benjamin Bengfort

Codemotion 2017 - "Dime cómo manejas tus datos y te diré qué clase de base de...

Jose Mº Muñoz

This thesis presents the design and architecture of an Active Learning system for Question Answering on Multiparty Dialogue. The goal of this system is to collect a robust Question Answering dataset and to improve the performance of the system on Question Answering challenges on Multiparty Dialogue. The system has an interactive web-based user interface which allows users to challenge the system with their own questions regarding a short passage of dialogues between multiple characters in a TV series. This system makes use of a state-of-art Machine Learning model to predict the answers to users' questions. In the same time, the system learns from users' responses and performs online update on the model. The system uses probability functions to guide user towards contributing data needed most for model improvement. The system is designed to handle heavy internet traffic by efficiently storing data and by carefully synchronizing the shared resources in the web system. The system has shown promising results in guiding users to contribute high quality data useful for model training.

Active Learning on Question Answering with Dialogues

Jinho Choi

Quora ML Workshop: Maintaining High Quality User-Generated Content through Ma...

Quora

Presentation slides at RecSys 2016, Boston. At Quora, our mission is to share and grow the world’s knowledge. Recommender systems are at the core of this mission: we need to recommend the most important questions to people most likely to write great answers, and recommend the best answers to people interested in reading them. Driven by the above mission statement, we have a variety of interesting and challenging recommendation problems and a large, rich data set that we can work with to build novel solutions for them. In this talk, we will describe several of these recommendation problems and present our approaches solving them.

Recommending the world's knowledge

Lei Yang

How we won the 2nd place In KaggleDays Championship.pptx

jficwwc

Similar to Estola meetup big_datacampla_6_14_evan_estola (20)

Parismlmeetupfinalslides 151209190037-lva1-app6892

Customer segmentation scbcn17

Life in CSE.pptx

Computer Science Career Guidance

Be Part of a Community

Cepstrum Placement Talk 2022.pptx

Data analysis with Pandas and Spark

AWS re:Invent 2016| GAM302 | Sony PlayStation: Breaking the Bandwidth Barrier...

Graph processing at scale using spark & graph frames

SDEC2011 Mahout - the what, the how and the why

Dataiku hadoop summit - semi-supervised learning with hadoop for understand...

Maintaining high quality user generated content through machine learning

Better programmer overview

Overcome the Reign of Chaos

Building Data Apps with Python

Codemotion 2017 - "Dime cómo manejas tus datos y te diré qué clase de base de...

Active Learning on Question Answering with Dialogues

Quora ML Workshop: Maintaining High Quality User-Generated Content through Ma...

Recommending the world's knowledge

How we won the 2nd place In KaggleDays Championship.pptx

More from Data Con LA

Data Con LA 2022 Keynotes

Data Con LA

Data Con LA 2022 Keynotes

Data Con LA

Data Con LA 2022 Keynote

Data Con LA

Data Con LA 2022 - Startup Showcase

Data Con LA

Data Con LA 2022 Keynote

Data Con LA

Mike Limcaco, Analytics Specialist / Customer Engineer at Google Measure trends in a particular topic or search term across Google Search across the US down to the city-level. Integrate these data signals into analytic pipelines to drive product, retail, media (video, audio, digital content) recommendations tailored to your audience segment. We'll discuss how Google unique datasets can be used with Google Cloud smart analytic services to process, enrich and surface the most relevant product or content that matches the ever-changing interests of your local customer segment.

Data Con LA 2022 - Using Google trends data to build product recommendations

Data Con LA

Melinda Thielbar, Data Science Practice Lead and Director of Data Science at Fidelity Investments From corporations to governments to private individuals, most of the AI community has recognized the growing need to incorporate ethics into the development and maintenance of AI models. Much of the current discussion, though, is meant for leaders and managers. This talk is directed to data scientists, data engineers, ML Ops specialists, and anyone else who is responsible for the hands-on, day-to-day of work building, productionalizing, and maintaining AI models. We'll give a short overview of the business case for why technical AI expertise is critical to developing an AI Ethics strategy. Then we'll discuss the technical problems that cause AI models to behave unethically, how to detect problems at all phases of model development, and the tools and techniques that are available to support technical teams in Ethical AI development.

Data Con LA 2022 - AI Ethics

Data Con LA

Antje Barth, Principal Developer Advocate, AI/ML at AWS & Chris Fregly, Principal Engineer, AI & ML at AWS The frequency and severity of natural disasters are increasing. In response, governments, businesses, nonprofits, and international organizations are placing more emphasis on disaster preparedness and response. Many organizations are accelerating their efforts to make their data publicly available for others to use. Repositories such as the Registry of Open Data on AWS and Humanitarian Data Exchange contain troves of data available for use by developers, data scientists, and machine learning practitioners. In this session, see how a community of developers came together though the AWS Disaster Response hackathon to build models to support natural disaster preparedness and response.

Data Con LA 2022 - Improving disaster response with machine learning

Data Con LA

Sig Narvaez, Executive Solution Architect at MongoDB MongoDB is now a Developer Data Platform. Come learn whatï¿½s new in the 6.0 release and Atlas following all the recent announcements made at MongoDB World 2022. Topics will include - Atlas Search which combines 3 systems into one (database, search engine, and sync mechanisms) letting you focus on your product's differentiation. - Atlas Data Federation to seamlessly query, transform, and aggregate data from one or more MongoDB Atlas databases, Atlas Data Lake and AWS S3 buckets - Queryable Encryption lets you run expressive queries on fully randomized encrypted data to meet the most stringent security requirements - Relational Migrator which analyzes your existing relational schemas and helps you design a new MongoDB schema. - And more!

Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas

Data Con LA

Jaysen Gillespie, Head of Analytics and Data Science at RTB House 1. Shopkick has over 30M downloads, but the userbase is very heterogeneous. Anecdotal evidence indicated a wide variety of users for whom the app holds long-term appeal. 2. Marketing and other teams challenged Analytics to get beyond basic summary statistics and develop a holistic segmentation of the userbase. 3. Shopkick's data science team used SQL and python to gather data, clean data, and then perform a data-driven segmentation using a k-means algorithm. 4. Interpreting the results is more work -- and more fun -- than running the algo itself. We'll discuss how we transform from ""segment 1"", ""segment 2"", etc. to something that non-analytics users (Marketing, Operations, etc.) could actually benefit from. 5. So what? How did team across Shopkick change their approach given what Analytics had discovered.

Data Con LA 2022 - Real world consumer segmentation

Data Con LA

Ravi Pillala, Chief Data Architect & Distinguished Engineer at Intuit TurboTax is one of the well known consumer software brand which at its peak serves 385K+ concurrent users. In this session, We start with looking at how user behavioral data & tax domain events are captured in real time using the event bus and analyzed to drive real time personalization with various TurboTax data pipelines. We will also look at solutions performing analytics which make use of these events, with the help of Kafka, Apache Flink, Apache Beam, Spark, Amazon S3, Amazon EMR, Redshift, Athena and Amazon lambda functions. Finally, we look at how SageMaker is used to create the TurboTax model to predict if a customer is at risk or needs help.

Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...

Data Con LA

Data Con LA 2022 - Moving Data at Scale to AWS

Data Con LA

Anand Ranganathan, Chief AI Officer at Unscrambl Conversational AI is getting more and more widely used for customer support and employee support use-cases. In this session, I'm going to talk about how it can be extended for data analysis and data science use-cases ... i.e., how users can interact with a bot to ask analytical questions on data in relational databases. This allows users to explore complex datasets using a combination of text and voice questions, in natural language, and then get back results in a combination of natural language and visualizations. Furthermore, it allows collaborative exploration of data by a group of users in a channel in platforms like Microsoft Teams, Slack or Google Chat. For example, a group of users in a channel can ask questions to a bot in plain English like ""How many cases of Covid were there in the last 2 months by state and gender"" or ""Why did the number of deaths from Covid increase in May 2022"", and jointly look at the results that come back. This facilitates data awareness, data-driven collaboration and joint decision making among teams in enterprises and outside. In this talk, I'll describe how we can bring together various features including natural-language understanding, NL-to-SQL translation, dialog management, data story-telling, semantic modeling of data and augmented analytics to facilitate collaborate exploration of data using conversational AI.

Data Con LA 2022 - Collaborative Data Exploration using Conversational AI

Data Con LA

Anil Inamdar, VP & Head of Data Solutions at Instaclustr The most modernized enterprises utilize polyglot architecture, applying the best-suited database technologies to each of their organization's particular use cases. To successfully implement such an architecture, though, you need a thorough knowledge of the expansive NoSQL data technologies now available. Attendees of this Data Con LA presentation will come away with: -- A solid understanding of the decision-making process that should go into vetting NoSQL technologies and how to plan out their data modernization initiatives and migrations. -- They will learn the types of functionality that best match the strengths of NoSQL key-value stores, graph databases, columnar databases, document-type databases, time-series databases, and more. -- Attendees will also understand how to navigate database technology licensing concerns, and to recognize the types of vendors they'll encounter across the NoSQL ecosystem. This includes sniffing out open-core vendors that may advertise as â€œopen source,"" but are driven by a business model that hinges on achieving proprietary lock-in. -- Attendees will also learn to determine if vendors offer open-code solutions that apply restrictive licensing, or if they support true open source technologies like Hadoop, Cassandra, Kafka, OpenSearch, Redis, Spark, and many more that offer total portability and true freedom of use.

Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...

Data Con LA

Zia Khan, Computer Systems Analyst and Data Scientist at LearningFuze Data Science tutorial is designed for people who are new to Data Science. This is a beginner level session so no prior coding or technical knowledge is required. Just bring your laptop with WiFi capability. The session starts with a review of what is data science, the amount of data we generate and how companies are using that data to get insight. We will pick a business use case, define the data science process, followed by hands-on lab using python and Jupyter notebook. During the hands-on portion we will work with pandas, numpy, matplotlib and sklearn modules and use a machine learning algorithm to approach the business use case.

Data Con LA 2022 - Intro to Data Science

Data Con LA

Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment

Data Con LA

Curtis ODell, Global Director Data Integrity at Tricentis Join me to learn about a new end-to-end data testing approach designed for modern data pipelines that fills dangerous gaps left by traditional data management toolsâ€”one designed to handle structured and unstructured data from any source. You'll hear how you can use unique automation technology to reach up to 90 percent test coverage rates and deliver trustworthy analytical and operational data at scale. Several real world use cases from major banks/finance, insurance, health analytics, and Snowflake examples will be presented. Key Learning Objective 1. Data journeys are complex and you have to ensure integrity of the data end to end across this journey from source to end reporting for compliance 2. Data Management tools do not test data, they profile and monitor at best, and leave serious gaps in your data testing coverage 3. Automation with integration to DevOps and DataOps' CI/CD processes are key to solving this. 4. How this approach has impact in your vertical

Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...

Data Con LA

Arif Ansari, Professor at University of Southern California Super Bowl Ad cost $7 million and each year a few Super Bowl ads go viral. The traditional A/B testing does not predict virality. Some highly shared ones reach over 60 million organic views, which can be more valuable than views on TV. Not only are these voluntary, but they are typically without distraction, and win viewer engagement in the form of likes, comments, or shares. A Super Bowl ad that wins 69 million views on YouTube (e.g., Alexa Mind Reader) costs less than 10 cents per quality view! However, the challenge is triggering virality. We developed a method to predict virality and engineer virality into Ads. 1. Prof. Gerard J. Tellis and co-authors recommended that advertisers use YouTube to tease, test, and tweak (TTT) their ads to maximize sharing and viewing. 2022 saw that maxim put into practice. 2. We developed viral Ads prediction using two scientific models: a. Prof. Gerard Tellis et al.'s model for viral prediction b. Deep Learning viral prediction using social media effect 3. The model was able to identify all the top 15 Viral Ads it performed better than the traditional agencies. 4. New proposed method is Tease, Test, Tweak, Target and Spots Ad.

Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...

Data Con LA

Jai Bansal, Senior Manager, Data Science at Aetna This talk describes an internal data product called Member Embeddings that facilitates modeling of member medical journeys with machine learning. Medical claims are the key data source we use to understand health journeys at Aetna. Claims are the data artifacts that result from our members' interactions with the healthcare system. Claims contain data like the amount the provider billed, the place of service, and provider specialty. The primary medical information in a claim is represented in codes that indicate the diagnoses, procedures, or drugs for which a member was billed. These codes give us a semi-structured view into the medical reason for each claim and so contain rich information about members' health journeys. However, since the codes themselves are categorical and high-dimensional (10K cardinality), it's challenging to extract insight or predictive power directly from the raw codes on a claim. To transform claim codes into a more useful format for machine learning, we turned to the concept of embeddings. Word embeddings are widely used in natural language processing to provide numeric vector representations of individual words. We use a similar approach with our claims data. We treat each claim code as a word or token and use embedding algorithms to learn lower-dimensional vector representations that preserve the original high-dimensional semantic meaning. This process converts the categorical features into dense numeric representations. In our case, we use sequences of anonymized member claim diagnosis, procedure, and drug codes as training data. We tested a variety of algorithms to learn embeddings for each type of claim code. We found that the trained embeddings showed relationships between codes that were reasonable from the point of view of subject matter experts. In addition, using the embeddings to predict future healthcare-related events outperformed other basic features, making this tool an easy way to improve predictive model performance and save data scientist time.

Data Con LA 2022- Embedding medical journeys with machine learning to improve...

Data Con LA

Jie Chen, Manager Advisory, KPMG Data is the new oil. However, many organizations have fragmented data in siloed line of businesses. In this topic, we will focus on identifying the legacy patterns and their limitations and introducing the new patterns packed by Kafka's core design ideas. The goal is to tirelessly pursue better solutions for organizations to overcome the bottleneck in data pipelines and modernize the digital assets for ready to scale their businesses. In summary, we will walk through three uses cases, recommend Dos and Donts, Take aways for Data Engineers, Data Scientist, Data architect in developing forefront data oriented skills.

Data Con LA 2022 - Data Streaming with Kafka

Data Con LA

More from Data Con LA (20)

Data Con LA 2022 Keynotes

Data Con LA 2022 Keynote

Data Con LA 2022 - Startup Showcase

Data Con LA 2022 Keynote

Data Con LA 2022 - Using Google trends data to build product recommendations

Data Con LA 2022 - AI Ethics

Data Con LA 2022 - Improving disaster response with machine learning

Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas

Data Con LA 2022 - Real world consumer segmentation

Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...

Data Con LA 2022 - Moving Data at Scale to AWS

Data Con LA 2022 - Collaborative Data Exploration using Conversational AI

Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...

Data Con LA 2022 - Intro to Data Science

Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment

Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...

Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...

Data Con LA 2022- Embedding medical journeys with machine learning to improve...

Data Con LA 2022 - Data Streaming with Kafka

Recently uploaded

Invezz.com - Grow your wealth with trading signals

Invezz1

Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage with sex Booking Contact Details :- WhatsApp Chat :- +91-9711199171 27-April-2024(SMW) Best Escorts Service in Delhi Call Us — 9711199171 —Are you Looking for Call Girls in Delhi then This's a perfect place for you. We brings Many Models and Independent Call girls who Can fulfill all your Sexual Desires by Their Complete package of sex Service and It's Absolutely safe and secure. Book Your One night Stand Call Girls : 9711199171 -SERVICES- Our Delhi Call Girls are fully cooperative and understand your needs and they Give you their complete package of Sex service like- Real girl friend experience, Lip Kiss, Lip Lock, Smooch, Sucking without Condom, Oral, Erotic Massage, Lap Dance, Threesome, 69 Licking, Sex in all Position, Anal Sex etc, -AVAILABLE GIRLS- We Bring Many Good looking Slim, Busty, Hot and Sexy Call Girls as per our client's Choice like Housewives, College girls, Russian girls, Muslim girls, Afghani girls, Bengali girls, Working girls, south Indian girls, Punjabi girls, Big Boobs Call Girls etc, -PAYMENT METHOD- C O D. If you Want to Book our Call girls Service in Delhi . We don,t ask anything in Advance, you Can simply pay to her Cash on Delivery or via any upi. In-Call: — You can reach at our Place in Delhi Our place Hotel and Flat Which Is Very Clean Hygiene And 100% safe Accommodation. Out-Call: — You Can pick up the Girl from my Place if not we Provide Door-Step Call Girls Delivery in Delhi. NOTE: — Pic Collectors Time Passers and Bargainers please Stay Away As We Respect The Value For Your Money Time And Expect The Same From You. OUR SERVICES RATE: – let you know the our Call girls price in Delhi are very affordable and Best rate in genuine market of Call girls. But as all you knows that Quality has comes with its own Price tag. One Shot — 5000/in call (time 1 hour), 6000/out call Two shot with one girl — 8000/in call (time 2 hour), 10000/out call Body to body massage with sex- 8000/in call (time 1 hour) Full night Service for one person– 12000/in call, 13000/out call (shot limit 3-4 shots) Full night Service for more than 1 person — please contact Us —9711199171 We are available 24*7 all days of the year. Call us — 9711199171 Thank you for Visiting.

Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...

shivangimorya083

Week-01-2.ppt BBB human Computer interaction

fulawalesam

Accredited-Transport-Cooperatives-Jan-2021-Web.pdf

adriantubila

Saudi Arabia [ Abortion pills) Jeddah/riaydh/dammam/+966572737505☎️] cytotec tablets uses abortion pills 💊💊 How effective is the abortion pill? 💊💊 +966572737505) "Abortion pills in Jeddah" how to get cytotec tablets in Riyadh " Abortion pills in dammam*💊💊 The abortion pill is very effective. If you’re taking mifepristone and misoprostol, it depends on how far along the pregnancy is, and how many doses of medicine you take:💊💊 +966572737505) how to buy cytotec pills At 8 weeks pregnant or less, it works about 94-98% of the time. +966572737505[ 💊💊💊 At 8-9 weeks pregnant, it works about 94-96% of the time. +966572737505) At 9-10 weeks pregnant, it works about 91-93% of the time. +966572737505)💊💊 If you take an extra dose of misoprostol, it works about 99% of the time. At 10-11 weeks pregnant, it works about 87% of the time. +966572737505) If you take an extra dose of misoprostol, it works about 98% of the time. In general, taking both mifepristone and+966572737505 misoprostol works a bit better than taking misoprostol only. +966572737505 Taking misoprostol alone works to end the+966572737505 pregnancy about 85-95% of the time — depending on how far along the+966572737505 pregnancy is and how you take the medicine. +966572737505 The abortion pill usually works, but if it doesn’t, you can take more medicine or have an in-clinic abortion. +966572737505 When can I take the abortion pill?+966572737505 In general, you can have a medication abortion up to 77 days (11 weeks)+966572737505 after the first day of your last period. If it’s been 78 days or more since the first day of your last+966572737505 period, you can have an in-clinic abortion to end your pregnancy.+966572737505 Why do people choose the abortion pill? Which kind of abortion you choose all depends on your personal+966572737505 preference and situation. With+966572737505 medication+966572737505 abortion, some people like that you don’t need to have a procedure in a doctor’s office. You can have your medication abortion on your own+966572737505 schedule, at home or in another comfortable place that you choose.+966572737505 You get to decide who you want to be with during your abortion, or you can go it alone. Because+966572737505 medication abortion is similar to a miscarriage, many people feel like it’s more “natural” and less invasive. And some+966572737505 people may not have an in-clinic abortion provider close by, so abortion pills are more available to+966572737505 them. +966572737505 Your doctor, nurse, or health center staff can help you decide which kind of abortion is best for you. +966572737505 More questions from patients: Saudi Arabia+966572737505 CYTOTEC Misoprostol Tablets. Misoprostol is a medication that can prevent stomach ulcers if you also take NSAID medications. It reduces the amount of acid in your stomach, which protects your stomach lining. The brand name of this medication is Cytotec®.+966573737505) Unwanted Kit is a combination of two medici

Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec

Abortion pills in Riyadh +966572737505 get cytotec

Mature dropshipping via API with DroFx.pptx

olyaivanovalion

Zuja dropshipping via API with DroFx.pptx

olyaivanovalion

(Vivek)Call Us, 8448380779,Call girls in Delhi NCr – We Offer best in class call girls. escort Service At Affordable Price At low Rate with Space Night 8000 We Are One Of The Oldest Escort and Call girls Agencies in Delhi. You Will Find That Our Female Escorts Are Full Of Fun, Sexy And They Would Love Enjoy Your Company. We Have A Fantastic Selection Of Escort Ladies Available For In-Calls As Well As Out-Calls. Our Escorts Are Not Only Beautiful But All Have Great Personalities Making Them The Perfect Companion For Any Occasion. In-Call:- You Can Come At Our Place in Delhi Our place Which Is Very Clean Hygienic 100% safe Accommodation. Out-Call:- You have To Come Pick The Girl From My Place We Are Also Provide Door Step Services (Delhi Ncr, Noida, Gurgaon, Faridabad, Ghaziabad Note:- Pic Collectors Time Passers Bargainers Stay Away As We Respect The Value For Your Money Time And Expect The Same From You Hygienic:- Full Ac room And Clean Rooms Available In Hotel 24 * 7 Hourly In Delhi NCR More Details, With WhatsApp Number, +91-8448380779

BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service

Delhi Call girls

BigBuy dropshipping via API with DroFx.pptx

olyaivanovalion

Carero dropshipping via API with DroFx.pptx

olyaivanovalion

CebaBaby dropshipping via API with DroFX.pptx

olyaivanovalion

Smarteg dropshipping via API with DroFx.pptx

olyaivanovalion

Call Girls In Connaught Place Delhi Call or Whataap 🔝 9953056974 🔝Escorts provide 24×7 Available With Room TIMINGS 24 HOURS OPENS Booking Now Gentleman Only:-Call Now Best High Class Normal Call Girls Escorts Service In Delhi NCR 24-7 Hours Available Service I, provide In Delhi NCR Female Escorts Sex Service 100% Customers Satisfaction Guarantee VIP Profiles Top Grade Service 100% Cooperative All round Service 🔝 9953056974 🔝 InCall: – You Can Reach At Our Place in Delhi Our place Which Is Very Clean Hygienic 100% safe Accommodation OutCall: – Service For Out Call You have To Come Pick The Girl From My Place We Also Provide Door Step Services Note: – Pic Collectors Time Passers Bargainers Stay Away As We Respect The Value For Your Money Time And Expect The Same From You 🔝 9953056974 🔝 Hygienic: – Full Ac Neat And Clean Rooms Available In Hotel 24 * 7 Hrs In Delhi Ncr 🔝 9953056974 🔝 Place: – South Extension Nehru Place Saket Malviya Nagar Munirka Vasant Kunj Safdarjung Katwaria Sarai Lajpat Nagar Kalkaji Hauz Khas Mahipalpur Dwarka Karol Bagh Noida Gurgaon Faridabad All Outcall Only Hotel Service In Delhi Ncr 🔝 9953056974 🔝 We Are Providing : – House Wife’s : – Private Independent House Wife’ : – Private Independent Collage Going Girls : – Corporate MNC Working Profiles : – Call Center Girls: – Live Band Girls : – Foreigners Many More: – Independent Models Service type For Pics And Other Details Pls Whatsapp Me Otherwise Call Me Any Time Incall Outcall Both Are Services Available Door Step, home, Apartment, Guest House, Flate, All Star Hotel Available 99530 vip 56974 THANKS FOR VISITING Booking 24×7 HRS

Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh

9953056974 Low Rate Call Girls In Saket, Delhi NCR

Midocean dropshipping via API with DroFx

olyaivanovalion

BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx

MohammedJunaid861692

Unlock the Secrets of Discriminative Models: Master the Art of Prediction! 🌟 Dive into our latest presentation where we simplify the complex world of machine learning. Discover how discriminative models, from Logistic Regression to Neural Networks, optimize decision-making by focusing on the relationships between observed features and outcomes. Perfect for beginners and experts alike! 🚀 #MachineLearning #DataScience #AI #PredictiveAnalytics #100ConceptsOfAIwithAnupamaK

100-Concepts-of-AI by Anupama Kate .pptx

Anupama Kate

April 2024 - Crypto Market Report's Analysis

manisha194592

Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore Escorts Service Booking Contact Details :- WhatsApp Chat :- +91-7737669865 2-May-2024(SMW) Call Girls In Model Towh Bangalore +91-7737669865 !! Best Woman Seeking Man Call Girls Service, Escorts Service in Home Hotel in Bangalore NCR 24 Hours Available Service Call Girls, Contact Us +91-7737669865 (Any Time. Any Where) Call Girls in Bangalore, Noida, Gurgaon, Ghaziabad,Sexy Indian Female Escorts Service Bangalore NCRWelcome To Bangalore Escorts Service – An All Over New Bangalore Very Sexy Hot Call Girls Agency Service Escorts In South BangaloreNCRBangalore’s No. 1 High Profile Independent Female Escorts Service. We Provide Good Quality Educated Profile At Very Regnebal Price 100% Safe And Original.We Are Provide Escorts Service All OYO Hotels ,3*,4*,5* Star Hotel And Home Flat, Apartment. Guest-House. Services In -Call And Out – Call Both Are Services Available. 24Hrs. Any Time Any Where. In All Over Bangalore Noida Gurgaon Ghaziabad Faridabad.More Information And Contact Profile Real Pic Visit Our Website City Wise Escorts Service Agency.Good Looking Cheap And Best Models Girls U Can Get Best Click On Link……Night Call Girls Now In Hotel Le Meridien Gurgaon Near Female Escort One Shot — 5000/in call (time 1 hour), 6000/out call Two shot with one girl — 8000/in call (time 2 hour), 10000/out call Body to body massage with sex- 8000/in call (time 1 hour) Full night Service for one person– 12000/in call, 13000/out call (shot limit 3-4 shots) Full night Service for more than 1 person — please contact Us —7737669865 We are available 24*7 all days of the year. Call us — 7737669865 Thank you for Visiting.

Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...

amitlee9823

Digital advertising, or paid media, encompasses the strategic deployment of online advertisements to reach target audiences efficiently and effectively. This includes any digital platform that supports advertising to deliver unique messages for any objective. Understanding the mechanics of digital advertising platforms, along with insights into audience behaviors and preferences, allows marketers to optimize their ad spend and achieve significant engagement and conversion rates. This lecture is for Advanced Digital & Social Media Strategy (MGMTX 466.05) at UCLA Extension.

Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...

Valters Lauzums

Call Girl In Dwarka ☎92055#41914 ¶¶ Indian,Russian Best Quality full Educated And Full Cooperative Independent Call Girls Escort Services In New Delhi- I Have Extremely Beautiful Broad Minded Cute Sexy & Hot Call Girls and Escorts, We Are Located in 3* 4* 5* Hotels in Delhi. Safe & Secure High Class Services Affordable Rate 100% Satisfaction, Unlimited Enjoyment. Any Time for Model/Teens Escort in Delhi High class luxury and premium escorts agency Indian Russian Call Girls In Delhi Booking Good High Profile Escorts (Call Girls) In Delhi 5 Star Hotel ,Incall Service,OutCall Service, We provide services by Call Girls,College Girls,Modals Get High Profile queens,Well Educated,Good Looking,Full Cooperative Model, Russian Models,Punjabi Girls Kashmeri Girls Services etc… We Provide Hottest Female With Safe And Consensual With Most Limits Respected Complete Satisfaction Guaranteed…Service. Call Me Spacial For Including Incall//outcall Service In New Delhi Indian Russian Escorts Service

Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...

Delhi Call girls

Recently uploaded (20)

Invezz.com - Grow your wealth with trading signals

Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...

Week-01-2.ppt BBB human Computer interaction

Accredited-Transport-Cooperatives-Jan-2021-Web.pdf

Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec

Mature dropshipping via API with DroFx.pptx

Zuja dropshipping via API with DroFx.pptx

BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service

BigBuy dropshipping via API with DroFx.pptx

Carero dropshipping via API with DroFx.pptx

CebaBaby dropshipping via API with DroFX.pptx

Smarteg dropshipping via API with DroFx.pptx

Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh

Midocean dropshipping via API with DroFx

BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx

100-Concepts-of-AI by Anupama Kate .pptx

April 2024 - Crypto Market Report's Analysis

Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...

Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...

Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...

Estola meetup big_datacampla_6_14_evan_estola

1. Beyond Collaborative Filtering: ML & Recommendations at Meetup Evan Estola Meetup.com evan@meetup.com @estola

2. My Background ● Machine Learning Engineer ● At Meetup since May 2012 ● BS Computer Science ○ Information Retrieval ○ Data Mining ○ Math ■ Linear Algebra ■ Graph Theory

3. Meetup what are you

4. Why Meetup data is cool ● Real people meeting up ● Every meetup could change someone's life ● No ads, just do the best thing ● Oh and >125 million rsvps by >17 million members ● ~3 million rsvps in the last 30 days ○ >1/second

6. Tools at Meetup ● Hive - SQL on Hadoop ● Spark - Distributed Scala on Hadoop cluster ● Scala - Recommendations service ● R - Data analysis, Model building ● Python - Scripting, Data organizing ● Java - Backend of our web stack

7. “Everything is a recommendation”

9. Collaborative Filtering ● Classic recommendations approach ● Users who like this also like this

10. Weaknesses of CF ● Sparsity ● Cold Start ● Coverage ● Diversity

11. Why Recs at Meetup are hard ● Incomplete Data (topics) ● Cold start ● Asking user for data is hard ● Going to meetups is scary ● Sparsity ○ Location ○ Groups/person ○ Membership: 0.001% ○ Compare to Netflix: 1%

12. Cleaning data ● Schenectady ● Beverly Hills ● Fake RSVP boosts (+100 guests!) ● Rsvp hogs

13. Real data is gross ● Preprocessing is critical! ○ missing data ○ outliers ○ log scale ○ bucketing ○ sampling bias

14. Supervised Learning/Classification ● “Inferring a function from labeled training data” ● Joined Meetup group/Didn’t join Meetup group

15. Ranking ● Membership << expected error rate ○ Sample to 50/50 join/no-join ● Model output label no longer explicitly true ● Use a classifier that gives you a useful output

16.

17. Topic Match

18. State Match

19. Ensemble Learning “... use multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms”

20. Ensemble Learning ● Collaborative Filtering on Topics ● Other simple features

21. Logistic Regression Output ● RsvpScore 0.02 ● FbFriends 2.02 ● 2ndFbFriends 0.09 ● AgeUnmatch -2.40 ● GenUnmatch -3.37 ● Distance -0.04 ● StateMatch 0.54 ● CountyMatch 0.41 ● ZipScore 0.06 ● TopicScore 4.14 ● ExtendedTS 0.47 ● RelatedTS 0.66 ● FbLikeTS 0.78

22. Facebook Likes ● Lots of information, but how to use? ● Map to topics, let training the model take care of the rest! ● Bonus: Recommendations server knows topics, generated topics can be passed in by request

23. Mapping FB Likes to Meetup Topics ● Text based? ○ Go(game) vs Go(lang)? ○ Burton? ● Data approach! ○ Grab most popular topics across all members with the same like

24. Normalization ● Top topics for Burton-Likers ○ Meeting New People, Coffee, bla bla ○ Most popular still dominates ● Normalize based on expected topic occurrence in sample

25. Normalization ● For members with a given Like ● Compare percent with each topic to expected among total population ● Burton: ○ 20% “Meeting New People” ○ 9% “Snowboarding” ● Total population ○ 20% “Meeting New People” ○ 2% “Snowboarding

26. Processing ● Load FB Like connections, topics into Hadoop ● Process with Hive to generate top topics for each like ● Join with member likes to generate top topics per member ● Add feature to model using FB-Like- Generated-Topics crossover with groups...

27. Results ● Positive weight ○ Very good sign ○ Captured information about member identity, not just behavior ● Deploy/Split test ○ 1.5% lift in conversion overall ○ (Only have facebook data for ~10% of members)

28. Thanks! Smart people come work with me. http://www.meetup.com/jobs/

Estola meetup big_datacampla_6_14_evan_estola

Recommended

Recommended

More Related Content

Similar to Estola meetup big_datacampla_6_14_evan_estola

Similar to Estola meetup big_datacampla_6_14_evan_estola (20)

More from Data Con LA

More from Data Con LA (20)

Recently uploaded

Recently uploaded (20)

Estola meetup big_datacampla_6_14_evan_estola