Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.
Analyzing Mixpanel
Data into Amazon
Redshift
How?
Define the Data Pipeline
Access & Extract data from Mixpanel
Prepare data
Load data to Amazon Redshift
Who am I?
Kostas Pardalis
Co-founder & CEO
Blendo.co
@KostasPardalis
Why we built Blendo?
The Simplest Platform to get and
remix your data from any source.
Make your data available anywhere.
Mixpanel?
Mixpanel helps you to easily measure what people are doing
in your app on iOS, Android, and web.
Amazon Redshift
How to Analyze Mixpanel
Data?
Use the Mixpanel Internal Reports
Write JQL
Load data to a data warehouse for SQL
Access
How to Extract data from
Mixpanel?
Use Mixpanel’s Export API
https://mixpanel.com/docs/api-documentation/data-
export-api
Access it with:
CURL
Postman
Apache...
Use Mixpanel’s Export API
https://mixpanel.com/docs/api-documentation/data-
export-api
Or Use Mixpanel’s Libraries /SDKs
P...
Mixpanel API Resources
Annotations
annotations - list the annotations for a specified date
range.
create - create an annot...
Mixpanel API Resources
Events
events - get total, unique, or average data for a set of
events over a time period
top - get...
Mixpanel API Resources
Funnels
funnels - get data for a set of funnels over a time period
list - get a list of the names o...
Mixpanel API Resources
Retention
retention - get data about how often people are coming back
(cohort analysis)
addiction -...
Mixpanel API Resources
Let’s assume that we want to export our raw data from Mixpanel.
We’ll need to execute requests to t...
Prepare Mixpanel Data for
Amazon Redshift
Prepare Mixpanel Data for Amazon
Redshift
• Follow Amazon Redshift Data Model
• Map into tables and columns
• Adhere to th...
Load data from Mixpanel to
Amazon Redshift
Put data in a source that Redshift can pull
it from
Amazon S3
Amazon DynamoDB
Amazon Kinesis Firehose
Amazon S3
1. Use the AWS REST API
Amazon S3
2. Create a bucket
Execute an HTTP PUT on the Amazon AWS REST API endpoints
for S3. (Use: CURL or Postman or use...
Amazon S3
3. Start sending your data to Amazon S3
Use the same AWS REST API
Use the endpoints for Object operations
Amazon DynamoDB
• DynamoDB imports data from S3
• Adds another step between S3 and
Amazon Redshift
Amazon Kinesis Firehose
1. Create a delivery stream
2. Add data to the stream
* Whenever you add new data to the stream, K...
Load data into Redshift #1
INSERT
1. Connect to Amazon Redshift instance with
your client, (JDBC or ODBC)
2. Perform an IN...
Load data into Redshift #2
COPY
For more examples on how to invoke a COPY command you can check
the COPY examples page on ...
An even easier way?
Blendo
Próxima SlideShare
Cargando en…5
×
Próxima SlideShare
Mixpanel in 10 minutes
Siguiente
Descargar para leer sin conexión y ver en pantalla completa.

0

Compartir

Descargar para leer sin conexión

Analyzing Mixpanel Data into Amazon Redshift

Descargar para leer sin conexión

This presentation is an overview guide to help us define a process or data pipeline, to load data from Mixpanel into Amazon Redshift for further analysis.
We will see how to:
- access and extract data from Mixpanel through its API
- how to load it into Redshift

This is not a full solution as it will require to writing the code to get the data and make sure that this process will run every time new data are generated.

Libros relacionados

Gratis con una prueba de 30 días de Scribd

Ver todo
  • Sé el primero en recomendar esto

Analyzing Mixpanel Data into Amazon Redshift

  1. 1. Analyzing Mixpanel Data into Amazon Redshift
  2. 2. How? Define the Data Pipeline Access & Extract data from Mixpanel Prepare data Load data to Amazon Redshift
  3. 3. Who am I? Kostas Pardalis Co-founder & CEO Blendo.co @KostasPardalis
  4. 4. Why we built Blendo? The Simplest Platform to get and remix your data from any source. Make your data available anywhere.
  5. 5. Mixpanel? Mixpanel helps you to easily measure what people are doing in your app on iOS, Android, and web.
  6. 6. Amazon Redshift
  7. 7. How to Analyze Mixpanel Data?
  8. 8. Use the Mixpanel Internal Reports Write JQL Load data to a data warehouse for SQL Access
  9. 9. How to Extract data from Mixpanel?
  10. 10. Use Mixpanel’s Export API https://mixpanel.com/docs/api-documentation/data- export-api Access it with: CURL Postman Apache HttpClient for Java Spray-client for Scala Hyper for Rust Ruby rest-client Python http-client
  11. 11. Use Mixpanel’s Export API https://mixpanel.com/docs/api-documentation/data- export-api Or Use Mixpanel’s Libraries /SDKs Python PHP Ruby Javascript
  12. 12. Mixpanel API Resources Annotations annotations - list the annotations for a specified date range. create - create an annotation update - update an annotation delete - delete an annotation Export export - get a "raw dump" of tracked events over a time period
  13. 13. Mixpanel API Resources Events events - get total, unique, or average data for a set of events over a time period top - get the top events from the last day names - get the top event names for a time period Event Properties properties - get total, unique, or average data from a single event property top - get the top properties for an event values - get the top values for a single event property
  14. 14. Mixpanel API Resources Funnels funnels - get data for a set of funnels over a time period list - get a list of the names of all the funnels Segmentation segmentation - get data for an event, segmented and filtered by properties over a time period numeric - get numeric data, divided up into buckets for an event segmented and filtered by properties over a time period sum - get the sum of a segment's values per time unit average - get the average of a segment's values per time unit Segmentation Expressions - a detailed overview of what a segmentation expression consists of
  15. 15. Mixpanel API Resources Retention retention - get data about how often people are coming back (cohort analysis) addiction - get data about how frequently people are performing events People Analytics engage - get data from People Analytics Let’s assume that we want to export our raw data from Mixpanel. To do so we’ll need to execute requests to the export endpoint.
  16. 16. Mixpanel API Resources Let’s assume that we want to export our raw data from Mixpanel. We’ll need to execute requests to the export endpoint. Eg “a request that would get us back raw events from Mixapanel”
  17. 17. Prepare Mixpanel Data for Amazon Redshift
  18. 18. Prepare Mixpanel Data for Amazon Redshift • Follow Amazon Redshift Data Model • Map into tables and columns • Adhere to the datatypes that are supported by Redshift* • Have in mind the best practices that Amazon has published regarding the design of a Redshift database. Amazon Redshift is built around industry-standard SQL with added functionality to manage very large datasets and high performance analysis. * As your data are probably coming in a representation like JSON that supports a much smaller range of data types you have to be really careful about what data you feed into Redshift and make sure that you have mapped your types
  19. 19. Load data from Mixpanel to Amazon Redshift
  20. 20. Put data in a source that Redshift can pull it from Amazon S3 Amazon DynamoDB Amazon Kinesis Firehose
  21. 21. Amazon S3 1. Use the AWS REST API
  22. 22. Amazon S3 2. Create a bucket Execute an HTTP PUT on the Amazon AWS REST API endpoints for S3. (Use: CURL or Postman or use the libraries provided by Amazon)* * You can find more information by reading the API reference for the Bucket operations on Amazon AWS documentation.
  23. 23. Amazon S3 3. Start sending your data to Amazon S3 Use the same AWS REST API Use the endpoints for Object operations
  24. 24. Amazon DynamoDB • DynamoDB imports data from S3 • Adds another step between S3 and Amazon Redshift
  25. 25. Amazon Kinesis Firehose 1. Create a delivery stream 2. Add data to the stream * Whenever you add new data to the stream, Kinesis takes care of adding these data to S3 or Redshift. Going through S3 in this case is redundant if your goal is to move your data to Redshift. Amazon Kinesis Firehose offers a real time streaming approach into data importing Use the same AWS REST API Push by using a Kinesis Agent.
  26. 26. Load data into Redshift #1 INSERT 1. Connect to Amazon Redshift instance with your client, (JDBC or ODBC) 2. Perform an INSERT command for your data. for more information you can check the INSERT examples page on the Amazon Redshift documentation.
  27. 27. Load data into Redshift #2 COPY For more examples on how to invoke a COPY command you can check the COPY examples page on Amazon Redshift documentation. 1. Connect to Amazon Redshift instance with your client, (JDBC or ODBC) 2. Perform an COPY command for your data.
  28. 28. An even easier way?
  29. 29. Blendo

This presentation is an overview guide to help us define a process or data pipeline, to load data from Mixpanel into Amazon Redshift for further analysis. We will see how to: - access and extract data from Mixpanel through its API - how to load it into Redshift This is not a full solution as it will require to writing the code to get the data and make sure that this process will run every time new data are generated.

Vistas

Total de vistas

2.233

En Slideshare

0

De embebidos

0

Número de embebidos

1.583

Acciones

Descargas

12

Compartidos

0

Comentarios

0

Me gusta

0

×