SlideShare una empresa de Scribd logo
1 de 28
Descargar para leer sin conexión
Dharmesh Vaya
@DRVaya
http://drvaya.wordpress.com/
Agenda
● What is Big Data ?
● Available Big Data Solutions & Issues
● Why Google BigQuery ?
● Inside BigQuery
● Features & Components
● RESTful API
● Development with BigQuery (Live Demo)
○ Query History, Projects, DataSets, Public Datasets, Table Details, Writing
Queries, Save Results.
○ Integration with Applications.
● BigQuery Tools
● Big Data Solution with BigQuery & Google Cloud Platform
● Pricing Model
● Any questions ?
What is Big Data ?
Is it a Data Type ?
No
Its a buzzword - massive volume of
structured and/or unstructured data.
It is so large that it is difficult to
process/analyze using traditional databases.
What is Big Data ?
Data that has following attributes can be ‘Big Data’
So how Big is B - I - G ?
So how Big is B - I - G ?
Library of Congress - Textual Data
20 Terabytes
(20 000 000 000 000 bytes)
So how Big is B - I - G ?
Amazon.com - Inventory &Customer Data
42 Terabytes
(42 000 000 000 000 bytes)
So how Big is B - I - G ?
YouTube.com - Media Data
100+ Terabytes
(100 000 000 000 000
bytes)
So how Big is B - I - G ?
Google.com - Search, Mail, Media & anything you can think of !!
850+ Terabytes
(850 000 000 000 000 bytes)
(Speculated Figures)
So how Big is B - I - G ?
World Data Center for Climate - Meteorology Data
6.2 Petabytes
(7 000 000 000 000 000 bytes)
Available Big Data Solutions & Issues
- Highly Scalable and Distributed Computing.
- Storage (HDFS) optimized for high throughput
- Security, disabled by default
- MapReduce is batch based, hence no real
time operations.
- Costly to maintain.
- Highly Scalable, talks of handling Petabytes
- Elastic set of resources to return result sets
- Almost 10x fast as compared to Hadoop.
- High costs of Data Migration and integration
- Operations/Maintenance cost may shoot up
Why Google BigQuery ?
Hadoop
(with Hive)
Amazon
Redshift
Google
BigQuery
= 1.4 TB
On an average its within 8-10 seconds !!
Inside Google BigQuery
● BigQuery is based on Dremel, a technology pioneered by Google & extensively used
within.
● It used Columnar storage & multi-level execution trees to achieve interactive
performance for queries against multi-terabyte datasets.
● BigQuery's performance advantage comes from its parallel processing architecture.
● The query is processed by thousands of servers in a multi-level execution tree
structure, with the final results aggregated at the root. BigQuery stores the data in a
columnar format so that only data from the columns being queried are real.
● All this & more is now available as a publicly available service for any business
or developer to use. This release made it possible for those outside of Google to
utilize the power of Dremel for their Big Data processing requirements.
Columnar Storage & Trees
Inside Google BigQuery
There’s a difference
● Dremel is designed as an interactive
data analysis tool for large datasets.
● MapReduce is designed as a
programming framework to batch
process large datasets
Hey you mentioned
Dremel,
isn’t Map Reduce
based on it ?
Features & Components
Features:
● Web GUI for BigQuery
● Affordable
● Run in Background
● Easy Data Importation
● Flexible (Addition of Columns, Native Support For Timestamp Type
Of Data)
● REST API Support
● More than just Standard SQL
Components:
● Project
● Tables
● DataSets
● Jobs
RESTful API
Method HTTP Request
delete DELETE
/projects/projectId/datasets/datasetId
get GET
/projects/projectId/datasets/datasetId
insert POST /projects/projectId/datasets
list GET /projects/projectId/datasets
patch PATCH
/projects/projectId/datasets/datasetId
update PUT
/projects/projectId/datasets/datasetId
For Datasets
RESTful API
Method HTTP Request
delete GET /projects/projectId/jobs/jobId
getQueryR
esults
GET
/projects/projectId/queries/jobId
insert POST
https://www.googleapis.
com/upload/bigquery/v2/projects/p
rojectId/jobs
and
POST /projects/projectId/jobs
list GET /projects/projectId/jobs
query POST /projects/projectId/queries
For Jobs
Similar methods for -
● Projects
● Tables
● TableData
Demo using Web Interface
Demo : Excel Connector
+
BigQuery Tools
BigQuery Excel Connector bq Command LineBigQuery Browser Tool
Virtualization
& BI Tools
ETL Tools
ODBC Connector
Big Data Solution with BigQuery
Big Data Solution with BigQuery
Data Pipeline - transforming and loading data into BigQuery
The process of using the Google Cloud Platform to upload data into BigQuery involves
uploading the CSV files or Javascript Object Notation (JSON) files to Google Cloud Storage before
loading the data into BigQuery. Alternatively, REST API can also be used to provide programmatic
integration into the current computing environment.
Data Visualization - performing data analysis on BigQuery and visualizing the results
A custom, web-based dashboard can be built on Google App Engine using the BigQuery REST
API to execute the queries and using Google Chart Tools to visualize the results
Pricing Model
Action Example
Loading Data Loading files/data into BigQuery
Exporting Data Exporting data, Saving Results from BigQuery
Table Reads Browsing through data
Table Copies Copy existing table to new table
Storage Action Cost
Storage $0.020 per GB, per month.
Streaming Inserts Free until January 1, 2015.
After January 1, 2015,
$0.01 per 100,000 rows
Query Pricing Cost
On-demand $5 per TB
Reserved
Capacity
5GB per second
$20k/ month
Wow that’s like 800MB for 1 Rupee,
even Internet ain’t that cheap here.
Where to use ?
● Not a replacement to traditional systems, but it compliments the eco-system !!
● Major strength is Handling Large DataSets
● Major usage in Data Analytics
● Important component of Google Cloud Platform
● People are interested in numbers/data and that too quick….
Google BigQuery is the future of Analytics!!
Any questions ?
What we covered ...
✓ What is Big Data ?
✓ Available Big Data Solutions & Issues
✓ Why Google BigQuery ?
✓ Features, Components & Tools
✓ RESTful API
✓ Demo using Web Interface
✓ Big Query Tools
✓ Big Data Solution with BigQuery
✓ Pricing Model
✓ Usage
https://bigquery.cloud.google.com
No registration, just sign-in with your Google account
Follow Dharmesh Vaya on @DRVaya
or subscribe to my http://drvaya.wordpress.com/
You can also add me on +DharmeshVaya
About the presenter
https://cloud.google.com/developers/articles/getting-started-with-google-bigquery
https://cloud.google.com/files/Redbus.pdf
http://www.reddit.
com/r/bigquery/comments/28ialf/173_million_2013_nyc_taxi_rides_shared_on_big
query/
http://www.datawrangling.com/some-datasets-available-on-the-web/
http://bigqueri.es/
https://developers.google.com/bigquery/pricing#data

Más contenido relacionado

La actualidad más candente

30 days of google cloud event
30 days of google cloud event30 days of google cloud event
30 days of google cloud eventPreetyKhatkar
 
Google Cloud Platform at Vente-Exclusive.com
Google Cloud Platform at Vente-Exclusive.comGoogle Cloud Platform at Vente-Exclusive.com
Google Cloud Platform at Vente-Exclusive.comAlex Van Boxel
 
Google BigQuery 101 & What’s New
Google BigQuery 101 & What’s NewGoogle BigQuery 101 & What’s New
Google BigQuery 101 & What’s NewDoiT International
 
Google BigQuery - Features & Benefits
Google BigQuery - Features & BenefitsGoogle BigQuery - Features & Benefits
Google BigQuery - Features & BenefitsAndreas Raible
 
Google BigQuery for Everyday Developer
Google BigQuery for Everyday DeveloperGoogle BigQuery for Everyday Developer
Google BigQuery for Everyday DeveloperMárton Kodok
 
TDC2016SP - Trilha BigData
TDC2016SP - Trilha BigDataTDC2016SP - Trilha BigData
TDC2016SP - Trilha BigDatatdc-globalcode
 
Augmenting Mongo DB with treasure data
Augmenting Mongo DB with treasure dataAugmenting Mongo DB with treasure data
Augmenting Mongo DB with treasure dataTreasure Data, Inc.
 
Google Cloud Platform for Data Science teams
Google Cloud Platform for Data Science teamsGoogle Cloud Platform for Data Science teams
Google Cloud Platform for Data Science teamsBarton Rhodes
 
Big Query Basics
Big Query BasicsBig Query Basics
Big Query BasicsIdo Green
 
How Google Does Big Data - DevNexus 2014
How Google Does Big Data - DevNexus 2014How Google Does Big Data - DevNexus 2014
How Google Does Big Data - DevNexus 2014James Chittenden
 
GDD Brazil 2010 - Google Storage, Bigquery and Prediction APIs
GDD Brazil 2010 - Google Storage, Bigquery and Prediction APIsGDD Brazil 2010 - Google Storage, Bigquery and Prediction APIs
GDD Brazil 2010 - Google Storage, Bigquery and Prediction APIsPatrick Chanezon
 
Treasure Data From MySQL to Redshift
Treasure Data  From MySQL to RedshiftTreasure Data  From MySQL to Redshift
Treasure Data From MySQL to RedshiftTreasure Data, Inc.
 
Streaming 4 billion Messages per day. Lessons Learned.
Streaming 4 billion Messages per day. Lessons Learned.Streaming 4 billion Messages per day. Lessons Learned.
Streaming 4 billion Messages per day. Lessons Learned.Angelos Petheriotis
 
A Day in the Life of a Druid Implementor and Druid's Roadmap
A Day in the Life of a Druid Implementor and Druid's RoadmapA Day in the Life of a Druid Implementor and Druid's Roadmap
A Day in the Life of a Druid Implementor and Druid's RoadmapItai Yaffe
 
Big Data Best Practices on GCP
Big Data Best Practices on GCPBig Data Best Practices on GCP
Big Data Best Practices on GCPAllCloud
 

La actualidad más candente (20)

Google BigQuery
Google BigQueryGoogle BigQuery
Google BigQuery
 
30 days of google cloud event
30 days of google cloud event30 days of google cloud event
30 days of google cloud event
 
Google Cloud Platform at Vente-Exclusive.com
Google Cloud Platform at Vente-Exclusive.comGoogle Cloud Platform at Vente-Exclusive.com
Google Cloud Platform at Vente-Exclusive.com
 
Google BigQuery 101 & What’s New
Google BigQuery 101 & What’s NewGoogle BigQuery 101 & What’s New
Google BigQuery 101 & What’s New
 
Big query
Big queryBig query
Big query
 
Google BigQuery - Features & Benefits
Google BigQuery - Features & BenefitsGoogle BigQuery - Features & Benefits
Google BigQuery - Features & Benefits
 
Google BigQuery for Everyday Developer
Google BigQuery for Everyday DeveloperGoogle BigQuery for Everyday Developer
Google BigQuery for Everyday Developer
 
TDC2016SP - Trilha BigData
TDC2016SP - Trilha BigDataTDC2016SP - Trilha BigData
TDC2016SP - Trilha BigData
 
Augmenting Mongo DB with treasure data
Augmenting Mongo DB with treasure dataAugmenting Mongo DB with treasure data
Augmenting Mongo DB with treasure data
 
Google Cloud Platform for Data Science teams
Google Cloud Platform for Data Science teamsGoogle Cloud Platform for Data Science teams
Google Cloud Platform for Data Science teams
 
Big Query Basics
Big Query BasicsBig Query Basics
Big Query Basics
 
How Google Does Big Data - DevNexus 2014
How Google Does Big Data - DevNexus 2014How Google Does Big Data - DevNexus 2014
How Google Does Big Data - DevNexus 2014
 
GDD Brazil 2010 - Google Storage, Bigquery and Prediction APIs
GDD Brazil 2010 - Google Storage, Bigquery and Prediction APIsGDD Brazil 2010 - Google Storage, Bigquery and Prediction APIs
GDD Brazil 2010 - Google Storage, Bigquery and Prediction APIs
 
Google Cloud Spanner Preview
Google Cloud Spanner PreviewGoogle Cloud Spanner Preview
Google Cloud Spanner Preview
 
Google and big query
Google and big queryGoogle and big query
Google and big query
 
Treasure Data From MySQL to Redshift
Treasure Data  From MySQL to RedshiftTreasure Data  From MySQL to Redshift
Treasure Data From MySQL to Redshift
 
Streaming 4 billion Messages per day. Lessons Learned.
Streaming 4 billion Messages per day. Lessons Learned.Streaming 4 billion Messages per day. Lessons Learned.
Streaming 4 billion Messages per day. Lessons Learned.
 
A Day in the Life of a Druid Implementor and Druid's Roadmap
A Day in the Life of a Druid Implementor and Druid's RoadmapA Day in the Life of a Druid Implementor and Druid's Roadmap
A Day in the Life of a Druid Implementor and Druid's Roadmap
 
Data Science on Google Cloud Platform
Data Science on Google Cloud PlatformData Science on Google Cloud Platform
Data Science on Google Cloud Platform
 
Big Data Best Practices on GCP
Big Data Best Practices on GCPBig Data Best Practices on GCP
Big Data Best Practices on GCP
 

Destacado

You might be paying too much for BigQuery
You might be paying too much for BigQueryYou might be paying too much for BigQuery
You might be paying too much for BigQueryRyuji Tamagawa
 
HBaseCon 2012 | Real-time Analytics with HBase - Sematext
HBaseCon 2012 | Real-time Analytics with HBase - SematextHBaseCon 2012 | Real-time Analytics with HBase - Sematext
HBaseCon 2012 | Real-time Analytics with HBase - SematextCloudera, Inc.
 
Hadoop Summit 2012 | HBase Consistency and Performance Improvements
Hadoop Summit 2012 | HBase Consistency and Performance ImprovementsHadoop Summit 2012 | HBase Consistency and Performance Improvements
Hadoop Summit 2012 | HBase Consistency and Performance ImprovementsCloudera, Inc.
 
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...Cloudera, Inc.
 
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, SalesforceHBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, SalesforceCloudera, Inc.
 
HBaseCon 2013: Deal Personalization Engine with HBase @ Groupon
HBaseCon 2013: Deal Personalization Engine with HBase @ GrouponHBaseCon 2013: Deal Personalization Engine with HBase @ Groupon
HBaseCon 2013: Deal Personalization Engine with HBase @ GrouponCloudera, Inc.
 
Realtime Analytics with Hadoop and HBase
Realtime Analytics with Hadoop and HBaseRealtime Analytics with Hadoop and HBase
Realtime Analytics with Hadoop and HBaselarsgeorge
 

Destacado (7)

You might be paying too much for BigQuery
You might be paying too much for BigQueryYou might be paying too much for BigQuery
You might be paying too much for BigQuery
 
HBaseCon 2012 | Real-time Analytics with HBase - Sematext
HBaseCon 2012 | Real-time Analytics with HBase - SematextHBaseCon 2012 | Real-time Analytics with HBase - Sematext
HBaseCon 2012 | Real-time Analytics with HBase - Sematext
 
Hadoop Summit 2012 | HBase Consistency and Performance Improvements
Hadoop Summit 2012 | HBase Consistency and Performance ImprovementsHadoop Summit 2012 | HBase Consistency and Performance Improvements
Hadoop Summit 2012 | HBase Consistency and Performance Improvements
 
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...
 
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, SalesforceHBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
 
HBaseCon 2013: Deal Personalization Engine with HBase @ Groupon
HBaseCon 2013: Deal Personalization Engine with HBase @ GrouponHBaseCon 2013: Deal Personalization Engine with HBase @ Groupon
HBaseCon 2013: Deal Personalization Engine with HBase @ Groupon
 
Realtime Analytics with Hadoop and HBase
Realtime Analytics with Hadoop and HBaseRealtime Analytics with Hadoop and HBase
Realtime Analytics with Hadoop and HBase
 

Similar a Exploring BigData with Google BigQuery

VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...
VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...
VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...Márton Kodok
 
Google Developer Group - Cloud Singapore BigQuery Webinar
Google Developer Group - Cloud Singapore BigQuery WebinarGoogle Developer Group - Cloud Singapore BigQuery Webinar
Google Developer Group - Cloud Singapore BigQuery WebinarRasel Rana
 
Google BigQuery is the future of Analytics! (Google Developer Conference)
Google BigQuery is the future of Analytics! (Google Developer Conference)Google BigQuery is the future of Analytics! (Google Developer Conference)
Google BigQuery is the future of Analytics! (Google Developer Conference)Rasel Rana
 
bigquery.pptx
bigquery.pptxbigquery.pptx
bigquery.pptxHarissh16
 
Experimentation Platform on Hadoop
Experimentation Platform on HadoopExperimentation Platform on Hadoop
Experimentation Platform on HadoopDataWorks Summit
 
eBay Experimentation Platform on Hadoop
eBay Experimentation Platform on HadoopeBay Experimentation Platform on Hadoop
eBay Experimentation Platform on HadoopTony Ng
 
PostgreSQL as a Strategic Tool
PostgreSQL as a Strategic ToolPostgreSQL as a Strategic Tool
PostgreSQL as a Strategic ToolEDB
 
Ataas2016 - Big data hadoop and map reduce - new age tools for aid to test...
Ataas2016 - Big data   hadoop and map reduce  - new age tools for aid to test...Ataas2016 - Big data   hadoop and map reduce  - new age tools for aid to test...
Ataas2016 - Big data hadoop and map reduce - new age tools for aid to test...Agile Testing Alliance
 
The Six pillars for Building big data analytics ecosystems
The Six pillars for Building big data analytics ecosystemsThe Six pillars for Building big data analytics ecosystems
The Six pillars for Building big data analytics ecosystemstaimur hafeez
 
GDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQuery
GDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQueryGDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQuery
GDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQueryMárton Kodok
 
Architecting for Big Data - Gartner Innovation Peer Forum Sept 2011
Architecting for Big Data - Gartner Innovation Peer Forum Sept 2011Architecting for Big Data - Gartner Innovation Peer Forum Sept 2011
Architecting for Big Data - Gartner Innovation Peer Forum Sept 2011Jonathan Seidman
 
Gartner peer forum sept 2011 orbitz
Gartner peer forum sept 2011   orbitzGartner peer forum sept 2011   orbitz
Gartner peer forum sept 2011 orbitzRaghu Kashyap
 
Apache AGE and the synergy effect in the combination of Postgres and NoSQL
 Apache AGE and the synergy effect in the combination of Postgres and NoSQL Apache AGE and the synergy effect in the combination of Postgres and NoSQL
Apache AGE and the synergy effect in the combination of Postgres and NoSQLEDB
 
Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Group
 
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQueryCodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQueryMárton Kodok
 
Hadoop and the Data Warehouse: Point/Counter Point
Hadoop and the Data Warehouse: Point/Counter PointHadoop and the Data Warehouse: Point/Counter Point
Hadoop and the Data Warehouse: Point/Counter PointInside Analysis
 

Similar a Exploring BigData with Google BigQuery (20)

VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...
VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...
VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...
 
Google Developer Group - Cloud Singapore BigQuery Webinar
Google Developer Group - Cloud Singapore BigQuery WebinarGoogle Developer Group - Cloud Singapore BigQuery Webinar
Google Developer Group - Cloud Singapore BigQuery Webinar
 
Google BigQuery is the future of Analytics! (Google Developer Conference)
Google BigQuery is the future of Analytics! (Google Developer Conference)Google BigQuery is the future of Analytics! (Google Developer Conference)
Google BigQuery is the future of Analytics! (Google Developer Conference)
 
bigquery.pptx
bigquery.pptxbigquery.pptx
bigquery.pptx
 
Experimentation Platform on Hadoop
Experimentation Platform on HadoopExperimentation Platform on Hadoop
Experimentation Platform on Hadoop
 
eBay Experimentation Platform on Hadoop
eBay Experimentation Platform on HadoopeBay Experimentation Platform on Hadoop
eBay Experimentation Platform on Hadoop
 
A data analyst view of Bigdata
A data analyst view of Bigdata A data analyst view of Bigdata
A data analyst view of Bigdata
 
PostgreSQL as a Strategic Tool
PostgreSQL as a Strategic ToolPostgreSQL as a Strategic Tool
PostgreSQL as a Strategic Tool
 
Ataas2016 - Big data hadoop and map reduce - new age tools for aid to test...
Ataas2016 - Big data   hadoop and map reduce  - new age tools for aid to test...Ataas2016 - Big data   hadoop and map reduce  - new age tools for aid to test...
Ataas2016 - Big data hadoop and map reduce - new age tools for aid to test...
 
The Six pillars for Building big data analytics ecosystems
The Six pillars for Building big data analytics ecosystemsThe Six pillars for Building big data analytics ecosystems
The Six pillars for Building big data analytics ecosystems
 
Data Platform on GCP
Data Platform on GCPData Platform on GCP
Data Platform on GCP
 
GDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQuery
GDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQueryGDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQuery
GDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQuery
 
Architecting for Big Data - Gartner Innovation Peer Forum Sept 2011
Architecting for Big Data - Gartner Innovation Peer Forum Sept 2011Architecting for Big Data - Gartner Innovation Peer Forum Sept 2011
Architecting for Big Data - Gartner Innovation Peer Forum Sept 2011
 
Gartner peer forum sept 2011 orbitz
Gartner peer forum sept 2011   orbitzGartner peer forum sept 2011   orbitz
Gartner peer forum sept 2011 orbitz
 
Apache AGE and the synergy effect in the combination of Postgres and NoSQL
 Apache AGE and the synergy effect in the combination of Postgres and NoSQL Apache AGE and the synergy effect in the combination of Postgres and NoSQL
Apache AGE and the synergy effect in the combination of Postgres and NoSQL
 
Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Big Data part 2
Skillwise Big Data part 2
 
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQueryCodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
 
Hadoop and the Data Warehouse: Point/Counter Point
Hadoop and the Data Warehouse: Point/Counter PointHadoop and the Data Warehouse: Point/Counter Point
Hadoop and the Data Warehouse: Point/Counter Point
 
Skilwise Big data
Skilwise Big dataSkilwise Big data
Skilwise Big data
 
bigdata.pptx
bigdata.pptxbigdata.pptx
bigdata.pptx
 

Último

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 

Último (20)

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 

Exploring BigData with Google BigQuery

  • 2. Agenda ● What is Big Data ? ● Available Big Data Solutions & Issues ● Why Google BigQuery ? ● Inside BigQuery ● Features & Components ● RESTful API ● Development with BigQuery (Live Demo) ○ Query History, Projects, DataSets, Public Datasets, Table Details, Writing Queries, Save Results. ○ Integration with Applications. ● BigQuery Tools ● Big Data Solution with BigQuery & Google Cloud Platform ● Pricing Model ● Any questions ?
  • 3. What is Big Data ? Is it a Data Type ? No Its a buzzword - massive volume of structured and/or unstructured data. It is so large that it is difficult to process/analyze using traditional databases.
  • 4. What is Big Data ? Data that has following attributes can be ‘Big Data’
  • 5. So how Big is B - I - G ?
  • 6. So how Big is B - I - G ? Library of Congress - Textual Data 20 Terabytes (20 000 000 000 000 bytes)
  • 7. So how Big is B - I - G ? Amazon.com - Inventory &Customer Data 42 Terabytes (42 000 000 000 000 bytes)
  • 8. So how Big is B - I - G ? YouTube.com - Media Data 100+ Terabytes (100 000 000 000 000 bytes)
  • 9. So how Big is B - I - G ? Google.com - Search, Mail, Media & anything you can think of !! 850+ Terabytes (850 000 000 000 000 bytes) (Speculated Figures)
  • 10. So how Big is B - I - G ? World Data Center for Climate - Meteorology Data 6.2 Petabytes (7 000 000 000 000 000 bytes)
  • 11. Available Big Data Solutions & Issues - Highly Scalable and Distributed Computing. - Storage (HDFS) optimized for high throughput - Security, disabled by default - MapReduce is batch based, hence no real time operations. - Costly to maintain. - Highly Scalable, talks of handling Petabytes - Elastic set of resources to return result sets - Almost 10x fast as compared to Hadoop. - High costs of Data Migration and integration - Operations/Maintenance cost may shoot up
  • 12. Why Google BigQuery ? Hadoop (with Hive) Amazon Redshift Google BigQuery = 1.4 TB On an average its within 8-10 seconds !!
  • 13. Inside Google BigQuery ● BigQuery is based on Dremel, a technology pioneered by Google & extensively used within. ● It used Columnar storage & multi-level execution trees to achieve interactive performance for queries against multi-terabyte datasets. ● BigQuery's performance advantage comes from its parallel processing architecture. ● The query is processed by thousands of servers in a multi-level execution tree structure, with the final results aggregated at the root. BigQuery stores the data in a columnar format so that only data from the columns being queried are real. ● All this & more is now available as a publicly available service for any business or developer to use. This release made it possible for those outside of Google to utilize the power of Dremel for their Big Data processing requirements.
  • 15. Inside Google BigQuery There’s a difference ● Dremel is designed as an interactive data analysis tool for large datasets. ● MapReduce is designed as a programming framework to batch process large datasets Hey you mentioned Dremel, isn’t Map Reduce based on it ?
  • 16. Features & Components Features: ● Web GUI for BigQuery ● Affordable ● Run in Background ● Easy Data Importation ● Flexible (Addition of Columns, Native Support For Timestamp Type Of Data) ● REST API Support ● More than just Standard SQL Components: ● Project ● Tables ● DataSets ● Jobs
  • 17. RESTful API Method HTTP Request delete DELETE /projects/projectId/datasets/datasetId get GET /projects/projectId/datasets/datasetId insert POST /projects/projectId/datasets list GET /projects/projectId/datasets patch PATCH /projects/projectId/datasets/datasetId update PUT /projects/projectId/datasets/datasetId For Datasets
  • 18. RESTful API Method HTTP Request delete GET /projects/projectId/jobs/jobId getQueryR esults GET /projects/projectId/queries/jobId insert POST https://www.googleapis. com/upload/bigquery/v2/projects/p rojectId/jobs and POST /projects/projectId/jobs list GET /projects/projectId/jobs query POST /projects/projectId/queries For Jobs Similar methods for - ● Projects ● Tables ● TableData
  • 19. Demo using Web Interface
  • 20. Demo : Excel Connector +
  • 21. BigQuery Tools BigQuery Excel Connector bq Command LineBigQuery Browser Tool Virtualization & BI Tools ETL Tools ODBC Connector
  • 22. Big Data Solution with BigQuery
  • 23. Big Data Solution with BigQuery Data Pipeline - transforming and loading data into BigQuery The process of using the Google Cloud Platform to upload data into BigQuery involves uploading the CSV files or Javascript Object Notation (JSON) files to Google Cloud Storage before loading the data into BigQuery. Alternatively, REST API can also be used to provide programmatic integration into the current computing environment. Data Visualization - performing data analysis on BigQuery and visualizing the results A custom, web-based dashboard can be built on Google App Engine using the BigQuery REST API to execute the queries and using Google Chart Tools to visualize the results
  • 24. Pricing Model Action Example Loading Data Loading files/data into BigQuery Exporting Data Exporting data, Saving Results from BigQuery Table Reads Browsing through data Table Copies Copy existing table to new table Storage Action Cost Storage $0.020 per GB, per month. Streaming Inserts Free until January 1, 2015. After January 1, 2015, $0.01 per 100,000 rows Query Pricing Cost On-demand $5 per TB Reserved Capacity 5GB per second $20k/ month Wow that’s like 800MB for 1 Rupee, even Internet ain’t that cheap here.
  • 25. Where to use ? ● Not a replacement to traditional systems, but it compliments the eco-system !! ● Major strength is Handling Large DataSets ● Major usage in Data Analytics ● Important component of Google Cloud Platform ● People are interested in numbers/data and that too quick…. Google BigQuery is the future of Analytics!!
  • 26. Any questions ? What we covered ... ✓ What is Big Data ? ✓ Available Big Data Solutions & Issues ✓ Why Google BigQuery ? ✓ Features, Components & Tools ✓ RESTful API ✓ Demo using Web Interface ✓ Big Query Tools ✓ Big Data Solution with BigQuery ✓ Pricing Model ✓ Usage
  • 27. https://bigquery.cloud.google.com No registration, just sign-in with your Google account Follow Dharmesh Vaya on @DRVaya or subscribe to my http://drvaya.wordpress.com/ You can also add me on +DharmeshVaya About the presenter