Build an AI/ML-driven image
archive processing workflow:
Image archive, analysis & report generation
Silicon Valley :: Fall 2023
Wesley Chun
Principal, CyberWeb
Principal, CyberWeb Consulting
● Mission: produce accelerated Python
developers, enable developers to be
successful using Google Cloud and
other Google developer tools & APIs
● Focus: Python, Google Cloud (GCP)
and Google Workspace (GWS) APIs;
Google cross-product solutions
● Services: technical consulting,
training, engineering, speaking, code
samples, hands-on tutorials, public
technical content (blogs, social, etc.)
About the speaker
Previous experience / background
● Software Engineer & Developer Advocate
○ Google, Sun, HP, Cisco, EMC, Xilinx &
○ Original Yahoo!Mail engineer/SWE
● Technical trainer, teacher, instructor
○ Teaching Math, Linux, Python since '83
○ Adjunct CS Faculty at local SV college
● Python community member
○ Popular Core Python series author
○ Python Software Foundation Fellow
● AB (Math/CS) & CMP (Music/Piano), UC
Berkeley and MSCS, UC Santa Barbara
● Adjunct Computer Science Faculty, Foothill
College (Silicon Valley)
Why and Agenda
● Organizations have real-life business problems seeking solutions
● Google Cloud (the organization) produces 2 main product groups
○ Google Cloud (GCP) and Google Workspace (GWS; ex-G Suite)
● May know GCP for compute, storage, data & AI/ML cloud services
● While GWS known for its apps (Gmail, Drive,...) also for developers(!)
● Use both to build novel solutions to unique business problems
Using Google APIs
AI/ML-driven img
processing workflow
Using Google APIs
Getting started & the nuts-n-bolts
General steps
1. Go to Cloud Console
2. Login to Google/Gmail account
(Workspace domain may require admin approval)
3. Create project (per application)
4. Enable APIs to use
5. Enable billing (CC, Free Trial, etc.)
6. Download client library(ies)
7. Create & download credentials
8. Write code*
9. Run code (may need to authorize)
Google APIs: how to use
Costs and pricing
● GCP: pay-per-use
● Google Workspace: subscription
● GCP Free Trial ($300/1Q, CC req'd)
● GCP "Always Free" tier
○ Some products have free tier
○ Daily or monthly quota
○ Must exceed to incur billing
● More on both programs at
*In your code
1. Import API client library
2. Create API client object
3. Use client to make API Calls
Cloud/GCP console
● Hub of all developer activity
● Applications == projects
○ New project for new apps
○ Projects have a billing acct
● Manage billing accounts
○ Financial instrument required
○ Personal or corporate credit cards,
Free Trial, and education grants
● Access GCP product settings
● Manage users & security
● Manage APIs in devconsole
● View application statistics
● En-/disable Google APIs
● Obtain application credentials
Using Google APIs
API manager aka Developers Console (devconsole)
Three different credentials types
● Simple: API keys (to access public data)
○ Simplest form of authorization: an API key; tied to a project
○ Allows access to public data
○ Do not put in code, lose, or upload to GitHub! (can be restricted however)
○ Supported by: Google Maps, (some) YouTube, (some) GCP, etc.
● Authorized: OAuth client IDs (to access data owned by [human] user)
○ Provides additional layer of security via OAuth2 (RFC 6749)
○ Owner must grant permission for your app to access their data
○ Access granularity determined by requested permissions (user scopes)
○ Supported by: Google Workspace, (some) YouTube, (some) GCP, etc.
● Authorized: service accounts (to access data owned by an app/robot user)
○ Provides additional layer of security via OAuth2 or JWT (RFC 7519)
○ Project owning data grants permission implicitly; requires public-private key-pair
○ Access granularity determined by Cloud IAM permissions granted to service account key-pair
○ Supported by: GCP, (some) Google Workspace, etc.
Blog series:
Two different client library "styles"
● "Platform-level" client libraries (lower-level)
○ Supports multiple products as a "lowest-common denominator"
○ Manage API service endpoints (setup & use)
○ Manage authorization (API keys, OAuth client IDs, service accounts)
○ Google Workspace, Google Analytics, YouTube, Google Ads APIs, GCP, etc.
○ Install:
● "Product-level" client libraries (higher-level)
○ Custom client libraries made specifically for each product
○ Managing API service endpoints & security mostly taken care of
○ Only need to create a "client" to use API services
○ Install (Cloud/GCP & Firebase):
○ Install (Maps):
● Some Google APIs families support both, e.g., Cloud
Google APIs client
libraries for common
languages; demos in
(User-)authorized API access(lower-level, older, generic)
OAuth boilerplate
from googleapiclient import discovery
from httplib2 import Http
from oauth2client import file, client, tools
SCOPES = ... # at least one (string or array of strings)
# 'storage.json' - where to store OAuth2 tokens from API
# 'client_secret.json' - OAuth2 client ID & secret (download from DevConsole)
store = file.Storage('storage.json')
creds = store.get()
if not creds or creds.invalid:
flow = client.flow_from_clientsecrets('client_secret.json', SCOPES)
creds = tools.run_flow(flow, store)
# create API service endpoint; for example: API='sheets', VERSION='v4'
SERVICE =, VERSION, http=creds.authorize(Http()))
(User-)authorized API access(lower-level, newer, generic)
from googleapiclient import discovery
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request
from google.oauth2 import credentials
SCOPES = ... # at least one (string or array of strings)
# 'storage.json' - where to store OAuth2 tokens from API
# 'client_secret.json' - OAuth2 client ID & secret (download from DevConsole)
TOKENS = 'storage.json' # OAuth2 token storage
if os.path.exists(TOKENS):
creds = credentials.Credentials.from_authorized_user_file(TOKENS)
if not (creds and creds.valid):
if creds and creds.expired and creds.refresh_token:
flow = InstalledAppFlow.from_client_secrets_file('client_secret.json', SCOPES)
creds = flow.run_local_server()
with open(TOKENS, 'w') as token:
# create API service endpoint; for example: API='sheets', VERSION='v4'
SERVICE =, VERSION, http=creds.authorize(Http()))
OAuth2 or
API key
HTTP-based REST APIs 1
Google APIs request-response workflow
● Application makes request
● Request received by service
● Process data, return response
● Results sent to application
(typical client-server model)
Examples of using Google APIs
( )
● GWS developer home:
● GWS developer intro:
● GWS Apps Script:
● Comprehensive overview:
Google Workspace
(formerly G Suite and Google Apps)
Google Compute Engine, Cloud Storage
AWS EC2 & S3; Rackspace; Joyent
Software as a Service
Platform as a Service
Infrastructure as a Service
Google Apps Script
Google Workspace (was G Suite/Google Apps)
Yahoo!Mail, Hotmail, Salesforce, Netsuite, Office 365
Google App Engine, Cloud Functions
Heroku, Cloud Foundry, Engine Yard, AWS Lambda
Google BigQuery, Cloud SQL, Vertex AI,
Cloud Firestore, NL, Vision, Pub/Sub
AWS Kinesis, RDS; Windows Azure SQL, Docker
Google Cloud Platform vs. Google Workspace
List (first 100) files/folders in Drive (older, OAuth2)
from __future__ import print_function
from googleapiclient import discovery
from httplib2 import Http
from oauth2client import file, client, tools
store = file.Storage('storage.json')
creds = store.get()
if not creds or creds.invalid:
flow = client.flow_from_clientsecrets('client_secret.json', SCOPES)
creds = tools.run_flow(flow, store)
DRIVE ='drive', 'v3', http=creds.authorize(Http()))
files = DRIVE.files().list().execute().get('files', [])
for f in files:
print(f['name'], f['mimeType'])
Listing your files
Migrate SQL data to a Sheet
# read SQL data then create new spreadsheet & add rows into it
FIELDS = ('ID', 'Customer Name', 'Product Code',
'Units Ordered', 'Unit Price', 'Status')
cxn = sqlite3.connect('db.sqlite')
cur = cxn.cursor()
rows = cur.execute('SELECT * FROM orders').fetchall()
rows.insert(0, FIELDS)
DATA = {'properties': {'title': 'Customer orders'}}
SHEET_ID = SHEETS.spreadsheets().create(body=DATA,
SHEETS.spreadsheets().values().update(spreadsheetId=SHEET_ID, range='A1',
body={'values': rows}, valueInputOption='RAW').execute()
Migrate SQL data
to Sheets
Storage: listing buckets
from __future__ import print_function
from googleapiclient import discovery
GCS ='storage', 'v1')
# send bucket name & return fields to API, display results
print('n** Objects in bucket %r...' % BUCKET)
FIELDS = 'items(name,size)'
files = GCS.objects().list(bucket=BUCKET, fields=FIELDS
).execute().get('items') or [{'name': '(none)', 'size': 'NaN'}]
for f in files:
print(' %s (%s)' % (f['name'], f['size']))
IMG = 'gs://cloud-samples-data/vision/using_curl/shanghai.jpeg'
body = {'requests': [{
'image': {'source': {'imageUri': IMG}},
'features': [{'type': 'LABEL_DETECTION'}],
VISION ='vision', 'v1', developerKey=API_KEY)
labeling = VISION.images().annotate(body=body).execute().get('responses')
for labels in labeling:
if 'labelAnnotations' in labels:
print('** Labels detected (and confidence score):')
for label in labels['labelAnnotations']:
print(label['description'], '(%.2f%%)' % (label['score']*100.))
Vision: label annotation/object detection
$ python3
Labels (and confidence score):
People (95.05%)
Street (89.12%)
Mode of transport (89.09%)
Transport (85.13%)
Vehicle (84.69%)
Snapshot (84.11%)
Urban area (80.29%)
Infrastructure (73.14%)
Road (72.74%)
Pedestrian (68.90%)
Vision: label annotation/object detection
AI/ML-driven image
processing workflow
Archive and analyze GWS images with GCP
Image: Gerd Altmann from Pixabay
Google Workspace GCP
Cloud image processing workflow
Cloud image processing workflow
from __future__ import print_function
import argparse, base64, io, webbrowser
from googleapiclient import discovery, http
from httplib2 import Http
from oauth2client import file, client, tools
k_ize = lambda b: '%6.2fK' % (b/1000.) # bytes to kBs
TOP = 5 # get top Vision API labels
DEBUG = False
# process credentials for OAuth2 tokens
store = file.Storage('storage.json')
creds = store.get()
if not creds or creds.invalid:
flow = client.flow_from_clientsecrets(
'client_secret.json', SCOPES)
creds = tools.run_flow(flow, store)
# create API service endpoints
HTTP = creds.authorize(Http())
DRIVE ='drive', 'v3', http=HTTP)
GCS ='storage', 'v1', http=HTTP)
VISION ='vision', 'v1', http=HTTP)
SHEETS ='sheets', 'v4', http=HTTP)
Cloud image processing workflow
def drive_get_file(fname):
rsp = DRIVE.files().list(q="name='%s'" % fname).execute().get['files'][0]
fileId, fname, mtype = rsp['id'], rsp['name'], rsp['mimeType']
blob = DRIVE.files().get_media(fileId).execute()
return fname, mtype, rsp['modifiedTime'], blob
def gcs_blob_upload(fname, bucket, blob, mimetype):
body = {'name': fname, 'uploadType': 'multipart',
'contentType': mimetype}
return GCS.objects().insert(bucket, body, blob).execute()
def vision_label_img(img, top):
body = {'requests': [{'image': {'content': img}, 'features':
[{'type': 'LABEL_DETECTION', 'maxResults': top}]}]}
rsp = VISION.images().annotate(
body=body).execute().get('responses', [{}])[0]
return ', '.join('%s (%.2f%%)' % (label['description'],
label['score']*100.) for label in rsp['labelAnnotations'])
def sheet_append_row(sheet, row):
rsp = SHEETS.spreadsheets().values().append(
spreadsheetId=sheet, range='Sheet1',
body={'values': rows}).execute()
return rsp.get('updates').get('updatedCells')
def main(fname, bucket, sheet_id, top):
fname, mtype, ftime, data = drive_get_img(fname)
gcs_blob_upload(fname, bucket, data, mtype)
rsp = vision_label_img(data, top)
sheet_append_row(sheet_id, [fname, mtype,
ftime, len(data), rsp])
API method calls in Bold
Driver calls in Bold Italics
Cloud image processing workflow
def drive_get_file(fname):
rsp = DRIVE.files().list(
q="name='%s'" % fname).execute().get['files'][0]
fileId, fname, mtype =
rsp['id'], rsp['name'], rsp['mimeType']
blob = DRIVE.files().get_media(fileId).execute()
return mtype, rsp['modifiedTime'], blob
Cloud image processing workflow
def gcs_blob_upload(fname, bucket, blob, mimetype):
body = {
'name': fname,
'uploadType': 'multipart',
'contentType': mimetype
return GCS.objects().insert(
bucket, body, blob).execute()
Cloud image processing workflow
def vision_label_img(img, top):
body = [{'image': {'content': img},
'features': [{
'maxResults': top,
rsp = VISION.images().annotate(
body={'requests': body}).execute()['responses'][0]
return ', '.join('%s (%.2f%%)' % (
label['description'], label['score']*100.)
for label in rsp['labelAnnotations'])
Cloud image processing workflow
def sheet_append_row(sheet, row):
rsp = SHEETS.spreadsheets().values().append(
spreadsheetId=sheet, range='Sheet1',
body={'values': row}).execute()
return rsp.get('updates').get('updatedCells')
Cloud image processing workflow
def main(fname, bucket, sheet_id, top):
fname, mtype, ftime, data = drive_get_img(fname)
gcs_blob_upload(fname, bucket, data, mtype)
rsp = vision_label_img(data, top)
[fname, mtype, ftime, len(data), rsp])
● Project goal: Imagining an actual enterprise use case and solve it!
● Specific goals: free-up highly-utilized resource, archive data to
colder/cheaper storage, analyze images, generate report for mgmt
● Download image binary from Google Drive
● Upload object to Cloud Storage bucket
● Send payload for analysis by Cloud Vision
● Write back-up location & analysis results into Google Sheets
● Blog post: (original post); Cloud X-post
● Codelab: free, online, self-paced, hands-on tutorial
● Application source code
App summary
Summary & resources
Session Summary
● Google provides more than just apps
○ More than search, YouTube, Android, Chrome, and Gmail/Docs
○ "Much" Google technology available to developers via APIs
● Google Cloud vs. non-Cloud Google APIs
○ Alas, developer experience differs between product families
○ Some products have higher-level product client libraries
○ Others require use of lower-level client libraries
■ Lower-level may be useful as lowest common denominator
● Inconsistency due to various reasons: budget, timeframe, priorities, etc.
● Interesting possibilities using multiple Google product APIs
Other Google APIs & platforms
● Firebase (mobile development platform + RT DB; ML Kit)
○ &
● Google Looker/Data Studio (data visualization, dashboards, etc.)
● Actions on Google/Assistant/DialogFlow (voice apps)
● YouTube (Data, Analytics, and Livestreaming APIs)
● Google Maps (Maps, Routes, and Places APIs)
● Flutter (native apps [Android, iOS, web] w/1 code base[!])
● Documentation
○ GCP:{docs,vision,automl,storage,language,speech,translate,firestore,sql,
○ GWS & other non-GCP:{gsuite,gmail,drive,calendar,docs,sheets,
● Introductory "codelabs" ([free] self-paced, hands-on tutorials)
○ GWS APIs: (featuring Drive API)
○ Cloud Vision API: (or C#)
○ All other codelabs: (all Google APIs, all levels)
● Videos
○ GWS:, Drive:, Sheets:, GCP:
● Code: &
● GCP Free Trial (new users) and Always Free tier:
● Compare GCP to AWS and Azure:
Online resources
Bring me to your organization
... it is my job to help you!
● "Transfer of Info" tech talks
● Half- or full-day seminars
● Hands-on "codelab" workshops
● Multi-day training courses
● Engineering consulting
● Migration strategy & planning
Slides: you are
looking at them now
App blog post:
GCP+GWS 101 (2019):
...& 102 talks (2023):
Progress bars:
Thank you! Questions?
Wesley Chun
Principal Consultant, CyberWeb
Python, GCP & GWS specialist
@wescpy (Tw/X, SO, GH, IG, LI)

Build an AI/ML-driven image archive processing workflow: Image archive, analysis & report generation

  • 1. Build an AI/ML-driven image archive processing workflow: Image archive, analysis & report generation Silicon Valley :: Fall 2023 Wesley Chun Principal, CyberWeb @wescpy@ Principal, CyberWeb Consulting ● Mission: produce accelerated Python developers, enable developers to be successful using Google Cloud and other Google developer tools & APIs ● Focus: Python, Google Cloud (GCP) and Google Workspace (GWS) APIs; Google cross-product solutions ● Services: technical consulting, training, engineering, speaking, code samples, hands-on tutorials, public technical content (blogs, social, etc.) About the speaker Previous experience / background ● Software Engineer & Developer Advocate ○ Google, Sun, HP, Cisco, EMC, Xilinx & ○ Original Yahoo!Mail engineer/SWE ● Technical trainer, teacher, instructor ○ Teaching Math, Linux, Python since '83 ○ Adjunct CS Faculty at local SV college ● Python community member ○ Popular Core Python series author ○ Python Software Foundation Fellow ● AB (Math/CS) & CMP (Music/Piano), UC Berkeley and MSCS, UC Santa Barbara ● Adjunct Computer Science Faculty, Foothill College (Silicon Valley)
  • 2. Why and Agenda ● Organizations have real-life business problems seeking solutions ● Google Cloud (the organization) produces 2 main product groups ○ Google Cloud (GCP) and Google Workspace (GWS; ex-G Suite) ● May know GCP for compute, storage, data & AI/ML cloud services ● While GWS known for its apps (Gmail, Drive,...) also for developers(!) ● Use both to build novel solutions to unique business problems 1 Using Google APIs 2 GCP & GWS APIs 3 AI/ML-driven img processing workflow 4 Wrap-up 01 Using Google APIs Getting started & the nuts-n-bolts
  • 3.
  • 4. General steps 1. Go to Cloud Console 2. Login to Google/Gmail account (Workspace domain may require admin approval) 3. Create project (per application) 4. Enable APIs to use 5. Enable billing (CC, Free Trial, etc.) 6. Download client library(ies) 7. Create & download credentials 8. Write code* 9. Run code (may need to authorize) Google APIs: how to use Costs and pricing ● GCP: pay-per-use ● Google Workspace: subscription ● GCP Free Trial ($300/1Q, CC req'd) ● GCP "Always Free" tier ○ Some products have free tier ○ Daily or monthly quota ○ Must exceed to incur billing ● More on both programs at *In your code 1. Import API client library 2. Create API client object 3. Use client to make API Calls Cloud/GCP console ● Hub of all developer activity ● Applications == projects ○ New project for new apps ○ Projects have a billing acct ● Manage billing accounts ○ Financial instrument required ○ Personal or corporate credit cards, Free Trial, and education grants ● Access GCP product settings ● Manage users & security ● Manage APIs in devconsole
  • 5. ● View application statistics ● En-/disable Google APIs ● Obtain application credentials Using Google APIs API manager aka Developers Console (devconsole) Three different credentials types ● Simple: API keys (to access public data) ○ Simplest form of authorization: an API key; tied to a project ○ Allows access to public data ○ Do not put in code, lose, or upload to GitHub! (can be restricted however) ○ Supported by: Google Maps, (some) YouTube, (some) GCP, etc. ● Authorized: OAuth client IDs (to access data owned by [human] user) ○ Provides additional layer of security via OAuth2 (RFC 6749) ○ Owner must grant permission for your app to access their data ○ Access granularity determined by requested permissions (user scopes) ○ Supported by: Google Workspace, (some) YouTube, (some) GCP, etc. ● Authorized: service accounts (to access data owned by an app/robot user) ○ Provides additional layer of security via OAuth2 or JWT (RFC 7519) ○ Project owning data grants permission implicitly; requires public-private key-pair ○ Access granularity determined by Cloud IAM permissions granted to service account key-pair ○ Supported by: GCP, (some) Google Workspace, etc. Blog series:
  • 6. Two different client library "styles" ● "Platform-level" client libraries (lower-level) ○ Supports multiple products as a "lowest-common denominator" ○ Manage API service endpoints (setup & use) ○ Manage authorization (API keys, OAuth client IDs, service accounts) ○ Google Workspace, Google Analytics, YouTube, Google Ads APIs, GCP, etc. ○ Install: ● "Product-level" client libraries (higher-level) ○ Custom client libraries made specifically for each product ○ Managing API service endpoints & security mostly taken care of ○ Only need to create a "client" to use API services ○ Install (Cloud/GCP & Firebase): ○ Install (Maps): ● Some Google APIs families support both, e.g., Cloud Google APIs client libraries for common languages; demos in client-library /cloud-client-libraries
  • 7. (User-)authorized API access(lower-level, older, generic) OAuth boilerplate from googleapiclient import discovery from httplib2 import Http from oauth2client import file, client, tools SCOPES = ... # at least one (string or array of strings) # 'storage.json' - where to store OAuth2 tokens from API # 'client_secret.json' - OAuth2 client ID & secret (download from DevConsole) store = file.Storage('storage.json') creds = store.get() if not creds or creds.invalid: flow = client.flow_from_clientsecrets('client_secret.json', SCOPES) creds = tools.run_flow(flow, store) # create API service endpoint; for example: API='sheets', VERSION='v4' SERVICE =, VERSION, http=creds.authorize(Http())) (User-)authorized API access(lower-level, newer, generic) from googleapiclient import discovery from google_auth_oauthlib.flow import InstalledAppFlow from google.auth.transport.requests import Request from google.oauth2 import credentials SCOPES = ... # at least one (string or array of strings) # 'storage.json' - where to store OAuth2 tokens from API # 'client_secret.json' - OAuth2 client ID & secret (download from DevConsole) TOKENS = 'storage.json' # OAuth2 token storage if os.path.exists(TOKENS): creds = credentials.Credentials.from_authorized_user_file(TOKENS) if not (creds and creds.valid): if creds and creds.expired and creds.refresh_token: creds.refresh(Request()) else: flow = InstalledAppFlow.from_client_secrets_file('client_secret.json', SCOPES) creds = flow.run_local_server() with open(TOKENS, 'w') as token: token.write(creds.to_json()) # create API service endpoint; for example: API='sheets', VERSION='v4' SERVICE =, VERSION, http=creds.authorize(Http()))
  • 8. OAuth2 or API key HTTP-based REST APIs 1 HTTP 2 Google APIs request-response workflow ● Application makes request ● Request received by service ● Process data, return response ● Results sent to application (typical client-server model) 02 GWS & GCP APIs Examples of using Google APIs
  • 10. ● GWS developer home: ● GWS developer intro: ● GWS REST APIs: ● GWS Apps Script: ● Comprehensive overview: Google Workspace (formerly G Suite and Google Apps) (GWS) APIs Google Compute Engine, Cloud Storage AWS EC2 & S3; Rackspace; Joyent SaaS Software as a Service PaaS Platform as a Service IaaS Infrastructure as a Service Google Apps Script Salesforce1/ Google Workspace (was G Suite/Google Apps) Yahoo!Mail, Hotmail, Salesforce, Netsuite, Office 365 Google App Engine, Cloud Functions Heroku, Cloud Foundry, Engine Yard, AWS Lambda Google BigQuery, Cloud SQL, Vertex AI, Cloud Firestore, NL, Vision, Pub/Sub AWS Kinesis, RDS; Windows Azure SQL, Docker Google Cloud Platform vs. Google Workspace GWS APIs GCP APIs
  • 11. List (first 100) files/folders in Drive (older, OAuth2) from __future__ import print_function from googleapiclient import discovery from httplib2 import Http from oauth2client import file, client, tools SCOPES = '' store = file.Storage('storage.json') creds = store.get() if not creds or creds.invalid: flow = client.flow_from_clientsecrets('client_secret.json', SCOPES) creds = tools.run_flow(flow, store) DRIVE ='drive', 'v3', http=creds.authorize(Http())) files = DRIVE.files().list().execute().get('files', []) for f in files: print(f['name'], f['mimeType']) Listing your files Migrate SQL data to a Sheet # read SQL data then create new spreadsheet & add rows into it FIELDS = ('ID', 'Customer Name', 'Product Code', 'Units Ordered', 'Unit Price', 'Status') cxn = sqlite3.connect('db.sqlite') cur = cxn.cursor() rows = cur.execute('SELECT * FROM orders').fetchall() cxn.close() rows.insert(0, FIELDS) DATA = {'properties': {'title': 'Customer orders'}} SHEET_ID = SHEETS.spreadsheets().create(body=DATA, fields='spreadsheetId').execute().get('spreadsheetId') SHEETS.spreadsheets().values().update(spreadsheetId=SHEET_ID, range='A1', body={'values': rows}, valueInputOption='RAW').execute() Migrate SQL data to Sheets
  • 12. Storage: listing buckets from __future__ import print_function from googleapiclient import discovery GCS ='storage', 'v1') BUCKET = YOUR_BUCKET # send bucket name & return fields to API, display results print('n** Objects in bucket %r...' % BUCKET) FIELDS = 'items(name,size)' files = GCS.objects().list(bucket=BUCKET, fields=FIELDS ).execute().get('items') or [{'name': '(none)', 'size': 'NaN'}] for f in files: print(' %s (%s)' % (f['name'], f['size'])) IMG = 'gs://cloud-samples-data/vision/using_curl/shanghai.jpeg' body = {'requests': [{ 'image': {'source': {'imageUri': IMG}}, 'features': [{'type': 'LABEL_DETECTION'}], }]} VISION ='vision', 'v1', developerKey=API_KEY) labeling = VISION.images().annotate(body=body).execute().get('responses') for labels in labeling: if 'labelAnnotations' in labels: print('** Labels detected (and confidence score):') for label in labels['labelAnnotations']: print(label['description'], '(%.2f%%)' % (label['score']*100.)) Vision: label annotation/object detection
  • 13. $ python3 Labels (and confidence score): ============================== People (95.05%) Street (89.12%) Mode of transport (89.09%) Transport (85.13%) Vehicle (84.69%) Snapshot (84.11%) Urban area (80.29%) Infrastructure (73.14%) Road (72.74%) Pedestrian (68.90%) Vision: label annotation/object detection 03 AI/ML-driven image processing workflow Archive and analyze GWS images with GCP
  • 14.
  • 15. Image: Gerd Altmann from Pixabay
  • 16. Cloud Vision Google Workspace GCP Cloud image processing workflow Cloud Storage Drive Sheets Archive image Categorize image Record results
  • 17. Cloud image processing workflow from __future__ import print_function import argparse, base64, io, webbrowser from googleapiclient import discovery, http from httplib2 import Http from oauth2client import file, client, tools k_ize = lambda b: '%6.2fK' % (b/1000.) # bytes to kBs FILE = 'YOUR_IMG_ON_DRIVE' BUCKET = 'YOUR_BUCKET_NAME' SHEET = 'YOUR_SHEET_ID' TOP = 5 # get top Vision API labels DEBUG = False # process credentials for OAuth2 tokens SCOPES = ( '', '', '', '', ) store = file.Storage('storage.json') creds = store.get() if not creds or creds.invalid: flow = client.flow_from_clientsecrets( 'client_secret.json', SCOPES) creds = tools.run_flow(flow, store) # create API service endpoints HTTP = creds.authorize(Http()) DRIVE ='drive', 'v3', http=HTTP) GCS ='storage', 'v1', http=HTTP) VISION ='vision', 'v1', http=HTTP) SHEETS ='sheets', 'v4', http=HTTP) Cloud image processing workflow def drive_get_file(fname): rsp = DRIVE.files().list(q="name='%s'" % fname).execute().get['files'][0] fileId, fname, mtype = rsp['id'], rsp['name'], rsp['mimeType'] blob = DRIVE.files().get_media(fileId).execute() return fname, mtype, rsp['modifiedTime'], blob def gcs_blob_upload(fname, bucket, blob, mimetype): body = {'name': fname, 'uploadType': 'multipart', 'contentType': mimetype} return GCS.objects().insert(bucket, body, blob).execute() def vision_label_img(img, top): body = {'requests': [{'image': {'content': img}, 'features': [{'type': 'LABEL_DETECTION', 'maxResults': top}]}]} rsp = VISION.images().annotate( body=body).execute().get('responses', [{}])[0] return ', '.join('%s (%.2f%%)' % (label['description'], label['score']*100.) for label in rsp['labelAnnotations']) def sheet_append_row(sheet, row): rsp = SHEETS.spreadsheets().values().append( spreadsheetId=sheet, range='Sheet1', body={'values': rows}).execute() return rsp.get('updates').get('updatedCells') def main(fname, bucket, sheet_id, top): fname, mtype, ftime, data = drive_get_img(fname) gcs_blob_upload(fname, bucket, data, mtype) rsp = vision_label_img(data, top) sheet_append_row(sheet_id, [fname, mtype, ftime, len(data), rsp]) API method calls in Bold Driver calls in Bold Italics
  • 18. Cloud image processing workflow def drive_get_file(fname): rsp = DRIVE.files().list( q="name='%s'" % fname).execute().get['files'][0] fileId, fname, mtype = rsp['id'], rsp['name'], rsp['mimeType'] blob = DRIVE.files().get_media(fileId).execute() return mtype, rsp['modifiedTime'], blob Cloud image processing workflow def gcs_blob_upload(fname, bucket, blob, mimetype): body = { 'name': fname, 'uploadType': 'multipart', 'contentType': mimetype } return GCS.objects().insert( bucket, body, blob).execute()
  • 19. Cloud image processing workflow def vision_label_img(img, top): body = [{'image': {'content': img}, 'features': [{ 'type': 'LABEL_DETECTION', 'maxResults': top, }] }] rsp = VISION.images().annotate( body={'requests': body}).execute()['responses'][0] return ', '.join('%s (%.2f%%)' % ( label['description'], label['score']*100.) for label in rsp['labelAnnotations']) Cloud image processing workflow def sheet_append_row(sheet, row): rsp = SHEETS.spreadsheets().values().append( spreadsheetId=sheet, range='Sheet1', body={'values': row}).execute() return rsp.get('updates').get('updatedCells')
  • 20. Cloud image processing workflow def main(fname, bucket, sheet_id, top): fname, mtype, ftime, data = drive_get_img(fname) gcs_blob_upload(fname, bucket, data, mtype) rsp = vision_label_img(data, top) sheet_append_row(sheet_id, [fname, mtype, ftime, len(data), rsp]) ● Project goal: Imagining an actual enterprise use case and solve it! ● Specific goals: free-up highly-utilized resource, archive data to colder/cheaper storage, analyze images, generate report for mgmt ● Download image binary from Google Drive ● Upload object to Cloud Storage bucket ● Send payload for analysis by Cloud Vision ● Write back-up location & analysis results into Google Sheets ● Blog post: (original post); Cloud X-post ● Codelab: free, online, self-paced, hands-on tutorial ● ● Application source code ● App summary
  • 21. 04 Wrap-up Summary & resources Session Summary ● Google provides more than just apps ○ More than search, YouTube, Android, Chrome, and Gmail/Docs ○ "Much" Google technology available to developers via APIs ● Google Cloud vs. non-Cloud Google APIs ○ Alas, developer experience differs between product families ○ Some products have higher-level product client libraries ○ Others require use of lower-level client libraries ■ Lower-level may be useful as lowest common denominator ● Inconsistency due to various reasons: budget, timeframe, priorities, etc. ● Interesting possibilities using multiple Google product APIs
  • 22. Other Google APIs & platforms ● Firebase (mobile development platform + RT DB; ML Kit) ○ & ● Google Looker/Data Studio (data visualization, dashboards, etc.) ○ ○ ● Actions on Google/Assistant/DialogFlow (voice apps) ○ ● YouTube (Data, Analytics, and Livestreaming APIs) ○ ● Google Maps (Maps, Routes, and Places APIs) ○ ● Flutter (native apps [Android, iOS, web] w/1 code base[!]) ○ ● Documentation ○ GCP:{docs,vision,automl,storage,language,speech,translate,firestore,sql, video-intelligence,bigquery,filestore,identity-platform,vertex-ai,kubernetes,compute,gpu,tpu} ○ GWS & other non-GCP:{gsuite,gmail,drive,calendar,docs,sheets, slides,forms,classroom,chat,apps-script,maps,youtube,analytics,cast,actions,people,ar,books} ● Introductory "codelabs" ([free] self-paced, hands-on tutorials) ○ GWS APIs: (featuring Drive API) ○ Cloud Vision API: (or C#) ○ All other codelabs: (all Google APIs, all levels) ● Videos ○ GWS:, Drive:, Sheets:, GCP: ● Code: & ● GCP Free Trial (new users) and Always Free tier: ● Compare GCP to AWS and Azure: Online resources
  • 23. Bring me to your organization ... it is my job to help you! ● "Transfer of Info" tech talks ● Half- or full-day seminars ● Hands-on "codelab" workshops ● Multi-day training courses ● Engineering consulting ● Migration strategy & planning ● Slides: you are looking at them now 😉 Work: Books: Blog: App blog post: GCP+GWS 101 (2019): ...& 102 talks (2023): Progress bars: Thank you! Questions? Wesley Chun Principal Consultant, CyberWeb Python, GCP & GWS specialist @wescpy (Tw/X, SO, GH, IG, LI)