12. Unstructured data
• Page scrapping
• select content from html where
url="http://in.news.yahoo.com/murray-takes-two-
set-lead-final-against-djokovic-153758871.html"
and xpath='//*[@id="mediaarticlebody"]/div/p[29]’
• Xpath – path to a node in an XML document
13. Unstructured data
• Reading google spreadsheets
• https://docs.google.com/spreadsheet/pub?key=0A
gGxPO1AxEhldFZDNzAzQldLSGp2MzVGVXdlUnI
xeUE&output=csv
• select * from csv where url = ””
14. Why
• Unstructured data
– Yes, YQL is cool. But …
• Why use YQL
– When webservices are already available
• Lets see why, via an example
15. Example – Profile, Flickr
• !YQL
– Get a user profile
• http://social.yahooapis.com/v1/user/{guid}/profile
– Search for photos in flickr
• http://api.flickr.com/services/rest/?method=flickr.ph
otos.search&api_key=…&text=djokovic&format=re
st
16. Example – Profile, Flickr
• YQL
– Get a user profile
• select * from social.profile where guid = me
– Search for photos in flickr
• select * from flickr.photos.search where
api_key="..." and text="san francisco"
24. SQL like… JOINS?
• A small demo
– http://doc1.ydn.gq1.yahoo.com/mybloglog_test/yqljoin.html
– JOINs doesn’t mean single API call
• YQL still makes multiple calls
– Only one IN allowed per select
• Sub-select can also have one IN
25. How? – Devil is in the details
http://www.flickr.com/photos/prodiffusion/8267223638/
26. How to use
• PHP
$yql_query = "select * from answers.getbycategory where
category_id=2115500137";
$yql_url = "http://query.yahooapis.com/v1/public/yql?q=" .
rawurlencode($yql_query) . "&format=json";
$session = curl_init($yql_url);
curl_setopt($session, CURLOPT_RETURNTRANSFER,true);
$json = curl_exec($session);
39. Woeid
SELECT placeTypeName, name FROM geo.places.ancestors WHERE
descendant_woeid = "55925520”
SELECT placeTypeName, name FROM geo.places.ancestors WHERE
descendant_woeid = "55925520”