memories of tumblr gear & Tumblrowl

memories of
tumblr gear & Tumblrowl
@honishi
tumblr developer’s meetup jp 2011
17 dec 2011

@honishi
hiroyuki onishi
honishi.tumblr.com
since mid 2008

• not a truly seasoned programmer
• writing code just for fun
• consultant for FAST Search Server at Microsoft

“honishi” is my secondary identity on the web,
my primary ones are:

@notomamiko
fuckyeahnotomamiko.tumblr.com

@kugimiyarie
fuckyeahkugimiyarie.tumblr.com

summary
✴ living in yukari kingdom
✴ a dedicated notomamist
✴ a patient with kugimiya disease (type: n)

today’s topic
✓ tumblr gear for iPhone
✓ Tumblrowl for Mac OS X

tumblr gear?

ancient tumblr client for iPhone

core concept
• prerequisiteexcept login
• no api,
• poor performance of iPhone 3g
• scraping
• text-based scraping
• not xml-based: fat?
• dom...... slow?
• sax complex?
• as fast as possible
• minimize processing
• minimize network trafﬁc

main user interface
• lots of webviews...

UIWebView
UIWebView
UIWebView
UIWebView
UIWebView
UIWebView
UIWebView
UIWebView
UIWebView
UIWebView
UIWebView
UIWebView
UIWebView

x11 for browsing x2 for reblogging
(1 unhidden, 10 hidden) (always hidden)

main user interface
(cont’d)
• it’s slow to start rendering all webview at
one time
• so webviews are gradually warmed up

(debug view)

days of fixing app
• initial release ... jun 2009
• released after 4 rejections by Apple
• days of fixing app ... after release
• every little modification on dashboard
affects app’s scraping logic

an opinion from
the opinion leader
• scraping should be executed on server side
• when is a need to modify scraping logic
there
the structure of html changes,

• if the logic is implemented long time to
application, it usually takes
within client
release ﬁxed app; submit the build, wait
for Apple’s review, being reviewed by
Apple...
• it’s also better for cross-platform
application provisioning

weakness of
server side scraping
• scalability?
• all connections & accesses in single point
• need to invest for computing resources there
• possibility of ban?
• service provider can easily identify massive
transactions from one location
• once banned, it’s over
• security?
• no oauth provided at that time
• so need to have & use user’s password at server
side

still yet
client side scraping...

restructuring for
fault-tolerance
• splitting the scraping processes into 2 blocks:
• logic for scraping
• metadata for above
• store them in difference places:
• logic inside of the app
• metadata outside of the app, s3
• metadata is read from the app at the time of
startup.

logic & metadata
logic(process): metadata:

1. read dashboard base url?

2. pre-process target? how?

boundaries for:
3. split posts
html header? footer? post?

4. ﬁnd next link (then back to 1.) base url?
elements for the url?

inside app outside app

scraping metadata
• simple property list
• almost all rules are written in simple string
or regular expression
• located on amazon s3
• http://s3.amazonaws.com/tumblrgear/parsemeta.plist

scraping metadata
(cont’d)

overview: dashboard
html header

post #1

post #2

:
dashboard preprocessed
html html
post #9

post #10

(next link)

html footer

#1. read dashboard
• login, v1 api
• read regular html from the url deﬁned in
metadata
# key value
1 baseUrl http://www.tumblr.com

#2. pre-process
• regular the weight of html
expression (replace)
• reduce unwanted images
• disable javascripts
• disable img src to *_250.[jpg|gif] etc...
• change
# key value
1 pageReplace src="http://assets.tumblr.com/images/.*" ;; removed
2 pageReplace <(script .*?</script)> ;; 
src=(".*?)_(400|500).(jpg|png|gif)(") ;;
3 pageReplace ORIGINAL_SRC=$1_$2.$3$4 src=$1_250.$3$4
: :
( A ;; B ... replace A with B )

#2. pre-process(cont’d)
• override cssrequired
for iPhone
• highre-replace img src to *_500.[jpg|gif]
reso, if
•
# key value
(</head>) ;; <style type="text/css">

</style>
<meta name="viewport" content="width=320">
$1

2 highResoReplace ORIGINAL_SRC="(.*)" src=".*?" ;; HIGH_RESO src="$1"

3 : :

#3. split post
• detect boundaries in the html
• then split them into header, footer and
posts
# key value
1 pageHeaderSplitter 
2 pageFooterSplitter 
3 postBeginSplitter <li id="post_
4 postEndSplitter

#4. ﬁnd next link
• ﬁnd next link next link using elements
elements
• assemble the
# key value
1 nextLinkUrl http://www.tumblr.com{1}
2 nextLinkElements <a id="next_page_link" href="(.*)">

• then read next page

stored posts
html header header

post #1 footer header

post #2

: post #n

post #9 posts
array
post #10 footer

html footer

split html stored separately concatenate on demand

reblog
• detect reblog url of the post
# key value
1 reblogUrl http://www.tumblr.com{1}
2 reblogElements <a href="(/reblog/.*?)">

• get the raw html from the url

reblog (cont’d)
• preprocess the html (disable img src etc...)
# key value
1 reblogReplace <(script .*?</script)> ;; 
2 reblogReplace <link ;; <disabled_link
3 reblogReplace <img ;; <disabled_img

• send the html to webview for reblogging

reblog (cont’d)
• do the javascript thingsinto text area, if provided
• put the commentbutton
• push the submit
# key value
1 reblogAddCommentJS (javascript here ... snip)
2 reblogSubmitJS (javascript here ... snip)

• wait for redirect back to dashboard
# key value
1 reblogRedirectUrl http://www.tumblr.com/dashboard

• done

like
• detect like url of the post
# key value
1 likeUrl http://www.tumblr.com/like/{2}?form_key={3}&id={1}
2 likeElements type="hidden" name="id" value="(.*?)"
3 likeElements action="/like/(.*?)"
4 likeElements name="form_key"s+value="(.*?)"

• do the simple postcode 200
• wait for response
• done

sales & trends
• average 1,800 downloads/week?

sales & trends (cont’d)
• US market is now 3 times larger than
Japanese one ?

recommended
migration path
• for iOS users ... Tumbletail
• for Android users ... Tumblife

conclusion
• ibecause:currently do not use this app,
myself

• softbank’s very poor signal everywhere
• reducing numberenough for me to
accounts, so it’s
of following
check the dashboard using pc in the
bed

Tumblrowl?

Growl-like dashboard application for Mac OS X

motivation to build
• recently...
• i don’t wanna do anything...
• just wanna watch niconama...

hataratti aka

motivation to build
(cont’d)
• i don’t wanna do anything (reprise)

tumblr?
tired of pressing j & k,
but missing...

my requirements
• no user input requiredscreen
• effective utilization of

yorufukurou chrome here!

architecture
• Growl, forked
• network
• OAuthConsumer for v2 api
• ASIHTTPRequest
• misc
• RegexKitLite
• JSON Framework
• Sparkle

overview
suspend?

dashboard
api post queue display queue display
(mutable array) (mutable array) (nswindow)
w/ since_id
1 post
/ dequeue

polling every 10 sec polling every 2 sec

•open post?
•reblog?
•like?

Growl, forked
• extracting the display window from Growl
the displaying module
• extending
out of box window: extended window:
x x r o l
icon title avatar blog name
description
image area

upper text area

lower text area

title
source

miscellaneous
• oauth & webview
• all cookiesofare shared (default behavior)
instances webview
in safari & all

• so the login sequence to get authorized
doesn’t work expectedly
• need to override containerinmanually
to handle cookie
delegate webview

• xauth...?

miscellaneous (cont’d)
• avoiding reblog display limit
storm
• implement free space seeking logic
• by hooking

conclusion
• i myself currently do not use this app,
because i ﬁnd it distracting... seriously...

icons for tumblr gear
• designed by charactoy
• 3,000- for each

• http://www.charactoy.com/

icons for Tumblrowl
• designed by diwakar ganesh (designcrowd)
• $365.65-

• http://www.designcrowd.com/

thank you.

@honishi
onishi.hiroyuki@gmail.com
special thanks:
inu(nihon henshu ongaku kyokai), nonSectRadicals, mamiko noto,
shingo yamanaka, jeffrey kuo, midori yokoyama, naoto ohara, masami iwasawa

memories of tumblr gear & Tumblrowl

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (19)

Similar a memories of tumblr gear & Tumblrowl

Similar a memories of tumblr gear & Tumblrowl (20)

Último

Último (20)

memories of tumblr gear & Tumblrowl

Notas del editor