Transaction Management in Database Management System
Cluster Computing Web2 Sept2009
1. Integrating Web 2.0 Technologies with Scientific Simulation Codes for Real-Time Collaboration Gabrielle Allen (LSU), Frank Loeffler (LSU), Thomas Radke (AEI), Erik Schnetter (LSU), Edward Seidel (NSF/LSU) IEEE Cluster Computing, New Orleans, September 2009
2. Gravitational Wave Physics Models Analysis & Insight Observations Petascale problems: Full 3D general relativistic models of binary systems, supernova, gamma-ray bursts
8. Cactus Application Environment Individual research groups Domain specific shared infrastructure Flesh: APIs, information, orchestration Adaptive mesh refinement, parallel I/O, interaction, …
9. Typical Black Hole Simulations At LSU … 300 Cactus thorns 10,000 potential parameters 20 different supercomputers 100-2000 cores Days/weeks to run (checkpoint/restart) GBs to TBs of data (HDF5, ASCII, jpeg) 7
10. Collaborative Technologies Technologies to share simulation-related information developed in our group from the early 1990s Essential to support the scientific research Review historical evolution of these technologies Show how Web 2.0 provides new tools to enable old scenarios 8
11. Web-based Mail Lists Mosaic web browser (1993, NCSA) Seidel’s group at NCSA worry about content http://archive.ncsa.illinois.edu/Cyberia/NumRel/GravWaves.html(1995) Collaborative Cork Board (CoCoBoard) (Mid 90’s) Researchers have web-based “project pages” Could attach images!! (usually 1-D plots of results) Used till late 90’s Currently Project based private wikis: parameter/output files, figures Organize material for weekly project conference calls Cons: network to access/edit wiki, editing slow 9
13. Simulation Web Interfaces Thorn “httpd” First collaborative tool fundamentally integrated into Cactus Werner Benger (1999), visiting NCSA from Germany (7 hr time difference and email) Used socket library developed for remote viz (John Shalf & TIKSL project) Thorn “HTTPD” in standard toolkit (2000) Simulation status, variables, timing, viewport, output files, parameter steering, etc Thorns can include their own web content 11
14. Issues Authorization to web pages (username/password in parameter file is insecure and awkward, newer version uses https and can also use X.509) Browsers can display images in certain formats, a Visualization thorn uses gnuplot to include e.g. performance with time, physical parameters Problem deploying on compute nodes where web server cannot be directly accessed (port forwarding, filewalls) How to find and track the simulations, publicize existence to a collaboration? 12
17. Simulation Reports and Email Readable report automatically generated for each simulation (computation and physics) Prototyped 2001 but not used (?) How to collect reports in one place? Mail Thorn (sendmail) Email reliable and fault tolerant (spool) Supercomputers do not allow mail to be sent from compute nodes. 15
19. Announcing and Grid Portals Collaborations need reliable, live information about long running simulations. NSF Astrophysics Simulation Collaboratory (ASC), 1999 Grid Portal provided centralized, collaborative interface to submit, monitor and archive simulations Java, JSP, Javascript with back-end data base, contributed to GridSphere design (GridLab) JavaCOG to submit jobs and basic monitoring. 17 ASC Portal (2002)
20. Announcing Simulation Info 18 Publish (application provided) simulation information Thorn Announce, in prototype Cactus Worm scenario (2001) Message from Flesh/Thorn info Transport: XML-RPC to remote socket (portal) Issues Job IDs Security, mapping users Cumbersome user set parameters (portal location, visibility of job, notification needs) Announcing to ASC Portal (2002)
21. Notification Portal notification service Portal users configure at portal, simulations configure in parameter file Email, SMS, Instant Message Initial experiments generated large telecom bills! 19 Cool and useful, but lots of work (FTE) to develop and modify portal service, difficult to configure.
22. Web 2.0 Technologies Use for collaborative, simulation-level messaging and information archiving Reliable, persistent, well-documented, user-configurable, cheap, well supported, good APIs 20
23. Twitter March 2006 Real-time short messaging system. Users send and receive each others updates (tweets). Wide range of devices and rudimentary social networking. Receivers can filter messages they see and specify how they receive them Twitter API (e.g. post a new Twitter message from a user) Free 21
24. Thorn Twitter Uses libcurl Cactus parameters for twitter username/password Twitter API: statuses/update At LSU “numrel” group account Messages when simulation starts and at different stages 22
25. Flickr 2004, image hosting website for digital photographs (and now videos). Bought by Yahoo (2005). Professional account ($25/yr) for unlimited use Web service API for uploading and manipulating images Group images into Sets and Collections Tags, title, description, metadata from EXIF headers Social networking: users can comment on images, flag them, order by popularity, etc. Public/Private/Friends/Family. Blogs. RSS field allows quick previewing. 23
26. Thorn Flickr Send images from running simulation Uses: flickcurl, libcurl, libxml2, openssl Authentication more complex (api key, shared secret) Thorn uploads images that are generated by Cactus (and known to I/O layer), e.g. IoJpeg Each simulation given its own Flickr set 24
27. Future Work Extend capabilities, production testing Common authentication mechanism Social networking model (individual/shared accounts) Development of common tags, more metadata etc Storing videos (Flickr, YouTube, Vimeo) Advantage for scientists presenting Lots of other possibilities: DropBox to publish files across a collaboration, WordPress for simulation reports/blogs, FaceBook to replace grid portals and aggregate services, Cloud computing APIs for “grid” scenarios, … 25
28. Conclusions Started as a fun project (undergrad) Web 2.0 provide reliable delivery, storage, access, and flexible collaborative features Can use Web 2.0 to easily prototype new interactive and collaborative scenarios (have really missed this) Small groups and individuals can do this too!! Target standard of ease-of-use for cyberinfrastructure development For real use need unified authentication, clear policies on data, site versions 26