A little bit about me Over 10 years in the industry 3 years in the cloud Former GigaSpaces Cloud Technical Director Leading Yahoo Integrated Cloud product line Based in Yahoo’s headquarters at Sunnyvale, CA
Gartner presentation?? May
Up arrow – more updates with the same quality, can only be done by moving the curve Right arrow – more quality without losing features Quicker in the same quality Y Axis = “Agility” , measured by how many releases we can have in a year X Axis = “Quality”, measured by up time
Why Pyramid? Everything is targeted toward the users Every $ you spend in lower layers, yield in $$$ at the upper layers Every layer is built on top of the other, and add only the “marginal” development effort Yahoo!’s unique cloud play is that while other companies have focused on building out and exposing services at the infrastructure level (in terms of external products – think Amazon) Yahoo! has decided to focus on building out and exposing what we call our “functional cloud services”. These are services aimed at helping developers, partners, advertisers and Yahoo! itself create and deliver consumer focused product innovations by creating open services with access to Yahoo! data and computing resources. This is not to say we don’t invest and lead in Infrastructure cloud technology – in fact with Hadoop and other tech that drives the Internet we do lead – but rather that on that front our focus is not on productization of infrastructure as much as on open source and industry/academic collaboration. More like basic research. YQL The YQL platform provides a mediator service that enables developers to query, filter and combine data across Yahoo! and beyond. YQL exposes a SQL-like SELECT syntax that is both familiar to developers and expressive enough for getting the right data. Through the SHOW and DESC commands we attempt to make YQL self-documenting, enabling developers to discover the available data sources and structure without opening another web browser or reading a manual. The YQL Web Service exposes just a single URL, http://query.yahooapis.com/v1/?q=[command] that is compiled for each query. We perform rudimentary analysis on the query to determine how to factor it across one or more web services. As much of the query as possible is reworked into Yahoo! web service REST calls, and the remaining aspects are performed the YQL service itself. Think of a better name of “horizontal”
Why Pyramid? Everything is targeted toward the users Every $ you spend in lower layers, yield in $$$ at the upper layers Every layer is built on top of the other, and add only the “marginal” development effort Yahoo!’s unique cloud play is that while other companies have focused on building out and exposing services at the infrastructure level (in terms of external products – think Amazon) Yahoo! has decided to focus on building out and exposing what we call our “functional cloud services”. These are services aimed at helping developers, partners, advertisers and Yahoo! itself create and deliver consumer focused product innovations by creating open services with access to Yahoo! data and computing resources. This is not to say we don’t invest and lead in Infrastructure cloud technology – in fact with Hadoop and other tech that drives the Internet we do lead – but rather that on that front our focus is not on productization of infrastructure as much as on open source and industry/academic collaboration. More like basic research. YQL The YQL platform provides a mediator service that enables developers to query, filter and combine data across Yahoo! and beyond. YQL exposes a SQL-like SELECT syntax that is both familiar to developers and expressive enough for getting the right data. Through the SHOW and DESC commands we attempt to make YQL self-documenting, enabling developers to discover the available data sources and structure without opening another web browser or reading a manual. The YQL Web Service exposes just a single URL, http://query.yahooapis.com/v1/?q=[command] that is compiled for each query. We perform rudimentary analysis on the query to determine how to factor it across one or more web services. As much of the query as possible is reworked into Yahoo! web service REST calls, and the remaining aspects are performed the YQL service itself. Think of a better name of “horizontal”
Open PaaS for building scalable web-sites, NOT IaaS Better wording than structred
Inquisitor Today, Yahoo! Search is embracing the Mac community and offering similar search assistance features with the acquisition of Inquisitor software, a Safari browser plug-in. Inquisitor 3, a search technology that auto-completes queries and delivers results right in Safari Web browser, is similar to Yahoo!’s existing Search Assist technology. Simply type in your query and websites will appear immediately, as well as suggestions for refining your search. Just as with Search Assist, the goal with Inquisitor is to help users find exactly the site they’re looking for as quickly as possible.
Inquisitor Today, Yahoo! Search is embracing the Mac community and offering similar search assistance features with the acquisition of Inquisitor software, a Safari browser plug-in. Inquisitor 3, a search technology that auto-completes queries and delivers results right in Safari Web browser, is similar to Yahoo!’s existing Search Assist technology. Simply type in your query and websites will appear immediately, as well as suggestions for refining your search. Just as with Search Assist, the goal with Inquisitor is to help users find exactly the site they’re looking for as quickly as possible.
Why Pyramid? Everything is targeted toward the users Every $ you spend in lower layers, yield in $$$ at the upper layers Every layer is built on top of the other, and add only the “marginal” development effort Yahoo!’s unique cloud play is that while other companies have focused on building out and exposing services at the infrastructure level (in terms of external products – think Amazon) Yahoo! has decided to focus on building out and exposing what we call our “functional cloud services”. These are services aimed at helping developers, partners, advertisers and Yahoo! itself create and deliver consumer focused product innovations by creating open services with access to Yahoo! data and computing resources. This is not to say we don’t invest and lead in Infrastructure cloud technology – in fact with Hadoop and other tech that drives the Internet we do lead – but rather that on that front our focus is not on productization of infrastructure as much as on open source and industry/academic collaboration. More like basic research. YQL The YQL platform provides a mediator service that enables developers to query, filter and combine data across Yahoo! and beyond. YQL exposes a SQL-like SELECT syntax that is both familiar to developers and expressive enough for getting the right data. Through the SHOW and DESC commands we attempt to make YQL self-documenting, enabling developers to discover the available data sources and structure without opening another web browser or reading a manual. The YQL Web Service exposes just a single URL, http://query.yahooapis.com/v1/?q=[command] that is compiled for each query. We perform rudimentary analysis on the query to determine how to factor it across one or more web services. As much of the query as possible is reworked into Yahoo! web service REST calls, and the remaining aspects are performed the YQL service itself. Think of a better name of “horizontal”
Yahoo Infrastructure Cloud is a comprehensive bundle of services (not only Hadoop) “ Simple APIs” = integrated and simple API to “consume” the core services, targeted at functional services developers Unstructured storage (blobs) = Images, text, video, binaries, …. Structured storage (information) = Very dynamic, Queryable Online Serving – a key aspect of the “power of Data within Yahoo” Foundations on which tenants build functional services. Not tied to specific app-logic Partially provide the ability to inject application logic through well-defined APIs Broadly applicable Fault-tolerant over commodity hardware Built using inexpensive commodity hardware, and should mask component failures. The Integrated Cloud is the key Loosely coupled services that collectively make it easy to quickly develop and operate functional services
Commodity HW + Horizontal scaling Add inexpensive servers with JBODS Storage servers and their disks are not assumed to be highly reliable and available Use replication across servers to deal with unreliable storage/servers Metadata-data separation - simple design Storage scales horizontally Metadata scales vertically (today) Slightly Restricted file semantics Focus is mostly sequential access No file locking features Support for moving computation close to data i.e. servers have 2 purposes: data storage and computation Simplicity of design why a small team could build such a large system in the first place
Why Pyramid? Everything is targeted toward the users Every $ you spend in lower layers, yield in $$$ at the upper layers Every layer is built on top of the other, and add only the “marginal” development effort Yahoo!’s unique cloud play is that while other companies have focused on building out and exposing services at the infrastructure level (in terms of external products – think Amazon) Yahoo! has decided to focus on building out and exposing what we call our “functional cloud services”. These are services aimed at helping developers, partners, advertisers and Yahoo! itself create and deliver consumer focused product innovations by creating open services with access to Yahoo! data and computing resources. This is not to say we don’t invest and lead in Infrastructure cloud technology – in fact with Hadoop and other tech that drives the Internet we do lead – but rather that on that front our focus is not on productization of infrastructure as much as on open source and industry/academic collaboration. More like basic research. YQL The YQL platform provides a mediator service that enables developers to query, filter and combine data across Yahoo! and beyond. YQL exposes a SQL-like SELECT syntax that is both familiar to developers and expressive enough for getting the right data. Through the SHOW and DESC commands we attempt to make YQL self-documenting, enabling developers to discover the available data sources and structure without opening another web browser or reading a manual. The YQL Web Service exposes just a single URL, http://query.yahooapis.com/v1/?q=[command] that is compiled for each query. We perform rudimentary analysis on the query to determine how to factor it across one or more web services. As much of the query as possible is reworked into Yahoo! web service REST calls, and the remaining aspects are performed the YQL service itself. Think of a better name of “horizontal”
Open PaaS for building scalable web-sites, NOT IaaS Better wording than structred
BOSS BOSS (Build your Own Search Service) is Yahoo!'s open search web services platform. The goal of BOSS is simple: to foster innovation in the search industry. Developers, start-ups, and large Internet companies can use BOSS to build and launch web-scale search products that utilize the entire Yahoo! Search index. BOSS gives you access to Yahoo!'s investments in crawling and indexing, ranking and relevancy algorithms, and powerful infrastructure. By combining your unique assets and ideas with our search technology assets, BOSS is a platform for the next generation of search innovation, serving hundreds of millions of users across the Web. YQL The YQL platform provides a mediator service that enables developers to query, filter and combine data across Yahoo! and beyond. YQL exposes a SQL-like SELECT syntax that is both familiar to developers and expressive enough for getting the right data. Through the SHOW and DESC commands we attempt to make YQL self-documenting, enabling developers to discover the available data sources and structure without opening another web browser or reading a manual. The YQL Web Service exposes just a single URL, http://query.yahooapis.com/v1/?q=[command] that is compiled for each query. We perform rudimentary analysis on the query to determine how to factor it across one or more web services. As much of the query as possible is reworked into Yahoo! web service REST calls, and the remaining aspects are performed the YQL service itself.
YDN is the new thing, exposing cloud-based services to the outside world SaaS-like services (the upper layer) has been around at Yahoo for a long time We need to w ork with YDN to crystallized the difference between web services and functional cloud services
M45 more than 27 trillion calculations per second Carnegie Mellon University The University of California at Berkeley Cornell University The University of Massachusetts at Amherst joined OpenCiruic the first Eastern European institution, the Russian Academy of Sciences, Korean Electronics and Telecommunications Research Institute (ETRI), Malaysian Institute of Microelectronic Systems (MIMOS). The University of Illinois at Urbana-Champaign Infocomm Development Authority (IDA) in Singapore The Karlsruhe Institute of Technology (KIT) in Germany The Russian Academy of Sciences, Electronics & Telecomm. Malaysian Institute of Microelectronic Systems
Q: What is Yahoo!’s cloud computing infrastructure and how is it unique? A: As one of the largest providers of consumer Internet services in the world, Yahoo’s cloud operates at virtually unprecedented scale, making it a unique environment and testing ground for cloud computing technologies. Yahoo! has more than 500M unique users per month across the world. We store and deliver hundreds of petabytes of data, hundreds of billions of objects, and hundreds of thousands of requests/sec. All of this activity is processed across a diverse footprint of distributed data centers while seamlessly balancing highly variable usage patterns across a global audience at low-latencies. Almost no other company can boast of having to tune its infrastructure to deal with such a range of technical requirements and high standards of performance. To meet this challenge, Yahoo!’s cloud includes a collection of infrastructure and functional services targeted at dramatically improving the company’s efficiency throughout the entire product development cycle, from gathering user feedback and insight, to feature testing and iteration to ongoing product operations. Q: How should we categorize Yahoo!’s Cloud Computing offerings? Is it IaaS, PaaS, SaaS or others? A: For the time being, Yahoo!’s cloud computing focus is on its internal offerings, in service of making the Yahoo! experience as extraordinary, effective and productive as possible for consumers and advertisers across the world. We see this as a multi-year effort that will provide significant advantages for Yahoo! now and in the future. Over time, we can envision exposing some of these cloud technologies and services externally. In such a scenario, we’d likely focus on more “functional” cloud services (more PaaS than IaaS) that could help developers leverage Yahoo!’s massive scale to innovate and deliver new, more richly integrated user experiences Q: What is Yahoo!’s future plans in cloud computing? A: We are investing in further building out and deploying cloud computing technologies and services across the global Yahoo! operation so as to help our product teams innovate faster and deliver high-quality experiences to our customers across the globe. One area where this effort has been particularly notable is in the development and delivery of the Yahoo! Open Strategy and Y!OS platform. Overall, we will continue to actively collaborate with the industry, academia and the open source community, including through our Open Cirrus consortium, involvement with Hadoop and Pig, and support of Apache. Over time, we may consider exposing our cloud services in a more comprehensive manner through the Yahoo! Developer Network, which serves as Yahoo!’s front door for third parties seeking to engage with our developer tools and web services, including such popular offerings like BOSS, Flickr, YQL, YUI, and Y!OS. However, we have nothing specific to share at this time.