Data Applications and Infrastructure at LinkedIn__HadoopSummit2010

•

97 likes•7,776 views

Yahoo Developer Network

Hadoop Summit 2010 - application track Data Applications and Infrastructure at LinkedIn Jay Kreps, LinkedIn

Data Applications and Infrastructure at LinkedIn ,[object Object],LinkedIn

Plan ,[object Object],[object Object],[object Object]

Data-centric engineering at LinkedIn ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

People You May Know

Other products

People You May Know ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Relevance Products ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Infrastructure as an Ecosystem ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Open Source Zoie – Faceted Search Bobo – Real-time search indexing Decomposer – Very large matrix decomposition routines (now in Mahout) Norbert – Partition aware cluster management & RPC Voldemort – Key/Value storage Kamikaze – Compression package Sensei – Distributed search Azkaban – Hadoop workflow

Azkaban workflow = cron + make

Azkaban workflow:hadoop :: web framework:webapp

Azkaban

Azkaban Examples ,[object Object],Example workflow UI

Workflow

Azkaban ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Data Deployment How do you get your multi-billion edge probabilistic relationship graph to the live website to serve queries?

Voldemort ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Voldemort Data Deployment

Voldemort Data Deployment ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Voldemort Data Deployment ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Questions?

More Related Content

What's hot

WhereHows: Taming Metadata for 150K Datasets Over 9 Data Platforms

WhereHows: Taming Metadata for 150K Datasets Over 9 Data Platforms

WhereHows: Taming Metadata for 150K Datasets Over 9 Data Platforms

What's new in SQL on Hadoop and Beyond

What's new in SQL on Hadoop and Beyond

What's new in SQL on Hadoop and Beyond

DataWorks Summit/Hadoop Summit

Lambda-less Stream Processing @Scale in LinkedIn

Lambda-less Stream Processing @Scale in LinkedIn

Lambda-less Stream Processing @Scale in LinkedIn

DataWorks Summit/Hadoop Summit

Big Data Ready Enterprise

Big Data Ready Enterprise

Big Data Ready Enterprise

DataWorks Summit/Hadoop Summit

What is an Open Data Lake? - Data Sheets | Whitepaper

What is an Open Data Lake? - Data Sheets | Whitepaper

What is an Open Data Lake? - Data Sheets | Whitepaper

Discovery & Consumption of Analytics Data @Twitter

Discovery & Consumption of Analytics Data @Twitter

Discovery & Consumption of Analytics Data @Twitter

Benefits of Hadoop as Platform as a Service

Benefits of Hadoop as Platform as a Service

Benefits of Hadoop as Platform as a Service

DataWorks Summit/Hadoop Summit

Machine learning at scale challenges and solutions

Machine learning at scale challenges and solutions

Machine learning at scale challenges and solutions

Stavros Kontopoulos

Yahoo Mail has 200+ million users a month and generates hundreds of terabytes of data per day, which continues to grow steadily. The nature of email messages has also evolved: for example, today the majority of them are generated by machines, consisting of newsletters, social media notifications, purchase invoices, travel bookings, and the like, which drove innovations in product development to help users organize their inboxes. Since 2014, the Yahoo Mail Data Engineering team took on the task of revamping the Mail data warehouse and analytics infrastructure in order to drive the continued growth and evolution of Yahoo Mail. Along the way we have built a 50 PB Hadoop warehouse, and surrounding analytics and machine learning programs that have transformed the way data plays in Yahoo Mail. In this session we will share our experience from this 3 year journey, from the system architecture, analytics systems built, to the learnings from development and drive for adoption.

Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop Warehouse

Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop Warehouse

Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop Warehouse

DataWorks Summit

The Past, Present and Future of Big Data @LinkedIn

The Past, Present and Future of Big Data @LinkedIn

The Past, Present and Future of Big Data @LinkedIn

Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop

Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop

Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop

[Webinar] Getting to Insights Faster: A Framework for Agile Big Data

[Webinar] Getting to Insights Faster: A Framework for Agile Big Data

[Webinar] Getting to Insights Faster: A Framework for Agile Big Data

Infochimps, a CSC Big Data Business

I will share the vision and the production journey of how we build enterprise shared AI As A Service platforms with distributed deep learning technologies. Including those topics: 1) The vision of Enterprise Shared AI As A Service and typical AI services use cases at FinTech industry 2) The high level architecture design principles for AI As A Service 3) The technical evaluation journey to choose an enterprise deep learning framework with comparisons, such as why we choose Deep learning framework based on Spark ecosystem 4) Share some production AI use cases, such as how we implemented new Users-Items Propensity Models with deep learning algorithms with Spark,improve the quality , performance and accuracy of offer and campaigns design, targeting offer matching and linking etc. 5) Share some experiences and tips of using deep learning technologies on top of Spark , such as how we conduct Intel BigDL into a real production.

AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...

AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...

AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...

Netflix processes trillions of events and petabytes of data a day in the Keystone data pipeline, which is built on top of Apache Flink. As Netflix has scaled up original productions annually enjoyed by more than 150 million global members, data integration across the streaming service and the studio has become a priority. Scalably integrating data across hundreds of different data stores in a way that enables us to holistically optimize cost, performance and operational concerns presented a significant challenge. Learn how we expanded the scope of the Keystone pipeline into the Netflix Data Mesh, our real-time, general-purpose, data transportation platform for moving data between Netflix systems. The Keystone Platform’s unique approach to declarative configuration and schema evolution, as well as our approach to unifying batch and streaming data and processing will be covered in depth.

Virtual Flink Forward 2020: Netflix Data Mesh: Composable Data Processing - J...

Virtual Flink Forward 2020: Netflix Data Mesh: Composable Data Processing - J...

Virtual Flink Forward 2020: Netflix Data Mesh: Composable Data Processing - J...

Druid is a high performance, column-oriented distributed data store that is widely used at Oath for big data analysis. Druid has a JSON schema as its query language, making it difficult for new users unfamiliar with the schema to start querying Druid quickly. The JSON schema is designed to work with the data ingestion methods of Druid, so it can provide high performance features such as data aggregations in JSON, but many are unable to utilize such features, because they not familiar with the specifics of how to optimize Druid queries. However, most new Druid users at Yahoo are already very familiar with SQL, and the queries they want to write for Druid can be converted to concise SQL. We found that our data analysts wanted an easy way to issue ad-hoc Druid queries and view the results in a BI tool in a way that's presentable to nontechnical stakeholders. In order to achieve this, we had to bridge the gap between Druid, SQL, and our BI tools such as Apache Superset. In this talk, we will explore different ways to query a Druid datasource in SQL and discuss which methods were most appropriate for our use cases. We will also discuss our open source contributions so others can utilize our work. GURUGANESH KOTTA, Software Dev Eng, Oath and JUNXIAN WU, Software Engineer, Oath Inc.

Querying Druid in SQL with Superset

Querying Druid in SQL with Superset

Querying Druid in SQL with Superset

DataWorks Summit

Gobblin' Big Data With Ease @ QConSF 2014

Gobblin' Big Data With Ease @ QConSF 2014

Gobblin' Big Data With Ease @ QConSF 2014

Spark and Couchbase– Augmenting the Operational Database with Spark

Spark and Couchbase– Augmenting the Operational Database with Spark

Spark and Couchbase– Augmenting the Operational Database with Spark

Matt Ingenthron

Schema-on-Read vs Schema-on-Write

Schema-on-Read vs Schema-on-Write

Schema-on-Read vs Schema-on-Write

Data Infrastructure at LinkedIn

Data Infrastructure at LinkedIn

Data Infrastructure at LinkedIn

While we frequently talk about how to build interesting products on top of machine and event data, the reality is that collecting, organizing, providing access to, and managing this data is where most people get stuck. In this session, we’ll follow the flow of data through an end to end system built to handle tens of terabytes per day of event-oriented data, providing real time streaming, in-memory, SQL, and batch access to this data. We’ll go into detail on how open source systems such as Hadoop, Kafka, Solr, and Impala/Hive are actually stitched together; describe how and where to perform data transformation and aggregation; provide a simple and pragmatic way of managing event metadata; and talk about how applications built on top of this platform get access to data and extend its functionality. This session is especially recommended for data infrastructure engineers and architects planning, building, or maintaining similar systems.

Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...

Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...

Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...

What's hot (20)

WhereHows: Taming Metadata for 150K Datasets Over 9 Data Platforms

WhereHows: Taming Metadata for 150K Datasets Over 9 Data Platforms

WhereHows: Taming Metadata for 150K Datasets Over 9 Data Platforms

What's new in SQL on Hadoop and Beyond

What's new in SQL on Hadoop and Beyond

What's new in SQL on Hadoop and Beyond

Lambda-less Stream Processing @Scale in LinkedIn

Lambda-less Stream Processing @Scale in LinkedIn

Lambda-less Stream Processing @Scale in LinkedIn

Big Data Ready Enterprise

Big Data Ready Enterprise

Big Data Ready Enterprise

What is an Open Data Lake? - Data Sheets | Whitepaper

What is an Open Data Lake? - Data Sheets | Whitepaper

What is an Open Data Lake? - Data Sheets | Whitepaper

Discovery & Consumption of Analytics Data @Twitter

Discovery & Consumption of Analytics Data @Twitter

Discovery & Consumption of Analytics Data @Twitter

Benefits of Hadoop as Platform as a Service

Benefits of Hadoop as Platform as a Service

Benefits of Hadoop as Platform as a Service

Machine learning at scale challenges and solutions

Machine learning at scale challenges and solutions

Machine learning at scale challenges and solutions

Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop Warehouse

Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop Warehouse

Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop Warehouse

The Past, Present and Future of Big Data @LinkedIn

The Past, Present and Future of Big Data @LinkedIn

The Past, Present and Future of Big Data @LinkedIn

Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop

Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop

Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop

[Webinar] Getting to Insights Faster: A Framework for Agile Big Data

[Webinar] Getting to Insights Faster: A Framework for Agile Big Data

[Webinar] Getting to Insights Faster: A Framework for Agile Big Data

AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...

AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...

AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...

Virtual Flink Forward 2020: Netflix Data Mesh: Composable Data Processing - J...

Virtual Flink Forward 2020: Netflix Data Mesh: Composable Data Processing - J...

Virtual Flink Forward 2020: Netflix Data Mesh: Composable Data Processing - J...

Querying Druid in SQL with Superset

Querying Druid in SQL with Superset

Querying Druid in SQL with Superset

Gobblin' Big Data With Ease @ QConSF 2014

Gobblin' Big Data With Ease @ QConSF 2014

Gobblin' Big Data With Ease @ QConSF 2014

Spark and Couchbase– Augmenting the Operational Database with Spark

Spark and Couchbase– Augmenting the Operational Database with Spark

Spark and Couchbase– Augmenting the Operational Database with Spark

Schema-on-Read vs Schema-on-Write

Schema-on-Read vs Schema-on-Write

Schema-on-Read vs Schema-on-Write

Data Infrastructure at LinkedIn

Data Infrastructure at LinkedIn

Data Infrastructure at LinkedIn

Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...

Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...

Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...

Viewers also liked

Building a Real-Time Data Pipeline: Apache Kafka at LinkedIn

Building a Real-Time Data Pipeline: Apache Kafka at LinkedIn

Building a Real-Time Data Pipeline: Apache Kafka at LinkedIn

Graph db

GraphDB Connectors – Powering Complex SPARQL Queries

GraphDB Connectors – Powering Complex SPARQL Queries

GraphDB Connectors – Powering Complex SPARQL Queries

LinkedIn Data Infrastructure Slides (Version 2)

LinkedIn Data Infrastructure Slides (Version 2)

LinkedIn Data Infrastructure Slides (Version 2)

Shirshanka Das and Yael Garten describe how LinkedIn redesigned its data analytics ecosystem in the face of a significant product rewrite, covering the infrastructure changes that enable LinkedIn to roll out future product innovations with minimal downstream impact. Shirshanka and Yael explore the motivations and the building blocks for this reimagined data analytics ecosystem, the technical details of LinkedIn’s new client-side tracking infrastructure, its unified reporting platform, and its data virtualization layer on top of Hadoop and share lessons learned from data producers and consumers that are participating in this governance model. Along the way, they offer some anecdotal evidence during the rollout that validated some of their decisions and are also shaping the future roadmap of these efforts.

Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem

Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem

Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem

Architecture of a Kafka camus infrastructure

Architecture of a Kafka camus infrastructure

Architecture of a Kafka camus infrastructure

Netflix Data Pipeline With Kafka

Netflix Data Pipeline With Kafka

Netflix Data Pipeline With Kafka

Allen (Xiaozhong) Wang

NoSQL x SQL: Bancos de Dados em Nuvens Computacionais

NoSQL x SQL: Bancos de Dados em Nuvens Computacionais

NoSQL x SQL: Bancos de Dados em Nuvens Computacionais

LinkedIn has several data driven products that improve the experience of its users -- whether they are professionals or enterprises. Supporting this is a large ecosystem of systems and processes that provide data and insights in a timely manner to the products that are driven by it. This talk provides an overview of the various components of this ecosystem which are: - Hadoop - Teradata - Kafka - Databus - Camus - Lumos etc.

The Big Data Analytics Ecosystem at LinkedIn

The Big Data Analytics Ecosystem at LinkedIn

The Big Data Analytics Ecosystem at LinkedIn

Apache Kafka

Bigger Faster Easier: LinkedIn Hadoop Summit 2015

Bigger Faster Easier: LinkedIn Hadoop Summit 2015

Bigger Faster Easier: LinkedIn Hadoop Summit 2015

Text Analytics & Linked Data Management As-a-Service

Text Analytics & Linked Data Management As-a-Service

Text Analytics & Linked Data Management As-a-Service

Realtime streaming architecture in INFINARIO

Realtime streaming architecture in INFINARIO

Realtime streaming architecture in INFINARIO

Non-interactive big-data analysis prohibits experimentation and can interrupt the analyst’s train of thoughts but analyzing and drawing insights in real time is no easy task with jobs often taking minutes/hours to complete. What if you want to put a interactive interface in front of that data that allows iterative insights? What if you need that interactive experience to be sub second? Traditional SQL and most MPP/NoSQL databases cannot run complex calculations over large data in a performant manner. Popular distributed systems such as Hadoop or Spark can execute jobs but their job overhead prohibits sub second response times. Learn how an in-memory computing framework enabled us to perform complex analysis jobs on massive data points with sub second response times — allowing us to plug it into a simple, drag-and-drop web 2.0 interface.

IMCSummit 2015 - Day 2 IT Business Track - Real-time Interactive Big Data Ana...

IMCSummit 2015 - Day 2 IT Business Track - Real-time Interactive Big Data Ana...

IMCSummit 2015 - Day 2 IT Business Track - Real-time Interactive Big Data Ana...

In-Memory Computing Summit

Taming the ETL beast: How LinkedIn uses metadata to run complex ETL flows rel...

Taming the ETL beast: How LinkedIn uses metadata to run complex ETL flows rel...

Taming the ETL beast: How LinkedIn uses metadata to run complex ETL flows rel...

Bringing OLTP woth OLAP: Lumos on Hadoop

Bringing OLTP woth OLAP: Lumos on Hadoop

Bringing OLTP woth OLAP: Lumos on Hadoop

DataWorks Summit

Comparação de desempenho entre SQL e NoSQL

Comparação de desempenho entre SQL e NoSQL

Comparação de desempenho entre SQL e NoSQL

Free Code Friday - Spark Streaming with HBase

Free Code Friday - Spark Streaming with HBase

Free Code Friday - Spark Streaming with HBase

MapR Technologies

Real-time Analytics with Apache Flink and Druid

Real-time Analytics with Apache Flink and Druid

Real-time Analytics with Apache Flink and Druid

Databus: LinkedIn's Change Data Capture Pipeline SOCC 2012

Databus: LinkedIn's Change Data Capture Pipeline SOCC 2012

Databus: LinkedIn's Change Data Capture Pipeline SOCC 2012

Viewers also liked (20)

Building a Real-Time Data Pipeline: Apache Kafka at LinkedIn

Building a Real-Time Data Pipeline: Apache Kafka at LinkedIn

Building a Real-Time Data Pipeline: Apache Kafka at LinkedIn

Graph db

GraphDB Connectors – Powering Complex SPARQL Queries

GraphDB Connectors – Powering Complex SPARQL Queries

GraphDB Connectors – Powering Complex SPARQL Queries

LinkedIn Data Infrastructure Slides (Version 2)

LinkedIn Data Infrastructure Slides (Version 2)

LinkedIn Data Infrastructure Slides (Version 2)

Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem

Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem

Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem

Architecture of a Kafka camus infrastructure

Architecture of a Kafka camus infrastructure

Architecture of a Kafka camus infrastructure

Netflix Data Pipeline With Kafka

Netflix Data Pipeline With Kafka

Netflix Data Pipeline With Kafka

NoSQL x SQL: Bancos de Dados em Nuvens Computacionais

NoSQL x SQL: Bancos de Dados em Nuvens Computacionais

NoSQL x SQL: Bancos de Dados em Nuvens Computacionais

The Big Data Analytics Ecosystem at LinkedIn

The Big Data Analytics Ecosystem at LinkedIn

The Big Data Analytics Ecosystem at LinkedIn

Apache Kafka

Bigger Faster Easier: LinkedIn Hadoop Summit 2015

Bigger Faster Easier: LinkedIn Hadoop Summit 2015

Bigger Faster Easier: LinkedIn Hadoop Summit 2015

Text Analytics & Linked Data Management As-a-Service

Text Analytics & Linked Data Management As-a-Service

Text Analytics & Linked Data Management As-a-Service

Realtime streaming architecture in INFINARIO

Realtime streaming architecture in INFINARIO

Realtime streaming architecture in INFINARIO

IMCSummit 2015 - Day 2 IT Business Track - Real-time Interactive Big Data Ana...

IMCSummit 2015 - Day 2 IT Business Track - Real-time Interactive Big Data Ana...

IMCSummit 2015 - Day 2 IT Business Track - Real-time Interactive Big Data Ana...

Taming the ETL beast: How LinkedIn uses metadata to run complex ETL flows rel...

Taming the ETL beast: How LinkedIn uses metadata to run complex ETL flows rel...

Taming the ETL beast: How LinkedIn uses metadata to run complex ETL flows rel...

Bringing OLTP woth OLAP: Lumos on Hadoop

Bringing OLTP woth OLAP: Lumos on Hadoop

Bringing OLTP woth OLAP: Lumos on Hadoop

Comparação de desempenho entre SQL e NoSQL

Comparação de desempenho entre SQL e NoSQL

Comparação de desempenho entre SQL e NoSQL

Free Code Friday - Spark Streaming with HBase

Free Code Friday - Spark Streaming with HBase

Free Code Friday - Spark Streaming with HBase

Real-time Analytics with Apache Flink and Druid

Real-time Analytics with Apache Flink and Druid

Real-time Analytics with Apache Flink and Druid

Databus: LinkedIn's Change Data Capture Pipeline SOCC 2012

Databus: LinkedIn's Change Data Capture Pipeline SOCC 2012

Databus: LinkedIn's Change Data Capture Pipeline SOCC 2012

Similar to Data Applications and Infrastructure at LinkedIn__HadoopSummit2010

UnConference for Georgia Southern Computer Science March 31, 2015

UnConference for Georgia Southern Computer Science March 31, 2015

UnConference for Georgia Southern Computer Science March 31, 2015

Christopher Curtin

Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010

Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010

Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010

Hadoop and Voldemort @ LinkedIn

Hadoop and Voldemort @ LinkedIn

Hadoop and Voldemort @ LinkedIn

Hadoop User Group

Os Solomon

Super Sizing Youtube with Python

Super Sizing Youtube with Python

Super Sizing Youtube with Python

scale_perf_best_practices

scale_perf_best_practices

scale_perf_best_practices

Bhupeshbansal bigdata

Bhupeshbansal bigdata

Bhupeshbansal bigdata

Front Range PHP NoSQL Databases

Front Range PHP NoSQL Databases

Front Range PHP NoSQL Databases

Tackling the challenge of designing a machine learning model and putting it into production is the key to getting value back – and the roadblock that stops many promising machine learning projects. After the data scientists have done their part, engineering robust production data pipelines has its own set of challenges. Syncsort software helps the data engineer every step of the way. Building on the process of finding and matching duplicates to resolve entities, the next step is to set up a continuous streaming flow of data from data sources so that as the sources change, new data automatically gets pushed through the same transformation and cleansing data flow – into the arms of machine learning models. Some of your sources may already be streaming, but the rest are sitting in transactional databases that change hundreds or thousands of times a day. The challenge is that you can’t affect performance of data sources that run key applications, so putting something like database triggers in place is not the best idea. Using Apache Kafka or similar technologies as the backbone to moving data around doesn’t solve the problem of needing to grab changes from the source pushing them into Kafka and consuming the data from Kafka to be processed. If something unexpected happens – like connectivity is lost on either the source or the target side, you don’t want to have to fix it or start over because the data is out of sync. View this 15-minute webcast on-demand to learn how to tackle these challenges in large scale production implementations.

Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...

Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...

Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...

Drupal Perfomance. Talk given at DrupalCamp North, 25th July 2015. This session looked at tools you can use to analyse the performance and benchmark a Drupal site. It then looked at tools and techniques that can be used to improve the site performance. The session also included a case study about the Drupal based BAFTA website that was built by Access. Focusing on the recent Film and TV awards, which saw a large amount of traffic in a short amount of time.

Drupal Performance : DrupalCamp North

Drupal Performance : DrupalCamp North

Drupal Performance : DrupalCamp North

Web20expo Scalable Web Arch

Web20expo Scalable Web Arch

Web20expo Scalable Web Arch

Web20expo Scalable Web Arch

Web20expo Scalable Web Arch

Web20expo Scalable Web Arch

Web20expo Scalable Web Arch

Web20expo Scalable Web Arch

Web20expo Scalable Web Arch

Financial industry companies need Java EE to power for its business today. Rakuten Card, one of the largest credit card companies in Japan, adopted Java EE 7 for its credit card core systems architecture, from one of the oldest COBOL based mainframe in Japan. Additionally, we chose Apache Spark for super rapid batch execution platform. We completed this big core system migration project successfully. You can learn why we choose Java EE, and Apache Spark for super rapid batch execution, and our experiences and lessons we learned. How to start such a the big project? Why we choose it, how we ported, how use Apache Spark for performance improvements, and launched with? We’ll answer these questions and any that you may have. Additionally, we are going to unveil our future roadmap for expanding our systems as well, with the cutting edge technology and standards.

Java ee7 with apache spark for the world's largest credit card core systems, ...

Java ee7 with apache spark for the world's largest credit card core systems, ...

Java ee7 with apache spark for the world's largest credit card core systems, ...

Rakuten Group, Inc.

Stream based data / event / message processing becomes preferred way of achieving interoperability and real-time communication in distributed SOA / microservice / database architectures. Beside lambdas, Java 8 introduced two new APIs explicitly dealing with stream data processing: - Stream - which is PULL-based and easily parallelizable; - CompletableFuture / CompletionStage - which allow composition of PUSH-based, non-blocking, asynchronous data processing pipelines. Java 9 will provide further support for stream-based data-processing by extending the CompletableFuture with additional functionality – support for delays and timeouts, better support for subclassing, and new utility methods. More, Java 9 provides new java.util.concurrent.Flow API implementing Reactive Streams specification that enables reactive programming and interoperability with libraries like Reactor, RxJava, RabbitMQ, Vert.x, Ratpack, and Akka. The presentation will discuss the novelties in Java 8 and Java 9 supporting stream data processing, describing the APIs, models and practical details of asynchronous pipeline implementation, error handling, multithreaded execution, asyncronous REST service implementation, interoperability with existing libraries. There are provided demo examples (code on GitHub) using Completable Future and Flow with: - JAX-RS 2.1 AsyncResponse, and more importantly unit-testing the async REST service method implementations; - CDI 2.0 asynchronous observers (fireAsync / @ObservesAsync);

Stream Processing with CompletableFuture and Flow in Java 9

Stream Processing with CompletableFuture and Flow in Java 9

Stream Processing with CompletableFuture and Flow in Java 9

Final deck

Beat the devil: towards a Drupal performance benchmark

Beat the devil: towards a Drupal performance benchmark

Beat the devil: towards a Drupal performance benchmark

Pedro González Serrano

Performance Analysis of Idle Programs

Performance Analysis of Idle Programs

Performance Analysis of Idle Programs

Real time analytics

Real time analytics

Real time analytics

Leandro Totino Pereira

Apache Kafka® and the Data Mesh

Apache Kafka® and the Data Mesh

Apache Kafka® and the Data Mesh

Similar to Data Applications and Infrastructure at LinkedIn__HadoopSummit2010 (20)

UnConference for Georgia Southern Computer Science March 31, 2015

UnConference for Georgia Southern Computer Science March 31, 2015

UnConference for Georgia Southern Computer Science March 31, 2015

Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010

Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010

Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010

Hadoop and Voldemort @ LinkedIn

Hadoop and Voldemort @ LinkedIn

Hadoop and Voldemort @ LinkedIn

Os Solomon

Super Sizing Youtube with Python

Super Sizing Youtube with Python

Super Sizing Youtube with Python

scale_perf_best_practices

scale_perf_best_practices

scale_perf_best_practices

Bhupeshbansal bigdata

Bhupeshbansal bigdata

Bhupeshbansal bigdata

Front Range PHP NoSQL Databases

Front Range PHP NoSQL Databases

Front Range PHP NoSQL Databases

Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...

Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...

Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...

Drupal Performance : DrupalCamp North

Drupal Performance : DrupalCamp North

Drupal Performance : DrupalCamp North

Web20expo Scalable Web Arch

Web20expo Scalable Web Arch

Web20expo Scalable Web Arch

Web20expo Scalable Web Arch

Web20expo Scalable Web Arch

Web20expo Scalable Web Arch

Web20expo Scalable Web Arch

Web20expo Scalable Web Arch

Web20expo Scalable Web Arch

Java ee7 with apache spark for the world's largest credit card core systems, ...

Java ee7 with apache spark for the world's largest credit card core systems, ...

Java ee7 with apache spark for the world's largest credit card core systems, ...

Stream Processing with CompletableFuture and Flow in Java 9

Stream Processing with CompletableFuture and Flow in Java 9

Stream Processing with CompletableFuture and Flow in Java 9

Final deck

Beat the devil: towards a Drupal performance benchmark

Beat the devil: towards a Drupal performance benchmark

Beat the devil: towards a Drupal performance benchmark

Performance Analysis of Idle Programs

Performance Analysis of Idle Programs

Performance Analysis of Idle Programs

Real time analytics

Real time analytics

Real time analytics

Apache Kafka® and the Data Mesh

Apache Kafka® and the Data Mesh

Apache Kafka® and the Data Mesh

More from Yahoo Developer Network

Developing Mobile Apps for Performance - Swapnil Patel, Verizon Media

Developing Mobile Apps for Performance - Swapnil Patel, Verizon Media

Developing Mobile Apps for Performance - Swapnil Patel, Verizon Media

Yahoo Developer Network

Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...

Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...

Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...

Yahoo Developer Network

Athenz & SPIFFE, Tatsuya Yano, Yahoo Japan

Athenz & SPIFFE, Tatsuya Yano, Yahoo Japan

Athenz & SPIFFE, Tatsuya Yano, Yahoo Japan

Yahoo Developer Network

Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...

Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...

Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...

Yahoo Developer Network

CICD at Oath using Screwdriver

CICD at Oath using Screwdriver

CICD at Oath using Screwdriver

Yahoo Developer Network

Offline and stream processing of big data sets can be done with tools such as Hadoop, Spark, and Storm, but what if you need to process big data at the time a user is making a request? Vespa (http://www.vespa.ai) allows you to search, organize and evaluate machine-learned models from e.g TensorFlow over large, evolving data sets with latencies in the tens of milliseconds. Vespa is behind the recommendation, ad targeting, and search at Yahoo where it handles billions of daily queries over billions of documents.

Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath

Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath

Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath

Yahoo Developer Network

How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu

How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu

How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu

Yahoo Developer Network

The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool

The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool

The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool

Yahoo Developer Network

Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...

Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...

Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...

Yahoo Developer Network

Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...

Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...

Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...

Yahoo Developer Network

HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath

HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath

HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath

Yahoo Developer Network

Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...

Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...

Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...

Yahoo Developer Network

Moving the Oath Grid to Docker, Eric Badger, Oath

Moving the Oath Grid to Docker, Eric Badger, Oath

Moving the Oath Grid to Docker, Eric Badger, Oath

Yahoo Developer Network

Architecting Petabyte Scale AI Applications

Architecting Petabyte Scale AI Applications

Architecting Petabyte Scale AI Applications

Yahoo Developer Network

Offline and stream processing of big data sets can be done with tools such as Hadoop, Spark, and Storm, but what if you need to process big data at the time a user is making a request? This presentation introduces Vespa (http://vespa.ai) – the open source big data serving engine. Vespa allows you to search, organize, and evaluate machine-learned models from e.g TensorFlow over large, evolving data sets with latencies in the tens of milliseconds. Vespa is behind the recommendation, ad targeting, and search at Yahoo where it handles billions of daily queries over billions of documents and was recently open sourced at http://vespa.ai.

Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...

Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...

Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...

Yahoo Developer Network

In recent times, YARN Capacity Scheduler has improved a lot in terms of some critical features and refactoring. Here is a quick look into some of the recent changes in scheduler: Global Scheduling Support General placement support Better preemption model to handle resource anomalies across and within queue. Absolute resources’ configuration support Priority support between Queues and Applications In this talk, we will deep dive into each of these new features to give a better picture of their usage and performance comparison. We will also provide some more brief overview about the ongoing efforts and how they can help to solve some of the core issues we face today. Speakers: Sunil Govind (Hortonworks), Jian He (Hortonworks)

Jun 2017 HUG: YARN Scheduling – A Step Beyond

Jun 2017 HUG: YARN Scheduling – A Step Beyond

Jun 2017 HUG: YARN Scheduling – A Step Beyond

Yahoo Developer Network

In recent years, Yahoo has brought the big data ecosystem and machine learning together to discover mathematical models for search ranking, online advertising, content recommendation, and mobile applications. We use distributed computing clusters with CPUs and GPUs to train these models from 100’s of petabytes of data. A collection of distributed algorithms have been developed to achieve 10-1000x the scale and speed of alternative solutions. Our algorithms construct regression/classification models and semantic vectors within hours, even for billions of training examples and parameters. We have made our distributed deep learning solutions, CaffeOnSpark and TensorFlowOnSpark, available as open source. In this talk, we highlight Yahoo use cases where big data and machine learning technologies are best exemplified. We explain algorithm/system challenges to scale ML algorithms for massive datasets. We provide a technical overview of CaffeOnSpark and TensorFlowOnSpark to jumpstart your journey of large-scale machine learning. Speakers: Andy Feng is a VP of Architecture at Yahoo, leading the architecture and design of big data and machine learning initiatives. He has architected large-scale systems for personalization, ad serving, NoSQL, and cloud infrastructure. Prior to Yahoo, he was a Chief Architect at Netscape/AOL, and Principal Scientist at Xerox. He received a Ph.D. degree in computer science from Osaka University, Japan.

Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies

Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies

Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies

Yahoo Developer Network

Spark and SQL-on-Hadoop have made it easier than ever for enterprises to create or migrate apps to the big data stack. Thousands of apps are being generated every day in the form of ETL and modeling pipelines, business intelligence and data cubes, deep machine learning, graph analytics, and real-time data streaming. However, the task of reliably operationalizing these big data apps involves many painpoints. Developers may not have the experience in distributed systems to tune apps for efficiency and performance. Diagnosing failures or unpredictable performance of apps can be a laborious process that involves multiple people. Apps may get stuck or steal resources and cause mission-critical apps to miss SLAs. This talk with introduce the audience to these problems and their common causes. We will also demonstrate how to find and fix these problems quickly, as well as prevent such problems from happening in the first place. Speakers: Dr. Shivnath Babu is a Co-founder and CTO of Unravel and Associate Professor of Computer Science at Duke University. With more than a decade of experience researching the ease of use and manageability of data-intensive systems, he leads the Starfish project at Duke, which pioneered the automation of Hadoop application tuning, problem diagnosis, and resource management. Shivnath has more than 80 peer-reviewed publications to his credit and has received the U.S. National Science Foundation CAREER Award, the HP Labs Innovation Award, and three IBM Faculty Awards.

February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...

February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...

February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...

Yahoo Developer Network

Apache Apex (http://apex.apache.org/) is a stream processing platform that helps organizations to build processing pipelines with fault tolerance and strong processing guarantees. It was built to support low processing latency, high throughput, scalability, interoperability, high availability and security. The platform comes with Malhar library - an extensive collection of processing operators and a wide range of input and output connectors for out-of-the-box integration with an existing infrastructure. In the talk I am going to describe how connectors together with the distributed checkpointing (a mechanism used by the Apex to support fault tolerance and high availability) provide exactly-once end-to-end processing guarantees. Speakers: Vlad Rozov is Apache Apex PMC member and back-end engineer at DataTorrent where he focuses on the buffer server, Apex platform network layer, benchmarks and optimizing the core components for low latency and high throughput. Prior to DataTorrent Vlad worked on distributed BI platform at Huawei and on multi-dimensional database (OLAP) at Hyperion Solutions and Oracle.

February 2017 HUG: Exactly-once end-to-end processing with Apache Apex

February 2017 HUG: Exactly-once end-to-end processing with Apache Apex

February 2017 HUG: Exactly-once end-to-end processing with Apache Apex

Yahoo Developer Network

In the analysis of big data there are problematic queries that don’t scale because they require huge compute resources and time to generate exact results. Examples include count distinct, quantiles, most frequent items, joins, matrix computations, and graph analysis. If approximate results are acceptable, there is a class of sub-linear, stochastic streaming algorithms, called "sketches", that can produce results orders-of magnitude faster and with mathematically proven error bounds. For interactive queries there may not be other viable alternatives, and in the case of extracting results for these problem queries in real-time, sketches are the only known solution. For any analysis system that requires these problematic queries from big data, sketches are a required toolkit that should be tightly integrated into the system's analysis capabilities. This technology has helped Yahoo successfully reduce data processing times from days to hours, or minutes to seconds on a number of its internal platforms. This talk covers the current state of our Open Source DataSketches.github.io library, which includes adaptations and example code for Pig, Hive, Spark and Druid and gives architectural examples of use and a case study. Speakers: Jon Malkin is a scientist at Yahoo working to extend the DataSketches library. His previous roles have involved large scale data processing for sponsored search, display advertising, user counting, ad targeting, and cross-device user identity modeling. Alexander Saydakov is a senior software engineer at Yahoo working on the open source Data Sketches project. In his previous roles he has been involved in building large-scale back-end data processing systems and frameworks for data analytics and experimentation based on Torque, Hadoop, Pig, Hive and Druid. Alexander’s education background is in the field of applied mathematics.

February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics

February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics

February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics

Yahoo Developer Network

More from Yahoo Developer Network (20)

Developing Mobile Apps for Performance - Swapnil Patel, Verizon Media

Developing Mobile Apps for Performance - Swapnil Patel, Verizon Media

Developing Mobile Apps for Performance - Swapnil Patel, Verizon Media

Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...

Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...

Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...

Athenz & SPIFFE, Tatsuya Yano, Yahoo Japan

Athenz & SPIFFE, Tatsuya Yano, Yahoo Japan

Athenz & SPIFFE, Tatsuya Yano, Yahoo Japan

Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...

Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...

Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...

CICD at Oath using Screwdriver

CICD at Oath using Screwdriver

CICD at Oath using Screwdriver

Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath

Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath

Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath

How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu

How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu

How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu

The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool

The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool

The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool

Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...

Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...

Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...

Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...

Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...

Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...

HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath

HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath

HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath

Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...

Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...

Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...

Moving the Oath Grid to Docker, Eric Badger, Oath

Moving the Oath Grid to Docker, Eric Badger, Oath

Moving the Oath Grid to Docker, Eric Badger, Oath

Architecting Petabyte Scale AI Applications

Architecting Petabyte Scale AI Applications

Architecting Petabyte Scale AI Applications

Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...

Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...

Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...

Jun 2017 HUG: YARN Scheduling – A Step Beyond

Jun 2017 HUG: YARN Scheduling – A Step Beyond

Jun 2017 HUG: YARN Scheduling – A Step Beyond

Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies

Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies

Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies

February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...

February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...

February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...

February 2017 HUG: Exactly-once end-to-end processing with Apache Apex

February 2017 HUG: Exactly-once end-to-end processing with Apache Apex

February 2017 HUG: Exactly-once end-to-end processing with Apache Apex

February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics

February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics

February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics

Recently uploaded

DBX First Quarter 2024 Investor Presentation

DBX First Quarter 2024 Investor Presentation

DBX First Quarter 2024 Investor Presentation

Artificial Intelligence Chap.5 : Uncertainty

Artificial Intelligence Chap.5 : Uncertainty

Artificial Intelligence Chap.5 : Uncertainty

Khushali Kathiriya

How to Troubleshoot Apps for the Modern Connected Worker

How to Troubleshoot Apps for the Modern Connected Worker

How to Troubleshoot Apps for the Modern Connected Worker

Created by Mozilla Research in 2012 and now part of Linux Foundation Europe, the Servo project is an experimental rendering engine written in Rust. It combines memory safety and concurrency to create an independent, modular, and embeddable rendering engine that adheres to web standards. Stewardship of Servo moved from Mozilla Research to the Linux Foundation in 2020, where its mission remains unchanged. After some slow years, in 2023 there has been renewed activity on the project, with a roadmap now focused on improving the engine’s CSS 2 conformance, exploring Android support, and making Servo a practical embeddable rendering engine. In this presentation, Rakhi Sharma reviews the status of the project, our recent developments in 2023, our collaboration with Tauri to make Servo an easy-to-use embeddable rendering engine, and our plans for the future to make Servo an alternative web rendering engine for the embedded devices industry. (c) Embedded Open Source Summit 2024 April 16-18, 2024 Seattle, Washington (US) https://events.linuxfoundation.org/embedded-open-source-summit/ https://ossna2024.sched.com/event/1aBNF/a-year-of-servo-reboot-where-are-we-now-rakhi-sharma-igalia

A Year of the Servo Reboot: Where Are We Now?

A Year of the Servo Reboot: Where Are We Now?

A Year of the Servo Reboot: Where Are We Now?

MySQL Webinar, presented on the 25th of April, 2024. Summary: MySQL solutions enable the deployment of diverse Database Architectures tailored to specific needs, including High Availability, Disaster Recovery, and Read Scale-Out. With MySQL Shell's AdminAPI, administrators can seamlessly set up, manage, and monitor these solutions, ensuring efficiency and ease of use in their administration. MySQL Router, on the other hand, provides transparent routing from the application traffic to the backend servers in the architectures, requiring minimal configuration. Completely built in-house and supported by Oracle, these solutions have been adopted by enterprises of all sizes for their business-critical applications. In this presentation, we'll delve into various database architecture solutions to help you choose the right one based on your business requirements. Focusing on technical details and the latest features to maximize the potential of these solutions.

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...

In the thrilling conclusion to 2023, ransomware groups had a banner year, really outdoing themselves in the "make everyone's life miserable" department. LockBit 3.0 took gold in the hacking olympics, followed by the plucky upstarts Clop and ALPHV/BlackCat. Apparently, 48% of organizations were feeling left out and decided to get in on the cyber attack action. Business services won the "most likely to get digitally mugged" award, with education and retail nipping at their heels. Hackers expanded their repertoire beyond boring old encryption to the much more exciting world of extortion. The US, UK and Canada took top honors in the "countries most likely to pay up" category. Bitcoins were the currency of choice for discerning hackers, because who doesn't love untraceable money?

Ransomware_Q4_2023. The report. [EN].pdf

Ransomware_Q4_2023. The report. [EN].pdf

Ransomware_Q4_2023. The report. [EN].pdf

Overkill Security

MINDCTI Revenue Release Quarter One 2024

MINDCTI Revenue Release Quarter One 2024

MINDCTI Revenue Release Quarter One 2024

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving

Building Digital Trust in a Digital Economy Veronica Tan, Director - Cyber Security Agency of Singapore Apidays Singapore 2024: Connecting Customers, Business and Technology (April 17 & 18, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...

The Good, the Bad and the Governed - Why is governance a dirty word? David O'Neill, Chief Operating Officer - APIContext Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...

💉💊+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI}}+971581248768 +971581248768 Mtp-Kit (500MG) Prices » Dubai [(+971581248768**)] Abortion Pills For Sale In Dubai, UAE, Mifepristone and Misoprostol Tablets Available In Dubai, UAE CONTACT DR.Maya Whatsapp +971581248768 We Have Abortion Pills / Cytotec Tablets /Mifegest Kit Available in Dubai, Sharjah, Abudhabi, Ajman, Alain, Fujairah, Ras Al Khaimah, Umm Al Quwain, UAE, Buy cytotec in Dubai +971581248768''''Abortion Pills near me DUBAI | ABU DHABI|UAE. Price of Misoprostol, Cytotec” +971581248768' Dr.DEEM ''BUY ABORTION PILLS MIFEGEST KIT, MISOPROTONE, CYTOTEC PILLS IN DUBAI, ABU DHABI,UAE'' Contact me now via What's App…… abortion Pills Cytotec also available Oman Qatar Doha Saudi Arabia Bahrain Above all, Cytotec Abortion Pills are Available In Dubai / UAE, you will be very happy to do abortion in Dubai we are providing cytotec 200mg abortion pill in Dubai, UAE. Medication abortion offers an alternative to Surgical Abortion for women in the early weeks of pregnancy. We only offer abortion pills from 1 week-6 Months. We then advise you to use surgery if its beyond 6 months. Our Abu Dhabi, Ajman, Al Ain, Dubai, Fujairah, Ras Al Khaimah (RAK), Sharjah, Umm Al Quwain (UAQ) United Arab Emirates Abortion Clinic provides the safest and most advanced techniques for providing non-surgical, medical and surgical abortion methods for early through late second trimester, including the Abortion By Pill Procedure (RU 486, Mifeprex, Mifepristone, early options French Abortion Pill), Tamoxifen, Methotrexate and Cytotec (Misoprostol). The Abu Dhabi, United Arab Emirates Abortion Clinic performs Same Day Abortion Procedure using medications that are taken on the first day of the office visit and will cause the abortion to occur generally within 4 to 6 hours (as early as 30 minutes) for patients who are 3 to 12 weeks pregnant. When Mifepristone and Misoprostol are used, 50% of patients complete in 4 to 6 hours; 75% to 80% in 12 hours; and 90% in 24 hours. We use a regimen that allows for completion without the need for surgery 99% of the time. All advanced second trimester and late term pregnancies at our Tampa clinic (17 to 24 weeks or greater) can be completed within 24 hours or less 99% of the time without the need surgery. The procedure is completed with minimal to no complications. Our Women's Health Center located in Abu Dhabi, United Arab Emirates, uses the latest medications for medical abortions (RU-486, Mifeprex, Mifegyne, Mifepristone, early options French abortion pill), Methotrexate and Cytotec (Misoprostol). The safety standards of our Abu Dhabi, United Arab Emirates Abortion Doctors remain unparalleled. They consistently maintain the lowest complication rates throughout the nation. Our Physicians and staff are always available to answer questions and care for women in one of the most difficult times in their lives. The decision to have an abortion at the Abortion Cl

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...

?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@

Accelerating FinTech Innovation: Unleashing API Economy and GenAI Vasa Krishnan, Chief Technology Officer - FinResults Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood

Juan lago vázquez

Whatsapp Number Escorts Call girls 8617370543 Available 24x7 Navi Mumbai Call Girls Service Offer Genuine VIP Model Escorts Call Girls in Your Budget. Navi Mumbai Call Girls Service Provide Real Call Girls Number. Make Your Sexual Pleasure Memorable with Our Navi Mumbai Call Girls at Affordable Price. Top VIP Escorts Call Girls, High Profile Independent Escorts Call Girls, Housewife Women Escorts Call Girl, College Girls Escorts Call Girls, Russian Escorts Call girls Service in Your Budget.

Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model

Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model

Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model

As privacy and data protection regulations evolve rapidly, organizations operating in multiple jurisdictions face mounting challenges to ensure compliance and safeguard customer data. With state-specific privacy laws coming up in multiple states this year, it is essential to understand what their unique data protection regulations will require clearly. How will data privacy evolve in the US in 2024? How to stay compliant? Our panellists will guide you through the intricacies of these states' specific data privacy laws, clarifying complex legal frameworks and compliance requirements. This webinar will review: - The essential aspects of each state's privacy landscape and the latest updates - Common compliance challenges faced by organizations operating in multiple states and best practices to achieve regulatory adherence - Valuable insights into potential changes to existing regulations and prepare your organization for the evolving landscape

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments

In this session, we will delve into strategic approaches for optimizing knowledge management within Microsoft 365, amidst the evolving landscape of Copilot. From leveraging automatic metadata classification and permission governance with SharePoint Premium, to unlocking Viva Engage for the cultivation of knowledge and communities, you will gain actionable insights to bolster your organization's knowledge-sharing initiatives. In this session, we will also explore how to facilitate solutions to enable your employees to find answers and expertise within Microsoft 365. You will leave equipped with practical techniques and a deeper understanding of how there is more to effective knowledge management than just enabling Copilot, but building actual solutions to prepare the knowledge that Copilot and your employees can use.

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

GenAI Risks & Security Meetup 01052024.pdf

GenAI Risks & Security Meetup 01052024.pdf

GenAI Risks & Security Meetup 01052024.pdf

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke

Product Anonymous

Abhishek Deb(1), Mr Abdul Kalam(2) M. Des (UX) , School of Design, DIT University , Dehradun. This paper explores the future potential of AI-enabled smartphone processors, aiming to investigate the advancements, capabilities, and implications of integrating artificial intelligence (AI) into smartphone technology. The research study goals consist of evaluating the development of AI in mobile phone processors, analyzing the existing state as well as abilities of AI-enabled cpus determining future patterns as well as chances together with reviewing obstacles as well as factors to consider for more growth.

Exploring the Future Potential of AI-Enabled Smartphone Processors

Exploring the Future Potential of AI-Enabled Smartphone Processors

Exploring the Future Potential of AI-Enabled Smartphone Processors

FWD Group - Insurer Innovation Award 2024

FWD Group - Insurer Innovation Award 2024

FWD Group - Insurer Innovation Award 2024

The Digital Insurer

Recently uploaded (20)

DBX First Quarter 2024 Investor Presentation

DBX First Quarter 2024 Investor Presentation

DBX First Quarter 2024 Investor Presentation

Artificial Intelligence Chap.5 : Uncertainty

Artificial Intelligence Chap.5 : Uncertainty

Artificial Intelligence Chap.5 : Uncertainty

How to Troubleshoot Apps for the Modern Connected Worker

How to Troubleshoot Apps for the Modern Connected Worker

How to Troubleshoot Apps for the Modern Connected Worker

A Year of the Servo Reboot: Where Are We Now?

A Year of the Servo Reboot: Where Are We Now?

A Year of the Servo Reboot: Where Are We Now?

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...

Ransomware_Q4_2023. The report. [EN].pdf

Ransomware_Q4_2023. The report. [EN].pdf

Ransomware_Q4_2023. The report. [EN].pdf

MINDCTI Revenue Release Quarter One 2024

MINDCTI Revenue Release Quarter One 2024

MINDCTI Revenue Release Quarter One 2024

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood

Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model

Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model

Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

GenAI Risks & Security Meetup 01052024.pdf

GenAI Risks & Security Meetup 01052024.pdf

GenAI Risks & Security Meetup 01052024.pdf

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke

Exploring the Future Potential of AI-Enabled Smartphone Processors

Exploring the Future Potential of AI-Enabled Smartphone Processors

Exploring the Future Potential of AI-Enabled Smartphone Processors

FWD Group - Insurer Innovation Award 2024

FWD Group - Insurer Innovation Award 2024

FWD Group - Insurer Innovation Award 2024

Data Applications and Infrastructure at LinkedIn__HadoopSummit2010

1.

2.

3.

4. People You May Know

5. Other products

6.

7.

8.

9. Open Source Zoie – Faceted Search Bobo – Real-time search indexing Decomposer – Very large matrix decomposition routines (now in Mahout) Norbert – Partition aware cluster management & RPC Voldemort – Key/Value storage Kamikaze – Compression package Sensei – Distributed search Azkaban – Hadoop workflow

10. Azkaban workflow = cron + make

11. Azkaban workflow:hadoop :: web framework:webapp

13.

15.

16. Data Deployment How do you get your multi-billion edge probabilistic relationship graph to the live website to serve queries?

17.

18. Voldemort Data Deployment

19.

20.

Editor's Notes

This is the Title slide. Please use the name of the presentation that was used in the abstract submission.
This is the agenda slide. There is only one of these in the deck.
Why linkedin cares about derived data Why it is hard
Talk about what you can do
if you get bad results, I claim you are in an unsuccessful test! Still a small percentage of the quadrillion possible relationships (pairwise is hard)
What we learned
Azk is a workflow scheduler? What is workflow?
Samurai rule Logic is in jobs, not job descriptor Jobs are independent Work – viz, polish
This is the final slide; generally for questions at the end of the talk. Please post your contact information here.