Hadoop for the disillusioned

•Descargar como PPT, PDF•

1 recomendación•1,308 vistas

This document discusses Hadoop and its use for managing large and growing amounts of data beyond what traditional systems can handle. It outlines the different layers, technologies, and distributions that make up Hadoop platforms today. It notes that while many organizations initially adopt Hadoop to save money on data management and queries, they then face the challenge of determining the next steps. It recommends asking domain experts what unanswered questions they have and finding ways to obtain the necessary data to answer those questions, either from within or outside the organization. Building data products is also presented as a way for organizations to explore their data assets. A few examples of real-world Hadoop uses are briefly described.

Tecnología Noticias y política

Hadoop for the disillusioned
Steve Watt, Red Hat

CC flickr rubenswieringa

@wattsteve

Hadoop in 2013
Platform Layers

Technologies

Computational
Runtimes

YARN, GiRAPH, MapReduce,
HBase, Phoenix, Spark/BDAS,
Drill, Impala, Stinger & more

FileSystems

Azure, CassandraFS, CephFS,
CleverSafe, GlusterFS, GridGain,
HDFS, Lustre
MapR FS, S3, SWIFT, Quantcast
FS, Symantec VCFS & more

Infrastructures

System on a Chip, x86,
Virtualization and Cloud

Distributions

Cloudera, Hortonworks, IBM,
Intel, MapR, WanDisco

CC flickr lowfatbrains

@wattsteve

Your data is growing beyond your ability to manage & query it

CC flickr kakadu

@wattsteve

Save money when asking the same questions of your data

CC flickr martijnsnels

@wattsteve

Hadoop Customer, “Great, but now what?”
Innovators

Early
Adopters

Early
Majority

Late
Majority

Laggards

CHASM

Geoffrey Moore’s Technology Adoption Lifecycle

@wattsteve

new
and build data products

CC flickr cbcastro

@wattsteve







Ask your domain experts and LOB folks what unanswered questions they have
Where can you get the data you need to answer that question? (domain experts should know
where to get it)
Some of this data may be outside your organization (Social Media, Sensor Data, Data
brokerages/Marketplaces, Web Pages) and some of it may be inside.
If the data for the query doesn’t exist, figure out how to instrument or gather it.
Pair your domain experts with your data engineers so they can work out how to obtain and
massage the data given the types of queries desired

CC flickr birdwatcher63

@wattsteve

• Building data products is a similar exercise except that it involves typical product planning,
such as identifying a market.
• This is also a great way for an organization to explore what assets they have within their data

CC flickr syume

@wattsteve

Mapping the night sky

CC flickr bobfamiliar

@wattsteve

Analyzing farm soil content
to predict human conflict

CC flickr oxfam

@wattsteve

Crisis Management for the
Chilean Earthquake

CC flickr flodigrip

@wattsteve

Thanks for listening

Steve Watt

swatt@redhat.com

@wattsteve

Más contenido relacionado

La actualidad más candente

Enabling Apache Spark for Hybrid Cloud

Alluxio, Inc.

Final deck

Steve Watt

Build Your Own Data Beast : Greenplum + Dell

skahler

Hadoop Summit Amsterdam 2013 - Making Hadoop Ready for Prime Time - Syncsort ...

Steven Totman

Get Results, Build Your Own Big Data Beast : Greenplum + Dell

skahler

Developing high frequency indicators using real time tick data on apache supe...

Zekeriya Besiroglu

Realtime  Distributed Analysis  of Datastreams

Florian Stegmaier

Data Tools and the Data Scientist Shortage

Wes McKinney

Lambda Architecture - Storm, Trident, SummingBird ... - Architecture and Over...

Dataiku

Introduction to Hivemall

Treasure Data, Inc.

Build a simple open data lake on AWS using a combination of open-source software (OSS), including Red Hat’s Debezium, Apache Kafka, and Kafka Connect for change data capture (CDC), and Apache Hive, Apache Spark, Apache Hudi, and Hudi’s DeltaStreamer for managing our data lake. We will use fully-managed AWS services to host the open data lake components, including Amazon RDS, Amazon MKS, Amazon EKS, and EMR. Link to the blog post and video: https://garystafford.medium.com/building-open-data-lakes-with-debezium-and-apache-hudi-c3370d3f86fb

Building Open Data Lakes on AWS with Debezium and Apache Hudi

Gary Stafford

The central premise of DataXu is to apply data science to better marketing. At its core, is the Real Time Bidding Platform that processes 2 Petabytes of data per day and responds to ad auctions at a rate of 2.1 million requests per second across 5 different continents. Serving on top of this platform is Dataxu’s analytics engine that gives their clients insightful analytics reports addressed towards client marketing business questions. Some common requirements for both these platforms are the ability to do real-time processing, scalable machine learning, and ad-hoc analytics. This talk will showcase DataXu’s successful use-cases of using the Apache Spark framework and Databricks to address all of the above challenges while maintaining its agility and rapid prototyping strengths to take a product from initial R&D phase to full production. The team will share their best practices and highlight the steps of large scale Spark ETL processing, model testing, all the way through to interactive analytics.

R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...

Spark Summit

"Einstürzenden Neudaten: Building an Analytics Engine from Scratch", Tobias Johansson, Lead Developer at Valo.io Watch more from Data Natives Berlin 2016 here: http://bit.ly/2fE1sEo Visit the conference website to learn more: www.datanatives.io Follow Data Natives: https://www.facebook.com/DataNatives https://twitter.com/DataNativesConf Stay Connected to Data Natives by Email: Subscribe to our newsletter to get the news first about Data Natives 2017: http://bit.ly/1WMJAqS About the Author: Tobias is technical lead developer for Valo.io in London. He has a background in the financial sector as a front-office developer but changed track in 2013 to be part of a team building a new real-time analytics platform from the ground up. His goal is to outlive the JVM and his tea addiction. This is his first appearance on the conference scene as a speaker.

"Einstürzenden Neudaten: Building an Analytics Engine from Scratch", Tobias J...

Dataconomy Media

One of the first problems a developer encounters when evaluating a graph database is how to construct a graph efficiently. Recognizing this need in 2014, TinkerPop's Stephen Mallette penned a series of blog posts titled "Powers of Ten" which addressed several bulkload techniques for Titan. Since then Titan has gone away, and the open source graph database landscape has evolved significantly. Do the same approaches stand the test of time? In this session, we will take a deep dive into strategies for loading data of various sizes into modern Apache TinkerPop graph systems. We will discuss bulkloading with JanusGraph, the scalable graph database forked from Titan, to better understand how its architecture can be optimized for ingestion. Presented at Data Day Texas on January 27, 2018.

Powers of Ten Redux

Jason Plurad

Graph Computing with JanusGraph

Jason Plurad

Data in Motion vs Data at Rest

Internap

Dataiku Flow and dctc - Berlin Buzzwords

Dataiku

Snaplogic Live: Big Data in Motion

SnapLogic

Iceberg + Alluxio for Fast Data Analytics

Alluxio, Inc.

Organizations today produce exponentially more data than they did just a few years ago, but their databases weren’t built to handle these new volumes. As a result, reporting takes way too long, and some complex analytics simply cannot be done. The Era of Massive Data is upon us, and a new approach is required to overcome the limitations of traditional CPU-based data stores. KEY TAKEAWAYS - Flexible data exploration with minimal preparation - Unrestricted access to your organization’s full scope of data - Access to previously unobtainable insights, for smarter business decisions

Accelerating analytics in a new era of data

Arnon Shimoni

La actualidad más candente (20)

Enabling Apache Spark for Hybrid Cloud

Final deck

Build Your Own Data Beast : Greenplum + Dell

Hadoop Summit Amsterdam 2013 - Making Hadoop Ready for Prime Time - Syncsort ...

Get Results, Build Your Own Big Data Beast : Greenplum + Dell

Developing high frequency indicators using real time tick data on apache supe...

Realtime  Distributed Analysis  of Datastreams

Data Tools and the Data Scientist Shortage

Lambda Architecture - Storm, Trident, SummingBird ... - Architecture and Over...

Introduction to Hivemall

Building Open Data Lakes on AWS with Debezium and Apache Hudi

R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...

"Einstürzenden Neudaten: Building an Analytics Engine from Scratch", Tobias J...

Powers of Ten Redux

Graph Computing with JanusGraph

Data in Motion vs Data at Rest

Dataiku Flow and dctc - Berlin Buzzwords

Snaplogic Live: Big Data in Motion

Iceberg + Alluxio for Fast Data Analytics

Accelerating analytics in a new era of data

Similar a Hadoop for the disillusioned

"Hadoop Analytics on your data in place" Steve Watt leads engineering for the Hadoop and Big Data program at Red Hat. Most recently Steve has been focusing on Hadoop Interoperability and better enabling Hadoop support for alternative filesystems. Prior to Red Hat, Steve spent 2 years at Hewlett-Packard, first co-founding the Hadoop business and then leading engineering as the Hadoop CTO. Prior to HP, Steve was at IBM for 10 years where he created IBMs first Hadoop Distribution and was part of the team that built BigSheets, the first spreadsheet interface for Hadoop.

Steve Watt, Chief Architect, Hadoop and Big Data, Red Hat - 21st BDL meetup

bigdatalondon

Architecting Virtualized Infrastructure for Big Data

Richard McDougall

Presentation architecting virtualized infrastructure for big data

solarisyourep

Presentation architecting virtualized infrastructure for big data

xKinAnx

Architecting virtualized infrastructure for big data presentation

Vlad Ponomarev

Azure Cafe Marketplace with Hortonworks March 31 2016

Joan Novino

Data analytics using the cloud challenges and opportunities for india

Ajay Ohri

We will explore the strengths and limitations of Hadoop for analyzing large data sets and review the growing ecosystem of tools for augmenting, extending, or replacing Hadoop MapReduce. We will introduce the Amazon Elastic MapReduce (EMR) platform as the big data foundation for Hadoop and beyond by providing specific examples of running Machine Learning (Mahout), Graph Analytics (Giraph), and Statistical Analysis (R) on EMR. We will discuss also big data analytics and visualization of results with Amazon Redshift + third party business intelligence tools, as well as typical end-to-end Big Data workflow on AWS. We will conclude with real-world examples from ICAO of Big Data analytics for aviation safety data on AWS. The integrated Safety Trend Analysis and Reporting System (iSTARS) is a web based system linking a collection of safety datasets and related web application to perform online safety and risk analysis. It uses AWS EC2, S3, EMR and related partner tools for continuous data aggregation and filtering.

(BDT302) Big Data Beyond Hadoop: Running Mahout, Giraph, and R on Amazon EMR ...

Amazon Web Services

Valentyn Kropov, Big Data Solutions Architect has recently attended "Hadoop World / Strata" – biggest and coolest Big Data conference in a World, and he can't wait to share fresh trends and topics straight from New-York. Come and learn how Hadoop cluster will help NASA to explore Mars, how Netflix build 10PB platform, what are the latest trends in Spark, to learn about newest, just announced storage engine from Cloudera called Kudu and many many more interesting stuff.

Hadoop world overview trends and topics

Valentin Kropov

The Hive Think Tank: "Stream Processing Systems" by M.C. Srivas of MapR

The Hive

20150314 sahara intro and the future plan for open stack meetup

Wei Ting Chen

The Briefing Room with Dr. Robin Bloor and HPE Security The Internet of Things brings new technological problems: sensor communications are bi-directional, the scale of data generation points has no precedent and, in this new world, security, privacy and data protection need to go out to the edge. Likely, most of that data lands in Hadoop and Big Data platforms. With the need for rapid analytics never greater, companies try to seize opportunities in tighter time windows. Yet, cyber-threats are at an all-time high, targeting the most valuable of assets—the data. Register for this episode of The Briefing Room to hear Analyst Dr. Robin Bloor explain the implications of today's divergent data forces. He’ll be briefed by Reiner Kappenberger of HPE, who will discuss how a recent innovation -- NiFi -- is revolutionizing the big data ecosystem. He’ll explain how this technology dramatically simplifies data flow design, enabling a new era of business-driven analysis, while also protecting sensitive data.

Solving the Really Big Tech Problems with IoT

Eric Kavanagh

GSJUG: Mastering Data Streaming Pipelines 09May2023 https://www.meetup.com/futureofdata-princeton/events/293233881/ This is a repost from the Garden State Java Users Group Event. Join me at https://www.meetup.com/garden-state-java-user-group/events/293229660/ See: https://www.eventbrite.com/e/mastering-data-streaming-pipelines-tickets-627677218457?_ga=2.253257801.1787151623.1682868226-741104479.1678110925 Please note that registration via EventBrite is required to attend either in-person or online. We are happy to announce that Tim Spann will be our special guest for the May 9, 2023 meeting! Abstract: In this session, Tim will show you some best practices that he has discovered over the last seven years in building data streaming applications including IoT, CDC, Logs, and more. In his modern approach, we utilize several Apache frameworks to maximize the best features of all. We often start with Apache NiFi as the orchestrator of streams flowing into Apache Kafka. From there we build streaming ETL with Apache Flink, enhance events with NiFi enrichment. We build continuous queries against our topics with Flink SQL. We will show where Java fits in as sources, enrichments, NiFi processors and sinks. We hope to see you on May 9! Speaker Timothy Spann Tim Spann is a Principal Developer Advocate in Data In Motion for Cloudera. He works with Apache NiFi, Apache Pulsar, Apache Kafka, Apache Flink, Flink SQL, Apache Pinot, Trino, Apache Iceberg, DeltaLake, Apache Spark, Big Data, IoT, Cloud, AI/DL, machine learning, and deep learning. Tim has over ten years of experience with the IoT, big data, distributed computing, messaging, streaming technologies, and Java programming. Previously, he was a Developer Advocate at StreamNative, Principal DataFlow Field Engineer at Cloudera, a Senior Solutions Engineer at Hortonworks, a Senior Solutions Architect at AirisData, a Senior Field Engineer at Pivotal and a Team Leader at HPE. He blogs for DZone, where he is the Big Data Zone leader, and runs a popular meetup in Princeton & NYC on Big Data, Cloud, IoT, deep learning, streaming, NiFi, the blockchain, and Spark. Tim is a frequent speaker at conferences such as ApacheCon, DeveloperWeek, Pulsar Summit and many more. He holds a BS and MS in computer science. In this session, Tim will show you some best practices that he has discovered over the last seven years in building data streaming applications, including IoT, CDC, Logs, and more. In his modern approach, we utilize several Apache frameworks to maximize the best features of all. We often start with Apache NiFi as the orchestrator of streams flowing into Apache Kafka. From there, we build streaming ETL with Apache Flink, enhance events with NiFi enrichment. We build continuous queries against our topics with Flink SQL. We will show where Java fits in as sources, enrichments, NiFi processors, and sinks. https://www.eventbrite.com/e/mastering-data-streaming-pipelines-tickets-627677218457?_ga=2.253257801.178

GSJUG: Mastering Data Streaming Pipelines 09May2023

Timothy Spann

Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines https://www.youtube.com/watch?v=Yeua8NlzQ3Y https://www.conf42.com/Large_Language_Models_LLMs_2024_Tim_Spann_generative_ai_streaming Adding Generative AI to Real-Time Streaming Pipelines Abstract Let’s build streaming pipelines that convert streaming events into prompts, call LLMs, and process the results. Summary Tim Spann: My talk is adding generative AI to real time streaming pipelines. I'm going to discuss a couple of different open source technologies. We'll touch on Kafka, Nifi, Flink, Python, Iceberg. All the slides, all the code and GitHub are out there. Llm, if you didn't know, is rapidly evolving. There's a lot of different ways to interact with models. That enrichment, transformation, processing really needs tools. The amount of models and projects and software that are available is massive. Nifi supports hundreds of different inputs and can convert them on the fly. Great way to distribute your data quickly to whoever needs it without duplication, without tight coupling. Fun to find new things to integrate into. So what we can do is, well, I want to get a meetup chat going. I have a processor here that just listens for events as they come from slack. And then I'm going to clean it up, add a couple fields and push that out to slack. Every model is a little bit of different tweaking. Nifi acts as a whole website. And as you see here, it can be get, post, put, whatever you want. We send that response back to flink and it shows up here. Thank you for attending this talk. I'm going to be speaking at some other events very shortly. Transcript This transcript was autogenerated. To make changes, submit a PR. Hi, Tim Spann here. My talk is adding generative AI to real time streaming pipelines, and we're here for the large language model conference at Comp 42, which is always a nice one, great place to be. I'm going to discuss a couple of different open source technologies that work together to enable you to build real time pipelines using large language models. So we'll touch on Kafka, Nifi, Flink, Python, Iceberg, and I'll show you a little bit of each one in the demos. I've been working with data machine learning, streaming IoT, some other things for a number of years, and you could contact me at any of these places, whether Twitter or whatever it's called, some different blogs, or in person at my meetups and at different conferences around the world. I do a weekly newsletter, cover streaming ML, a lot of LLM, open source, Python, Java, all kinds of fun stuff, as I mentioned, do a bunch of different meetups. They are not just in the east coast of the US, they are available virtually live, and I also put them on YouTube, and if you need them somewhere else, let me know. We publish all the slides, all the code and GitHub. Everything you need is out there. Let's get into the talk. Llm, if you didn't know, is rapidly evolving. While you're typing down the things that you use, it

Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines

Timothy Spann

In this webinar This talk identifies several shortcomings of Apache Hadoop and presents an alternative approach for building simple and flexible Big Data software stacks quickly, based on next generation computing paradigms, such as in-memory data/compute grids. The focus of the talk is on software architectures, but several code examples using Hazelcast will be provided to illustrate the concepts discussed. We’ll cover these topics: -Briefly explain why Hadoop is not a universal, or inexpensive, Big Data solution – despite the hype -Lay out technical requirements for a flexible Big/Fast Data processing stack -Present solutions thought to be alternatives to Hadoop -Argue why In-Memory Data/Compute Grids are so attractive in creating future-proof Big/Fast Data applications -Discuss how well Hazelcast meets the Big/Fast Data requirements vs Hadoop -Present several code examples using Java and Hazelcast to illustrate concepts discussed -Live Q&A Session Presenter: Jacek Kruszelnicki, President of Numatica Corporation

Big Data, Simple and Fast: Addressing the Shortcomings of Hadoop

Hazelcast

The Hadoop Path by Subash DSouza of Archangel Technology Consultants, LLC.

Data Con LA

Big Data/Hadoop Infrastructure Considerations

Richard McDougall

Wasp2 - IoT and Streaming Platform

Paolo Platter

Is your cloud ready for Big Data? Strata NY 2013

Richard McDougall

Scaling up with Cisco Big Data: Data + Science = Data Science

eRic Choo

Similar a Hadoop for the disillusioned (20)

Steve Watt, Chief Architect, Hadoop and Big Data, Red Hat - 21st BDL meetup

Architecting Virtualized Infrastructure for Big Data

Presentation architecting virtualized infrastructure for big data

Architecting virtualized infrastructure for big data presentation

Azure Cafe Marketplace with Hortonworks March 31 2016

Data analytics using the cloud challenges and opportunities for india

(BDT302) Big Data Beyond Hadoop: Running Mahout, Giraph, and R on Amazon EMR ...

Hadoop world overview trends and topics

The Hive Think Tank: "Stream Processing Systems" by M.C. Srivas of MapR

20150314 sahara intro and the future plan for open stack meetup

Solving the Really Big Tech Problems with IoT

GSJUG: Mastering Data Streaming Pipelines 09May2023

Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines

Big Data, Simple and Fast: Addressing the Shortcomings of Hadoop

The Hadoop Path by Subash DSouza of Archangel Technology Consultants, LLC.

Big Data/Hadoop Infrastructure Considerations

Wasp2 - IoT and Streaming Platform

Is your cloud ready for Big Data? Strata NY 2013

Scaling up with Cisco Big Data: Data + Science = Data Science

Más de Steve Watt

Building Clustered Applications with Kubernetes and Docker

Steve Watt

Building Clustered Applications with Kubernetes and Docker

Steve Watt

Apache con 2013-hadoop

Steve Watt

Apache con 2012 taking the guesswork out of your hadoop infrastructure

Steve Watt

Mining the Web for Information using Hadoop

Steve Watt

Tech4Africa - Opportunities around Big Data

Steve Watt

Bridging Structured and Unstructred Data with Apache Hadoop and Vertica

Steve Watt

Web Crawling and Data Gathering with Apache Nutch

Steve Watt

Introduction to Apache Hadoop

Steve Watt

Extractiv

Steve Watt

Más de Steve Watt (10)

Building Clustered Applications with Kubernetes and Docker

Apache con 2013-hadoop

Apache con 2012 taking the guesswork out of your hadoop infrastructure

Mining the Web for Information using Hadoop

Tech4Africa - Opportunities around Big Data

Bridging Structured and Unstructred Data with Apache Hadoop and Vertica

Web Crawling and Data Gathering with Apache Nutch

Introduction to Apache Hadoop

Extractiv

Último

Explore 'The Codex of Business: Writing Software for Real-World Solutions,' a compelling SlideShare presentation that delves into digital transformation in healthcare. Discover through a detailed case study how Agile methodologies empower healthcare providers to develop, iterate, and refine digital solutions that address real-world challenges. Learn how strategic planning, user feedback, and continuous improvement drive success in deploying technologies that enhance patient care and operational efficiency. Ideal for healthcare professionals, IT specialists, and digital transformation advocates seeking actionable insights and practical examples of technology making a real difference.

The Codex of Business Writing Software for Real-World Solutions 2.pptx

Malak Abu Hammad

The 7 Things I Know About Cyber Security After 25 Years | April 2024

Rafal Los

How to Troubleshoot Apps for the Modern Connected Worker

ThousandEyes

Discord is a free app offering voice, video, and text chat functionalities, primarily catering to the gaming community. It serves as a hub for users to create and join servers tailored to their interests. Discord’s ecosystem comprises servers, each functioning as a distinct online community with its own channels dedicated to specific topics or activities. Users can engage in text-based discussions, voice calls, or video chats within these channels. Understanding Discord Servers Discord servers are virtual spaces where users congregate to interact, share content, and build communities. Servers may revolve around gaming, hobbies, interests, or fandoms, providing a platform for like-minded individuals to connect. Communication Features Discord offers a range of communication tools, including text channels for messaging, voice channels for real-time audio conversations, and video channels for face-to-face interactions. These features facilitate seamless communication and collaboration. What Does NSFW Mean? The acronym NSFW stands for “Not Safe For Work,” indicating content that may be inappropriate for professional or public settings. NSFW Content NSFW content encompasses material that is sexually explicit, violent, or otherwise graphic in nature. It often includes nudity, profanity, or depictions of sensitive topics.

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf

UK Journal

Created by Mozilla Research in 2012 and now part of Linux Foundation Europe, the Servo project is an experimental rendering engine written in Rust. It combines memory safety and concurrency to create an independent, modular, and embeddable rendering engine that adheres to web standards. Stewardship of Servo moved from Mozilla Research to the Linux Foundation in 2020, where its mission remains unchanged. After some slow years, in 2023 there has been renewed activity on the project, with a roadmap now focused on improving the engine’s CSS 2 conformance, exploring Android support, and making Servo a practical embeddable rendering engine. In this presentation, Rakhi Sharma reviews the status of the project, our recent developments in 2023, our collaboration with Tauri to make Servo an easy-to-use embeddable rendering engine, and our plans for the future to make Servo an alternative web rendering engine for the embedded devices industry. (c) Embedded Open Source Summit 2024 April 16-18, 2024 Seattle, Washington (US) https://events.linuxfoundation.org/embedded-open-source-summit/ https://ossna2024.sched.com/event/1aBNF/a-year-of-servo-reboot-where-are-we-now-rakhi-sharma-igalia

A Year of the Servo Reboot: Where Are We Now?

Igalia

BooK Now Call us at +918448380779 to hire a gorgeous and seductive call girl for sex. Take a Delhi Escort Service. The help of our escort agency is mostly meant for men who want sexual Indian Escorts In Delhi NCR. It should be noted that any impersonator will get 100 attention from our Young Girls Escorts in Delhi. They will assume the position of reliable allies. VIP Call Girl With Original Photos Book Tonight +918448380779 Our Cheap Price 1 Hour not available 2 Hours 5000 Full Night 8000 TAG: Call Girls in Delhi, Noida, Gurgaon, Ghaziabad, Connaught Place, Greater Kailash Delhi, Lajpat Nagar Delhi, Mayur Vihar Delhi, Chanakyapuri Delhi, New Friends Colony Delhi, Majnu Ka Tilla, Karol Bagh, Malviya Nagar, Saket, Khan Market, Noida Sector 18, Noida Sector 76, Noida Sector 51, Gurgaon Mg Road, Iffco Chowk Gurgaon, Rajiv Chowk Gurgaon All Delhi Ncr Free Home Deliver

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men

Delhi Call girls

🐬 The future of MySQL is Postgres 🐘

RTylerCroy

Slack Application Development 101 Slides

praypatel2

A Call to Action for Generative AI in 2024

Results

MySQL Webinar, presented on the 25th of April, 2024. Summary: MySQL solutions enable the deployment of diverse Database Architectures tailored to specific needs, including High Availability, Disaster Recovery, and Read Scale-Out. With MySQL Shell's AdminAPI, administrators can seamlessly set up, manage, and monitor these solutions, ensuring efficiency and ease of use in their administration. MySQL Router, on the other hand, provides transparent routing from the application traffic to the backend servers in the architectures, requiring minimal configuration. Completely built in-house and supported by Oracle, these solutions have been adopted by enterprises of all sizes for their business-critical applications. In this presentation, we'll delve into various database architecture solutions to help you choose the right one based on your business requirements. Focusing on technical details and the latest features to maximize the potential of these solutions.

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...

Miguel Araújo

This presentations targets students or working professionals. You may know Google for search, YouTube, Android, Chrome, and Gmail, but did you know Google has many developer tools, platforms & APIs? This comprehensive yet still high-level overview outlines the most impactful tools for where to run your code, store & analyze your data. It will also inspire you as to what's possible. This talk is 50 minutes in length.

Powerful Google developer tools for immediate impact! (2023-24 C)

wesley chun

Sara Mae O’Brien Scott and Tatiana Baquero Cakici, Senior Consultants at Enterprise Knowledge (EK), presented “AI Fast Track to Search-Focused AI Solutions” at the Information Architecture Conference (IAC24) that took place on April 11, 2024 in Seattle, WA. In their presentation, O’Brien-Scott and Cakici focused on what Enterprise AI is, why it is important, and what it takes to empower organizations to get started on a search-based AI journey and stay on track. The presentation explored the complexities of enterprise search challenges and how IA principles can be leveraged to provide AI solutions through the use of a semantic layer. O’Brien-Scott and Cakici showcased a case study where a taxonomy, an ontology, and a knowledge graph were used to structure content at a healthcare workforce solutions organization, providing personalized content recommendations and increasing content findability. In this session, participants gained insights about the following: Most common types of AI categories and use cases; Recommended steps to design and implement taxonomies and ontologies, ensuring they evolve effectively and support the organization’s search objectives; Taxonomy and ontology design considerations and best practices; Real-world AI applications that illustrated the value of taxonomies, ontologies, and knowledge graphs; and Tools, roles, and skills to design and implement AI-powered search solutions.

IAC 2024 - IA Fast Track to Search Focused AI Solutions

Enterprise Knowledge

Scaling API-first – The story of a global engineering organization

Radu Cotescu

Real Time Object Detection Using Open CV

Khem

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...

Neo4j

Handwritten Text Recognition for manuscripts and early printed texts

Maria Levchenko

In this session, we will delve into strategic approaches for optimizing knowledge management within Microsoft 365, amidst the evolving landscape of Copilot. From leveraging automatic metadata classification and permission governance with SharePoint Premium, to unlocking Viva Engage for the cultivation of knowledge and communities, you will gain actionable insights to bolster your organization's knowledge-sharing initiatives. In this session, we will also explore how to facilitate solutions to enable your employees to find answers and expertise within Microsoft 365. You will leave equipped with practical techniques and a deeper understanding of how there is more to effective knowledge management than just enabling Copilot, but building actual solutions to prepare the knowledge that Copilot and your employees can use.

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

Drew Madelung

Histor y of HAM Radio presentation slide

vu2urc

What are drone anti-jamming systems? The drone anti-jamming systems and anti-spoof technology protect against interference, jamming, and spoofing of the UAVs. To protect their security, countries are beginning to research drone anti-jamming systems, also known as drone strike weapons. The anti-jam and anti-spoof technology protects against interference, jamming and spoofing. A drone strike weapon is a drone attack weapon that can attack and destroy enemy drones. So what is so unique about this amazing system?

What Are The Drone Anti-jamming Systems Technology?

Antenna Manufacturer Coco

Choosing the right accounts payable services provider is a strategic decision that can significantly impact your business's financial performance and operational efficiency. By considering factors such as expertise, range of services, technology infrastructure, scalability, cost, and reputation, businesses can make informed decisions and select a provider that aligns with their unique needs and objectives. Partnering with the right provider can streamline accounts payable processes, drive cost savings, and position your business for long-term success. https://katprotech.com/accounts-payable-and-purchase-order-automation/

Factors to Consider When Choosing Accounts Payable Services Providers.pptx

Katpro Technologies

Hadoop for the disillusioned

1. Hadoop for the disillusioned Steve Watt, Red Hat CC flickr rubenswieringa @wattsteve

2. @wattsteve

3. Wired Magazine - July 2008 @wattsteve

4. Hadoop in 2013 Platform Layers Technologies Computational Runtimes YARN, GiRAPH, MapReduce, HBase, Phoenix, Spark/BDAS, Drill, Impala, Stinger & more FileSystems Azure, CassandraFS, CephFS, CleverSafe, GlusterFS, GridGain, HDFS, Lustre MapR FS, S3, SWIFT, Quantcast FS, Symantec VCFS & more Infrastructures System on a Chip, x86, Virtualization and Cloud Distributions Cloudera, Hortonworks, IBM, Intel, MapR, WanDisco CC flickr lowfatbrains @wattsteve

5. Source: Gartner Hype Cycle @wattsteve

6. Your data is growing beyond your ability to manage & query it CC flickr kakadu @wattsteve

7. Save money when asking the same questions of your data CC flickr martijnsnels @wattsteve

8. Hadoop Customer, “Great, but now what?” Innovators Early Adopters Early Majority Late Majority Laggards CHASM Geoffrey Moore’s Technology Adoption Lifecycle @wattsteve

9. new and build data products CC flickr cbcastro @wattsteve

10.      Ask your domain experts and LOB folks what unanswered questions they have Where can you get the data you need to answer that question? (domain experts should know where to get it) Some of this data may be outside your organization (Social Media, Sensor Data, Data brokerages/Marketplaces, Web Pages) and some of it may be inside. If the data for the query doesn’t exist, figure out how to instrument or gather it. Pair your domain experts with your data engineers so they can work out how to obtain and massage the data given the types of queries desired CC flickr birdwatcher63 @wattsteve

11. • Building data products is a similar exercise except that it involves typical product planning, such as identifying a market. • This is also a great way for an organization to explore what assets they have within their data CC flickr syume @wattsteve

12. Mapping the night sky CC flickr bobfamiliar @wattsteve

13. Analyzing farm soil content to predict human conflict CC flickr oxfam @wattsteve

14. Crisis Management for the Chilean Earthquake CC flickr flodigrip @wattsteve

15. Thanks for listening Steve Watt swatt@redhat.com @wattsteve

Notas del editor

Hadoop is not new - NY Time Source: http://open.blogs.nytimes.com/2007/11/01/self-service-prorated-super-computing-fun/
Wired Source: http://www.wired.com/wired/issue/16-07
Source: Gartner Hype Cycle - http://www.gartner.com/technology/research/methodologies/hype-cycle.jsp “Big Data is a fad”, “Its just BI 2.0”, “This is all just hype”, “We can’t figure out how to use it”, “There’s nothing new here”, “It’s not ready”, “Too few support options”, “Its too hard”
- You’re sharding your RDBMS infrastructure and its becoming brittle and a nightmare to maintain. - Twitter has a good quote where they stated it used to take them 2 weeks to run an alter table statement
Using Hadoop for ETL to save money by displacing ETL vendors Using Hive to offload datasets and their corresponding queries from your EDW and lower your EDW bill
A great way to competitively differentiate with arbitrarily structured data
Hadoop’s power is in its single storage repository and its support for arbitrary data structures. You have the technology to ask any question if you just have the data.
http://escience.washington.edu/get-help-now/astronomical-image-processing-hadoop
http://strataconf.com/stratany2013/public/schedule/detail/30810
http://vimeo.com/16861296

Hadoop for the disillusioned

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Hadoop for the disillusioned

Similar a Hadoop for the disillusioned (20)

Más de Steve Watt

Más de Steve Watt (10)

Último

Último (20)

Hadoop for the disillusioned

Notas del editor