Brisk hadoop june2011

•

4 recomendaciones•1,076 vistas

Brisk - Truly peer-to-peer hadoop. Brisk is an open-source Hadoop & Hive distribution that uses Apache Cassandra for its core services and storage. Brisk makes it possible to run Hadoop MapReduce on top of CassandraFS, an HDFS-compatible storage layer. By replacing HDFS with CassandraFS, users leverage MapReduce jobs on Cassandra’s peer-to-peer, fault-tolerant and scalable architecture. With CassandraFS all nodes are peers. Data files can be loaded through any node in the cluster and any node can serve as the JobTracker for MapReduce jobs. Hive MetaStore is stored & accessed as just another column family (table) on the distributed data store. Brisk makes Hadoop truly peer-to-peer. We demonstrate visualisation & monitoring of Brisk using OpsCenter. The operational simplicity of cassandra’s multi-datacenter & multi-region aware replication makes Brisk well-suited for a rich set of Applications and usecases. And by being able to store and isolate hdfs & online data within the same data cluster, Brisk makes analytics possible without ETL! LA Scalability Talk, Mahalo May 31.2011

Tecnología

Brisk: Truly peertopeer Hadoop


  srisatish.ambati AT gmail.com
  DataStax/OpenJDK
  @srisatish

Brisk: Hive + Hadoop + Cassandra

@srisatish

Have large sets of data & you can
work on small pieces in parallel.

@srisatish

Multicore map reduce framework,
Kunle, et al

@srisatish

Writeoncereadmany!
File once created, written & closed need change

@srisatish

DataNodes: Read, Write Blocks

@srisatish

NameNode:
Single Master node
Single Machine Address space
Single Point of failure

When “it” does not fit in a single node!
… Enter the distributed dragon!

Enter the Cassandra:
High Scale
Peertopeer

@srisatish

Cassandra:
High Scale
Peertopeer

@srisatish

Portfolio Demo
Low latency
Live tick prices for stocks.
Batch Analytics
Historical EOD prices.
Value at Risk.

http://www.datastax.com/docs/0.8/brisk/brisk_demo

Demo URLs (good for this demo only)

http://ec250194143.compute1.amazonaws.com:8888/opscenter/index.html
http://ec26720212176.compute1.amazonaws.com:50030/jobdetails.jsp?job
http://ec250194143.compute1.amazonaws.com:8983/portfolio/

Dynamo, 2007
Bigtable, 2006

OSS, 2008

Incubator, 2009 TLP, 2010

Y
Key “C”
A
W
Cassandra:
High Scale
U
Peertopeer F
No SPOF

T
L
P

@srisatish

YDH security edition (soon to be Apache)
Apache Hive – Access via SQL like
CassandraHandler
Cassandra 0.8

Use ColumnFamilies
inode
sblock

@srisatish

String keyspace = “cfs”;
CfDef cf = new CfDef();
   cf.setName(inodeDefaultCf);
   cf.setComparator_type("BytesType");
…

     cf.setName(sblockDefaultCf);
     cf.setKey_cache_size(1M);
     cf.setComment(
"Stores blocks of information associated with a inode");

cf.setKeyspace(keyspace);

@srisatish

Consistency: R + W > N

"brisk.consistencylevel.read", "QUORUM";
"brisk.consistencylevel.write", "QUORUM";

@srisatish

Hadoop:
job tracker, task tracker

@srisatish

BriskSnitch:
brisk nodes, cassandra nodes

@srisatish

$BriskSimpleSnitch.java if(TrackerInitializer.isTrackerNode) { myDC = BRISK_DC; logger.info("Detected Hadoop trackers are enabled, setting my DC to " + myDC); } else { myDC = CASSANDRA_DC; logger.info("Looks like Vanilla Cassandra nodes, setting my DC to " + myDC); } @srisatish$

Hive: SQLlike access
cli, hwi, jdbc, metastore
Pushdown predicates (v beta2)

@srisatish

hive>  CREATE TABLE invites (foo INT, bar
STRING)PARTITIONED BY (ds STRING);

hive>  LOAD DATA LOCAL INPATH
'$BRISK_HOME/resources/hive/examples/files
/kv2.txt' OVERWRITE INTO TABLE invites
PARTITION (ds='20080815');

hive>  SELECT count(*), ds FROM invites
GROUP BY ds;

http://www.datastax.com/docs/0.8/brisk/about_hive @srisatish

ETL
Realtime
Cassandra CFs
DataCenters
Scale

@srisatish

No me in team!
● Ben Coverston ● Michael Allen
● Ben Werther ● Mike Bulman
● Brandon Williams ● Michael Weir
● Cathy Daw ● Nate McCall
● Daria Hutchinson ● Nick M Bailey
● Jackson Chung ● Patricio Echague
● Jake Luciani ● Tyler Hobbs
● Joaquin Casares ● SriSatish Ambati
● Jonathan Ellis ● Yewei Zhang

@srisatish

100node Brisk Cluster on Opscenter
@srisatish

Dynamo, 2007
Bigtable, 2006 +

OSS, 2008

Incubator 2009
TLP, 2010

Cassandra
+ +

Brisk

git clone git@github.com:riptano/brisk.git
http://www.datastax.com/product/brisk
Getting Started via Brisk AMI.
Mahalo. Thank You.

@srisatish

References
● MapReduce: Simplified Data Processing on Large Clusters, 2004, Jeffrey Dean and
Sanjay Ghemawat, http://bit.ly/googmr_pdf
● Multicore MapReduce, Kunle, et al. http://bit.ly/iRJd1n

@srisatish

Más contenido relacionado

La actualidad más candente

SSTable Reader Cassandra Day Denver 2014

Ben Vanberg

Cassandra for Sysadmins

Nathan Milford

From the original abstract: If you're already using Cassandra you're already aware of it’s strengths of high availability and linear scalability. The downside to this power is less query flexibility. For an OLTP system with an SLA this is an acceptable tradeoff, but for a data scientist it’s extremely limiting. Enter Apache Spark. Apache spark complements an existing Cassandra cluster by providing a means of executing arbitrary queries, filters, sorting and aggregation. It’s possible to use functional constructs like map, filter, and reduce, as well as SQL and DataFrames. In this presentation I’ll show you how to process Cassandra data in bulk or through a Kafka stream using Python. Then we’ll visualize our data using iPython notebooks, leveraging Pandas and matplotlib. This is an advanced talk. We will assume existing knowledge of Cassandra and CQL.

Intro to py spark (and cassandra)

Jon Haddad

Lightning fast analytics with Spark and Cassandra

nickmbailey

Cassandra and Spark: Optimizing for Data Locality

Russell Spitzer

Traditionally, machines were statically partitioned across the different services at Uber. In an effort to increase the machine utilization, Uber has recently started transitioning most of its services, including the storage services, to run on top of Mesos. This presentation will describe the initial experience building and operating a framework for running Cassandra on top of Mesos running across multiple datacenters at Uber. This framework automates several Cassandra operations such as node repairs, addition of new nodes and backup/restore. It improves efficiency by co-locating CPU-intensive services as well as multiple Cassandra nodes on the same Mesos agent. It handles failure and restart of Mesos agents by using persistent volumes and dynamic reservations. This talk includes statistics about the number of Cassandra clusters in production, time taken to start a new cluster, add a new node, detect a node failure; and the observed Cassandra query throughput and latency. About the Speaker Abhishek Verma Software Engineer, Uber Dr. Abhishek Verma is currently working on running Cassandra on top of Mesos at Uber. Prior to this, he worked on BorgMaster at Google and was the first author of the Borg paper published in Eurosys 2015. He received an MS in 2010 and a PhD in 2012 in Computer Science from the University of Illinois at Urbana-Champaign, during which he authored more than 20 publications in conferences, journals and books and presented tens of talks.

Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...

DataStax

Hadoop Pig: MapReduce the easy way!

Nathan Bijnens

Big data analytics with Spark & Cassandra

Matthias Niehoff

Spark Cassandra Connector Dataframes

Russell Spitzer

Spark Cassandra Connector: Past, Present, and Future

Russell Spitzer

Spark cassandra connector.API, Best Practices and Use-Cases

Duyhai Doan

Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)

Spark Summit

Spark application on ec2 cluster

Chao-Hsuan Shen

Spark Streaming with Cassandra

Jacek Lewandowski

Zero to Streaming: Spark and Cassandra

Russell Spitzer

Cloud Friendly Hadoop and Hive

DataWorks Summit

Analytics with Cassandra & Spark

Matthias Niehoff

The Hadoop Ecosystem

Mathias Herberts

Apache Spark and DataStax Enablement

Vincent Poncet

Escape from Hadoop: Ultra Fast Data Analysis with Spark & Cassandra

Piotr Kolaczkowski

La actualidad más candente (20)

SSTable Reader Cassandra Day Denver 2014

Cassandra for Sysadmins

Intro to py spark (and cassandra)

Lightning fast analytics with Spark and Cassandra

Cassandra and Spark: Optimizing for Data Locality

Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...

Hadoop Pig: MapReduce the easy way!

Big data analytics with Spark & Cassandra

Spark Cassandra Connector Dataframes

Spark Cassandra Connector: Past, Present, and Future

Spark cassandra connector.API, Best Practices and Use-Cases

Cassandra and Spark: Optimizing for Data Locality-(Russell Spitzer, DataStax)

Spark application on ec2 cluster

Spark Streaming with Cassandra

Zero to Streaming: Spark and Cassandra

Cloud Friendly Hadoop and Hive

Analytics with Cassandra & Spark

The Hadoop Ecosystem

Apache Spark and DataStax Enablement

Escape from Hadoop: Ultra Fast Data Analysis with Spark & Cassandra

Similar a Brisk hadoop june2011

Brisk hadoop june2011_sfjava

srisatish ambati

The analytics platform at Twitter has experienced tremendous growth over the past few years in terms of size, complexity, number of users, and variety of use cases. In this talk, we’ll discuss the evolution of our infrastructure and the development of capabilities for data mining on “big data”. We’ll share our experiences as a case study, but make recommendations for best practices and point out opportunities for future work.

Scaling Big Data Mining Infrastructure Twitter Experience

DataWorks Summit

Managing Cassandra at Scale by Al Tobey

DataStax Academy

Stratio big data spain

Álvaro Agea Herradón

Spark cassandra integration, theory and practice

Duyhai Doan

Apache Spark 2.0 has laid the foundation for many new features and functionality. Its main three themes—easier, faster, and smarter—are pervasive in its unified and simplified high-level APIs for Structured data. In this introductory part lecture and part hands-on workshop you’ll learn how to apply some of these new APIs using Databricks Community Edition. In particular, we will cover the following areas: What’s new in Spark 2.0 SparkSessions vs SparkContexts Datasets/Dataframes and Spark SQL Introduction to Structured Streaming concepts and APIs

Jump Start with Apache Spark 2.0 on Databricks

Databricks

Spring one2gx2010 spring-nonrelational_data

Roger Xia

Spark + Cassandra = Real Time Analytics on Operational Data

Victor Coustenoble

Integrating C* and Spark gives us a system that combines the best of both worlds. The goal of this integration is to obtain a better result than using Spark over HDFS because Cassandra´s philosophy is much closer to RDD's philosophy than what HDFS is. The goal with Cassandra is to have a system that mines all the information stored in C* in a much more efficient way than having the information stored in HDFS. Cassandra data storage and Spark data mining power: an unrivalled mix.

An efficient data mining solution by integrating Spark and Cassandra

Stratio

Analyzing big data is a challenge, requiring lots of processing power and storage. Cloud Computing is an ideal platform to tackle this problem. HD Insight on Microsoft Azure deploys Hadoop and other open source big data tools to the cloud, making it easier to take advantage of the high scalability of this platform. In this session, you will learn what tools are available in HD Insight and how to use them to store, process, and analyze large amounts of data.

Big Data on azure

David Giard

Big Data Solutions in Azure - David Giard

ITCamp

Dask: Scaling Python

Matthew Rocklin

Can we run the Whole Web on Apache Sling?

Bertrand Delacretaz

High-Performance Storage Services with HailDB and Java

sunnygleason

Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014

NoSQLmatters

Bids talk 9.18

Travis Oliphant

JS App Architecture

Corey Butler

Massively scalable, always on, and ridiculously fast. Apache Cassandra is the database chosen by Apple, Netflix, and 30 of the Fortune 100 to power their critical infrastructure. How do we analyze petabytes of data, whether it be massive batching or as it’s ingested via streaming with Apache Kafka? Enter Apache Spark. Challenging MapReduce head on, Apache Spark offers powerful constructs that make it possible to slice and dice your data, whether it be through machine learning, graph queries, as well as transformations familiar to people with functional programming backgrounds such as map, filter, and reduce. Step away ready to rock with the most powerful distributed database, scalable messaging, and analytics platform on the planet. Watch the video here https://www.youtube.com/watch?v=X-FKmKc9hkI

Getting started with Spark & Cassandra by Jon Haddad of Datastax

Data Con LA

On Rails with Apache Cassandra

Stu Hood

Riak add presentation

Ilya Bogunov

Similar a Brisk hadoop june2011 (20)

Brisk hadoop june2011_sfjava

Scaling Big Data Mining Infrastructure Twitter Experience

Managing Cassandra at Scale by Al Tobey

Stratio big data spain

Spark cassandra integration, theory and practice

Jump Start with Apache Spark 2.0 on Databricks

Spring one2gx2010 spring-nonrelational_data

Spark + Cassandra = Real Time Analytics on Operational Data

An efficient data mining solution by integrating Spark and Cassandra

Big Data on azure

Big Data Solutions in Azure - David Giard

Dask: Scaling Python

Can we run the Whole Web on Apache Sling?

High-Performance Storage Services with HailDB and Java

Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014

Bids talk 9.18

JS App Architecture

Getting started with Spark & Cassandra by Jon Haddad of Datastax

On Rails with Apache Cassandra

Riak add presentation

Más de srisatish ambati

H2O Open Dallas 2016 keynote for Business Transformation

srisatish ambati

Digital Transformation with AI and Data - H2O.ai and Open Source

srisatish ambati

Top 10 Data Parallelism and Model Parallelism lessons from scaling H2O. "Math Algorithms have primarily been the domain of desktop data science. With the success of scalable algorithms at Google, Amazon, and Netflix, there is an ever growing demand for sophisticated algorithms over big data. In this talk, we get a ringside view in the making of the world's most scalable and fastest machine learning framework, H2O, and the performance lessons learnt scaling it over EC2 for Netflix and over commodity hardware for other power users. Top 10 Performance Gotchas is about the white hot stories of i/o wars, S3 resets, and muxers, as well as the power of primitive byte arrays, non-blocking structures, and fork/join queues. Of good data distribution & fine-grain decomposition of Algorithms to fine-grain blocks of parallel computation. It's a 10-point story of the rage of a network of machines against the tyranny of Amdahl while keeping the statistical properties of the data and accuracy of the algorithm."

Top 10 Performance Gotchas for scaling in-memory Algorithms.

srisatish ambati

Cacheconcurrencyconsistency cassandra svcc

srisatish ambati

SF Java presentation of jvm goes to big data. “Slowly yet surely the JVM is going to Big Data! In this fun filled presentation we see what pieces of Java & JVM triumph or unravel in the battle for performance at high scale!” Concurrency is the currency of scale on multi-core & the new generation of collections and non-blocking hashmaps are well worth the time taking a deep dive into. We take a quick look at the next gen serialization techniques as well as implementation pitfalls around UUID. The achilles' heel for JVM remains Garbage Collection: a deep dive into the internals of the memory model, common GC algorithms and their tuning knobs is always a big draw. EC2 & cloud present us with a virtualized & unchartered territory for scaling the JVM. We will leave some room for Q&A or fill it up with any asynchronous I/O that might queue up during the talk. A round of applause will be due to the various tools that are essentials for Java performance debugging.

Jvm goes big_data_sfjava

srisatish ambati

jvm goes to big data

srisatish ambati

Svccg nosql 2011_sri-cassandra

srisatish ambati

Cache is King ( Or How To Stop Worrying And Start Caching in Java) at Chicago...

srisatish ambati

Caching in Java - A review of different caching vendors (Oracle Coherence, Apache Cassandra, Infinispan, Ehcache/Terracotta, etc) and limitations presented by the underlying Java Platform. Presented at RedHat Summit 2010, Boston Speakers: SriSatish Ambati, Performance Engg Manik Surtani, InfiniSpan Lead Presentation details from RH Summit: How to Stop Worrying & Start Caching in Java SriSatish Ambati — Performance & Partner Engineer, Azul Systems, Inc. Manik Surtani — Principal Software Engineer, Red Hat Application data caching has come of age as distributed and large cache clusters are now common. The next generation of applications that depend on efficient caching has come into being and data and cache size explosion has set in. In this session, Azul Systems’ SriSatish Ambati and Red Hat’s Manik Surtani will survey performance characteristics of different cache algorithms, their implementations (e.g., implementing a 200Gb data cache size), and how well they work in practical JVM deployments. In each scenario, they will present patterns of architecture that scale, and demonstrate where read and write performance stands in the context of increasing cache sizes and concurrency. Throughout this discussion, they will recognize several villains, including heap fragmentation, long-lived objects, multi-VM communication, socket handlers, and queue managers. SriSatish and Manik will take a fun-filled “whodunit” approach to portray the roles played by each villain in killing cache performance. http://www.redhat.com/promo/summit/2010/sessions/jboss.html

How to Stop Worrying and Start Caching in Java

srisatish ambati

Top 10 Causes for Java Issues in Production and What to Do When Things Go Wrong JavaOne 2010. Abstract: It's Friday evening and you hear the first rumble . . . one java node has become slightly unresponsive. You lookup the process, get a thread dump, and for good measure restart it at 8 p.m. Saturday afternoon is when you realize that other nodes have caught the flu and you get the ugly call from the customer. In a matter of hours, you're on that conference bridge with support groups of different packages and Java vendors and one of your uberarchitects. Yes, production instances are up and down, and restarting like there's no tomorrow. Here's an accumulated compendium of the op 10 things that can cause Java production heartburn and what to do when your Java production is on fire. And yes, please have your tools belt on. Speaker(s): Cliff Click, Azul Systems, Distinguished Engineer SriSatish Ambati, Azul Systems, Performance Engineer

JavaOne 2010: Top 10 Causes for Java Issues in Production and What to Do When...

srisatish ambati

Cache & Concurrency considerations for a high performance Cassandra deployment. SriSatish Ambati Cassandra has hit it's stride as a distributed java NoSQL database! It's fast, it's in-memory, it's scalable, it's seda; It's eventually consistent model makes it practical for the large & growing volumes of unstructured data usecases. It is also time to run it through the filters of performance analysis. For starters it runs on the java virtual machine and inherits the capabilities and culpabilities of the platform. This presentation reviews the runtime architecture, cache behavior & performance of a real-world workload on Cassandra. We blend existing system & jvm tools to get a quick overview & a breakdown of hotspots in the get, put & update operations. We highlight the role played by garbage collection & fragmentation due to long lived objects; We investigate lock contention in the data structures under concurrent usage. Cassandra uses UDP for management & TCP for data: we look at robustness of the communication patterns during high spikes and cluster-wide events. We review Non-Blocking Hashmap modifications to Cassandra that improve concurrency & amplify performance of this frontrunner in the NoSQL space ApacheCon2010 NA Wed, 03 November 2010 15:00 cassandra

ApacheCon2010: Cache & Concurrency Considerations in Cassandra (& limits of JVM)

srisatish ambati

Más de srisatish ambati (11)

H2O Open Dallas 2016 keynote for Business Transformation

Digital Transformation with AI and Data - H2O.ai and Open Source

Top 10 Performance Gotchas for scaling in-memory Algorithms.

Cacheconcurrencyconsistency cassandra svcc

Jvm goes big_data_sfjava

jvm goes to big data

Svccg nosql 2011_sri-cassandra

Cache is King ( Or How To Stop Worrying And Start Caching in Java) at Chicago...

How to Stop Worrying and Start Caching in Java

JavaOne 2010: Top 10 Causes for Java Issues in Production and What to Do When...

ApacheCon2010: Cache & Concurrency Considerations in Cassandra (& limits of JVM)

Último

Following the popularity of "Cloud Revolution: Exploring the New Wave of Serverless Spatial Data," we're thrilled to announce this much-anticipated encore webinar. In this sequel, we'll dive deeper into the Cloud-Native realm by uncovering practical applications and FME support for these new formats, including COGs, COPC, FlatGeoBuf, GeoParquet, STAC, and ZARR. Building on the foundation laid by industry leaders Michelle Roby of Radiant Earth and Chris Holmes of Planet in the first webinar, this second part offers an in-depth look at the real-world application and behind-the-scenes dynamics of these cutting-edge formats. We will spotlight specific use-cases and workflows, showcasing their efficiency and relevance in practical scenarios. Discover the vast possibilities each format holds, highlighted through detailed discussions and demonstrations. Our expert speakers will dissect the key aspects and provide critical takeaways for effective use, ensuring attendees leave with a thorough understanding of how to apply these formats in their own projects. Elevate your understanding of how FME supports these cutting-edge technologies, enhancing your ability to manage, share, and analyze spatial data. Whether you're building on knowledge from our initial session or are new to the serverless spatial data landscape, this webinar is your gateway to mastering cloud-native formats in your workflows.

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME

Safe Software

Boost Fertility New Invention Ups Success Rates.pdf

sudhanshuwaghmare1

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024

The Digital Insurer

Manulife - Insurer Innovation Award 2024

The Digital Insurer

Webinar Recording: https://www.panagenda.com/webinars/why-teams-call-analytics-is-critical-to-your-entire-business Nothing is as frustrating and noticeable as being in an important call and being unable to see or hear the other person. Not surprising then, that issues with Teams calls are among the most common problems users call their helpdesk for. Having in depth insight into everything relevant going on at the user’s device, local network, ISP and Microsoft itself during the call is crucial for good Microsoft Teams Call quality support. To ensure a quick and adequate solution and to ensure your users get the most out of their Microsoft 365. But did you know that ‘bad calls’ are also an excellent indicator of other problems arising? Precisely because it is so noticeable!? Like the canary in the mine, bad calls can be early indicators of problems. Problems that might otherwise not have been noticed for a while but can have a big impact on productivity and satisfaction. Join this session by Christoph Adler to learn how true Microsoft Teams call quality analytics helped other organizations troubleshoot bad calls and identify and fix problems that impacted Teams calls or the use of Microsoft365 in general. See what it can do to keep your users happy and productive! In this session we will cover - Why CQD data alone is not enough to troubleshoot call problems - The importance of attributing call problems to the right call participant - What call quality analytics can do to help you quickly find, fix-, and prevent problems - Why having retrospective detailed insights matters - Real life examples of how others have used Microsoft Teams call quality monitoring to problem shoot problems with their ISP, network, device health and more.

Why Teams call analytics are critical to your entire business

panagenda

Artificial Intelligence Chap.5 : Uncertainty

Khushali Kathiriya

Partners Life - Insurer Innovation Award 2024

The Digital Insurer

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke

Product Anonymous

Created by Mozilla Research in 2012 and now part of Linux Foundation Europe, the Servo project is an experimental rendering engine written in Rust. It combines memory safety and concurrency to create an independent, modular, and embeddable rendering engine that adheres to web standards. Stewardship of Servo moved from Mozilla Research to the Linux Foundation in 2020, where its mission remains unchanged. After some slow years, in 2023 there has been renewed activity on the project, with a roadmap now focused on improving the engine’s CSS 2 conformance, exploring Android support, and making Servo a practical embeddable rendering engine. In this presentation, Rakhi Sharma reviews the status of the project, our recent developments in 2023, our collaboration with Tauri to make Servo an easy-to-use embeddable rendering engine, and our plans for the future to make Servo an alternative web rendering engine for the embedded devices industry. (c) Embedded Open Source Summit 2024 April 16-18, 2024 Seattle, Washington (US) https://events.linuxfoundation.org/embedded-open-source-summit/ https://ossna2024.sched.com/event/1aBNF/a-year-of-servo-reboot-where-are-we-now-rakhi-sharma-igalia

A Year of the Servo Reboot: Where Are We Now?

Igalia

The presentation explores the development and application of artificial intelligence (AI) from its inception to its current status in the modern world. The term "artificial intelligence" was first coined by John McCarthy in 1956 to describe efforts to develop computer programs capable of performing tasks that typically require human intelligence. This concept was first introduced at a conference held at Dartmouth College, where programs demonstrated capabilities such as playing chess, proving theorems, and interpreting texts. In the early stages, Alan Turing contributed to the field by defining intelligence as the ability of a being to respond to certain questions intelligently, proposing what is now known as the Turing Test to evaluate the presence of intelligent behavior in machines. As the decades progressed, AI evolved significantly. The 1980s focused on machine learning, teaching computers to learn from data, leading to the development of models that could improve their performance based on their experiences. The 1990s and 2000s saw further advances in algorithms and computational power, which allowed for more sophisticated data analysis techniques, including data mining. By the 2010s, the proliferation of big data and the refinement of deep learning techniques enabled AI to become mainstream. Notable milestones included the success of Google's AlphaGo and advancements in autonomous vehicles by companies like Tesla and Waymo. A major theme of the presentation is the application of generative AI, which has been used for tasks such as natural language text generation, translation, and question answering. Generative AI uses large datasets to train models that can then produce new, coherent pieces of text or other media. The presentation also discusses the ethical implications and the need for regulation in AI, highlighting issues such as privacy, bias, and the potential for misuse. These concerns have prompted calls for comprehensive regulations to ensure the safe and equitable use of AI technologies. Artificial intelligence has also played a significant role in healthcare, particularly highlighted during the COVID-19 pandemic, where it was used in drug discovery, vaccine development, and analyzing the spread of the virus. The capabilities of AI in healthcare are vast, ranging from medical diagnostics to personalized medicine, demonstrating the technology's potential to revolutionize fields beyond just technical or consumer applications. In conclusion, AI continues to be a rapidly evolving field with significant implications for various aspects of society. The development from theoretical concepts to real-world applications illustrates both the potential benefits and the challenges that come with integrating advanced technologies into everyday life. The ongoing discussion about AI ethics and regulation underscores the importance of managing these technologies responsibly to maximize their their benefits while minimizing potential harms.

Artificial Intelligence: Facts and Myths

Joaquim Jorge

Increase engagement and revenue with Muvi Live Paywall! In this presentation, we will explore the five key benefits of using Muvi Live Paywall to monetize your live streams. You'll learn how Muvi Live Paywall can help you: Monetize your live content easily: Set up pay-per-view access to your live streams and start generating revenue from your content. Increase audience engagement: Provide exclusive, premium content behind the paywall to keep your viewers engaged. Gain valuable viewer insights: Track viewer data and analytics to better understand your audience and tailor your content accordingly. Reduce content piracy: Muvi Live Paywall's security features help protect your content from unauthorized distribution. Streamline your workflow: The all-in-one platform simplifies the process of managing and monetizing your live streams. With Muvi Live Paywall, you can take control of your live stream monetization and create a sustainable business model for your content. Learn more about Muvi Live Paywall and start generating revenue from your live streams today!

Top 5 Benefits OF Using Muvi Live Paywall For Live Streams

Roshan Dwivedi

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024

The Digital Insurer

In this session, we will delve into strategic approaches for optimizing knowledge management within Microsoft 365, amidst the evolving landscape of Copilot. From leveraging automatic metadata classification and permission governance with SharePoint Premium, to unlocking Viva Engage for the cultivation of knowledge and communities, you will gain actionable insights to bolster your organization's knowledge-sharing initiatives. In this session, we will also explore how to facilitate solutions to enable your employees to find answers and expertise within Microsoft 365. You will leave equipped with practical techniques and a deeper understanding of how there is more to effective knowledge management than just enabling Copilot, but building actual solutions to prepare the knowledge that Copilot and your employees can use.

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

Drew Madelung

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving

Edi Saputra

With more memory available, system performance of three Dell devices increased, which can translate to a better user experience Conclusion When your system has plenty of RAM to meet your needs, you can efficiently access the applications and data you need to finish projects and to-do lists without sacrificing time and focus. Our test results show that with more memory available, three Dell PCs delivered better performance and took less time to complete the Procyon Office Productivity benchmark. These advantages translate to users being able to complete workflows more quickly and multitask more easily. Whether you need the mobility of the Latitude 5440, the creative capabilities of the Precision 3470, or the high performance of the OptiPlex Tower Plus 7010, configuring your system with more RAM can help keep processes running smoothly, enabling you to do more without compromising performance.

Boost PC performance: How more available memory can improve productivity

Principled Technologies

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...

Neo4j

💉💊+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI}}+971581248768 +971581248768 Mtp-Kit (500MG) Prices » Dubai [(+971581248768**)] Abortion Pills For Sale In Dubai, UAE, Mifepristone and Misoprostol Tablets Available In Dubai, UAE CONTACT DR.Maya Whatsapp +971581248768 We Have Abortion Pills / Cytotec Tablets /Mifegest Kit Available in Dubai, Sharjah, Abudhabi, Ajman, Alain, Fujairah, Ras Al Khaimah, Umm Al Quwain, UAE, Buy cytotec in Dubai +971581248768''''Abortion Pills near me DUBAI | ABU DHABI|UAE. Price of Misoprostol, Cytotec” +971581248768' Dr.DEEM ''BUY ABORTION PILLS MIFEGEST KIT, MISOPROTONE, CYTOTEC PILLS IN DUBAI, ABU DHABI,UAE'' Contact me now via What's App…… abortion Pills Cytotec also available Oman Qatar Doha Saudi Arabia Bahrain Above all, Cytotec Abortion Pills are Available In Dubai / UAE, you will be very happy to do abortion in Dubai we are providing cytotec 200mg abortion pill in Dubai, UAE. Medication abortion offers an alternative to Surgical Abortion for women in the early weeks of pregnancy. We only offer abortion pills from 1 week-6 Months. We then advise you to use surgery if its beyond 6 months. Our Abu Dhabi, Ajman, Al Ain, Dubai, Fujairah, Ras Al Khaimah (RAK), Sharjah, Umm Al Quwain (UAQ) United Arab Emirates Abortion Clinic provides the safest and most advanced techniques for providing non-surgical, medical and surgical abortion methods for early through late second trimester, including the Abortion By Pill Procedure (RU 486, Mifeprex, Mifepristone, early options French Abortion Pill), Tamoxifen, Methotrexate and Cytotec (Misoprostol). The Abu Dhabi, United Arab Emirates Abortion Clinic performs Same Day Abortion Procedure using medications that are taken on the first day of the office visit and will cause the abortion to occur generally within 4 to 6 hours (as early as 30 minutes) for patients who are 3 to 12 weeks pregnant. When Mifepristone and Misoprostol are used, 50% of patients complete in 4 to 6 hours; 75% to 80% in 12 hours; and 90% in 24 hours. We use a regimen that allows for completion without the need for surgery 99% of the time. All advanced second trimester and late term pregnancies at our Tampa clinic (17 to 24 weeks or greater) can be completed within 24 hours or less 99% of the time without the need surgery. The procedure is completed with minimal to no complications. Our Women's Health Center located in Abu Dhabi, United Arab Emirates, uses the latest medications for medical abortions (RU-486, Mifeprex, Mifegyne, Mifepristone, early options French abortion pill), Methotrexate and Cytotec (Misoprostol). The safety standards of our Abu Dhabi, United Arab Emirates Abortion Doctors remain unparalleled. They consistently maintain the lowest complication rates throughout the nation. Our Physicians and staff are always available to answer questions and care for women in one of the most difficult times in their lives. The decision to have an abortion at the Abortion Cl

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...

?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@

Scaling API-first – The story of a global engineering organization Ian Reasor, Senior Computer Scientist - Adobe Radu Cotescu, Senior Computer Scientist - Adobe Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe

apidays

Three things you will take away from the session: • How to run an effective tenant-to-tenant migration • Best practices for before, during, and after migration • Tips for using migration as a springboard to prepare for Copilot in Microsoft 365 Main ideas: Migration Overview: The presentation covers the current reality of cross-tenant migrations, the triggers, phases, best practices, and benefits of a successful tenant migration Considerations: When considering a migration, it is important to consider the migration scope, performance, customization, flexibility, user-friendly interface, automation, monitoring, support, training, scalability, data integrity, data security, cost, and licensing structure Next Wave: The next wave of change includes the launch of Copilot, which requires businesses to be prepared for upcoming changes related to Copilot and the cloud, and to consolidate data and tighten governance ShareGate: ShareGate can help with pre-migration analysis, configurable migration tool, and automated, end-user driven collaborative governance

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff

sammart93

Automating Google Workspace (GWS) & more with Apps Script

wesley chun

Brisk hadoop june2011

1. Brisk: Truly peertopeer Hadoop srisatish.ambati AT gmail.com DataStax/OpenJDK @srisatish

2. Brisk: Hive + Hadoop + Cassandra @srisatish

3. Map Reduce @srisatish

4. Have large sets of data & you can work on small pieces in parallel. @srisatish

5. Map Reduce @srisatish

6. Multicore map reduce framework, Kunle, et al @srisatish

7. Parallel Execution View @srisatish

8. @srisatish

9. @srisatish

10. JobTracker NameNode HDFS @srisatish

11. Writeoncereadmany! File once created, written & closed need change @srisatish

12. Move computation, not data @srisatish

13. @srisatish

14. DataNodes: Read, Write Blocks @srisatish

15. NameNode: Single Master node Single Machine Address space Single Point of failure

16. When “it” does not fit in a single node! … Enter the distributed dragon! Enter the Cassandra: High Scale Peertopeer @srisatish

17. NameNode DataNodes

18. Onekindofnode!

19. Cassandra: High Scale Peertopeer @srisatish

20. Portfolio Demo Low latency Live tick prices for stocks. Batch Analytics Historical EOD prices. Value at Risk. http://www.datastax.com/docs/0.8/brisk/brisk_demo

21. Demo URLs (good for this demo only) http://ec250194143.compute1.amazonaws.com:8888/opscenter/index.html http://ec26720212176.compute1.amazonaws.com:50030/jobdetails.jsp?job http://ec250194143.compute1.amazonaws.com:8983/portfolio/

22. Dynamo, 2007 Bigtable, 2006 OSS, 2008 Incubator, 2009 TLP, 2010

23. Y Key “C” A W Cassandra: High Scale U Peertopeer F No SPOF T L P @srisatish

24.

25.

26. Brisk @srisatish

27. Brisk HowStuffWorks version @srisatish

28. YDH security edition (soon to be Apache) Apache Hive – Access via SQL like CassandraHandler Cassandra 0.8

29. Use ColumnFamilies inode sblock @srisatish

30. String keyspace = “cfs”; CfDef cf = new CfDef(); cf.setName(inodeDefaultCf); cf.setComparator_type("BytesType"); … cf.setName(sblockDefaultCf); cf.setKey_cache_size(1M); cf.setComment( "Stores blocks of information associated with a inode"); cf.setKeyspace(keyspace); @srisatish

31. Consistency: R + W > N "brisk.consistencylevel.read", "QUORUM"; "brisk.consistencylevel.write", "QUORUM"; @srisatish

32. Hadoop: job tracker, task tracker @srisatish

33. BriskSnitch: brisk nodes, cassandra nodes @srisatish

34. BriskSimpleSnitch.java if(TrackerInitializer.isTrackerNode) { myDC = BRISK_DC; logger.info("Detected Hadoop trackers are enabled, setting my DC to " + myDC); } else { myDC = CASSANDRA_DC; logger.info("Looks like Vanilla Cassandra nodes, setting my DC to " + myDC); } @srisatish

35. Hive: SQLlike access cli, hwi, jdbc, metastore Pushdown predicates (v beta2) @srisatish

36. hive> CREATE TABLE invites (foo INT, bar STRING)PARTITIONED BY (ds STRING); hive> LOAD DATA LOCAL INPATH '$BRISK_HOME/resources/hive/examples/files /kv2.txt' OVERWRITE INTO TABLE invites PARTITION (ds='20080815'); hive> SELECT count(*), ds FROM invites GROUP BY ds; http://www.datastax.com/docs/0.8/brisk/about_hive @srisatish

37. ETL Realtime Cassandra CFs DataCenters Scale @srisatish

38. @srisatish

39. No me in team! ● Ben Coverston ● Michael Allen ● Ben Werther ● Mike Bulman ● Brandon Williams ● Michael Weir ● Cathy Daw ● Nate McCall ● Daria Hutchinson ● Nick M Bailey ● Jackson Chung ● Patricio Echague ● Jake Luciani ● Tyler Hobbs ● Joaquin Casares ● SriSatish Ambati ● Jonathan Ellis ● Yewei Zhang @srisatish

40. 100node Brisk Cluster on Opscenter @srisatish

41. Dynamo, 2007 Bigtable, 2006 + OSS, 2008 Incubator 2009 TLP, 2010 Cassandra + + Brisk

42. git clone git@github.com:riptano/brisk.git http://www.datastax.com/product/brisk Getting Started via Brisk AMI. Mahalo. Thank You. @srisatish

43. References ● MapReduce: Simplified Data Processing on Large Clusters, 2004, Jeffrey Dean and Sanjay Ghemawat, http://bit.ly/googmr_pdf ● Multicore MapReduce, Kunle, et al. http://bit.ly/iRJd1n @srisatish

Brisk hadoop june2011

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Brisk hadoop june2011

Similar a Brisk hadoop june2011 (20)

Más de srisatish ambati

Más de srisatish ambati (11)

Último

Último (20)

Brisk hadoop june2011