LinkedIn emplea cookies para mejorar la funcionalidad y el rendimiento de nuestro sitio web, así como para ofrecer publicidad relevante. Si continúas navegando por ese sitio web, aceptas el uso de cookies. Consulta nuestras Condiciones de uso y nuestra Política de privacidad para más información.
LinkedIn emplea cookies para mejorar la funcionalidad y el rendimiento de nuestro sitio web, así como para ofrecer publicidad relevante. Si continúas navegando por ese sitio web, aceptas el uso de cookies. Consulta nuestra Política de privacidad y nuestras Condiciones de uso para más información.
HADOOP & DISTRIBUTED CLOUDCOMPUTINGDATA PROCESSING IN CLOUD Presentation By : Rajan Kumar Upadhyay || firstname.lastname@example.org
CLOUD COMPUTING ?Cloud computing is a virtual setup box that includesfollowing- Delivery of computing as a service rather than product - Shared resources are software, utility, hardware provided over a network ( TypicallyInternet ) Delivery of computing Public Utilities Shared Resources
DISTRIBUTED CLOUD COMPUTINGAs the name explains : Distributed computing in cloudExamples:• Distributed computing is nothing more than utilizing many networked computers to partition(split it into many smaller pieces) a question or problem and allow the network to solve theissue piecemeal• Software like Hadoop. Written in Java, Hadoop is a scalable, efficient, distributed softwareplatform designed to process enormous amounts of data. Hadoop can scale to thousands ofcomputers across many clusters.• Another instance of distributed computing, for storage instead of processing power, isbittorrent. A torrent is a file that is split into many pieces and stored on many computersaround the internet. When a local machine wants to access that file, the small pieces areretrieved and rebuilt.• P2P network, that send communication/data packages into multiple pieces across multiplenetwork routes. Then assemble them in receivers end.Distributed computing on cloud is nothing but next generation framework to utilize themaximum value of resources over distributed architecure
WHAT IS HADOOPFlexible infrastructure for large scale computation and data processing on a network ofcommodity hardware.Why Hadoop?A common infrastructure pattern extracted from building distributed systems•Scale • Apache.org Open Source project•Incremental growth • Yahoo !, Facebook, Google, Fox, Amazon, IBM,•Cost NY times uses it for their core infrastructure•Flexibility • Widely Adopted A valuable and reusable skill set• Distributed File System Taught at major universities• Distributed Processing Framework Easier to hire for Easier to train on Portable across projects, groups
HOW IT WORKSHDFS: Hadoop Distributed File SystemA distributed file system for large data• Your data in triplicate ( one local and two remote copies)• Built-in redundancy, resiliency to large scale failures (automated restart and re-allocation )• Intelligent distribution, striping across racks• Accommodates very large data sizes On commodity hardware
PROGRAMMING MODELThere are various programming model for Hadoopdevelopments. I personally like & experienced withMap/ReduceWhy Map/Reduce:•Simple programming technique: • Map(anything)->key, value • Sort, partition on key • Reduce(key,value)->key, value• No parallel processing / message passing semantics• Programmable in Java or any other language Continued …
PROGRAMMING MODEL Gather output ofCreate/Allocate Move computation map, sort or cluster to Data partition on key Put Data Run Results of job Program reduce stored on into File Execution task HDFS System Your Map code Data is split is copied to the into allocated nodes, blocks, store preferring nodes d in triplicate that contain across your copies of your data cluster
PRACTICESPut large data source into HDFSPerform aggregations, transformations, normalizations onthe dataLoad into RDBMS
THANK YOUThank you for reading this. I hope you find it useful. Please contact me email@example.com if you have any queries/feedback. My Name is RajanKumar Upadhyay, I have more than 10 years of collective IT experience as atechie.If you have anything to share/looking for consulting etc. Please feel free to contactme.