(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
Distributed Data Processing Workshop - SBU
1. 1
کارگاه پردازش داده توزیع شده
پردیس- شهیدبهشتی
دانشکده علوم و مهندسی کامپیوتر
درس: پایگاه داده توزیع شده
استاد: دکتر هادی طباطبایی
ارائه: ابوالفضل صدیقی
آبان ۱۳۹۳
2. Distributed Data Processing
School of Computer Science and Engineering
A. Sedighi
@amirsedighi
Hexican.com
sedighi@gmail.com
9. 9
Introduction
This is a 4 sessions, hands-on, step-by-step
tutorial on setting up, a Linux cluster on your
machine (Notebook or PC), to try a few number
of big-data processing frameworks and tools.
10. 10
What we are going to do?
● Your notebook, or a PC is just enough for starting.
– Setting your Linux cluster up.
● Distributed Log Management and Realtime Search-Engines
– What is Elasticsearch?
– Elasticsearch on the cluster.
– Monitoring and Usage.
● The most popular Distributed Data Processing Framework.
– What is Apache Hadoop?
– Apache Hadoop on the cluster.
– Using Scenarios.
11. 11
What we would Learn?
● Leveraging our knowledge of Big-Data.
● Getting familiar with distributed data processing.
● Maximizing availability and reliability.
● Increasing data storage capacity.
● Leveraging data processing performance.
● Data locality is a silver bullet.
● Increasing cluster utilization.
● Taming giants by giving them a try.
17. 17
Preparing the Cluster – First Node
● Creating a Linux machine inside VirtualBox.
● Installing Linux. (I've used Ubuntu 12.04)
– Check Samba
– Check OpenSSH
● Give the first node all.
– Having an “install” folder on.
– Having primitives such as Java installed on.
● Shutting down the first node.
18. Preparing the Cluster – Cloning, The
18
Virtual Box Side
● Cloning the first node. (tutorial)
19. Preparing the Cluster – Cloning, the
19
Linux side
● Turning the new node on.
● Network configuration
– sudo nano /etc/hosts
– sudo nano /etc/hostname
– sudo nano /etc/network/interfaces
– sudo rm /etc/udev/rules.d/70-persistent-net.rules
● sudo reboot
21. 21
Preparing the Cluster – Distributed
Shell
● Do it like a Commander
– Installing DSH (Optional)
22. 22
Preparing the Cluster – Enjoy it
● To scale your cluster just repeat the cloning
step.
23. 23
Next?
● An introduction to distributed Log Management
and analytical search-engines.
– How Elasticsearch works?
– Workshop.
● An introduction to Apache Hadoop
– How Apache Hadoop works?
– Workshop.