An introduction to Apache Apache Hadoop command set.
What commands are available and what do they do ? A
brief introductio to each command without indepth
detail.
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
An introduction to the Apache Hadoop command set
1. Apache Command Set
● What types ?
● What are they ?
● What do they do ?
● Environment
● Configuration
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
2. Hadoop commands – What types ?
● User commands
● Administration commands
● Generic options for all commands
● Configuration options
● Environment
– Variables i.e. HADOOP_PREFIX
– Aliases i.e. hls = hadoop fs -ls
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
3. Hadoop commands – What are they ?
User Commands
● archive – save files to a har archive
● distcp – copy files or directories recursively
● fs – file system commands
– cat – copies file to stdout
– chgrp – change group associated with file
– chmod – change file permissions
– chown – change file ownership
– CopyFromLocal – copy from local file reference
– CopyToLocal – copy to local file reference
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
4. Hadoop commands – What are they ?
User Commands
● fs – file system commands
– count – count of dir / files/ bytes
– cp – copy files
– du – size of files and directories
– dus – display file lengths
– expunge – empty trash
– get – copy files to local file system
– getmerge – get but merge files
– ls – file listing
– lsr recursive ls
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
5. Hadoop commands – What are they ?
User Commands
● fs – file system commands
– mkdir – make directory
– moveFromLocal – put with delete of origin
– mv – move from source to destination
– put – copy between file systems
– rm – remove a file
– rmr – recursive delete
– setrep – change file replication factor
– stat – returns file stat information
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
6. Hadoop commands – What are they ?
User Commands
● fs – file system commands
– tail – display end of file
– test – check file existence / type
– text – output file as text
– touchz – create zero length file
● fsck – HDFS file system check
● fetchdt – get delegation token from name node
● jar – run jar file
● Job – manage mapreduce jobs
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
7. Hadoop commands – What are they ?
User Commands
● pipes – run a pipe job
● queue – interact and view job queue
● version – get Hadoop version
● CLASSNAME – run class named CLASSNAME
● classpath – print the class path
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
8. Hadoop commands – What are they ?
Administration Commands
● balancer – run cluster balancing
● daemonlog – get/set daemon log level
● datanode – run hdfs data node
● dfsadmin – run dfsadmin client
● mradmin – run map reduce admin client
● jobtracker – run mr jobtracker node
● namenode – runs the name node
● secondarynamenode – run secondary name node
● tasktracker – run task tracker node
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
9. Hadoop Environment
See the .bashrc for environment set up
##export HADOOP_HOME=/usr/local/hadoop ## deprecated
export HADOOP_PREFIX=/usr/local/hadoop
export JAVA_HOME=/usr/lib/jvm/java-6-openjdk-i386
unalias hfs &> /dev/null
alias hfs="hadoop fs"
unalias hls &> /dev/null ; alias hls="hfs -ls"
unalias hup1 &> /dev/null ; alias hup1="cd $HADOOP_PREFIX/bin ; ./start-dfs.sh"
unalias hup2 &> /dev/null ; alias hup2="cd $HADOOP_PREFIX/bin ; ./start-mapred.sh"
unalias hdwn1 &> /dev/null ; alias hdwn1="cd $HADOOP_PREFIX/bin ; ./stop-mapred.sh"
unalias hdwn2 &> /dev/null ; alias hdwn2="cd $HADOOP_PREFIX/bin ; ./stop-dfs.sh"
# if using LZO compression then add entry here for viewing
# LZO compressed files
##PATH=$PATH:$HADOOP_HOME/bin ## deprecated
PATH=$PATH:$HADOOP_PREFIX/bin
PATH=$PATH:$JAVA_HOME/bin
export PATH
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
10. Hadoop Configuration
● Configuration files under $HADOOP_PREFIX/conf
● Initial set up in
– core-site.xml
– hdfs-site.xml
– mapred-site.xml
● Example from core-site.xml
<property>
<name>hadoop.tmp.dir</name>
<value>/app/hadoop/tmp</value>
<description>A base for other temporary directories.</description>
</property>
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
11. Contact Us
● Feel free to contact us at
– www.semtech-solutions.co.nz
– info@semtech-solutions.co.nz
● We offer IT project consultancy
● We are happy to hear about your problems
● You can just pay for those hours that you need
● To solve your problems