SlideShare una empresa de Scribd logo
1 de 6
Descargar para leer sin conexión
RHive tutorial - HDFS functions
Hive uses Hadoop’s system to process distributed file systems.
Thus, in order to expertly use Hive and RHive,
you must be able to do things along the lines of using HDFS to put, get, and
remove big data.
RHive possesses Functions that correspond to what the “hadoop fs”
command supports.
Using these Functions, a user can in R environment handle HDFS without
using HADOOP CLI(command line interface) or Hadoop HDFS library.
If you find yourself more comfortable with using “hadoop”’s CLI or Hadoop
library then it is also fine to use them.
But if you are not familiar with using Rstudio server or working from a terminal,
RHive HDFS Functions should prove to be easy-to-use solutions in handling
HDFS for R users.

Before Emulating this Example
rhive.hdfs.* Functions work after RHive has successfully been installed and
library(Rhive) and rhive.connect are successfully executed.
Let’s not forget to do the following before emulating the example.

#	
  Open	
  R	
  
library(RHive)	
  
rhive.connect()	
  


rhive.hdfs.connect
In order to use RHive Functions to use HDFS, a connection to hdfs must be
established.
But if the Hadoop configuration for HDFS is properly set and rhive.connect
Function is executed, then this Function will automatically be
processed/executed* so there is no need to have this separately executed.

If you need to connect to a different HDFS then you can do it like this:

rhive.hdfs.connect("hdfs://10.1.1.1:9000")	
  
[1]	
  "Java-­‐Object{DFS[DFSClient[clientName=DFSClient_630489789,	
  
ugi=root]]}"	
  
The connection will fail to establish itself if you do not insert the exact
hostname and port number servicing HDFS.
Ask the system manager if you do not have this information.

rhive.hdfs.ls
This does the same thing as "hadoop fs -ls" and this is used like this.

rhive.hdfs.ls("/")	
  
	
  	
  permission	
  owner	
  	
  	
  	
  	
  	
  group	
  	
  	
  length	
  	
  	
  	
  	
  	
  modify-­‐
time	
  	
  	
  	
  	
  	
  	
  	
  file	
  
1	
  	
  rwxr-­‐xr-­‐x	
  	
  root	
  supergroup	
  	
  	
  	
  	
  	
  	
  	
  0	
  2011-­‐12-­‐07	
  
14:27	
  	
  	
  	
  /airline	
  
2	
  	
  rwxr-­‐xr-­‐x	
  	
  root	
  supergroup	
  	
  	
  	
  	
  	
  	
  	
  0	
  2011-­‐12-­‐07	
  13:16	
  
/benchmarks	
  
3	
  	
  rw-­‐r-­‐-­‐r-­‐-­‐	
  	
  root	
  supergroup	
  11186419	
  2011-­‐12-­‐06	
  
03:59	
  	
  	
  /messages	
  
4	
  	
  rwxr-­‐xr-­‐x	
  	
  root	
  supergroup	
  	
  	
  	
  	
  	
  	
  	
  0	
  2011-­‐12-­‐07	
  
22:05	
  	
  	
  	
  	
  	
  	
  	
  /mnt	
  
5	
  	
  rwxr-­‐xr-­‐x	
  	
  root	
  supergroup	
  	
  	
  	
  	
  	
  	
  	
  0	
  2011-­‐12-­‐13	
  
20:24	
  	
  	
  	
  	
  	
  /rhive	
  
6	
  	
  rwxr-­‐xr-­‐x	
  	
  root	
  supergroup	
  	
  	
  	
  	
  	
  	
  	
  0	
  2011-­‐12-­‐07	
  
20:19	
  	
  	
  	
  	
  	
  	
  	
  /tmp	
  
7	
  	
  rwxr-­‐xr-­‐x	
  	
  root	
  supergroup	
  	
  	
  	
  	
  	
  	
  	
  0	
  2011-­‐12-­‐14	
  
01:14	
  	
  	
  	
  	
  	
  	
  /user	
  

This is the same as the command which uses Hadoop CLI.

hadoop	
  fs	
  -­‐ls	
  /	
  


rhive.hdfs.get
The rhive.hdfs.get Function’s role is to bring the data in HDFS to local.
This functions in the same way as "hadoop fs -get".
The next example entails taking messages data in HDFS and saving them to
local system’s /tmp/messages, then checking the number of Records.

rhive.hdfs.get("/messages",	
  "/tmp/messages")	
  
[1]	
  TRUE	
  
system("wc	
  -­‐l	
  /tmp/messages")	
  
145889	
  /tmp/messages	
  


rhive.hdfs.put
The rhive.hdfs.put Function uploads all data in local to HDFS.
This functions like "hadoop fs -put" and opposite of rhive.hdfs.get.
The following example uploads the “/tmp/messages” in local system to
“/messages_new” in HDFS.

rhive.hdfs.put("/tmp/messages",	
  "/messages_new")	
  
rhive.hdfs.ls("/")	
  
	
  	
  permission	
  owner	
  	
  	
  	
  	
  	
  group	
  	
  	
  length	
  	
  	
  	
  	
  	
  modify-­‐
time	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  file	
  
1	
  	
  rwxr-­‐xr-­‐x	
  	
  root	
  supergroup	
  	
  	
  	
  	
  	
  	
  	
  0	
  2011-­‐12-­‐07	
  
14:27	
  	
  	
  	
  	
  	
  /airline	
  
2	
  	
  rwxr-­‐xr-­‐x	
  	
  root	
  supergroup	
  	
  	
  	
  	
  	
  	
  	
  0	
  2011-­‐12-­‐07	
  
13:16	
  	
  	
  /benchmarks	
  
3	
  	
  rw-­‐r-­‐-­‐r-­‐-­‐	
  	
  root	
  supergroup	
  11186419	
  2011-­‐12-­‐06	
  
03:59	
  	
  	
  	
  	
  /messages	
  
4	
  	
  rw-­‐r-­‐-­‐r-­‐-­‐	
  	
  root	
  supergroup	
  11186419	
  2011-­‐12-­‐14	
  02:02	
  
/messages_new	
  
5	
  	
  rwxr-­‐xr-­‐x	
  	
  root	
  supergroup	
  	
  	
  	
  	
  	
  	
  	
  0	
  2011-­‐12-­‐07	
  
22:05	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  /mnt	
  
6	
  	
  rwxr-­‐xr-­‐x	
  	
  root	
  supergroup	
  	
  	
  	
  	
  	
  	
  	
  0	
  2011-­‐12-­‐13	
  
20:24	
  	
  	
  	
  	
  	
  	
  	
  /rhive	
  
7	
  	
  rwxr-­‐xr-­‐x	
  	
  root	
  supergroup	
  	
  	
  	
  	
  	
  	
  	
  0	
  2011-­‐12-­‐14	
  
01:14	
  	
  	
  	
  	
  	
  	
  	
  	
  /user	
  

You can see a new file, "/messages_new", now appears in HDFS.

rhive.hdfs.rm
This does the same thing as "hadoop fs -rm", deleting files in HDFS.
rhive.hdfs.rm("/messages_new")	
  
rhive.hdfs.ls("/")	
  
	
  	
  permission	
  owner	
  	
  	
  	
  	
  	
  group	
  	
  	
  length	
  	
  	
  	
  	
  	
  modify-­‐
time	
  	
  	
  	
  	
  	
  	
  	
  file	
  
1	
  	
  rwxr-­‐xr-­‐x	
  	
  root	
  supergroup	
  	
  	
  	
  	
  	
  	
  	
  0	
  2011-­‐12-­‐07	
  
14:27	
  	
  	
  	
  /airline	
  
2	
  	
  rwxr-­‐xr-­‐x	
  	
  root	
  supergroup	
  	
  	
  	
  	
  	
  	
  	
  0	
  2011-­‐12-­‐07	
  13:16	
  
/benchmarks	
  
3	
  	
  rw-­‐r-­‐-­‐r-­‐-­‐	
  	
  root	
  supergroup	
  11186419	
  2011-­‐12-­‐06	
  
03:59	
  	
  	
  /messages	
  
4	
  	
  rwxr-­‐xr-­‐x	
  	
  root	
  supergroup	
  	
  	
  	
  	
  	
  	
  	
  0	
  2011-­‐12-­‐07	
  
22:05	
  	
  	
  	
  	
  	
  	
  	
  /mnt	
  
5	
  	
  rwxr-­‐xr-­‐x	
  	
  root	
  supergroup	
  	
  	
  	
  	
  	
  	
  	
  0	
  2011-­‐12-­‐13	
  
20:24	
  	
  	
  	
  	
  	
  /rhive	
  
6	
  	
  rwxr-­‐xr-­‐x	
  	
  root	
  supergroup	
  	
  	
  	
  	
  	
  	
  	
  0	
  2011-­‐12-­‐14	
  
01:14	
  	
  	
  	
  	
  	
  	
  /user	
  

You can see the "/messages_new" file has been deleted from within HDFS.

rhive.hdfs.rename
This does the same thing as "hadoop fs -mv".
That is, it changes the file name for files in HDFS or moves directories.

rhive.hdfs.rename("/messages",	
  "/messages_renamed")	
  
[1]	
  TRUE	
  
rhive.hdfs.ls("/")	
  
	
  	
  permission	
  owner	
  	
  	
  	
  	
  	
  group	
  	
  	
  length	
  	
  	
  	
  	
  	
  modify-­‐
time	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  file	
  
1	
  	
  rwxr-­‐xr-­‐x	
  	
  root	
  supergroup	
  	
  	
  	
  	
  	
  	
  	
  0	
  2011-­‐12-­‐07	
  
14:27	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  /airline	
  
2	
  	
  rwxr-­‐xr-­‐x	
  	
  root	
  supergroup	
  	
  	
  	
  	
  	
  	
  	
  0	
  2011-­‐12-­‐07	
  
13:16	
  	
  	
  	
  	
  	
  	
  /benchmarks	
  
3	
  	
  rw-­‐r-­‐-­‐r-­‐-­‐	
  	
  root	
  supergroup	
  11186419	
  2011-­‐12-­‐06	
  03:59	
  
/messages_renamed	
  
4	
  	
  rwxr-­‐xr-­‐x	
  	
  root	
  supergroup	
  	
  	
  	
  	
  	
  	
  	
  0	
  2011-­‐12-­‐07	
  
22:05	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  /mnt	
  
5	
  	
  rwxr-­‐xr-­‐x	
  	
  root	
  supergroup	
  	
  	
  	
  	
  	
  	
  	
  0	
  2011-­‐12-­‐13	
  
20:24	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  /rhive	
  
6	
  	
  rwxr-­‐xr-­‐x	
  	
  root	
  supergroup	
  	
  	
  	
  	
  	
  	
  	
  0	
  2011-­‐12-­‐14	
  
01:14	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  /user	
  




rhive.hdfs.exists
This checks whether a file exists within HDFS. There is no corresponding
command hadoop that serves as a counterpart.

rhive.hdfs.exists("/messages_renamed")	
  
[1]	
  TRUE	
  
rhive.hdfs.exists("/foobar")	
  
[1]	
  FALSE	
  


rhive.hdfs.mkdirs
This does the same thing as "hadoop fs -mkdir".
This makes directories in HDFS, even subdirectories.

rhive.hdfs.mkdirs("/newdir/newsubdir")	
  
[1]	
  TRUE	
  
rhive.hdfs.ls("/")	
  
	
  	
  permission	
  owner	
  	
  	
  	
  	
  	
  group	
  	
  	
  length	
  	
  	
  	
  	
  	
  modify-­‐
time	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  file	
  
1	
  	
  rwxr-­‐xr-­‐x	
  	
  root	
  supergroup	
  	
  	
  	
  	
  	
  	
  	
  0	
  2011-­‐12-­‐07	
  
14:27	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  /airline	
  
2	
  	
  rwxr-­‐xr-­‐x	
  	
  root	
  supergroup	
  	
  	
  	
  	
  	
  	
  	
  0	
  2011-­‐12-­‐07	
  
13:16	
  	
  	
  	
  	
  	
  	
  /benchmarks	
  
3	
  	
  rw-­‐r-­‐-­‐r-­‐-­‐	
  	
  root	
  supergroup	
  11186419	
  2011-­‐12-­‐06	
  03:59	
  
/messages_renamed	
  
4	
  	
  rwxr-­‐xr-­‐x	
  	
  root	
  supergroup	
  	
  	
  	
  	
  	
  	
  	
  0	
  2011-­‐12-­‐07	
  
22:05	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  /mnt	
  
5	
  	
  rwxr-­‐xr-­‐x	
  	
  root	
  supergroup	
  	
  	
  	
  	
  	
  	
  	
  0	
  2011-­‐12-­‐14	
  
02:13	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  /newdir	
  
6	
  	
  rwxr-­‐xr-­‐x	
  	
  root	
  supergroup	
  	
  	
  	
  	
  	
  	
  	
  0	
  2011-­‐12-­‐13	
  
20:24	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  /rhive	
  
7	
  	
  rwxr-­‐xr-­‐x	
  	
  root	
  supergroup	
  	
  	
  	
  	
  	
  	
  	
  0	
  2011-­‐12-­‐14	
  
01:14	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  /user	
  
rhive.hdfs.ls("/newdir")	
  
	
  	
  permission	
  owner	
  	
  	
  	
  	
  	
  group	
  length	
  	
  	
  	
  	
  	
  modify-­‐
time	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  file	
  
1	
  	
  rwxr-­‐xr-­‐x	
  	
  root	
  supergroup	
  	
  	
  	
  	
  	
  0	
  2011-­‐12-­‐14	
  02:13	
  
/newdir/newsubdir	
  


rhive.hdfs.close
This is used to close the connection when you have completed using HDFS
and no longer need to use it.

rhive.hdfs.close()	
  

Más contenido relacionado

La actualidad más candente

101 2.4b use debian package management v2
101 2.4b use debian package management v2101 2.4b use debian package management v2
101 2.4b use debian package management v2Acácio Oliveira
 
Hadoop installation on windows
Hadoop installation on windows Hadoop installation on windows
Hadoop installation on windows habeebulla g
 
Unix commands in etl testing
Unix commands in etl testingUnix commands in etl testing
Unix commands in etl testingGaruda Trainings
 
TP2 Big Data HBase
TP2 Big Data HBaseTP2 Big Data HBase
TP2 Big Data HBaseAmal Abid
 
Hadoop 20111117
Hadoop 20111117Hadoop 20111117
Hadoop 20111117exsuns
 
Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export Rupak Roy
 
Cloudera hadoop installation
Cloudera hadoop installationCloudera hadoop installation
Cloudera hadoop installationSumitra Pundlik
 
Hadoop 20111215
Hadoop 20111215Hadoop 20111215
Hadoop 20111215exsuns
 
Content server installation guide
Content server installation guideContent server installation guide
Content server installation guideNaveed Bashir
 
Introduction to scoop and its functions
Introduction to scoop and its functionsIntroduction to scoop and its functions
Introduction to scoop and its functionsRupak Roy
 
Hadoop installation
Hadoop installationHadoop installation
Hadoop installationhabeebulla g
 
HADOOP 실제 구성 사례, Multi-Node 구성
HADOOP 실제 구성 사례, Multi-Node 구성HADOOP 실제 구성 사례, Multi-Node 구성
HADOOP 실제 구성 사례, Multi-Node 구성Young Pyo
 
Installing hadoop on ubuntu 16
Installing hadoop on ubuntu 16Installing hadoop on ubuntu 16
Installing hadoop on ubuntu 16Enrique Davila
 
Hive data migration (export/import)
Hive data migration (export/import)Hive data migration (export/import)
Hive data migration (export/import)Bopyo Hong
 
Install and upgrade Oracle grid infrastructure 12.1.0.2
Install and upgrade Oracle grid infrastructure 12.1.0.2Install and upgrade Oracle grid infrastructure 12.1.0.2
Install and upgrade Oracle grid infrastructure 12.1.0.2Biju Thomas
 

La actualidad más candente (16)

101 2.4b use debian package management v2
101 2.4b use debian package management v2101 2.4b use debian package management v2
101 2.4b use debian package management v2
 
Hadoop installation on windows
Hadoop installation on windows Hadoop installation on windows
Hadoop installation on windows
 
Unix commands in etl testing
Unix commands in etl testingUnix commands in etl testing
Unix commands in etl testing
 
TP2 Big Data HBase
TP2 Big Data HBaseTP2 Big Data HBase
TP2 Big Data HBase
 
Hadoop 20111117
Hadoop 20111117Hadoop 20111117
Hadoop 20111117
 
Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export
 
Cloudera hadoop installation
Cloudera hadoop installationCloudera hadoop installation
Cloudera hadoop installation
 
Hadoop 20111215
Hadoop 20111215Hadoop 20111215
Hadoop 20111215
 
Content server installation guide
Content server installation guideContent server installation guide
Content server installation guide
 
Introduction to scoop and its functions
Introduction to scoop and its functionsIntroduction to scoop and its functions
Introduction to scoop and its functions
 
Hadoop installation
Hadoop installationHadoop installation
Hadoop installation
 
HADOOP 실제 구성 사례, Multi-Node 구성
HADOOP 실제 구성 사례, Multi-Node 구성HADOOP 실제 구성 사례, Multi-Node 구성
HADOOP 실제 구성 사례, Multi-Node 구성
 
Hadoop completereference
Hadoop completereferenceHadoop completereference
Hadoop completereference
 
Installing hadoop on ubuntu 16
Installing hadoop on ubuntu 16Installing hadoop on ubuntu 16
Installing hadoop on ubuntu 16
 
Hive data migration (export/import)
Hive data migration (export/import)Hive data migration (export/import)
Hive data migration (export/import)
 
Install and upgrade Oracle grid infrastructure 12.1.0.2
Install and upgrade Oracle grid infrastructure 12.1.0.2Install and upgrade Oracle grid infrastructure 12.1.0.2
Install and upgrade Oracle grid infrastructure 12.1.0.2
 

Destacado

RHive tutorials - Basic functions
RHive tutorials - Basic functionsRHive tutorials - Basic functions
RHive tutorials - Basic functionsAiden Seonghak Hong
 
R hive tutorial - apply functions and map reduce
R hive tutorial - apply functions and map reduceR hive tutorial - apply functions and map reduce
R hive tutorial - apply functions and map reduceAiden Seonghak Hong
 
Integrate Hive and R
Integrate Hive and RIntegrate Hive and R
Integrate Hive and RJunHo Cho
 
Performance and Scale Options for R with Hadoop: A comparison of potential ar...
Performance and Scale Options for R with Hadoop: A comparison of potential ar...Performance and Scale Options for R with Hadoop: A comparison of potential ar...
Performance and Scale Options for R with Hadoop: A comparison of potential ar...Revolution Analytics
 
Introduccion a Apache Spark
Introduccion a Apache SparkIntroduccion a Apache Spark
Introduccion a Apache SparkGustavo Arjones
 
Docker networking basics & coupling with Software Defined Networks
Docker networking basics & coupling with Software Defined NetworksDocker networking basics & coupling with Software Defined Networks
Docker networking basics & coupling with Software Defined NetworksAdrien Blind
 

Destacado (9)

RHive tutorials - Basic functions
RHive tutorials - Basic functionsRHive tutorials - Basic functions
RHive tutorials - Basic functions
 
R hive tutorial - apply functions and map reduce
R hive tutorial - apply functions and map reduceR hive tutorial - apply functions and map reduce
R hive tutorial - apply functions and map reduce
 
Integrate Hive and R
Integrate Hive and RIntegrate Hive and R
Integrate Hive and R
 
RHadoop, R meets Hadoop
RHadoop, R meets HadoopRHadoop, R meets Hadoop
RHadoop, R meets Hadoop
 
Running R on Hadoop - CHUG - 20120815
Running R on Hadoop - CHUG - 20120815Running R on Hadoop - CHUG - 20120815
Running R on Hadoop - CHUG - 20120815
 
Performance and Scale Options for R with Hadoop: A comparison of potential ar...
Performance and Scale Options for R with Hadoop: A comparison of potential ar...Performance and Scale Options for R with Hadoop: A comparison of potential ar...
Performance and Scale Options for R with Hadoop: A comparison of potential ar...
 
Introduccion a Apache Spark
Introduccion a Apache SparkIntroduccion a Apache Spark
Introduccion a Apache Spark
 
Docker networking basics & coupling with Software Defined Networks
Docker networking basics & coupling with Software Defined NetworksDocker networking basics & coupling with Software Defined Networks
Docker networking basics & coupling with Software Defined Networks
 
Enabling R on Hadoop
Enabling R on HadoopEnabling R on Hadoop
Enabling R on Hadoop
 

Similar a RHive tutorial - HDFS functions

Configure h base hadoop and hbase client
Configure h base hadoop and hbase clientConfigure h base hadoop and hbase client
Configure h base hadoop and hbase clientShashwat Shriparv
 
Hadoop Interacting with HDFS
Hadoop Interacting with HDFSHadoop Interacting with HDFS
Hadoop Interacting with HDFSApache Apex
 
Hadoop Interview Questions and Answers by rohit kapa
Hadoop Interview Questions and Answers by rohit kapaHadoop Interview Questions and Answers by rohit kapa
Hadoop Interview Questions and Answers by rohit kapakapa rohit
 
Introduction to HDFS and MapReduce
Introduction to HDFS and MapReduceIntroduction to HDFS and MapReduce
Introduction to HDFS and MapReduceUday Vakalapudi
 
Hadoop Architecture and HDFS
Hadoop Architecture and HDFSHadoop Architecture and HDFS
Hadoop Architecture and HDFSEdureka!
 
Upgrading from HDP 2.1 to HDP 2.2
Upgrading from HDP 2.1 to HDP 2.2Upgrading from HDP 2.1 to HDP 2.2
Upgrading from HDP 2.1 to HDP 2.2SATOSHI TAGOMORI
 
5c_BigData_Hadoop_HDFS.PPTX
5c_BigData_Hadoop_HDFS.PPTX5c_BigData_Hadoop_HDFS.PPTX
5c_BigData_Hadoop_HDFS.PPTXMiguel720844
 
Inside HDFS Append
Inside HDFS AppendInside HDFS Append
Inside HDFS AppendYue Chen
 
Snapshot in Hadoop Distributed File System
Snapshot in Hadoop Distributed File SystemSnapshot in Hadoop Distributed File System
Snapshot in Hadoop Distributed File SystemBhavesh Padharia
 
Learn Hadoop Administration
Learn Hadoop AdministrationLearn Hadoop Administration
Learn Hadoop AdministrationEdureka!
 
Hadoop 2.x HDFS Cluster Installation (VirtualBox)
Hadoop 2.x  HDFS Cluster Installation (VirtualBox)Hadoop 2.x  HDFS Cluster Installation (VirtualBox)
Hadoop 2.x HDFS Cluster Installation (VirtualBox)Amir Sedighi
 
Top 10 Hadoop Shell Commands
Top 10 Hadoop Shell Commands Top 10 Hadoop Shell Commands
Top 10 Hadoop Shell Commands SimoniShah6
 
Data analysis on hadoop
Data analysis on hadoopData analysis on hadoop
Data analysis on hadoopFrank Y
 

Similar a RHive tutorial - HDFS functions (20)

Hadoop File System Shell Commands,
Hadoop File System Shell Commands,Hadoop File System Shell Commands,
Hadoop File System Shell Commands,
 
HDFS_Command_Reference
HDFS_Command_ReferenceHDFS_Command_Reference
HDFS_Command_Reference
 
Configure h base hadoop and hbase client
Configure h base hadoop and hbase clientConfigure h base hadoop and hbase client
Configure h base hadoop and hbase client
 
Hadoop Interacting with HDFS
Hadoop Interacting with HDFSHadoop Interacting with HDFS
Hadoop Interacting with HDFS
 
Bd class 2 complete
Bd class 2 completeBd class 2 complete
Bd class 2 complete
 
Hadoop Interview Questions and Answers by rohit kapa
Hadoop Interview Questions and Answers by rohit kapaHadoop Interview Questions and Answers by rohit kapa
Hadoop Interview Questions and Answers by rohit kapa
 
MapReduce1.pptx
MapReduce1.pptxMapReduce1.pptx
MapReduce1.pptx
 
Introduction to HDFS and MapReduce
Introduction to HDFS and MapReduceIntroduction to HDFS and MapReduce
Introduction to HDFS and MapReduce
 
Hadoop Architecture and HDFS
Hadoop Architecture and HDFSHadoop Architecture and HDFS
Hadoop Architecture and HDFS
 
Upgrading from HDP 2.1 to HDP 2.2
Upgrading from HDP 2.1 to HDP 2.2Upgrading from HDP 2.1 to HDP 2.2
Upgrading from HDP 2.1 to HDP 2.2
 
RHadoop - beginners
RHadoop - beginnersRHadoop - beginners
RHadoop - beginners
 
Unix Basics Commands
Unix Basics CommandsUnix Basics Commands
Unix Basics Commands
 
5c_BigData_Hadoop_HDFS.PPTX
5c_BigData_Hadoop_HDFS.PPTX5c_BigData_Hadoop_HDFS.PPTX
5c_BigData_Hadoop_HDFS.PPTX
 
Inside HDFS Append
Inside HDFS AppendInside HDFS Append
Inside HDFS Append
 
Snapshot in Hadoop Distributed File System
Snapshot in Hadoop Distributed File SystemSnapshot in Hadoop Distributed File System
Snapshot in Hadoop Distributed File System
 
Basics of Linux
Basics of LinuxBasics of Linux
Basics of Linux
 
Learn Hadoop Administration
Learn Hadoop AdministrationLearn Hadoop Administration
Learn Hadoop Administration
 
Hadoop 2.x HDFS Cluster Installation (VirtualBox)
Hadoop 2.x  HDFS Cluster Installation (VirtualBox)Hadoop 2.x  HDFS Cluster Installation (VirtualBox)
Hadoop 2.x HDFS Cluster Installation (VirtualBox)
 
Top 10 Hadoop Shell Commands
Top 10 Hadoop Shell Commands Top 10 Hadoop Shell Commands
Top 10 Hadoop Shell Commands
 
Data analysis on hadoop
Data analysis on hadoopData analysis on hadoop
Data analysis on hadoop
 

Más de Aiden Seonghak Hong

RHive tutorial supplement 3: RHive 튜토리얼 부록 3 - RStudio 설치
RHive tutorial supplement 3: RHive 튜토리얼 부록 3 - RStudio 설치RHive tutorial supplement 3: RHive 튜토리얼 부록 3 - RStudio 설치
RHive tutorial supplement 3: RHive 튜토리얼 부록 3 - RStudio 설치Aiden Seonghak Hong
 
RHive tutorial supplement 2: RHive 튜토리얼 부록 2 - Hive 설치
RHive tutorial supplement 2: RHive 튜토리얼 부록 2 - Hive 설치RHive tutorial supplement 2: RHive 튜토리얼 부록 2 - Hive 설치
RHive tutorial supplement 2: RHive 튜토리얼 부록 2 - Hive 설치Aiden Seonghak Hong
 
RHive tutorial supplement 1: RHive 튜토리얼 부록 1 - Hadoop 설치
RHive tutorial supplement 1: RHive 튜토리얼 부록 1 - Hadoop 설치RHive tutorial supplement 1: RHive 튜토리얼 부록 1 - Hadoop 설치
RHive tutorial supplement 1: RHive 튜토리얼 부록 1 - Hadoop 설치Aiden Seonghak Hong
 
RHive tutorial 5: RHive 튜토리얼 5 - apply 함수와 맵리듀스
RHive tutorial 5: RHive 튜토리얼 5 - apply 함수와 맵리듀스RHive tutorial 5: RHive 튜토리얼 5 - apply 함수와 맵리듀스
RHive tutorial 5: RHive 튜토리얼 5 - apply 함수와 맵리듀스Aiden Seonghak Hong
 
RHive tutorial 4: RHive 튜토리얼 4 - UDF, UDTF, UDAF 함수
RHive tutorial 4: RHive 튜토리얼 4 - UDF, UDTF, UDAF 함수RHive tutorial 4: RHive 튜토리얼 4 - UDF, UDTF, UDAF 함수
RHive tutorial 4: RHive 튜토리얼 4 - UDF, UDTF, UDAF 함수Aiden Seonghak Hong
 
RHive tutorial 3: RHive 튜토리얼 3 - HDFS 함수
RHive tutorial 3: RHive 튜토리얼 3 - HDFS 함수RHive tutorial 3: RHive 튜토리얼 3 - HDFS 함수
RHive tutorial 3: RHive 튜토리얼 3 - HDFS 함수Aiden Seonghak Hong
 
RHive tutorial 2: RHive 튜토리얼 2 - 기본 함수
RHive tutorial 2: RHive 튜토리얼 2 - 기본 함수RHive tutorial 2: RHive 튜토리얼 2 - 기본 함수
RHive tutorial 2: RHive 튜토리얼 2 - 기본 함수Aiden Seonghak Hong
 
RHive tutorial 1: RHive 튜토리얼 1 - 설치 및 설정
RHive tutorial 1: RHive 튜토리얼 1 - 설치 및 설정RHive tutorial 1: RHive 튜토리얼 1 - 설치 및 설정
RHive tutorial 1: RHive 튜토리얼 1 - 설치 및 설정Aiden Seonghak Hong
 
R hive tutorial supplement 2 - Installing Hive
R hive tutorial supplement 2 - Installing HiveR hive tutorial supplement 2 - Installing Hive
R hive tutorial supplement 2 - Installing HiveAiden Seonghak Hong
 
R hive tutorial supplement 1 - Installing Hadoop
R hive tutorial supplement 1 - Installing HadoopR hive tutorial supplement 1 - Installing Hadoop
R hive tutorial supplement 1 - Installing HadoopAiden Seonghak Hong
 

Más de Aiden Seonghak Hong (12)

IoT and Big data with R
IoT and Big data with RIoT and Big data with R
IoT and Big data with R
 
RHive tutorial supplement 3: RHive 튜토리얼 부록 3 - RStudio 설치
RHive tutorial supplement 3: RHive 튜토리얼 부록 3 - RStudio 설치RHive tutorial supplement 3: RHive 튜토리얼 부록 3 - RStudio 설치
RHive tutorial supplement 3: RHive 튜토리얼 부록 3 - RStudio 설치
 
RHive tutorial supplement 2: RHive 튜토리얼 부록 2 - Hive 설치
RHive tutorial supplement 2: RHive 튜토리얼 부록 2 - Hive 설치RHive tutorial supplement 2: RHive 튜토리얼 부록 2 - Hive 설치
RHive tutorial supplement 2: RHive 튜토리얼 부록 2 - Hive 설치
 
RHive tutorial supplement 1: RHive 튜토리얼 부록 1 - Hadoop 설치
RHive tutorial supplement 1: RHive 튜토리얼 부록 1 - Hadoop 설치RHive tutorial supplement 1: RHive 튜토리얼 부록 1 - Hadoop 설치
RHive tutorial supplement 1: RHive 튜토리얼 부록 1 - Hadoop 설치
 
RHive tutorial 5: RHive 튜토리얼 5 - apply 함수와 맵리듀스
RHive tutorial 5: RHive 튜토리얼 5 - apply 함수와 맵리듀스RHive tutorial 5: RHive 튜토리얼 5 - apply 함수와 맵리듀스
RHive tutorial 5: RHive 튜토리얼 5 - apply 함수와 맵리듀스
 
RHive tutorial 4: RHive 튜토리얼 4 - UDF, UDTF, UDAF 함수
RHive tutorial 4: RHive 튜토리얼 4 - UDF, UDTF, UDAF 함수RHive tutorial 4: RHive 튜토리얼 4 - UDF, UDTF, UDAF 함수
RHive tutorial 4: RHive 튜토리얼 4 - UDF, UDTF, UDAF 함수
 
RHive tutorial 3: RHive 튜토리얼 3 - HDFS 함수
RHive tutorial 3: RHive 튜토리얼 3 - HDFS 함수RHive tutorial 3: RHive 튜토리얼 3 - HDFS 함수
RHive tutorial 3: RHive 튜토리얼 3 - HDFS 함수
 
RHive tutorial 2: RHive 튜토리얼 2 - 기본 함수
RHive tutorial 2: RHive 튜토리얼 2 - 기본 함수RHive tutorial 2: RHive 튜토리얼 2 - 기본 함수
RHive tutorial 2: RHive 튜토리얼 2 - 기본 함수
 
RHive tutorial 1: RHive 튜토리얼 1 - 설치 및 설정
RHive tutorial 1: RHive 튜토리얼 1 - 설치 및 설정RHive tutorial 1: RHive 튜토리얼 1 - 설치 및 설정
RHive tutorial 1: RHive 튜토리얼 1 - 설치 및 설정
 
R hive tutorial 1
R hive tutorial 1R hive tutorial 1
R hive tutorial 1
 
R hive tutorial supplement 2 - Installing Hive
R hive tutorial supplement 2 - Installing HiveR hive tutorial supplement 2 - Installing Hive
R hive tutorial supplement 2 - Installing Hive
 
R hive tutorial supplement 1 - Installing Hadoop
R hive tutorial supplement 1 - Installing HadoopR hive tutorial supplement 1 - Installing Hadoop
R hive tutorial supplement 1 - Installing Hadoop
 

Último

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 

Último (20)

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 

RHive tutorial - HDFS functions

  • 1. RHive tutorial - HDFS functions Hive uses Hadoop’s system to process distributed file systems. Thus, in order to expertly use Hive and RHive, you must be able to do things along the lines of using HDFS to put, get, and remove big data. RHive possesses Functions that correspond to what the “hadoop fs” command supports. Using these Functions, a user can in R environment handle HDFS without using HADOOP CLI(command line interface) or Hadoop HDFS library. If you find yourself more comfortable with using “hadoop”’s CLI or Hadoop library then it is also fine to use them. But if you are not familiar with using Rstudio server or working from a terminal, RHive HDFS Functions should prove to be easy-to-use solutions in handling HDFS for R users. Before Emulating this Example rhive.hdfs.* Functions work after RHive has successfully been installed and library(Rhive) and rhive.connect are successfully executed. Let’s not forget to do the following before emulating the example. #  Open  R   library(RHive)   rhive.connect()   rhive.hdfs.connect In order to use RHive Functions to use HDFS, a connection to hdfs must be established. But if the Hadoop configuration for HDFS is properly set and rhive.connect Function is executed, then this Function will automatically be processed/executed* so there is no need to have this separately executed. If you need to connect to a different HDFS then you can do it like this: rhive.hdfs.connect("hdfs://10.1.1.1:9000")   [1]  "Java-­‐Object{DFS[DFSClient[clientName=DFSClient_630489789,   ugi=root]]}"  
  • 2. The connection will fail to establish itself if you do not insert the exact hostname and port number servicing HDFS. Ask the system manager if you do not have this information. rhive.hdfs.ls This does the same thing as "hadoop fs -ls" and this is used like this. rhive.hdfs.ls("/")      permission  owner            group      length            modify-­‐ time                file   1    rwxr-­‐xr-­‐x    root  supergroup                0  2011-­‐12-­‐07   14:27        /airline   2    rwxr-­‐xr-­‐x    root  supergroup                0  2011-­‐12-­‐07  13:16   /benchmarks   3    rw-­‐r-­‐-­‐r-­‐-­‐    root  supergroup  11186419  2011-­‐12-­‐06   03:59      /messages   4    rwxr-­‐xr-­‐x    root  supergroup                0  2011-­‐12-­‐07   22:05                /mnt   5    rwxr-­‐xr-­‐x    root  supergroup                0  2011-­‐12-­‐13   20:24            /rhive   6    rwxr-­‐xr-­‐x    root  supergroup                0  2011-­‐12-­‐07   20:19                /tmp   7    rwxr-­‐xr-­‐x    root  supergroup                0  2011-­‐12-­‐14   01:14              /user   This is the same as the command which uses Hadoop CLI. hadoop  fs  -­‐ls  /   rhive.hdfs.get The rhive.hdfs.get Function’s role is to bring the data in HDFS to local. This functions in the same way as "hadoop fs -get". The next example entails taking messages data in HDFS and saving them to local system’s /tmp/messages, then checking the number of Records. rhive.hdfs.get("/messages",  "/tmp/messages")  
  • 3. [1]  TRUE   system("wc  -­‐l  /tmp/messages")   145889  /tmp/messages   rhive.hdfs.put The rhive.hdfs.put Function uploads all data in local to HDFS. This functions like "hadoop fs -put" and opposite of rhive.hdfs.get. The following example uploads the “/tmp/messages” in local system to “/messages_new” in HDFS. rhive.hdfs.put("/tmp/messages",  "/messages_new")   rhive.hdfs.ls("/")      permission  owner            group      length            modify-­‐ time                    file   1    rwxr-­‐xr-­‐x    root  supergroup                0  2011-­‐12-­‐07   14:27            /airline   2    rwxr-­‐xr-­‐x    root  supergroup                0  2011-­‐12-­‐07   13:16      /benchmarks   3    rw-­‐r-­‐-­‐r-­‐-­‐    root  supergroup  11186419  2011-­‐12-­‐06   03:59          /messages   4    rw-­‐r-­‐-­‐r-­‐-­‐    root  supergroup  11186419  2011-­‐12-­‐14  02:02   /messages_new   5    rwxr-­‐xr-­‐x    root  supergroup                0  2011-­‐12-­‐07   22:05                    /mnt   6    rwxr-­‐xr-­‐x    root  supergroup                0  2011-­‐12-­‐13   20:24                /rhive   7    rwxr-­‐xr-­‐x    root  supergroup                0  2011-­‐12-­‐14   01:14                  /user   You can see a new file, "/messages_new", now appears in HDFS. rhive.hdfs.rm This does the same thing as "hadoop fs -rm", deleting files in HDFS.
  • 4. rhive.hdfs.rm("/messages_new")   rhive.hdfs.ls("/")      permission  owner            group      length            modify-­‐ time                file   1    rwxr-­‐xr-­‐x    root  supergroup                0  2011-­‐12-­‐07   14:27        /airline   2    rwxr-­‐xr-­‐x    root  supergroup                0  2011-­‐12-­‐07  13:16   /benchmarks   3    rw-­‐r-­‐-­‐r-­‐-­‐    root  supergroup  11186419  2011-­‐12-­‐06   03:59      /messages   4    rwxr-­‐xr-­‐x    root  supergroup                0  2011-­‐12-­‐07   22:05                /mnt   5    rwxr-­‐xr-­‐x    root  supergroup                0  2011-­‐12-­‐13   20:24            /rhive   6    rwxr-­‐xr-­‐x    root  supergroup                0  2011-­‐12-­‐14   01:14              /user   You can see the "/messages_new" file has been deleted from within HDFS. rhive.hdfs.rename This does the same thing as "hadoop fs -mv". That is, it changes the file name for files in HDFS or moves directories. rhive.hdfs.rename("/messages",  "/messages_renamed")   [1]  TRUE   rhive.hdfs.ls("/")      permission  owner            group      length            modify-­‐ time                            file   1    rwxr-­‐xr-­‐x    root  supergroup                0  2011-­‐12-­‐07   14:27                    /airline   2    rwxr-­‐xr-­‐x    root  supergroup                0  2011-­‐12-­‐07   13:16              /benchmarks   3    rw-­‐r-­‐-­‐r-­‐-­‐    root  supergroup  11186419  2011-­‐12-­‐06  03:59   /messages_renamed   4    rwxr-­‐xr-­‐x    root  supergroup                0  2011-­‐12-­‐07   22:05                            /mnt  
  • 5. 5    rwxr-­‐xr-­‐x    root  supergroup                0  2011-­‐12-­‐13   20:24                        /rhive   6    rwxr-­‐xr-­‐x    root  supergroup                0  2011-­‐12-­‐14   01:14                          /user   rhive.hdfs.exists This checks whether a file exists within HDFS. There is no corresponding command hadoop that serves as a counterpart. rhive.hdfs.exists("/messages_renamed")   [1]  TRUE   rhive.hdfs.exists("/foobar")   [1]  FALSE   rhive.hdfs.mkdirs This does the same thing as "hadoop fs -mkdir". This makes directories in HDFS, even subdirectories. rhive.hdfs.mkdirs("/newdir/newsubdir")   [1]  TRUE   rhive.hdfs.ls("/")      permission  owner            group      length            modify-­‐ time                            file   1    rwxr-­‐xr-­‐x    root  supergroup                0  2011-­‐12-­‐07   14:27                    /airline   2    rwxr-­‐xr-­‐x    root  supergroup                0  2011-­‐12-­‐07   13:16              /benchmarks   3    rw-­‐r-­‐-­‐r-­‐-­‐    root  supergroup  11186419  2011-­‐12-­‐06  03:59   /messages_renamed   4    rwxr-­‐xr-­‐x    root  supergroup                0  2011-­‐12-­‐07   22:05                            /mnt   5    rwxr-­‐xr-­‐x    root  supergroup                0  2011-­‐12-­‐14   02:13                      /newdir   6    rwxr-­‐xr-­‐x    root  supergroup                0  2011-­‐12-­‐13  
  • 6. 20:24                        /rhive   7    rwxr-­‐xr-­‐x    root  supergroup                0  2011-­‐12-­‐14   01:14                          /user   rhive.hdfs.ls("/newdir")      permission  owner            group  length            modify-­‐ time                            file   1    rwxr-­‐xr-­‐x    root  supergroup            0  2011-­‐12-­‐14  02:13   /newdir/newsubdir   rhive.hdfs.close This is used to close the connection when you have completed using HDFS and no longer need to use it. rhive.hdfs.close()