Reference Table for Hadoop and Big Data, Includes references for bytes, to megabytes to terabytes to petabytes as well as key Big Data Terms such as HDFS, HBase and Hadoop.
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Glossary of Big Data Terms
1.
2. Top Big Data Terms
Term Definition
Hadoop Open-source software framework that supports the running of applications
on large clusters of commodity hardware. Hadoop is written in Java.
HDFS Stands for Hadoop Distributed File System. HDFS is a distributed file system
that stores large files across multiple machines. The system replicates data
across multiple machines and understand what data is being processed when
and by whom
MapReduce MapReduce is a programming model for processing large data sets with a
parallel, distributed algorithm on a cluster. Its Map() procedure filters and
sorts and its Reduce() procedure performs summary operations.
Hive A Data Warehouse infrastructure built on top of Hadoop for providing data
summarization, query, and analysis.
Hbase HBase is an open source, non-relational, distributed database and runs on
top of HDFS.
Cassandra Apache Cassandra is an open source distributed database management
system designed to handle very large amounts of data spread out across
many commodity servers.
Source: Wikipedia (mainly)
3. Sizes that Matter
Name Value Example
1 Bit = The smallest unit of data that a computer uses. It can be used
to represent two states of information, such as Yes or No.
1 Byte = 8 Bits. A Byte can represent 256 states of information. 1 Byte
could be equal to one character. 10 Bytes could be equal to a
word. 100 Bytes would equal an average sentence.
1 kilobyte (kB) 1024 bytes 1 Kilobyte would be equal to a paragraph.
1 megabyte (MB) 1024 kB 3-1/2 inch floppy disks can hold 1.44 Megabytes or the
equivalent of a small book. 600 Megabytes is about the
amount of data that will fit on a CD-ROM disk.
1 gigabyte (GB) 1024 MB 1GB could hold the contents of about 10 yards of books .
1 terabyte (TB) 1024 GB 1 TB could hold 1,000 copies of the Encyclopedia Britannica.
1 petabyte (PB) 1024 TB 500 million floppy disks
1 exabyte (EB) 1024 PB 5 Exabytes could = all of the words ever spoken by mankind.
1 zettabyte (ZB) 1024 PB ?
Source: http://www.whatsabyte.com/