Data representation

Data Representation
 Data representation is generally how
information is
conceived, manipulated, and recorded.
The term can also be defined as the form
in which data and information is kept in a
certain environment. How data is stored
varies from one environment to
another, with each environment having its
own set of rules and standards.

Data Representation
 Converting the complex data structures
used by an application
(strings, integers, structures, etc. ) into a
byte stream transmitted across the
network.
 Representing information in such a way
that communicating peers agree to the
format of the data being exchanged.
E.g., How many bits does an integer
contain?, ASCII or EBCDIC character set?

Data Representation
 " The problem is that a file containing the
bytes 108, 97, 110
would read as “lan” on an ASCII
system, but
 “%/>” on an EBCDIC system
 In ASCII, the value 108 means the
character 'l'
 " In EBCDIC, the value 108 means the
character '%'

 ASCII - American Standard Code for
Information Interchange, representing
English on all microcomputers and most
minicomputer.
 EBCDIC - Extended Binary Coded Decimal
Interchange Code, represents English on
IBM mainframes.
 Shift-JIS - Japanese Characters.

Data Representation
 Popular network data representations
include:
◦ASN.1 (Abstract Syntax Notation
One) - an ISO standard
◦XDR (External Data
Representation)
- used with SunRPC

ASN.1
 Abstract Syntax Notation (ASN.1) is
standard and notation that describes rules
and structures for
representing, encoding, transmitting, and de
coding data. It consists of two parts:
1. An abstract syntax that describes data structures
in an unambiguous way. Use “
integers”, “character strings”, and “structures”
rather than bits and bytes.
2. A transfer syntax that describes the bit stream
encoding of ASN.1 data objects.

ASN.1
 The standard ASN.1 encoding rules include:
- Basic Encoding Rules (BER)
- Canonical Encoding Rules (CER)
- Distinguished Encoding Rules(DER)
- XML Encoding Rules (XER)
- Packed Encoding Rules (PER)
 These encoding rules facilitates the
exchange of structured data especially
between application programs over networks
by describing data structures in a way that is
independent of machine architecture and
implementation language.

ASN.1
 Example of ASN.1’S abstract syntax:
Student ::= SEQUENCE {
name [0] IMPLICIT OCTET STRING OPTIONAL,
grad [1] IMPLICIT BOOLEAN OPTIONAL DEFAULT
FALSE,
gpa [2] IMPLICIT REAL OPTIONAL,
id [3] IMPLICIT INTEGER,
bday [4] IMPLICIT OCTET STRING OPTIONAL
}

ASN.1
 Though initially used for specifying the
email protocol within the Open Systems
Interconnection environment, ASN.1 has
since then been adopted for a wide range
of other applications, as in network
management, secure email, cellular
telephony, air traffic control, and voice
and video over the Internet.

Current Uses of ANS1
 Audio & Video over the Internet
AT&T, Intel, IBM, Microsoft, 3COM
 Electronic Commerce
American Express, GTE, MasterCard, VISA
 Telephony
AT&T, MCI, Motorola, Nokia, Sprint
 Aviation
FAA, ICAO
 Manufacturing
Ford, Mercedes Benz, Mitsubishi
 Network Management
Bull, Compaq, Hewlett-Packard, Sun
 Routers
Bay Networks, Cisco, Racal, Xyplex

XDR
 Sun Microsystem's External Data
Representation (XDR) is much simpler
than ASN.1, but less powerful. For
instance:
1. XDR uses implicit typing. Communicating peers
must know the type of any exchanged data. In
contrast, ASN.1 uses explicit typing; it includes
type information as part of the transfer syntax.
2. In XDR, all data is transferred in units of 4 bytes.
Numbers are transferred in network order, most
significant byte first.

XDR
 4 bytes of XDR message:

XDR (cont.)
3. Strings consist of a 4 byte length, followed
by the data (and perhaps padding in the last
byte).
4. Defined types include:
integer, enumeration, boolean, floating
point, fixed length array, structures, plus
others.
 One advantage that XDR has over ASN.1
is that current implementations of ASN.1
execute significantly slower than XDR.

MIME
 " The message “£100 is about !150” could
become
Content-Transfer-Encoding: quoted-
printable
Content-Type: text/plain; charset=ISO-
8859-15
MIME-Version: 1.0
=A3100 is about =A4150

MIME
 or
Content-Transfer-Encoding: base64
Content-Type: text/plain; charset=ISO-
8859-15
MIME-Version: 1.0
ozEwMCBpcyBhYm91dCCkMTUwCg=49

Data Compression
 reduce resources usage, such as data storage
space or transmission capacity.
 Lossless Compression – involve no loss of
information. If data have been losslessly
compressed, the original data can be
recovered exactly from compress data. It is
generally used for application that cannot
tolerate any difference between original and
reconstructed data.
 Lossy Compression – involve some loss of
information and data have been compressed
using lossy techniques generally cannot be
recovered or reconstructed exactly. In return
for accepting this distortion in
reconstruction, can generally obtain much
higher compression ratios than is possible
with lossless compression.

Steps of Data Compression
 The compression of still images, audio and video
data streams:
1. Picture preparation – generates an appropriate
digital representation of the information in the
medium being compressed.
2. Picture processing –is the first step that makes
use of the various compression algorithms.
3. Quantization – Values determined in the
previous step cannot and should not be
processed with full exactness; instead they are
quantized according to a specific resolution and
characteristic curve.
4. Entropy encoding – with a sequential data
stream of individual bits and bytes, different
techniques are used to perform a final, lossless
compression.

Steps of Data Compression
Uncompressed
Picture
Picture
Preparation
Picture
Processing
Quantization
Entropy
Coding
Compressed
Picture
Major steps of image compression, can also be applied to audio and video data

Image Compression
 to represent images with less data in
order to save storage costs or
transmission time.
 possible to reduce file size to 10% from
the original without noticeable loss in
quality.
 Image compression can be lossless or
lossy.

Image Compression
 Lossless
- Image quality is not reduced.
Use in: artificial images that contain
sharp-edged lines such as technical
drawings, textual graphics, comics, maps
or logos.
Methods: run-length encoding
(RLE), entropy coding (Huffman coding)
and dictionary coders (LZW).

Image Compression
 Lossy
- reduces image quality. Cannot get the
original image back & lose some
information.
Use in: natural images such as photos of
landscapes
Methods: discrete cosine transform
(DCT, used in JPEG) or wavelet transform
(used in JPEG 2000), color quantization

FORMAT FILE EXTENTION TYPE OF
COMPRESSION
METHODS USAGE
BMP (bitmap) .bmp Cosiderably
compressed with
lossless
ZIP used to store bitmap
digital images
JPEG
(Joint Photographic
Experts Group)
.jpg , .jpeg , .jpe Lossy
Lossless
- Discrete Cosine
Transform (DCT) &
Chroma Subsampling
- Run-Length
Encoding (RLE)
For natural images
GIF (Graphics
Interchange Format
.gif , .giff , .gfa Lossless LZW (Lempel-Ziv-
Welch)
For artificial images
(sharp-edge lines
and few colors) &
support animation
PNG (Portable
Network Graphics)
.png Lossless DEFLATE Better compression &
features than GIF,
but don’t support
animation
TIFF (Tagged Image
File Format)
.tiff , .tif Lossless RLE / LZW /
DEFLATE / ZIP
Flexible file format,
can store multiple
images in a single file
JPEG2000 jp2, .j2c, jpc, j2k, jpx Lossy & Lossless Discrete Wavelet
Transform (DWT)
Better image quality
than JPEG (up to
20%), not widely
used because of
some patent issues.
Comparison of graphics file formats

Block Diagram of JPEG
Compression
Transformation
coding performed
using the Discrete
Cosine Transform
(DCT)
Quantization of all
DCT coefficients
( a lossy process)
Huffman coding
and arithmetic
coding as entropy
encoding methods
Source
image
JPEG compression
DCT Quantization Encoding
Compressed
image

Audio Compression
 A form of data compression designed to
reduce the size of audio files
 Audio compression can be lossless or
lossy
 Audio compression algorithms are
typically referred to as audio codecs.

Audio Compression
Lossless - allows one to preserve an exact
copy of one's audio files
Usage: For archival
purposes, editing, audio quality.
Codecs:
 Free Lossless Audio Codec (FLAC)
 Apple Lossless
 MPEG-4 ALS
 Monkey's Audio
 Lossless Predictive Audio Compression (LPAC)
 Lossless Transform Audio Compression (LTAC)

Audio Compression
 Lossy - irreversible changes , achieves far
greater compression, use psychoacoustics to
recognize that not all data in an audio stream
can be perceived by the human auditory
system.
Usage: distribution of streaming audio, or
interactive applications
Codecs:
 MP2- MPEG-1Layer 2 audio codec
 MP3 – MPEG-1 Layer 3 audio codec
 MPC Musepack
 Vorbis Ogg Vorbis
 AAC Advanced Audio Coding (MPEG-2 and MPEG-4)
 WMA Windows Media Audio
 AC3 AC-3 or Dolby Digital A/52

MPEG
 Stands for Moving Picture Experts Group. MPEG is an
ISO/IEC working group, established in 1988 to
develop standards for digital audio and video formats.
 MPEG-1
Designed for up to 1.5 Mbit/sec
Standard for the compression of moving pictures and
audio. Most popular is level 3 of MPEG-1 (MP3).
MPEG-1 is the standard of compression for VideoCD.
 MPEG-2
Designed for between 1.5 and 15 Mbit/sec
Standard on which Digital Television set top boxes
and DVD compression is based. Designed for the
compression and transmission of digital broadcast
television

MPEG (cont.)
• MPEG-4
Integrates several different audio components
into one standard: speech
compression, perceptually based coders, text-to-
speech, and MIDI. MPEG-4 AAC (Advanced Audio
Coding), is similar to the MPEG-2 AAC
standard, with some minor changes
 MPEG-7 (under development) - also called the
Multimedia Content Description Interface. In
terms of audio:facilitate the representation and
search for sound content. Example application
supported by MPEG-7: automatic speech
recognition (ASR).

MPEG Audio Encoding
Uncompresse
d Audio
Signal
Division in 32
Frequency
Bands
Psychoacoustic
Model
Quantization
(if
applicable)
Entropy
Encoding
Compressed
Audi o Data
controls

Audio Compression Format-MP3
 Played by almost every portable digital audio device
and many DVD players, MP3 is still hard to go past if
looking for maximum compatibility for your files.
 can get much better compression from other
formats, hard disks and blank CDs are cheap enough
to justify the extra file size.
 Stereo imaging is not terrific and encoding quality
differs from one software package to another.
 Compression: 5.
 Quality: 7.
 Compatibility: 10.
 Overall: 7.5.

Audio Compression Format-WMA
 Window's Media Audio is Microsoft's contribution to
high quality, lossy audio compression. Like most
other new formats, it outperforms MP3 in terms of
quality and compression, particularly at lower
bitrates.
 WMA is probably the format of choice for streaming
at low bandwidths. Like MP3, however, the stereo
imaging is not very accurate.
 WMA tends to overcompensate for its high
compression with what is often called
'overbrightness'.
 Compression: 8.
 Quality: 7.
 Overall: 8.

Audio Compression Format- Ogg
Vorbis
 project attempting to replace all proprietary audio formats
with an open standard freeware codec. Version one was
released in this past fortnight and has been demonstrated
to be very high quality and outperforms MP3 by a long
shot.
 At low bitrates it doesn't compete with WMA, and at high
bitrates it falls short of MPC. Given that it is a work in
progress, however, it has strong potential to become a
widely used audio codec.
 Some portable device manufacturers are promising to
support Ogg Vorbis in future software releases.
 Compression: 8.
 Quality: 7.
 Overall: 7.

Video Compression
 Storing and transmitting uncompressed
raw video is not an efficient technique
because it needs large amounts of storage
and bandwidth.
 DVD, DSS, and internet video, all use
digital data → take a lot of space to store
and large bandwidth to transmit.
 Video compression technique is used to
compress the data for these applications
→ less storage space and less bandwidth
to transmit data.

Video Compression
 Videos are sequences of images displayed at
a high rate. Each of these images is called a
frame.
 Human eye can not notice small changes in
the frames such as a slight difference in
color.
 Therefore, video compression standards do
not require the encoding of all the details and
some of the less important video details are
lost. This is because lossy compression is
used due to its ability to get very high
compression ratios.
 Typically 30 frames are displayed on the
screen every second.

Video Compression Process
1. Start by encoding the first frame using a
still image compression method.
2. It should then encode each successive
frame by identifying the differences
between the frame and its predecessor, and
encoding these differences. If the frame is
very different from its predecessor it should
be coded independently of any other frame.
3. In the video compression literature, a
frame that is coded using its predecessor is
called inter frame (or just inter), while a
frame that is coded independently is called
intra frame (or just intra).

Video Compression Techniques
 Flow Control and Buffering
 Temporal Compression
 Spatial Compression
 Discrete Cosine Transform (DCT)
 Vector Quantization (VQ)
 Fractal Compression
 Discrete Wavelet Transform (DWT).

Video Compression Formats
 The ISO/IEC, or International Organization for
Standardization and the International
Electrotechnical Commission, have a group called the
Moving Pictures Experts Group or MPEG. MPEG is
responsible, for the familiar compression formats
MPEG-1, MPEG-2 and MPEG-4.
 The ITU-T standardizes formats for the International
Telecommunications Union, a United Nations
Organization. Some popular ITU-T compression
formats include the H.261 and H.264 formats.
 There are other compression formats, such as Intel
Indeo and RealVideo (based on the ITU-T H.263
codec), AVI, DivX, Quicktime, Windows Media Video
(WMV).

Encryption
• To carry sensitive information, a system must
be able to assure privacy.
• As the number of attacks increase and as the
public Internet is used to transmit private
data, it is increasingly difficult to protect
information.
• One way to safeguard data from attacks is
encrypting the data.
• Practically, encryption is suitably done in
presentation layer besides transport and
physical layer.

Encryption
 Encryption – the conversion of data
into a form, called a ciphertext, that
cannot be easily understood by
unauthorized people.
 Decryption – the process of
converting encrypted data back into
its original form, so it can be
understood.

Example of Encryption / Decryption Process

Basic Terms and Concepts
 Cryptography – The science of encrypting or
hiding secrets
 Cryptosystem – a disguises message that
allows only selected people to see through
the disguise.
 Cryptanalysis – The science of decrypting
messages or breaking codes and ciphers
 Key – a value that is used by an algorithm to
encrypt and decrypt a message.
 Cipher – an encryption/decryption algorithm
tool that is used to create
encrypted/decrypted text

Encrytption/Decryption Keys
 Symmetric Keys – Also called secret key
encryption. It uses a single key to encrypt and
decrypt the message. This means the person
encrypting the message must give that key to
the recipient before they can decrypt it.
Eg.: Data Encryption Standard (DES), Triple DES (3DES), Advanced
Encryption Standard (AES)

 Asymmetric Keys -Also called public key encryption.
It uses two different keys which is public key to
encrypt the message, and a private key to decrypt it.
The public key can only be used to encrypt the
message and the private key can only be used to
decrypt it.
Eg.: RSA and Diffie-Hellman

How Encryption Protects
 Because cryptography is concerned with the storage or
transmission of information, five key security functions
need to be fulfilled:
Protection Description
Confidentiality Allow only authorized users
to access information.
Authentication Verify who the sender was
and trust the sender is who
they claim to be.
Integrity Trust the information has
not been altered
Nonrepudiation Ensure that the sender or
receiver cannot deny that a
message was sent or
received.
Access Control Restrict availability to
information.

Advantages of Encryption
 file is encrypted then the device that uses it
doesn’t need to be secure which means that
because the data is encrypted and secure that
the means of storage or transportation of it
doesn’t need to be securing which saves you
money on extra protection software.
 having the data encrypted it takes away the pain
and worry that is associated with data breaches
and the protection of intellectual property.
 the advantage of Encryption is that it keeps data
from snoopers without compromising systems or
storage devices.

Disadvantages of Encryption
 complexity of computer encryption, the
usually, expensive cost, the ability for it to be
easily changed and its inability to organize the
data has been encoded. Even though the data
doesn’t need to be protected anymore because
of the encryption, but instead it puts a lot of
pressure on IT employees.
 takes a lot of processing, energy and computer
power as well. This means that even though
data is protect the overall performance of the
computer could drop.
 encryption won’t prevent hackers or viruses and
it also may make it hard to use the encrypted file
as some restrictions may have been placed on it.

Data representation

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (6)

Similar a Data representation

Similar a Data representation (20)

Último

Último (20)

Data representation