SlideShare una empresa de Scribd logo
1 de 35
Descargar para leer sin conexión
ABAP workshop
Unicode and File Handling
Topics
•   Characters and Encoding
•   ASCII standards
•   Glyphs and Fonts
•   Extended ASCII and issues
•   Character Sets and Code Pages
•   Little and Big Endian
•   Unicode
•   Unicode Transformation Formats [UTF-8, etc]
•   Unicode SAP system
•   SAP Unicode Overhead
•   SAP File Interface
•   SAP Authorization for File Access
•   Files on the Application Server
•   File Interface Statements (Open, Transfer, Read, Get, Set, etc)
•   Error Handling
•   Attributes and Other commands
•   Files on the Presentation Server


                                                                      2
Characters and Encoding
• Characters are represented by character
  codes
• This coding is a called character Encoding
• Character codes are generated and stored
  when a user inputs and saves a document
• When a document is read by the system, it
  interprets the character codes that were
  stored and displays them as characters in
  the format that we understand
                                           3
ASCII standards
• The American National Standards Institute (ANSI)
  created the American Standard Code for Information
  Interchange (ASCII) standard
• For example in ASCII, character ‘A’ is represented by
  decimal code 65 or hexadecimal code 41 and is stored
  as binary code 01000001
• Single-Byte character sets provide 256 character codes.
  This is an adequate number to encode most of the
  characters needed for Western Europe
• BTW: Extended Binary Coded Decimal Interchange
  Code [EBCDIC] (that existed before ASCII) is an 8-bit
  character encoding used on IBM mainframe operating
  systems – is not being discussed here

                                                            4
Glyphs and Fonts
• A Glyph (glif) is a visual representation of a
  character – example: A A A A A A A A
• Users don't view or print characters they views
  or print Glyphs
• The character "Capital Letter A" represented by
  the Glyph in Times New Roman Bold is different
  from the Glyph in Arial Bold (each Glyph look
  visually different)
• A single character can be represented by
  several different Glyphs in a Font
• A Font is a collection of glyphs
                                                5
Extended ASCII and issues
•   ASCII represent every character using a number between 32 and 127.
    Space was 32, the letter "A" was 65, etc. This could conveniently be stored
    in 7 bits because the total characters were less than 128 (27)
•   Historically most computers used 8-bit bytes, therefore there was still 1 bit
    to spare
•   Extended ASCII that made use of this spare bit was not standardized all
    over the world
•   The IBM-PC had something that came to be known as the OEM [Original
    Equipment Manufacturer] character set which provided some accented
    characters for European languages and text-mode PCs could display and
    print vertical and horizontal line drawing characters
•   An assortment of 256-character Windows ANSI character sets cover all the
    8-bit languages targeted by Windows
•   Programmers from Israel, Russia (USSR), Asia used the 8th bit to represent
    their own language characters, so there were no universal standard left for
    the characters from 128 and up – confusion prevailed with the 8th bit
•   Something was required to map various Character Code created and used -
    not only for Extended ASCII but also for any new mapping developed

                                                                                6
Character Sets and Code Pages
•   A Character Set is any specific collection of characters
•   Code Page is a list of selected character codes for a Character Set
    in a particular order
•   Code Page is another name for encoding of each character in a
    Character Set (Fonts could have their own Character Set)
•   Code Page is a character set encoding that can include numbers,
    punctuation marks, and other glyphs. Code Pages are not the same
    for each language
•   Many Code Pages are single-byte Character Sets - that is, they
    contain no more than 256 characters.
•   A Code Page is a representation of Character Set used by a
    computer (OS) to support a specific language or set of languages.
            Character Sets   Windows Code Page
            US-ASCII         20127
            German (IA5)     20106
            Korean (ISO)     50225
•   Some languages, such as Japanese have multi-byte characters,
    while others, like English and German, only need one byte to
    represent each character
                                                                          7
Character Sets and Code Page
           (cont…)




 Within each Code Page,
 the Characters from Character Set
 are mapped to the Character Codes
 (Encoded)

                                     8
Character Sets and CodePage
           (cont…)




So potentially we could have hundreds of
Character Sets and these have to be mapped
to numerous Code Pages which is a
maintenance nightmare

                                             9
Character Sets and CodePage
              (cont…)
• All Code pages may not exist on all the
  computers, or they can be different on
  different computers, or they can be
  changed for a single computer.
• This will result in confusion and emails like
  these:
  – Dear □ □ ??? Thank □□□ █ █ █ █ ???



                                              10
Little and Big Endian
•     Some examples of ABAP build-in Data Types are:
      b    1 Byte - 1 byte Integer (internal)
      i    4 Bytes - 4 byte integer
      f    8 Bytes - Floating point number

•     Question: For the multi-byte data (say, i or f shown above), where does the biggest
      (most significant or highest-order) byte appear in the memory?

•     Little Endian: as used in Intel processors stores low-order byte of a number in
      memory at the lowest address

•     Big Endian: as used by Motorola processors and IBM's 370 mainframes, and most
      RISC-based computers store the high-order byte of a number in memory at the
      lowest address

(Example 1: 4 byte Long Int [Byte3 Byte2 Byte1 Byte0]. In the memory the arrangement is as shown)

    Base Address+0       +1    +2      +3       Base Address+0        +1      +2       +3

    Little Endian   Byte0 Byte1 Byte2 Byte3        Big Endian   Byte3 Byte2 Byte1 Byte0


                                                                                                11
Little and Big Endian (cont..)
•   Example 2: to store two bytes required for the hexadecimal number 4F52, the
    following shows the representation by the two methods (BTW: this is equal to 2*16^0
    + 5*16^1 + 15*16^2 + 4*16^3 = 20306 in decimal)

•   Little Endian – representation in memory:
       Base Address+0 52
       Base Address+1 4F

•   Big Endian – representation in memory:
       Base Address+0 4F
       Base Address+1 52

•   Big Endian is easy to understand, because it is consistent with the order we use
    naturally - when we read and write text and numbers.

•   Irrespective of the BYTE order which depends on the Big Endian or Little Endian
    representation, the BIT order within each Byte is always big-endian
          01001001 = (0 + 2^6 + 0 + 0 + 2^3 + 0 + 0 + 2^0 = 64 + 8 + 1 = 73)



                                                                                       12
Need for Standards - Unicode
•   We have seen the confusion that arises when each entity including
    hardware manufacturers, Software companies, Regions, Countries, Groups
    create Code Pages as per their own requirements and for their own
    Character sets
•   Without any set standards, and with the advent of internet, sharing of
    information could be almost impossible
•   What if we have one standard Code Page, having a set of all possible
    character codes that any computer or software could decipher?
•   Well, Unicode is the answer. It is not a Code Page, but more like a “meta-
    Code Page”
•   Unicode is a brave effort to create a single character set that included every
    reasonable writing system on the planet
•   Think of Unicode as a set of all possible character codes.
•   Unicode is a single very large (and still growing) character set and
    encoding, which encompasses essentially all the standard computer
    character sets that predated it.



                                                                                13
Unicode
• Unicode provides a unique number (or encoding or code
  point) for every character
                     NO matter what the platform
                     NO matter what the program
                     NO matter what the language

• Unicode is an international standard that assigns a
  unique number to characters from virtually every
  language and script

• Unicode currently defines more than 90,000 characters,
  with room for more than 1 million characters. With
  Unicode, all characters used in business-relevant
  languages can be represented
                                                        14
Unicode (cont…)
•   Most any computer Code Page can be mapped to Unicode and
    back. However, in computer systems Unicode is largely replacing
    Code Page based approaches

•   Instead of having dozens of Code Pages each using and re-using
    the same numbered slots for different characters, each character
    gets its own unique numbered slot in Unicode

•   Think of Unicode as a label attached to the character via which the
    character can be accessed by applications and operating systems

•   Example: The English letter A is U+0041, Hebrew letter alef is
    U+05D0, Greek letter alpha (α) would be U+03B1, etc – basically we
    have covered them all



                                                                          15
Does Unicode encode Language,
    Font, Size, Positioning, Glyphs?
•   The Unicode Standard does not attempt to encode features such as
    language, font, size, positioning, glyphs, and so forth. For example, it does
    not preserve language as a part of character encoding: just as French i
    grec, German ypsilon, and English wye are all represented by the same
    character code, U+0057 “Y”. The Unicode Standard deals only with
    character codes.
•   Glyphs represent the shapes that characters can have when they are
    rendered or displayed. In contrast to characters, glyphs appear on the
    screen or paper as particular representations of one or more characters. A
    repertoire of glyphs makes up a font. Glyph shape and methods of
    identifying and selecting glyphs are the responsibility of individual font
    vendors and of appropriate standards and are not part of the Unicode
    Standard.
• AAAAAAAA                             All represented by Latin capital letter A (U+0041)

• aaaaaaaaaa                               All represented by Latin small letter a (U+0061)




                                                                                              16
Unicode Challenges
• But, have we addressed all the issues?
• Of course not, Unicode has mapped all the
  characters uniquely, but how to store this in
  memory or represent it in an email message.
  The English letter A would be U+0041, but in
  memory should it be stored as [00 41] or as [41
  00] – Endianness?
• What about all those zeros. Are we doubling the
  disk space, resulting in more cooling costs and
  more greenhouse issues? [TX okay, but CA?]
• Welcome to the UTF-8 Standards!

                                                17
Unicode UTF-8 standard
•   UTF-8 (8-bit UCS/UTF) is a variable-length character encoding for
    Unicode. In UTF-8, every code point from 0-127 is stored in a single
    byte. Only code points 128 and above are stored using 2, 3, in fact,
    up to 6 bytes
•   If a legacy system can understand ASCII, they can understand the
    English portion of the UTF-8, therefore old programs can still
    decipher English text from UTF-8. They cannot decipher any other
    language in UTF-8 that has two or more bytes (they were not
    designed to read other languages so are basically not effected)
•   With UTF-8 standard, memory and disk space is conserved
•   UTF-8 is interpreted as a sequence of bytes, there is no endian
    problem as there is for encoding forms that use 16-bit or 32-bit code
    units.
•   UCS stands for Universal Character Set
•   UTF stands for Unicode Transformation Format


                                                                       18
Unicode other standards
• UCS-2 (2 bytes) or UTF-16 (16 bits)
   – High Endian UCS-2 or Low Endian UCS-2
• UTF-7 (similar to UTF-8 but guarantees that the high bit
  will always be zero to be consistent with old programs
  requirements)
• UTF-32 (32 bits)
• UTF-8 is most popular standard today
• A byte order mark (BOM) consists of the character code
  U+FEFF at the beginning of a data stream, where it can
  be used as a signature defining the byte order
• Where a BOM is used with UTF-8, it is only used as an
  encoding signature to distinguish UTF-8 from other
  encodings — it has nothing to do with byte order

                                                         19
Conveying the Encoding used
• How do we preserve this information about what
  encoding a string uses?
   – For an email message, you are expected to have a string in the
     header of the form
            Content-Type: text/plain; charset="UTF-8"
   – For HTML page by using some kind of special tag.
            <html>
            <head>
            <meta http-equiv="Content-Type" content="text/html; charset=utf-8">


• For the most consistent results, any new applications
  developed should use Unicode, such as UTF-8 or UTF-
  16, instead of a specific code page

• For Unicode UTF-8, the Windows Code Page is 65001
                                                                                  20
Unicode SAP system
• Enables you to harness Internet technologies better
• Allows better integration with non-SAP products and
  seamless integration with existing SAP systems
• Offers a superior platform for collaborative, cross-system
  business applications
• Work with all languages and language combinations in
  the world
• Allows you to install a central system for worldwide
  business processes, e.g. to gather and store aggregate
  customer data
• Enables you to optimize your system landscape and
  reduce your costs
                                                          21
Unicode SAP system (cont…)
• Unicode Program: A Unicode program is an
  ABAP program in which the Unicode checks are
  run effectively and in which certain statements
  involve different semantics from those that apply
  in non-Unicode program.
• Unicode System: Single-code-page system in
  which characters are coded in Unicode
  character representation.
• The Unicode check was tightened as of Release
  6.10

                                                  22
SAP Unicode Overhead
• Main Memory:
  – Average increase +40...50% -> Reason: Application
    servers are based on UTF-16

• Network load:
  – ~0% -> Almost no change due to efficient
    compression.

• Database size: Average increase
  – UTF-8: +10% (smaller systems (< 200GB) might grow
    more)
  – UTF-16: +20...60%
                                                        23
SAP File Interface and Unicode
• It is possible to exchange file between Unicode and non-
  Unicode systems, between different Unicode systems
  and between different non-Unicode systems with
  different code pages
• Instead of implicit programming with standard settings on
  which we have no control, programmers are required to
  do explicit programming and all important parameters
  need to be specified (with stringent requirements to
  maintain good programming practice)
• Examples of explicit programming are: file must be
  opened before each read/write, access type and type of
  data storage needs to be specified, file opened with
  read-only access remains that way through out the
  program, file opened as text can have text only, etc

                                                         24
SAP Authorization for File
             Access
• Operating system check
 System automatically checks the entries in the database
 table SPTH for access to individual files - none of the
 following (S_PATH / S_DATASET) can override this.
• Program independent authorization check
 The check against the authorization object S_PATH is independent of ABAP
 program used and is not restricted to an individual file but all files in the
 PATH/folder.

• User and program authorization check
 The check against the authorization object S_DATASET, and is based on
 the program name, filename and activity (Delete, Read, Write, Read with
 filter and Write with Filter).


                                                                            25
File Interface Statements
• OPEN DATASET
• TRANSFER
• READ DATASET
• GET DATASET
• SET DATASET
• TRUNCATE DATASET
• CLOSE DATASET
• DELETE DATASET
                             26
Opening a File
• OPEN DATASET dset FOR access IN mode
  [position] [os_addition] [error_handling].
  – dset is the file name including path (/usr/tmp/test.dat)
  – access can be
     • INPUT (opens only for reading, the file pointer is set at the start of the file, if
       file does not exist, sy-subrc is set to 8, In Unicode program, it is not
       possible to write to a file open for reading, whereas non-Unicode program
       allows both)
     • OUTPUT (opens a new file for writing, if file already exists, its content are
       deleted. Read access is permitted)
     • APPENDING (opens the file for appending, and the file pointer set at the end
       of the file, if file does not exist, it is created. Read attempt fails and sy-subrc
       is set to 4)
     • UPDATE (opens the file for updating, and the file pointer set at the start of
       the file, if file does not exist, sy-subrc is set to 8)




                                                                                       27
INPUT command (continued)
– Syntax of mode
   • BINARY MODE (opens the file as a binary file, and the
     binary content of a data object is transferred unchanged)
   • TEXT MODE ENCODING code (opens the file as a text file,
     when writing and the content of a data object is converted
     to the representation specified after code [UTF-8 or non-
     Unicode] and transferred to file. For characters, closing
     blank values are truncated, but not for strings. When
     reading, the content of file is read until the next end-of-line
     marking, converted from the format specified after code
     into the current character format [UTF-8 or non-Unicode
     specified in database table TCP0C] and transferred to a
     data object)
   • LEGACY BINARY MODE [endian] [codepage]
   • LEGACY TEXT FILE [endian] [codepage]


                                                                  28
INPUT command (continued)
• AT POSITION pos
 When opening file with this option pos defines where the file
 pointer is positioned in bytes (0 means start of fine, -1 means end
 of file and any value i means i bytes from the start of the file)
• TYPE attr
 For Non MS O/S, attr can contain O/S specific parameters for a
 file to be opened (OS/400 ‘blksize=8000’, etc). On MS O/S if attr
 contains “NT” the end-of line is marked by “CRLF”, and if it
 contains “UNIX” the end-of-line is marked by “LF”.
• FILTER opcom
 Using Filter option, opcom can be an OS command that is started
 when OPEN DATASET is executed, example: FILTER ‘compress’ or
 FILTER ‘uncompress”
     OPEN DATASET filexyz FOR OUTPUT in BINARY MODE FILTER ‘compress’.
     OPEN DATASET filexyz FOR INPUT in BINARY MODE FILTER ‘uncompress’.


                                                                          29
Error Handling
• [MESSAGE msg]
 When errors occurs the O/S error message is assigned to the data
 object msg to be displayed by the ABAP program to the user

• [IGNORING CONVERSION ERRORS]
 This addition can suppress treatable exceptions defined by class
 CX_SY_CONVERSION_CODEPAGE, each unconvertible character is
 replaced by literal ‘#’

• [REPLACEMENT CHARACTER rc]
 Same as above, except that each unconvertible character is
 replaced by the single character specified by rc – not applicable
 for binary files


                                                                     30
TRANSFER and READ
            Commands
• TRANSFER dobj TO dset [LENGTH len]
  [NO END OF LINE]
 The content are written to the file from the current file
 pointer, Length determines how many characters/bytes are
 written to the file, NO END OF LINE avoids the end-of-line
 marking to be appended to the data transferred
• READ DATASET dset INTO dobj [MAXIMUM
  LENGTH mlen] [[ACTUAL] LENGTH alen]
 This exports the data from the file specified in dset into the
 data object dobj starting from the current file pointer. Using
 the Maximum length addition, the number of characters or
 bytes to be read from the file can be limited. Using the
 Actual Length the number of characters or bytes actually
 used can be determined (mlen can be 100, but actual can
 be 60 if the file is small, so alen is returned with 60)

                                                             31
GET and SET Commands
• GET DATASET dset [POSITION pos]
  [ATTRIBUTES attr]
 Position determines the current position of the file pointer. Attributes
 enables us to read/get the value of fixed and changeable file
 attributes
• SET DATASET dset [POSITION pos|{END OF
  FILE}] [ATTRIBUTES attr]
 Position sets the position of the file pointer to new position
 indicated by pos. Attributes enables us to update the value
 of changeable file attributes


                                                                       32
ATTRIBUTES
•   Fixed Attributes
     – Indicator (sub-structure with the following fields and indicates ‘X’ if the following
       are significant)
     – Mode (Text (T), Binary (B), Legacy Binary (LB) and Legacy Text (LT))
     – Access_type (Reading (I), writing (O), appending (A) and editing (U))
     – Encoding (UTF-8 and NON-UNICODE)
     – Filter (filter command, example ‘compress’)
•   Changeable Attributes
     – Indicator (sub-structure with the following fields and indicates ‘X’ if the following
       are significant)
     – Repl_char (replacemen character rc)
     – Conv_error (contains ‘I’ if IGNORE conversion errors addition ws used ‘R’
       otherwise)
     – Code_page (code page that was specified, initial otherwise)
     – Endian (B for Big Endian, L for Little Endian, initial otherwise)

     Example:
     DATA attr TYPE dset_attributes. “dset_attributes SAP defined in type group DSET.
     GET DATASET dset ATTRIBUTES attr.
     IF attr-fixed-indicator-filter <> ‘X’
     … ENDIF.

                                                                                          33
Other commands
• TRUNCATE DATASET dset AT
  {Current Position} | {POSITION pos}
 File size is modified by setting the end of the file indicator at the
 current or pos position. When shortened the file is truncated after
 the new end of file, when extended (pos > current file size) the
 file is filled with hexadecimal null from the old to the new end of
 file.

• CLOSE DATASET dset
 Closes file on the application server.

• DELETE DATASET dset
 Deletes file on the application server.

                                                                     34
Files on the Presentation
               Server
• The CL_GUI_FRONTEND_SERVICES class of the
  class library contains the required methods for
  processing files on the presentation server
  (client/PC). There are no ABAP statements
  available for processing files here.
  – GUI_DOWNLOAD for writing files
  – GUI_UPLOAD for reading files
  – DIRECTORY_CREATE and DIRECTORY_DELETE for
    creating and deleting a directory
  – FILE_DELETE, FILE_COPY, FILE_EXIST, etc., for file
    operations
• The above is the class, but function modules
  GUI_DOWNLOAD and GUI_UPLOAD can also be
  used.

                                                         35

Más contenido relacionado

La actualidad más candente

Unicode Encoding Forms
Unicode Encoding FormsUnicode Encoding Forms
Unicode Encoding FormsMehdi Hasan
 
The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6
The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6
The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6Andrei Zmievski
 
Applied physics iii lecture3 digital_codes
Applied physics iii lecture3 digital_codesApplied physics iii lecture3 digital_codes
Applied physics iii lecture3 digital_codesJaphet Munnah
 
College forum software
College forum softwareCollege forum software
College forum softwareRahul E
 

La actualidad más candente (9)

Character Sets
Character SetsCharacter Sets
Character Sets
 
Unicode Encoding Forms
Unicode Encoding FormsUnicode Encoding Forms
Unicode Encoding Forms
 
Base-64 Presentation
Base-64 PresentationBase-64 Presentation
Base-64 Presentation
 
The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6
The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6
The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6
 
Ascii 03
Ascii 03Ascii 03
Ascii 03
 
Uncdtalk
UncdtalkUncdtalk
Uncdtalk
 
Applied physics iii lecture3 digital_codes
Applied physics iii lecture3 digital_codesApplied physics iii lecture3 digital_codes
Applied physics iii lecture3 digital_codes
 
Storing text
Storing textStoring text
Storing text
 
College forum software
College forum softwareCollege forum software
College forum software
 

Destacado

Abap course chapter 3 basic concepts
Abap course   chapter 3 basic conceptsAbap course   chapter 3 basic concepts
Abap course chapter 3 basic conceptsMilind Patil
 
Lecture04 abap on line
Lecture04 abap on lineLecture04 abap on line
Lecture04 abap on lineMilind Patil
 
Abap course chapter 1 introduction and first program
Abap course   chapter 1 introduction and first programAbap course   chapter 1 introduction and first program
Abap course chapter 1 introduction and first programMilind Patil
 
Abap course chapter 4 database accesses
Abap course   chapter 4 database accessesAbap course   chapter 4 database accesses
Abap course chapter 4 database accessesMilind Patil
 
Abap course chapter 2 tools in the development environment
Abap course   chapter 2 tools in the development environmentAbap course   chapter 2 tools in the development environment
Abap course chapter 2 tools in the development environmentMilind Patil
 
SAP ABAP Lock concept and enqueue
SAP ABAP Lock concept and enqueueSAP ABAP Lock concept and enqueue
SAP ABAP Lock concept and enqueueMilind Patil
 
Abap course chapter 5 dynamic programs
Abap course   chapter 5 dynamic programsAbap course   chapter 5 dynamic programs
Abap course chapter 5 dynamic programsMilind Patil
 
Abap course chapter 6 specialities for erp software
Abap course   chapter 6 specialities for erp softwareAbap course   chapter 6 specialities for erp software
Abap course chapter 6 specialities for erp softwareMilind Patil
 
Lecture01 abap on line
Lecture01 abap on lineLecture01 abap on line
Lecture01 abap on lineMilind Patil
 
Sap abap ppt
Sap abap pptSap abap ppt
Sap abap pptvonline
 
Abap course chapter 7 abap objects and bsp
Abap course   chapter 7 abap objects and bspAbap course   chapter 7 abap objects and bsp
Abap course chapter 7 abap objects and bspMilind Patil
 
Abap slides user defined data types and data
Abap slides user defined data types and dataAbap slides user defined data types and data
Abap slides user defined data types and dataMilind Patil
 
Introduction to ABAP
Introduction to ABAPIntroduction to ABAP
Introduction to ABAPsapdocs. info
 

Destacado (14)

Abap course chapter 3 basic concepts
Abap course   chapter 3 basic conceptsAbap course   chapter 3 basic concepts
Abap course chapter 3 basic concepts
 
Unicode In Sap Abap1
Unicode In Sap Abap1Unicode In Sap Abap1
Unicode In Sap Abap1
 
Lecture04 abap on line
Lecture04 abap on lineLecture04 abap on line
Lecture04 abap on line
 
Abap course chapter 1 introduction and first program
Abap course   chapter 1 introduction and first programAbap course   chapter 1 introduction and first program
Abap course chapter 1 introduction and first program
 
Abap course chapter 4 database accesses
Abap course   chapter 4 database accessesAbap course   chapter 4 database accesses
Abap course chapter 4 database accesses
 
Abap course chapter 2 tools in the development environment
Abap course   chapter 2 tools in the development environmentAbap course   chapter 2 tools in the development environment
Abap course chapter 2 tools in the development environment
 
SAP ABAP Lock concept and enqueue
SAP ABAP Lock concept and enqueueSAP ABAP Lock concept and enqueue
SAP ABAP Lock concept and enqueue
 
Abap course chapter 5 dynamic programs
Abap course   chapter 5 dynamic programsAbap course   chapter 5 dynamic programs
Abap course chapter 5 dynamic programs
 
Abap course chapter 6 specialities for erp software
Abap course   chapter 6 specialities for erp softwareAbap course   chapter 6 specialities for erp software
Abap course chapter 6 specialities for erp software
 
Lecture01 abap on line
Lecture01 abap on lineLecture01 abap on line
Lecture01 abap on line
 
Sap abap ppt
Sap abap pptSap abap ppt
Sap abap ppt
 
Abap course chapter 7 abap objects and bsp
Abap course   chapter 7 abap objects and bspAbap course   chapter 7 abap objects and bsp
Abap course chapter 7 abap objects and bsp
 
Abap slides user defined data types and data
Abap slides user defined data types and dataAbap slides user defined data types and data
Abap slides user defined data types and data
 
Introduction to ABAP
Introduction to ABAPIntroduction to ABAP
Introduction to ABAP
 

Similar a Abap slide class4 unicode-plusfiles

Lecture_ASCII and Unicode.ppt
Lecture_ASCII and Unicode.pptLecture_ASCII and Unicode.ppt
Lecture_ASCII and Unicode.pptAlula Tafere
 
Understanding Character Encodings
Understanding Character EncodingsUnderstanding Character Encodings
Understanding Character EncodingsMobisoft Infotech
 
Comprehasive Exam - IT
Comprehasive Exam - ITComprehasive Exam - IT
Comprehasive Exam - ITguest6ddfb98
 
Internationalisation And Globalisation
Internationalisation And GlobalisationInternationalisation And Globalisation
Internationalisation And GlobalisationAlan Dean
 
Character sets and alphabets
Character sets and alphabetsCharacter sets and alphabets
Character sets and alphabetsRazinaShamim
 
chapter-2.pptx
chapter-2.pptxchapter-2.pptx
chapter-2.pptxRithinA1
 
PDT DC015 Chapter 2 Computer System 2017/2018 (e)
PDT DC015 Chapter 2 Computer System 2017/2018 (e)PDT DC015 Chapter 2 Computer System 2017/2018 (e)
PDT DC015 Chapter 2 Computer System 2017/2018 (e)Fizaril Amzari Omar
 
Compgenerations pented
Compgenerations pentedCompgenerations pented
Compgenerations pentedSajib
 
Unicode - Hacking The International Character System
Unicode - Hacking The International Character SystemUnicode - Hacking The International Character System
Unicode - Hacking The International Character SystemWebsecurify
 
4 character encoding
4 character encoding4 character encoding
4 character encodingirdginfo
 
Xml For Dummies Chapter 6 Adding Character(S) To Xml
Xml For Dummies   Chapter 6 Adding Character(S) To XmlXml For Dummies   Chapter 6 Adding Character(S) To Xml
Xml For Dummies Chapter 6 Adding Character(S) To Xmlphanleson
 
Introduction to computers
Introduction to computersIntroduction to computers
Introduction to computersLearn By Watch
 

Similar a Abap slide class4 unicode-plusfiles (20)

Lecture_ASCII and Unicode.ppt
Lecture_ASCII and Unicode.pptLecture_ASCII and Unicode.ppt
Lecture_ASCII and Unicode.ppt
 
Using unicode with php
Using unicode with phpUsing unicode with php
Using unicode with php
 
Understanding Character Encodings
Understanding Character EncodingsUnderstanding Character Encodings
Understanding Character Encodings
 
Using unicode with php
Using unicode with phpUsing unicode with php
Using unicode with php
 
Comprehasive Exam - IT
Comprehasive Exam - ITComprehasive Exam - IT
Comprehasive Exam - IT
 
What character is that
What character is thatWhat character is that
What character is that
 
Internationalisation And Globalisation
Internationalisation And GlobalisationInternationalisation And Globalisation
Internationalisation And Globalisation
 
Ascii codes
Ascii codesAscii codes
Ascii codes
 
Character sets and alphabets
Character sets and alphabetsCharacter sets and alphabets
Character sets and alphabets
 
chapter-2.pptx
chapter-2.pptxchapter-2.pptx
chapter-2.pptx
 
PDT DC015 Chapter 2 Computer System 2017/2018 (e)
PDT DC015 Chapter 2 Computer System 2017/2018 (e)PDT DC015 Chapter 2 Computer System 2017/2018 (e)
PDT DC015 Chapter 2 Computer System 2017/2018 (e)
 
Compgenerations pented
Compgenerations pentedCompgenerations pented
Compgenerations pented
 
Unicode - Hacking The International Character System
Unicode - Hacking The International Character SystemUnicode - Hacking The International Character System
Unicode - Hacking The International Character System
 
4 character encoding
4 character encoding4 character encoding
4 character encoding
 
Xml For Dummies Chapter 6 Adding Character(S) To Xml
Xml For Dummies   Chapter 6 Adding Character(S) To XmlXml For Dummies   Chapter 6 Adding Character(S) To Xml
Xml For Dummies Chapter 6 Adding Character(S) To Xml
 
Dhacaini
DhacainiDhacaini
Dhacaini
 
data representation
data representationdata representation
data representation
 
Introduction to computers
Introduction to computersIntroduction to computers
Introduction to computers
 
Topic 2.3 (1)
Topic 2.3 (1)Topic 2.3 (1)
Topic 2.3 (1)
 
C# basics...
C# basics...C# basics...
C# basics...
 

Más de Milind Patil

Step by step abap_input help or lov
Step by step abap_input help or lovStep by step abap_input help or lov
Step by step abap_input help or lovMilind Patil
 
Step bystep abap_fieldhelpordocumentation
Step bystep abap_fieldhelpordocumentationStep bystep abap_fieldhelpordocumentation
Step bystep abap_fieldhelpordocumentationMilind Patil
 
Step bystep abap_field help or documentation
Step bystep abap_field help or documentationStep bystep abap_field help or documentation
Step bystep abap_field help or documentationMilind Patil
 
Abap slide lock Enqueue data clusters auth checks
Abap slide lock Enqueue data clusters auth checksAbap slide lock Enqueue data clusters auth checks
Abap slide lock Enqueue data clusters auth checksMilind Patil
 
Step bystep abap_changinga_singlerecord
Step bystep abap_changinga_singlerecordStep bystep abap_changinga_singlerecord
Step bystep abap_changinga_singlerecordMilind Patil
 
Abap slide lockenqueuedataclustersauthchecks
Abap slide lockenqueuedataclustersauthchecksAbap slide lockenqueuedataclustersauthchecks
Abap slide lockenqueuedataclustersauthchecksMilind Patil
 
Abap slide exceptionshandling
Abap slide exceptionshandlingAbap slide exceptionshandling
Abap slide exceptionshandlingMilind Patil
 
Step bystep abap_changinga_singlerecord
Step bystep abap_changinga_singlerecordStep bystep abap_changinga_singlerecord
Step bystep abap_changinga_singlerecordMilind Patil
 
Lecture16 abap on line
Lecture16 abap on lineLecture16 abap on line
Lecture16 abap on lineMilind Patil
 
Lecture14 abap on line
Lecture14 abap on lineLecture14 abap on line
Lecture14 abap on lineMilind Patil
 
Lecture13 abap on line
Lecture13 abap on lineLecture13 abap on line
Lecture13 abap on lineMilind Patil
 
Lecture12 abap on line
Lecture12 abap on lineLecture12 abap on line
Lecture12 abap on lineMilind Patil
 
Lecture11 abap on line
Lecture11 abap on lineLecture11 abap on line
Lecture11 abap on lineMilind Patil
 
Lecture10 abap on line
Lecture10 abap on lineLecture10 abap on line
Lecture10 abap on lineMilind Patil
 
Lecture09 abap on line
Lecture09 abap on lineLecture09 abap on line
Lecture09 abap on lineMilind Patil
 
Lecture08 abap on line
Lecture08 abap on lineLecture08 abap on line
Lecture08 abap on lineMilind Patil
 
Lecture07 abap on line
Lecture07 abap on lineLecture07 abap on line
Lecture07 abap on lineMilind Patil
 

Más de Milind Patil (20)

Step by step abap_input help or lov
Step by step abap_input help or lovStep by step abap_input help or lov
Step by step abap_input help or lov
 
Step bystep abap_fieldhelpordocumentation
Step bystep abap_fieldhelpordocumentationStep bystep abap_fieldhelpordocumentation
Step bystep abap_fieldhelpordocumentation
 
Step bystep abap_field help or documentation
Step bystep abap_field help or documentationStep bystep abap_field help or documentation
Step bystep abap_field help or documentation
 
Abap slides set1
Abap slides set1Abap slides set1
Abap slides set1
 
Abap slide class3
Abap slide class3Abap slide class3
Abap slide class3
 
Abap slide lock Enqueue data clusters auth checks
Abap slide lock Enqueue data clusters auth checksAbap slide lock Enqueue data clusters auth checks
Abap slide lock Enqueue data clusters auth checks
 
Step bystep abap_changinga_singlerecord
Step bystep abap_changinga_singlerecordStep bystep abap_changinga_singlerecord
Step bystep abap_changinga_singlerecord
 
Abap slide lockenqueuedataclustersauthchecks
Abap slide lockenqueuedataclustersauthchecksAbap slide lockenqueuedataclustersauthchecks
Abap slide lockenqueuedataclustersauthchecks
 
Abap slide exceptionshandling
Abap slide exceptionshandlingAbap slide exceptionshandling
Abap slide exceptionshandling
 
Step bystep abap_changinga_singlerecord
Step bystep abap_changinga_singlerecordStep bystep abap_changinga_singlerecord
Step bystep abap_changinga_singlerecord
 
Abap reports
Abap reportsAbap reports
Abap reports
 
Lecture16 abap on line
Lecture16 abap on lineLecture16 abap on line
Lecture16 abap on line
 
Lecture14 abap on line
Lecture14 abap on lineLecture14 abap on line
Lecture14 abap on line
 
Lecture13 abap on line
Lecture13 abap on lineLecture13 abap on line
Lecture13 abap on line
 
Lecture12 abap on line
Lecture12 abap on lineLecture12 abap on line
Lecture12 abap on line
 
Lecture11 abap on line
Lecture11 abap on lineLecture11 abap on line
Lecture11 abap on line
 
Lecture10 abap on line
Lecture10 abap on lineLecture10 abap on line
Lecture10 abap on line
 
Lecture09 abap on line
Lecture09 abap on lineLecture09 abap on line
Lecture09 abap on line
 
Lecture08 abap on line
Lecture08 abap on lineLecture08 abap on line
Lecture08 abap on line
 
Lecture07 abap on line
Lecture07 abap on lineLecture07 abap on line
Lecture07 abap on line
 

Último

Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 

Último (20)

Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 

Abap slide class4 unicode-plusfiles

  • 2. Topics • Characters and Encoding • ASCII standards • Glyphs and Fonts • Extended ASCII and issues • Character Sets and Code Pages • Little and Big Endian • Unicode • Unicode Transformation Formats [UTF-8, etc] • Unicode SAP system • SAP Unicode Overhead • SAP File Interface • SAP Authorization for File Access • Files on the Application Server • File Interface Statements (Open, Transfer, Read, Get, Set, etc) • Error Handling • Attributes and Other commands • Files on the Presentation Server 2
  • 3. Characters and Encoding • Characters are represented by character codes • This coding is a called character Encoding • Character codes are generated and stored when a user inputs and saves a document • When a document is read by the system, it interprets the character codes that were stored and displays them as characters in the format that we understand 3
  • 4. ASCII standards • The American National Standards Institute (ANSI) created the American Standard Code for Information Interchange (ASCII) standard • For example in ASCII, character ‘A’ is represented by decimal code 65 or hexadecimal code 41 and is stored as binary code 01000001 • Single-Byte character sets provide 256 character codes. This is an adequate number to encode most of the characters needed for Western Europe • BTW: Extended Binary Coded Decimal Interchange Code [EBCDIC] (that existed before ASCII) is an 8-bit character encoding used on IBM mainframe operating systems – is not being discussed here 4
  • 5. Glyphs and Fonts • A Glyph (glif) is a visual representation of a character – example: A A A A A A A A • Users don't view or print characters they views or print Glyphs • The character "Capital Letter A" represented by the Glyph in Times New Roman Bold is different from the Glyph in Arial Bold (each Glyph look visually different) • A single character can be represented by several different Glyphs in a Font • A Font is a collection of glyphs 5
  • 6. Extended ASCII and issues • ASCII represent every character using a number between 32 and 127. Space was 32, the letter "A" was 65, etc. This could conveniently be stored in 7 bits because the total characters were less than 128 (27) • Historically most computers used 8-bit bytes, therefore there was still 1 bit to spare • Extended ASCII that made use of this spare bit was not standardized all over the world • The IBM-PC had something that came to be known as the OEM [Original Equipment Manufacturer] character set which provided some accented characters for European languages and text-mode PCs could display and print vertical and horizontal line drawing characters • An assortment of 256-character Windows ANSI character sets cover all the 8-bit languages targeted by Windows • Programmers from Israel, Russia (USSR), Asia used the 8th bit to represent their own language characters, so there were no universal standard left for the characters from 128 and up – confusion prevailed with the 8th bit • Something was required to map various Character Code created and used - not only for Extended ASCII but also for any new mapping developed 6
  • 7. Character Sets and Code Pages • A Character Set is any specific collection of characters • Code Page is a list of selected character codes for a Character Set in a particular order • Code Page is another name for encoding of each character in a Character Set (Fonts could have their own Character Set) • Code Page is a character set encoding that can include numbers, punctuation marks, and other glyphs. Code Pages are not the same for each language • Many Code Pages are single-byte Character Sets - that is, they contain no more than 256 characters. • A Code Page is a representation of Character Set used by a computer (OS) to support a specific language or set of languages. Character Sets Windows Code Page US-ASCII 20127 German (IA5) 20106 Korean (ISO) 50225 • Some languages, such as Japanese have multi-byte characters, while others, like English and German, only need one byte to represent each character 7
  • 8. Character Sets and Code Page (cont…) Within each Code Page, the Characters from Character Set are mapped to the Character Codes (Encoded) 8
  • 9. Character Sets and CodePage (cont…) So potentially we could have hundreds of Character Sets and these have to be mapped to numerous Code Pages which is a maintenance nightmare 9
  • 10. Character Sets and CodePage (cont…) • All Code pages may not exist on all the computers, or they can be different on different computers, or they can be changed for a single computer. • This will result in confusion and emails like these: – Dear □ □ ??? Thank □□□ █ █ █ █ ??? 10
  • 11. Little and Big Endian • Some examples of ABAP build-in Data Types are: b 1 Byte - 1 byte Integer (internal) i 4 Bytes - 4 byte integer f 8 Bytes - Floating point number • Question: For the multi-byte data (say, i or f shown above), where does the biggest (most significant or highest-order) byte appear in the memory? • Little Endian: as used in Intel processors stores low-order byte of a number in memory at the lowest address • Big Endian: as used by Motorola processors and IBM's 370 mainframes, and most RISC-based computers store the high-order byte of a number in memory at the lowest address (Example 1: 4 byte Long Int [Byte3 Byte2 Byte1 Byte0]. In the memory the arrangement is as shown) Base Address+0 +1 +2 +3 Base Address+0 +1 +2 +3 Little Endian Byte0 Byte1 Byte2 Byte3 Big Endian Byte3 Byte2 Byte1 Byte0 11
  • 12. Little and Big Endian (cont..) • Example 2: to store two bytes required for the hexadecimal number 4F52, the following shows the representation by the two methods (BTW: this is equal to 2*16^0 + 5*16^1 + 15*16^2 + 4*16^3 = 20306 in decimal) • Little Endian – representation in memory: Base Address+0 52 Base Address+1 4F • Big Endian – representation in memory: Base Address+0 4F Base Address+1 52 • Big Endian is easy to understand, because it is consistent with the order we use naturally - when we read and write text and numbers. • Irrespective of the BYTE order which depends on the Big Endian or Little Endian representation, the BIT order within each Byte is always big-endian 01001001 = (0 + 2^6 + 0 + 0 + 2^3 + 0 + 0 + 2^0 = 64 + 8 + 1 = 73) 12
  • 13. Need for Standards - Unicode • We have seen the confusion that arises when each entity including hardware manufacturers, Software companies, Regions, Countries, Groups create Code Pages as per their own requirements and for their own Character sets • Without any set standards, and with the advent of internet, sharing of information could be almost impossible • What if we have one standard Code Page, having a set of all possible character codes that any computer or software could decipher? • Well, Unicode is the answer. It is not a Code Page, but more like a “meta- Code Page” • Unicode is a brave effort to create a single character set that included every reasonable writing system on the planet • Think of Unicode as a set of all possible character codes. • Unicode is a single very large (and still growing) character set and encoding, which encompasses essentially all the standard computer character sets that predated it. 13
  • 14. Unicode • Unicode provides a unique number (or encoding or code point) for every character NO matter what the platform NO matter what the program NO matter what the language • Unicode is an international standard that assigns a unique number to characters from virtually every language and script • Unicode currently defines more than 90,000 characters, with room for more than 1 million characters. With Unicode, all characters used in business-relevant languages can be represented 14
  • 15. Unicode (cont…) • Most any computer Code Page can be mapped to Unicode and back. However, in computer systems Unicode is largely replacing Code Page based approaches • Instead of having dozens of Code Pages each using and re-using the same numbered slots for different characters, each character gets its own unique numbered slot in Unicode • Think of Unicode as a label attached to the character via which the character can be accessed by applications and operating systems • Example: The English letter A is U+0041, Hebrew letter alef is U+05D0, Greek letter alpha (α) would be U+03B1, etc – basically we have covered them all 15
  • 16. Does Unicode encode Language, Font, Size, Positioning, Glyphs? • The Unicode Standard does not attempt to encode features such as language, font, size, positioning, glyphs, and so forth. For example, it does not preserve language as a part of character encoding: just as French i grec, German ypsilon, and English wye are all represented by the same character code, U+0057 “Y”. The Unicode Standard deals only with character codes. • Glyphs represent the shapes that characters can have when they are rendered or displayed. In contrast to characters, glyphs appear on the screen or paper as particular representations of one or more characters. A repertoire of glyphs makes up a font. Glyph shape and methods of identifying and selecting glyphs are the responsibility of individual font vendors and of appropriate standards and are not part of the Unicode Standard. • AAAAAAAA All represented by Latin capital letter A (U+0041) • aaaaaaaaaa All represented by Latin small letter a (U+0061) 16
  • 17. Unicode Challenges • But, have we addressed all the issues? • Of course not, Unicode has mapped all the characters uniquely, but how to store this in memory or represent it in an email message. The English letter A would be U+0041, but in memory should it be stored as [00 41] or as [41 00] – Endianness? • What about all those zeros. Are we doubling the disk space, resulting in more cooling costs and more greenhouse issues? [TX okay, but CA?] • Welcome to the UTF-8 Standards! 17
  • 18. Unicode UTF-8 standard • UTF-8 (8-bit UCS/UTF) is a variable-length character encoding for Unicode. In UTF-8, every code point from 0-127 is stored in a single byte. Only code points 128 and above are stored using 2, 3, in fact, up to 6 bytes • If a legacy system can understand ASCII, they can understand the English portion of the UTF-8, therefore old programs can still decipher English text from UTF-8. They cannot decipher any other language in UTF-8 that has two or more bytes (they were not designed to read other languages so are basically not effected) • With UTF-8 standard, memory and disk space is conserved • UTF-8 is interpreted as a sequence of bytes, there is no endian problem as there is for encoding forms that use 16-bit or 32-bit code units. • UCS stands for Universal Character Set • UTF stands for Unicode Transformation Format 18
  • 19. Unicode other standards • UCS-2 (2 bytes) or UTF-16 (16 bits) – High Endian UCS-2 or Low Endian UCS-2 • UTF-7 (similar to UTF-8 but guarantees that the high bit will always be zero to be consistent with old programs requirements) • UTF-32 (32 bits) • UTF-8 is most popular standard today • A byte order mark (BOM) consists of the character code U+FEFF at the beginning of a data stream, where it can be used as a signature defining the byte order • Where a BOM is used with UTF-8, it is only used as an encoding signature to distinguish UTF-8 from other encodings — it has nothing to do with byte order 19
  • 20. Conveying the Encoding used • How do we preserve this information about what encoding a string uses? – For an email message, you are expected to have a string in the header of the form Content-Type: text/plain; charset="UTF-8" – For HTML page by using some kind of special tag. <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> • For the most consistent results, any new applications developed should use Unicode, such as UTF-8 or UTF- 16, instead of a specific code page • For Unicode UTF-8, the Windows Code Page is 65001 20
  • 21. Unicode SAP system • Enables you to harness Internet technologies better • Allows better integration with non-SAP products and seamless integration with existing SAP systems • Offers a superior platform for collaborative, cross-system business applications • Work with all languages and language combinations in the world • Allows you to install a central system for worldwide business processes, e.g. to gather and store aggregate customer data • Enables you to optimize your system landscape and reduce your costs 21
  • 22. Unicode SAP system (cont…) • Unicode Program: A Unicode program is an ABAP program in which the Unicode checks are run effectively and in which certain statements involve different semantics from those that apply in non-Unicode program. • Unicode System: Single-code-page system in which characters are coded in Unicode character representation. • The Unicode check was tightened as of Release 6.10 22
  • 23. SAP Unicode Overhead • Main Memory: – Average increase +40...50% -> Reason: Application servers are based on UTF-16 • Network load: – ~0% -> Almost no change due to efficient compression. • Database size: Average increase – UTF-8: +10% (smaller systems (< 200GB) might grow more) – UTF-16: +20...60% 23
  • 24. SAP File Interface and Unicode • It is possible to exchange file between Unicode and non- Unicode systems, between different Unicode systems and between different non-Unicode systems with different code pages • Instead of implicit programming with standard settings on which we have no control, programmers are required to do explicit programming and all important parameters need to be specified (with stringent requirements to maintain good programming practice) • Examples of explicit programming are: file must be opened before each read/write, access type and type of data storage needs to be specified, file opened with read-only access remains that way through out the program, file opened as text can have text only, etc 24
  • 25. SAP Authorization for File Access • Operating system check System automatically checks the entries in the database table SPTH for access to individual files - none of the following (S_PATH / S_DATASET) can override this. • Program independent authorization check The check against the authorization object S_PATH is independent of ABAP program used and is not restricted to an individual file but all files in the PATH/folder. • User and program authorization check The check against the authorization object S_DATASET, and is based on the program name, filename and activity (Delete, Read, Write, Read with filter and Write with Filter). 25
  • 26. File Interface Statements • OPEN DATASET • TRANSFER • READ DATASET • GET DATASET • SET DATASET • TRUNCATE DATASET • CLOSE DATASET • DELETE DATASET 26
  • 27. Opening a File • OPEN DATASET dset FOR access IN mode [position] [os_addition] [error_handling]. – dset is the file name including path (/usr/tmp/test.dat) – access can be • INPUT (opens only for reading, the file pointer is set at the start of the file, if file does not exist, sy-subrc is set to 8, In Unicode program, it is not possible to write to a file open for reading, whereas non-Unicode program allows both) • OUTPUT (opens a new file for writing, if file already exists, its content are deleted. Read access is permitted) • APPENDING (opens the file for appending, and the file pointer set at the end of the file, if file does not exist, it is created. Read attempt fails and sy-subrc is set to 4) • UPDATE (opens the file for updating, and the file pointer set at the start of the file, if file does not exist, sy-subrc is set to 8) 27
  • 28. INPUT command (continued) – Syntax of mode • BINARY MODE (opens the file as a binary file, and the binary content of a data object is transferred unchanged) • TEXT MODE ENCODING code (opens the file as a text file, when writing and the content of a data object is converted to the representation specified after code [UTF-8 or non- Unicode] and transferred to file. For characters, closing blank values are truncated, but not for strings. When reading, the content of file is read until the next end-of-line marking, converted from the format specified after code into the current character format [UTF-8 or non-Unicode specified in database table TCP0C] and transferred to a data object) • LEGACY BINARY MODE [endian] [codepage] • LEGACY TEXT FILE [endian] [codepage] 28
  • 29. INPUT command (continued) • AT POSITION pos When opening file with this option pos defines where the file pointer is positioned in bytes (0 means start of fine, -1 means end of file and any value i means i bytes from the start of the file) • TYPE attr For Non MS O/S, attr can contain O/S specific parameters for a file to be opened (OS/400 ‘blksize=8000’, etc). On MS O/S if attr contains “NT” the end-of line is marked by “CRLF”, and if it contains “UNIX” the end-of-line is marked by “LF”. • FILTER opcom Using Filter option, opcom can be an OS command that is started when OPEN DATASET is executed, example: FILTER ‘compress’ or FILTER ‘uncompress” OPEN DATASET filexyz FOR OUTPUT in BINARY MODE FILTER ‘compress’. OPEN DATASET filexyz FOR INPUT in BINARY MODE FILTER ‘uncompress’. 29
  • 30. Error Handling • [MESSAGE msg] When errors occurs the O/S error message is assigned to the data object msg to be displayed by the ABAP program to the user • [IGNORING CONVERSION ERRORS] This addition can suppress treatable exceptions defined by class CX_SY_CONVERSION_CODEPAGE, each unconvertible character is replaced by literal ‘#’ • [REPLACEMENT CHARACTER rc] Same as above, except that each unconvertible character is replaced by the single character specified by rc – not applicable for binary files 30
  • 31. TRANSFER and READ Commands • TRANSFER dobj TO dset [LENGTH len] [NO END OF LINE] The content are written to the file from the current file pointer, Length determines how many characters/bytes are written to the file, NO END OF LINE avoids the end-of-line marking to be appended to the data transferred • READ DATASET dset INTO dobj [MAXIMUM LENGTH mlen] [[ACTUAL] LENGTH alen] This exports the data from the file specified in dset into the data object dobj starting from the current file pointer. Using the Maximum length addition, the number of characters or bytes to be read from the file can be limited. Using the Actual Length the number of characters or bytes actually used can be determined (mlen can be 100, but actual can be 60 if the file is small, so alen is returned with 60) 31
  • 32. GET and SET Commands • GET DATASET dset [POSITION pos] [ATTRIBUTES attr] Position determines the current position of the file pointer. Attributes enables us to read/get the value of fixed and changeable file attributes • SET DATASET dset [POSITION pos|{END OF FILE}] [ATTRIBUTES attr] Position sets the position of the file pointer to new position indicated by pos. Attributes enables us to update the value of changeable file attributes 32
  • 33. ATTRIBUTES • Fixed Attributes – Indicator (sub-structure with the following fields and indicates ‘X’ if the following are significant) – Mode (Text (T), Binary (B), Legacy Binary (LB) and Legacy Text (LT)) – Access_type (Reading (I), writing (O), appending (A) and editing (U)) – Encoding (UTF-8 and NON-UNICODE) – Filter (filter command, example ‘compress’) • Changeable Attributes – Indicator (sub-structure with the following fields and indicates ‘X’ if the following are significant) – Repl_char (replacemen character rc) – Conv_error (contains ‘I’ if IGNORE conversion errors addition ws used ‘R’ otherwise) – Code_page (code page that was specified, initial otherwise) – Endian (B for Big Endian, L for Little Endian, initial otherwise) Example: DATA attr TYPE dset_attributes. “dset_attributes SAP defined in type group DSET. GET DATASET dset ATTRIBUTES attr. IF attr-fixed-indicator-filter <> ‘X’ … ENDIF. 33
  • 34. Other commands • TRUNCATE DATASET dset AT {Current Position} | {POSITION pos} File size is modified by setting the end of the file indicator at the current or pos position. When shortened the file is truncated after the new end of file, when extended (pos > current file size) the file is filled with hexadecimal null from the old to the new end of file. • CLOSE DATASET dset Closes file on the application server. • DELETE DATASET dset Deletes file on the application server. 34
  • 35. Files on the Presentation Server • The CL_GUI_FRONTEND_SERVICES class of the class library contains the required methods for processing files on the presentation server (client/PC). There are no ABAP statements available for processing files here. – GUI_DOWNLOAD for writing files – GUI_UPLOAD for reading files – DIRECTORY_CREATE and DIRECTORY_DELETE for creating and deleting a directory – FILE_DELETE, FILE_COPY, FILE_EXIST, etc., for file operations • The above is the class, but function modules GUI_DOWNLOAD and GUI_UPLOAD can also be used. 35