SlideShare una empresa de Scribd logo
1 de 13
Digitizing books
1
STEPS, HINTS, VIDEOS
What is the digitization?
2

 Digitization is the process of converting

information into a digital format. In this format,
information is organized into discrete units of data
that can be separately addressed. This is
the binary data that computers and many devices
with computing capacity can process.
 http://whatis.techtarget.com/definition/0,,sid9_g
ci896692,00.html
 See also: http://en.wikipedia.org/wiki/Digitizing
Steps of digitization
3

 1. Choose the book you want to digitize.
 2. Choose an OCR software (GO!)
 3. Scan your book (Choose the devise. Scanner,








compact device, digital camera, IRIScan) (GO!)
4. Optical Character Recognition (image)
5. Correction (image1) (image2)
6. Save as a text searchable PDF document
See another versions:
http://www.inquisition.ca/en/info/artic/comment_
numeriser.htm
http://dlg.galileo.usg.edu/guide.html#01
Text and images
4

 Text and images can be digitized similarly:

a scanner captures an image (which may be an image
of text) and converts it to an image file, such as
a bitmap. An optical character recognition (OCR)
program analyzes a text image for light and dark
areas in order to identify each alphabetic letter or
numeric digit, and converts each character into
an ASCII code.
Choose an OCR software
5

 There are a lot of softwares to digitize your





documents.
On Wikipedia there is comparison list of optical
character recognition softwares. Check it out!
http://en.wikipedia.org/wiki/List_of_optical_chara
cter_recognition_software
(I recommend you the ABBYY FineReader.)
If you don’t want to buy (or download) a
software, here’s a free online OCR:
http://www.newocr.com/
What is OCR?
6

 OCR (optical character recognition) is the

recognition of printed or written text characters by
a computer. This involves photoscanning of the
text character-by-character, analysis of the
scanned-in image, and then translation of the
character image into character codes, such as
ASCII, commonly used in data processing.
 http://searchciomidmarket.techtarget.com/definition/OCR
 Read more:
http://en.wikipedia.org/wiki/Optical_character_r
ecognition
What is ASCII?
7

 ASCII (American Standard Code for Information

Interchange) is the most common format for
text files in computers and on the Internet. In an
ASCII file, each alphabetic, numeric, or special
character is represented with a 7-bit binary number
(a string of seven 0s or 1s). 128 possible characters
are defined.
 In: http://searchciomidmarket.techtarget.com/definition/ASCII
How to scan the book
8

 With scanner: http://www.wikihow.com/Scan-a






Book
http://www.proportionalreading.com/scan.html
With one compact device:
http://www.ehow.com/how_6950098_scan-bookpdf-format.html
With digital camera:
http://www.wikihow.com/Scan-a-Book-With-aDigital-Camera
With IRIScan:
http://www.youtube.com/watch?v=9bgcDHLe3Xg
Optical Character Recognition
9
Correction image 1
10
Correction image 2
11
Videos
12

 How to digitize a book:










http://www.youtube.com/watch?v=-M95Ob4kIak
How to chop and scan a book:
http://www.youtube.com/watch?v=8tx2JmW_p4c
Scanning text using OCR software:
http://www.youtube.com/watch?v=_SwrGtSY4-c
How to OCR PDFs easily with Acrobat Batch OCR:
http://www.youtube.com/watch?v=V6Iz3U5X-SU
How to digitize a million books
http://www.youtube.com/watch?v=OlKhKyTS23E
How to put a scanned doc into PDF format
13

 http://www.ehow.com/how_8563246_put-scanned-

document-pdf-format.html
 Some OCR softwares include
PDF format to save.
 Have a good reading on
your digital device!



Made by Mario Laskovics (2012.04.03)

Más contenido relacionado

Similar a Digitizing books

Optical character recognization word
Optical character recognization wordOptical character recognization word
Optical character recognization word
Dhana K
 
Reading System for the Blind PPT
Reading System for the Blind PPTReading System for the Blind PPT
Reading System for the Blind PPT
Binayak Ghosh
 
optical character recognition system
optical character recognition systemoptical character recognition system
optical character recognition system
Vijay Apurva
 
OPTICAL CHARACTER RECOGNIZATION NEERAJ.pptx
OPTICAL CHARACTER RECOGNIZATION  NEERAJ.pptxOPTICAL CHARACTER RECOGNIZATION  NEERAJ.pptx
OPTICAL CHARACTER RECOGNIZATION NEERAJ.pptx
NeerajBudhlakoti
 
Existco Scan and File Utility
Existco Scan and File UtilityExistco Scan and File Utility
Existco Scan and File Utility
Existco Pty Ltd
 

Similar a Digitizing books (20)

Barcode Educational Guide - IDAutomation.com
Barcode Educational Guide - IDAutomation.com Barcode Educational Guide - IDAutomation.com
Barcode Educational Guide - IDAutomation.com
 
Optical character recognization word
Optical character recognization wordOptical character recognization word
Optical character recognization word
 
Paper based interaction
Paper based interactionPaper based interaction
Paper based interaction
 
Reading System for the Blind PPT
Reading System for the Blind PPTReading System for the Blind PPT
Reading System for the Blind PPT
 
Intelligent Data Extraction, Turning Content into Data, A Look at Advanced Ca...
Intelligent Data Extraction, Turning Content into Data, A Look at Advanced Ca...Intelligent Data Extraction, Turning Content into Data, A Look at Advanced Ca...
Intelligent Data Extraction, Turning Content into Data, A Look at Advanced Ca...
 
A12REVIEW.pptx
A12REVIEW.pptxA12REVIEW.pptx
A12REVIEW.pptx
 
optical character recognition system
optical character recognition systemoptical character recognition system
optical character recognition system
 
What is Batch Document Processing? A tutorial for document capture.
What is Batch Document Processing?  A tutorial for document capture.What is Batch Document Processing?  A tutorial for document capture.
What is Batch Document Processing? A tutorial for document capture.
 
Optical Character Recognition (OCR) System
Optical Character Recognition (OCR) SystemOptical Character Recognition (OCR) System
Optical Character Recognition (OCR) System
 
D017222226
D017222226D017222226
D017222226
 
OCR 's Functions
OCR 's FunctionsOCR 's Functions
OCR 's Functions
 
OPTICAL CHARACTER RECOGNIZATION NEERAJ.pptx
OPTICAL CHARACTER RECOGNIZATION  NEERAJ.pptxOPTICAL CHARACTER RECOGNIZATION  NEERAJ.pptx
OPTICAL CHARACTER RECOGNIZATION NEERAJ.pptx
 
Optical character recognition IEEE Paper Study
Optical character recognition IEEE Paper StudyOptical character recognition IEEE Paper Study
Optical character recognition IEEE Paper Study
 
Don't just pdf, Smart PDF
Don't just pdf, Smart PDFDon't just pdf, Smart PDF
Don't just pdf, Smart PDF
 
Product Recognition using Label and Barcodes
Product Recognition using Label and BarcodesProduct Recognition using Label and Barcodes
Product Recognition using Label and Barcodes
 
Project report of OCR Recognition
Project report of OCR RecognitionProject report of OCR Recognition
Project report of OCR Recognition
 
Existco Scan and File Utility
Existco Scan and File UtilityExistco Scan and File Utility
Existco Scan and File Utility
 
Optical Character Recognition (OCR)
Optical Character Recognition (OCR)Optical Character Recognition (OCR)
Optical Character Recognition (OCR)
 
IRJET- Optical Character Recognition for Blind using Raspberry Pi
IRJET- Optical Character Recognition for Blind using Raspberry PiIRJET- Optical Character Recognition for Blind using Raspberry Pi
IRJET- Optical Character Recognition for Blind using Raspberry Pi
 
50120130406005
5012013040600550120130406005
50120130406005
 

Más de Nyugat-magyarországi Egyetem, Savaria Egyetemi Központ

Más de Nyugat-magyarországi Egyetem, Savaria Egyetemi Központ (9)

Jókai Mór Town Library 2007
Jókai Mór Town Library 2007Jókai Mór Town Library 2007
Jókai Mór Town Library 2007
 
A pápai városi könyvtár szervezeti kultúrája
A pápai városi könyvtár szervezeti kultúrájaA pápai városi könyvtár szervezeti kultúrája
A pápai városi könyvtár szervezeti kultúrája
 
Tudástranszfer irodák bemutatása
Tudástranszfer irodák bemutatásaTudástranszfer irodák bemutatása
Tudástranszfer irodák bemutatása
 
Tudásmenedzsment konferenciák, tanfolyamok témái
Tudásmenedzsment konferenciák, tanfolyamok témáiTudásmenedzsment konferenciák, tanfolyamok témái
Tudásmenedzsment konferenciák, tanfolyamok témái
 
Tudásmenedzsment a Galgóczi Erzsébet Városi Könyvtárban
Tudásmenedzsment a Galgóczi Erzsébet Városi KönyvtárbanTudásmenedzsment a Galgóczi Erzsébet Városi Könyvtárban
Tudásmenedzsment a Galgóczi Erzsébet Városi Könyvtárban
 
Erasmus-sal külföldön
Erasmus-sal külföldönErasmus-sal külföldön
Erasmus-sal külföldön
 
University of Wisconsin-Madison's School of Library and Information Studies
University of Wisconsin-Madison's School of Library and Information StudiesUniversity of Wisconsin-Madison's School of Library and Information Studies
University of Wisconsin-Madison's School of Library and Information Studies
 
Könyvtári statisztika
Könyvtári statisztikaKönyvtári statisztika
Könyvtári statisztika
 
Dokumentumok általános megnevezése
Dokumentumok általános megnevezéseDokumentumok általános megnevezése
Dokumentumok általános megnevezése
 

Último

Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 

Último (20)

Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 

Digitizing books

  • 2. What is the digitization? 2  Digitization is the process of converting information into a digital format. In this format, information is organized into discrete units of data that can be separately addressed. This is the binary data that computers and many devices with computing capacity can process.  http://whatis.techtarget.com/definition/0,,sid9_g ci896692,00.html  See also: http://en.wikipedia.org/wiki/Digitizing
  • 3. Steps of digitization 3  1. Choose the book you want to digitize.  2. Choose an OCR software (GO!)  3. Scan your book (Choose the devise. Scanner,      compact device, digital camera, IRIScan) (GO!) 4. Optical Character Recognition (image) 5. Correction (image1) (image2) 6. Save as a text searchable PDF document See another versions: http://www.inquisition.ca/en/info/artic/comment_ numeriser.htm http://dlg.galileo.usg.edu/guide.html#01
  • 4. Text and images 4  Text and images can be digitized similarly: a scanner captures an image (which may be an image of text) and converts it to an image file, such as a bitmap. An optical character recognition (OCR) program analyzes a text image for light and dark areas in order to identify each alphabetic letter or numeric digit, and converts each character into an ASCII code.
  • 5. Choose an OCR software 5  There are a lot of softwares to digitize your     documents. On Wikipedia there is comparison list of optical character recognition softwares. Check it out! http://en.wikipedia.org/wiki/List_of_optical_chara cter_recognition_software (I recommend you the ABBYY FineReader.) If you don’t want to buy (or download) a software, here’s a free online OCR: http://www.newocr.com/
  • 6. What is OCR? 6  OCR (optical character recognition) is the recognition of printed or written text characters by a computer. This involves photoscanning of the text character-by-character, analysis of the scanned-in image, and then translation of the character image into character codes, such as ASCII, commonly used in data processing.  http://searchciomidmarket.techtarget.com/definition/OCR  Read more: http://en.wikipedia.org/wiki/Optical_character_r ecognition
  • 7. What is ASCII? 7  ASCII (American Standard Code for Information Interchange) is the most common format for text files in computers and on the Internet. In an ASCII file, each alphabetic, numeric, or special character is represented with a 7-bit binary number (a string of seven 0s or 1s). 128 possible characters are defined.  In: http://searchciomidmarket.techtarget.com/definition/ASCII
  • 8. How to scan the book 8  With scanner: http://www.wikihow.com/Scan-a    Book http://www.proportionalreading.com/scan.html With one compact device: http://www.ehow.com/how_6950098_scan-bookpdf-format.html With digital camera: http://www.wikihow.com/Scan-a-Book-With-aDigital-Camera With IRIScan: http://www.youtube.com/watch?v=9bgcDHLe3Xg
  • 12. Videos 12  How to digitize a book:         http://www.youtube.com/watch?v=-M95Ob4kIak How to chop and scan a book: http://www.youtube.com/watch?v=8tx2JmW_p4c Scanning text using OCR software: http://www.youtube.com/watch?v=_SwrGtSY4-c How to OCR PDFs easily with Acrobat Batch OCR: http://www.youtube.com/watch?v=V6Iz3U5X-SU How to digitize a million books http://www.youtube.com/watch?v=OlKhKyTS23E
  • 13. How to put a scanned doc into PDF format 13  http://www.ehow.com/how_8563246_put-scanned- document-pdf-format.html  Some OCR softwares include PDF format to save.  Have a good reading on your digital device!  Made by Mario Laskovics (2012.04.03)