SlideShare una empresa de Scribd logo
1 de 13
Descargar para leer sin conexión
pdf.js
Julian Viereck
  @jviereck
Overview
• What is pdf.js
• How PDF is structured
• Processing in pdf.js
• Images & Fonts
• Problems
• Todo
• Demo
What is pdf.js

•   building faithful & efficient PDF renderer
•   HTML5 technology experiment
•   no native code
•   secure (web sandbox)
•   Mozilla Labs Project - Open Source
How PDF is structured
 Header      PDF version

             sequence of objets
   Body

 [Objects]   fonts, drawing cmds, images,
             words, bookmarks, form fields
xRef Table   mapping objID     byte offset
  Trailer    root objID, xRef byte offset
 PDF file     root obj = ref to pages catalog
Processing in pdf.js
• get plain Uint8Array via XHR2, build Stream
• new PDFDoc(stream): read xRef, root object
• page = PDFDoc.getPage(N)                    Internal

• page.startRendering(graphics)            Representation




 • read & convert all PDF cmds ➟ IR PartialEvaluator
 • load required objects (fonts, images)
 • graphics.executeIR(IR)                CanvasGraphics
1. page=PDFDoc.getPage(2)     5 0 obj
                                                stream maybe
   ➟ obj#3                    <<                   encoded!
                               /Length 8 0 R
2. page.startRendering(...)   >>
   ➟ obj#4, obj#5              stream
                                 /GS1 gs
                                 /F0 12 Tf
 3 0 obj                         BT
 <<                                100 700 Td
 /Type /Page                       (Hello World!) Tj
 /MediaBox 	

 0 612 792]
                [0               ET
 /Resources 	

 4 0 R            50 600 m
 /Parent 	

	

 2 0 R            400 600 l
 /Contents 	

5 0 R              S
 >>                            endstream
 endobj                       endobj
xRef, catalog,                                  IR
5 0 obj           + resources            PartialEvaluator         Form
<<
 /Length 8 0 R
>>                                  setGState: 	

   [ LW: 10 ]
 stream                             dependency:	

   [ font0 ]
   /GS1 gs                          setFont: 	

     font0, 12
   /F0 12 Tf                        beginText
   BT                               moveText: 	

    100, 700
     100 700 Td                     showText: 	

    “Hello World!”
     (Hello World!) Tj              endText
   ET                               moveTo: 	

      50, 600
   50 600 m                         lineTo: 	

      400, 600
   400 600 l                        stroke
   S
 endstream
endobj                                  CanvasGraphics
Images
• JPEG streams:
 • DOMImg.src = 'data:image/jpeg;base64,'
    + window.btoa(bytesToString(bytes));
• If not JPEG stream:
 • read bytes, convert to colorspace
 • imgData = canvas.getImageData()
 • fillWithPixelData(bytes, imgData)
 • canvas.putImageData(imgData)
Fonts
• There are lots of different font formats!
 • fonts are converted to OpenType
 • use CSS:
      @font-face { font-family:'font0'; src:url
    (data:font/opentype;base64, ...)
• some fonts can’t be converted :(
 • use drawing commands?
Problems
                                       platform =
                                     browser + OS

• No way to detect font is loaded (hacks)
• Font width (wrong on some platforms)
• Subpixel font size depending on platform
• Text selection
• Printing
• Speed
 • use workers (postMessage lose shape)
 • partial rendering
Todo
• more font work, printing, speed
• support more rendering spec
• explore using SVG
• PDF forms, “advanced PDF features”
• infrastructure: automated testing, requireJS
• test more PDF (need your help!)
Demo
Contact
Github:
 https://github.com/andreasgal/pdf.js
Mailing list:
 https://groups.google.com/group/
mozilla.dev.pdf-js/topics
IRC:
 irc.mozilla.org #pdfjs

Más contenido relacionado

La actualidad más candente

MongoDB at MercadoLibre
MongoDB at MercadoLibreMongoDB at MercadoLibre
MongoDB at MercadoLibrePablo Molnar
 
CouchDB Open Source Bridge
CouchDB Open Source BridgeCouchDB Open Source Bridge
CouchDB Open Source BridgeChris Anderson
 
Mastering the MongoDB Javascript Shell
Mastering the MongoDB Javascript ShellMastering the MongoDB Javascript Shell
Mastering the MongoDB Javascript ShellScott Hernandez
 
Shankar's mongo db presentation
Shankar's mongo db presentationShankar's mongo db presentation
Shankar's mongo db presentationShankar Kamble
 
JSONSchema with golang
JSONSchema with golangJSONSchema with golang
JSONSchema with golangSuraj Deshmukh
 
Apache AVRO (Boston HUG, Jan 19, 2010)
Apache AVRO (Boston HUG, Jan 19, 2010)Apache AVRO (Boston HUG, Jan 19, 2010)
Apache AVRO (Boston HUG, Jan 19, 2010)Cloudera, Inc.
 
Dirty - How simple is your database?
Dirty - How simple is your database?Dirty - How simple is your database?
Dirty - How simple is your database?Felix Geisendörfer
 
PhpstudyTokyo MongoDB PHP CakePHP
PhpstudyTokyo MongoDB PHP CakePHPPhpstudyTokyo MongoDB PHP CakePHP
PhpstudyTokyo MongoDB PHP CakePHPichikaway
 
Shell Tips & Tricks
Shell Tips & TricksShell Tips & Tricks
Shell Tips & TricksMongoDB
 
Mongo Presentation by Metatagg Solutions
Mongo Presentation by Metatagg SolutionsMongo Presentation by Metatagg Solutions
Mongo Presentation by Metatagg SolutionsMetatagg Solutions
 
Getting Started with MongoDB
Getting Started with MongoDBGetting Started with MongoDB
Getting Started with MongoDBMichael Redlich
 
javaScript.ppt
javaScript.pptjavaScript.ppt
javaScript.pptsentayehu
 
Apache CouchDB Presentation @ Sept. 2104 GTALUG Meeting
Apache CouchDB Presentation @ Sept. 2104 GTALUG MeetingApache CouchDB Presentation @ Sept. 2104 GTALUG Meeting
Apache CouchDB Presentation @ Sept. 2104 GTALUG MeetingMyles Braithwaite
 
NoSQL - An introduction to CouchDB
NoSQL - An introduction to CouchDBNoSQL - An introduction to CouchDB
NoSQL - An introduction to CouchDBJonathan Weiss
 
MongoDb In Action
MongoDb In ActionMongoDb In Action
MongoDb In Actionfuchaoqun
 

La actualidad más candente (20)

MongoDB at MercadoLibre
MongoDB at MercadoLibreMongoDB at MercadoLibre
MongoDB at MercadoLibre
 
CouchDB Open Source Bridge
CouchDB Open Source BridgeCouchDB Open Source Bridge
CouchDB Open Source Bridge
 
Mastering the MongoDB Javascript Shell
Mastering the MongoDB Javascript ShellMastering the MongoDB Javascript Shell
Mastering the MongoDB Javascript Shell
 
MongoDB & PHP
MongoDB & PHPMongoDB & PHP
MongoDB & PHP
 
CouchDB in The Room
CouchDB in The RoomCouchDB in The Room
CouchDB in The Room
 
Shankar's mongo db presentation
Shankar's mongo db presentationShankar's mongo db presentation
Shankar's mongo db presentation
 
faastCrystal
faastCrystalfaastCrystal
faastCrystal
 
JSONSchema with golang
JSONSchema with golangJSONSchema with golang
JSONSchema with golang
 
Apache AVRO (Boston HUG, Jan 19, 2010)
Apache AVRO (Boston HUG, Jan 19, 2010)Apache AVRO (Boston HUG, Jan 19, 2010)
Apache AVRO (Boston HUG, Jan 19, 2010)
 
Dirty - How simple is your database?
Dirty - How simple is your database?Dirty - How simple is your database?
Dirty - How simple is your database?
 
PhpstudyTokyo MongoDB PHP CakePHP
PhpstudyTokyo MongoDB PHP CakePHPPhpstudyTokyo MongoDB PHP CakePHP
PhpstudyTokyo MongoDB PHP CakePHP
 
Shell Tips & Tricks
Shell Tips & TricksShell Tips & Tricks
Shell Tips & Tricks
 
Trimming The Cruft
Trimming The CruftTrimming The Cruft
Trimming The Cruft
 
Mongo Presentation by Metatagg Solutions
Mongo Presentation by Metatagg SolutionsMongo Presentation by Metatagg Solutions
Mongo Presentation by Metatagg Solutions
 
Getting Started with MongoDB
Getting Started with MongoDBGetting Started with MongoDB
Getting Started with MongoDB
 
javaScript.ppt
javaScript.pptjavaScript.ppt
javaScript.ppt
 
Apache CouchDB Presentation @ Sept. 2104 GTALUG Meeting
Apache CouchDB Presentation @ Sept. 2104 GTALUG MeetingApache CouchDB Presentation @ Sept. 2104 GTALUG Meeting
Apache CouchDB Presentation @ Sept. 2104 GTALUG Meeting
 
NoSQL - An introduction to CouchDB
NoSQL - An introduction to CouchDBNoSQL - An introduction to CouchDB
NoSQL - An introduction to CouchDB
 
Mongo DB 102
Mongo DB 102Mongo DB 102
Mongo DB 102
 
MongoDb In Action
MongoDb In ActionMongoDb In Action
MongoDb In Action
 

Similar a 2011 09-pdfjs

JRuby with Java Code in Data Processing World
JRuby with Java Code in Data Processing WorldJRuby with Java Code in Data Processing World
JRuby with Java Code in Data Processing WorldSATOSHI TAGOMORI
 
Tuning web performance
Tuning web performanceTuning web performance
Tuning web performanceGeorge Ang
 
Tuning Web Performance
Tuning Web PerformanceTuning Web Performance
Tuning Web PerformanceEric ShangKuan
 
TorqueBox: The beauty of Ruby with the power of JBoss. Presented at Devnexus...
TorqueBox: The beauty of Ruby with the power of JBoss.  Presented at Devnexus...TorqueBox: The beauty of Ruby with the power of JBoss.  Presented at Devnexus...
TorqueBox: The beauty of Ruby with the power of JBoss. Presented at Devnexus...bobmcwhirter
 
解密解密
解密解密解密解密
解密解密Tom Chen
 
How dojo works
How dojo worksHow dojo works
How dojo worksAmit Tyagi
 
Playing with d3.js
Playing with d3.jsPlaying with d3.js
Playing with d3.jsmangoice
 
Rapid and Scalable Development with MongoDB, PyMongo, and Ming
Rapid and Scalable Development with MongoDB, PyMongo, and MingRapid and Scalable Development with MongoDB, PyMongo, and Ming
Rapid and Scalable Development with MongoDB, PyMongo, and MingRick Copeland
 
node.js: Javascript's in your backend
node.js: Javascript's in your backendnode.js: Javascript's in your backend
node.js: Javascript's in your backendDavid Padbury
 
Nodejs - Should Ruby Developers Care?
Nodejs - Should Ruby Developers Care?Nodejs - Should Ruby Developers Care?
Nodejs - Should Ruby Developers Care?Felix Geisendörfer
 
Allura - an Open Source MongoDB Based Document Oriented SourceForge
Allura - an Open Source MongoDB Based Document Oriented SourceForgeAllura - an Open Source MongoDB Based Document Oriented SourceForge
Allura - an Open Source MongoDB Based Document Oriented SourceForgeRick Copeland
 
Intro to mobile web application development
Intro to mobile web application developmentIntro to mobile web application development
Intro to mobile web application developmentzonathen
 
Sorry - How Bieber broke Google Cloud at Spotify
Sorry - How Bieber broke Google Cloud at SpotifySorry - How Bieber broke Google Cloud at Spotify
Sorry - How Bieber broke Google Cloud at SpotifyNeville Li
 
Familiar HTML5 - 事例とサンプルコードから学ぶ 身近で普通に使わているHTML5
Familiar HTML5 - 事例とサンプルコードから学ぶ 身近で普通に使わているHTML5Familiar HTML5 - 事例とサンプルコードから学ぶ 身近で普通に使わているHTML5
Familiar HTML5 - 事例とサンプルコードから学ぶ 身近で普通に使わているHTML5Sadaaki HIRAI
 
Cape Cod Web Technology Meetup - 2
Cape Cod Web Technology Meetup - 2Cape Cod Web Technology Meetup - 2
Cape Cod Web Technology Meetup - 2Asher Martin
 
Dojo: Getting Started Today
Dojo: Getting Started TodayDojo: Getting Started Today
Dojo: Getting Started TodayGabriel Hamilton
 

Similar a 2011 09-pdfjs (20)

JRuby with Java Code in Data Processing World
JRuby with Java Code in Data Processing WorldJRuby with Java Code in Data Processing World
JRuby with Java Code in Data Processing World
 
Tuning web performance
Tuning web performanceTuning web performance
Tuning web performance
 
Tuning Web Performance
Tuning Web PerformanceTuning Web Performance
Tuning Web Performance
 
Pdf secrets v2
Pdf secrets v2Pdf secrets v2
Pdf secrets v2
 
Hadoop - Introduction to Hadoop
Hadoop - Introduction to HadoopHadoop - Introduction to Hadoop
Hadoop - Introduction to Hadoop
 
TorqueBox: The beauty of Ruby with the power of JBoss. Presented at Devnexus...
TorqueBox: The beauty of Ruby with the power of JBoss.  Presented at Devnexus...TorqueBox: The beauty of Ruby with the power of JBoss.  Presented at Devnexus...
TorqueBox: The beauty of Ruby with the power of JBoss. Presented at Devnexus...
 
解密解密
解密解密解密解密
解密解密
 
How dojo works
How dojo worksHow dojo works
How dojo works
 
Playing with d3.js
Playing with d3.jsPlaying with d3.js
Playing with d3.js
 
Rapid and Scalable Development with MongoDB, PyMongo, and Ming
Rapid and Scalable Development with MongoDB, PyMongo, and MingRapid and Scalable Development with MongoDB, PyMongo, and Ming
Rapid and Scalable Development with MongoDB, PyMongo, and Ming
 
Wider than rails
Wider than railsWider than rails
Wider than rails
 
node.js: Javascript's in your backend
node.js: Javascript's in your backendnode.js: Javascript's in your backend
node.js: Javascript's in your backend
 
Nodejs - Should Ruby Developers Care?
Nodejs - Should Ruby Developers Care?Nodejs - Should Ruby Developers Care?
Nodejs - Should Ruby Developers Care?
 
Allura - an Open Source MongoDB Based Document Oriented SourceForge
Allura - an Open Source MongoDB Based Document Oriented SourceForgeAllura - an Open Source MongoDB Based Document Oriented SourceForge
Allura - an Open Source MongoDB Based Document Oriented SourceForge
 
Modern C++
Modern C++Modern C++
Modern C++
 
Intro to mobile web application development
Intro to mobile web application developmentIntro to mobile web application development
Intro to mobile web application development
 
Sorry - How Bieber broke Google Cloud at Spotify
Sorry - How Bieber broke Google Cloud at SpotifySorry - How Bieber broke Google Cloud at Spotify
Sorry - How Bieber broke Google Cloud at Spotify
 
Familiar HTML5 - 事例とサンプルコードから学ぶ 身近で普通に使わているHTML5
Familiar HTML5 - 事例とサンプルコードから学ぶ 身近で普通に使わているHTML5Familiar HTML5 - 事例とサンプルコードから学ぶ 身近で普通に使わているHTML5
Familiar HTML5 - 事例とサンプルコードから学ぶ 身近で普通に使わているHTML5
 
Cape Cod Web Technology Meetup - 2
Cape Cod Web Technology Meetup - 2Cape Cod Web Technology Meetup - 2
Cape Cod Web Technology Meetup - 2
 
Dojo: Getting Started Today
Dojo: Getting Started TodayDojo: Getting Started Today
Dojo: Getting Started Today
 

Último

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 

Último (20)

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 

2011 09-pdfjs

  • 2. Overview • What is pdf.js • How PDF is structured • Processing in pdf.js • Images & Fonts • Problems • Todo • Demo
  • 3. What is pdf.js • building faithful & efficient PDF renderer • HTML5 technology experiment • no native code • secure (web sandbox) • Mozilla Labs Project - Open Source
  • 4. How PDF is structured Header PDF version sequence of objets Body [Objects] fonts, drawing cmds, images, words, bookmarks, form fields xRef Table mapping objID byte offset Trailer root objID, xRef byte offset PDF file root obj = ref to pages catalog
  • 5. Processing in pdf.js • get plain Uint8Array via XHR2, build Stream • new PDFDoc(stream): read xRef, root object • page = PDFDoc.getPage(N) Internal • page.startRendering(graphics) Representation • read & convert all PDF cmds ➟ IR PartialEvaluator • load required objects (fonts, images) • graphics.executeIR(IR) CanvasGraphics
  • 6. 1. page=PDFDoc.getPage(2) 5 0 obj stream maybe ➟ obj#3 << encoded! /Length 8 0 R 2. page.startRendering(...) >> ➟ obj#4, obj#5 stream /GS1 gs /F0 12 Tf 3 0 obj BT << 100 700 Td /Type /Page (Hello World!) Tj /MediaBox 0 612 792] [0 ET /Resources 4 0 R 50 600 m /Parent 2 0 R 400 600 l /Contents 5 0 R S >> endstream endobj endobj
  • 7. xRef, catalog, IR 5 0 obj + resources PartialEvaluator Form << /Length 8 0 R >> setGState: [ LW: 10 ] stream dependency: [ font0 ] /GS1 gs setFont: font0, 12 /F0 12 Tf beginText BT moveText: 100, 700 100 700 Td showText: “Hello World!” (Hello World!) Tj endText ET moveTo: 50, 600 50 600 m lineTo: 400, 600 400 600 l stroke S endstream endobj CanvasGraphics
  • 8. Images • JPEG streams: • DOMImg.src = 'data:image/jpeg;base64,' + window.btoa(bytesToString(bytes)); • If not JPEG stream: • read bytes, convert to colorspace • imgData = canvas.getImageData() • fillWithPixelData(bytes, imgData) • canvas.putImageData(imgData)
  • 9. Fonts • There are lots of different font formats! • fonts are converted to OpenType • use CSS: @font-face { font-family:'font0'; src:url (data:font/opentype;base64, ...) • some fonts can’t be converted :( • use drawing commands?
  • 10. Problems platform = browser + OS • No way to detect font is loaded (hacks) • Font width (wrong on some platforms) • Subpixel font size depending on platform • Text selection • Printing • Speed • use workers (postMessage lose shape) • partial rendering
  • 11. Todo • more font work, printing, speed • support more rendering spec • explore using SVG • PDF forms, “advanced PDF features” • infrastructure: automated testing, requireJS • test more PDF (need your help!)
  • 12. Demo
  • 13. Contact Github: https://github.com/andreasgal/pdf.js Mailing list: https://groups.google.com/group/ mozilla.dev.pdf-js/topics IRC: irc.mozilla.org #pdfjs