SlideShare una empresa de Scribd logo
1 de 21
By Gianluca Tarasconi – Kites Univ. Bocconi / O.S.T.
About the speaker
   Background in Management Engineering @
    Politecnico of Milan
   Database Architect @ KITeS (previosly CESPRI)
    since 2002
   Project manager for data production in EU Projects
    STI-NET, TENIA, AEGIS and EU Tenders ICT
    network impact, INNOVA, Higly Cited Patents,
    Measurement and analysis of knowledge and R&D
    exploitation flows, assessed by patent and licensing
    data
   Collaborations on database projects with: MIT, LSE,
    Danish Board of Technology, Bonn Graduate School
    of Economic, Universtät Mainz, BETA …
   Redactor of blog rawpatentdata.blogspot.com
What is PATSTAT
   is a snapshot of the EPO database for
    over about 70 million applications from
    more than 80 application authorities,
    containing bibliographic data, citations
    and family links. It requires the data to
    be loaded in the customer's own
    database.

 + low cost of ownership
 - costs of implementation
Data Sorces for PATSTAT
 Source for EP data is DOCDB (EPO
  master documentation database)
 Source for other offices are files
  provided by other patent authorities

 + Good coverage for US, EU states, JP,
  EPO, WIPO
 - For other authorities gaps and leaks
  not easy to identify
Implementing the DB (I)
 Over 20 tables in
  a relational DB
  with application is
  as main primary
  key
 EPO adds /
  improves data
  each ediction
Implementing the DB (II)
 + standard scripts, a growing community
  to exchange procedures etc. (example)
 - need a person who has both DB and
  patent data knowledge
Plug & play extensions
Datasets that can be added with no effort:
 Regpat: OECD dataset giving NUTS3 for each
  applocant / inventor (EP only)
 Han: OECD Harmonized applicants names
  dataset (EP only)
 eee_ppat: KUL/Eurostat standard names and
  sector allocation (all patstat)
 Tls221: Epo legal data table, allowing to include
  changes of ownership, oppositions... (example)
 ape-inv: Inventors disambiguation tools and
  academic inventors.
Note: all tables, but TLS221 are free of cost
Some papers using Kites-Patstat
DB
Lissoni, F., Llerena, P., McKelvey, M., and B. Sanditov "Academic Patenting in Europe: New Evidence from the KEINS Database," Research
       Evaluation, 17(2): 87-102.

Bacchiocchi E., Montobbio F. (2009); Knowledge Diffusion from University and Public Research. A Comparison between US Japan and
      Europe using Patent Citations. Journal of Technology Transfer, vol.34 (2), pp.169-181.

Breschi S., Lissoni F., Montobbio F. (2008). University patenting and scientific productivity. A quantitative study of Italian academic inventors.
      European Management Review. The Journal of the European Academy of Management 5(2): 91-109

Corrocher N., Malerba F., Montobbio F. (2007); Schumpeterian Patterns of Innovative Activity in the ICT Field. Research Policy. vol. 36, pp.
      418-432

Breschi S., Lissoni F., Montobbio F. (2007). The Scientific Productivity Of Academic Inventors: New Evidence From Italian Data. Economics
      of Innovation and New Technology, Vol. 16, Issue 2, pp. 101-118

Della Malva A, Breschi S, Lissoni F, Montobbio F. (2007). L'attivita' brevettuale dei docenti universitari: L'Italia in un confronto internazionale.
       Economia e Politica Industriale.v.2 pp.43-70. [pdf]

Montobbio F. (2008); Patenting Activity in Latin American and Caribbean Countries.In World Intellectual Property Organization(WIPO) -
     Economic Commission for Latin America and the Caribbean (ECLAC) - Study on Intellectual Property Management in Open
     Economies: A Strategic Vision for Latin America". Forthcoming

Frazzoni S., Mancusi M., Rotondi Z., Sobrero M., Vezzulli A., (2011), “Relationship with banks and access to credit for innovation and
      internationalization in SMEs”, L’EUROPA E OLTRE. Banche e imprese nella nuova globalizzazione, XVI Rapporto sul sistema
      finanziario italiano, Edibank, 2011. ISBN 978-88-449-0495-1.

V. Sterzi: Patent quality and ownership: An analysis of UK faculty patenting, Research Policy, 2012 (forthcoming)
Some advanced
applications
 OST patent applicants data quality
  procedure and Match with ORBIS
 OST common identifier among Patstat
  WoS, Framework programs DBs
Applicants data quality
procedure and Match with
ORBIS (I)
 Goal of the procedure is to clean and
  standardize patent applicants names (ie
  removing type of company, common
  misspelling etc.)
 After names C&S a procedure has been
  developed in order to apply 5 different
  match algorithms in order to give allow
  the best matches with ORBIS company
  names.
Applicants data quality
procedure and Match with
ORBIS (II)
   Data quality procedure developed using
    portable query and tables (see Tarasconi -
    Sharing names/address cleaning patterns for Patstat
    from patstat users day 2011)

 Match procedure developed aiming to
  be multiporpose (IE has already been used to match
  TM vs Patents applicants @ KITeS)

 Code and tables available for MySql and
  Oracle.
    http://documents.epo.org/projects/babylon/eponet.nsf/0/92ab5eb34ff406d1c125795d0050bbc
    c/$FILE/PATSTAT_user_day_2011_presentations.zip
Applicants data quality
procedure and Match with
ORBIS (III)
 C&S step results: from 12.280.000 pat.
  applicants to about 3.800.000 companies
 Match against: 353.294 Orbis Companies
  in Nace 2540, 2630, 2651, 2910, 3030,
  3011, 8422 (defense)
 Results: 94726 Patent applicants against
  66256 Orbis companies
 Benchmark: Againsts a sample of 1%
  validation returned a precision rate of 91%
  and a recall of 95%
OST Common identifier (I)
Data cathegories existing across patent,
 scientific publications and Framework
 programs data:

                            PATSTAT              FPS                WOS
                       inventors/applicant   participants
Geographic data           s addresses         addresses     affiliations addresses
                           inventors,
Individuals                applicants         contacts             authors

companies                  applicants        participants        affiliations

sci /tech taxonomies          IPC                TPs        subject cathegories
OST Common identifier (II)
1)DEFINE ATOMIC ENTITIES AND NON
  AMBIGUOS JOINS
 Even if they regard similar entities there are
  differences among datasets on the
  granularity they use on data.
    (ie in WOS affiliations may be by lab / dept while
     patents may be by IP office: different size)
    Bridge dataset should use a entity size
     allowing unique data match across different
     sets. This might need some changes also in
     existing databases.
    Bridge dataset should also make possible a
     hierarchic structure of entities allowing
     join at different level to main datasets.
OST Common identifier (III)
   Example
OST Common identifier (IV)
   2) TIMESERIES
   2a) DATASET ASINCHRONIES
   Data may enter the database with different time frame
    depending from the dataset.
   (IE PATSTAT is a full update so a snapshot at moment of
    data creation, WOS is an incremental update; so name
    changes/M&A could make same entity different in 2 datasets;
    note also geographic entities change with time: counties,
    countries…)
   Bridge tables must have a time-related dimension.
   2b) DATA TRANFORMATIONS
   Data change within time.
   (IE companies may merge, split [most critical case], change
    name, change owner…)
   Bridge tables must have a continuation dimension
    allowing to follow transformation of entities.
OST Common identifier (V)
   Timeseries examples




                        Sarajevo chg from YU to BS in 1992

               BEFORE   Sarajevo   YU BS



               AFTER    Sarajevo   YU 1800 1991
                        Sarajevo   BS 1992 9999
OST Common identifier (V)
   OBJECT / PROPERTIES DATASTRUCTURE

   Data structure proposed should be a TEMPORAL DATABASE(1), allowing to store
    PROPERTIES/STATUS/EVENTS, so FI contain following fields:

   PROPERTY NAME                  (ie ownership, affiliation…)
   PROPERTYVALUE                  (ie new owner, new affiliation)
   DATEFROM
   DATETO
   CHGREASON                      (if blank is still valid)
   VALUE1…N                       (ie type of acquisition, % ownership…)

   Along with properties must also be defined how properties are inherited among entities
    (IE CNRS Bordeaux inherits from CNRS ownership, probably sector of activity… )

   (1) See Richard T. Snodgrass. "TSQL2 Temporal Query Language". www.cs.arizona.edu. Computer
    Science Department of the University of Arizona
APPENDIX: Temporal database Example (I)
NOVARTIS
Novartis pharma is originated by merge of CIBA
(1884) GEIGY (1758) and Sandoz (1876)
Until 1970 they are 3 separate entities
LEGPCODE    LEGPNAME
           1 CIBA
           2 GEIGHY
           3 SANDOZ
           4 CIBA SUB 1..N
           5 GEIGHY SUB 1…N
           6 SANDOZ SUB 1…N


LEGPCODE PROPNAME      PROPVALUE STATUSCODE2 STATUSTEXT STATUSPERC DATEFROM DATETO CHGREASON
       1 OWNERSHIP     FULLOWN             1                  100      1884   9999
       2 OWNERSHIP     FULLOWN             2                  100      1758   9999
       3 OWNERSHIP     FULLOWN             3                  100      1876   9999
       4 OWNERSHIP     FULLOWN             1                  100      1884   9999
       5 OWNERSHIP     FULLOWN             2                  100      1758   9999
       6 OWNERSHIP     FULLOWN             3                  100      1876   9999




                                                                                               19
Temporal database : Example (II)
    NOVARTIS

    1970 first merge CIBA + GEIGHY = CIBA GEIGHY LTD
LEGPCODE    LEGPNAME
           1 CIBA
           2 GEIGHY
           3 SANDOZ
           4 CIBA SUB 1..N
           5 GEIGHY SUB 1…N
           6 SANDOZ SUB 1…N
           7 CIBA GEIGY LTD.


   LEGPCODE PROPNAME        PROPVALUE STATUSCODE2 STATUSTEXT STATUSPERC DATEFROM DATETO CHGREASON
           1 OWNERSHIP      FULLOWN             1                  100      1884   1969 MERGE
           2 OWNERSHIP      FULLOWN             2                  100      1758   1969 MERGE
           3 OWNERSHIP      FULLOWN             3                  100      1876   9999
           4 OWNERSHIP      FULLOWN             1                  100      1884   1969 MERGE
           5 OWNERSHIP      FULLOWN             2                  100      1758   1969 MERGE
           6 OWNERSHIP      FULLOWN             3                  100      1876   9999
             TRANSFORMATI
           1 ON             MERGE               7                   50      1970   1970
             TRANSFORMATI
           2 ON             MERGE               7                   50      1970   1970
           7 OWNERSHIP      FULLOWN             7                  100      1970   9999
           4 OWNERSHIP      FULLOWN             7                  100      1970   9999
           5 OWNERSHIP      FULLOWN             7                  100      1970   9999             20
Temporal database : Example (III)
   NOVARTIS

   1996 second merge: CIBA GEIGHY + Sandoz = Novartis
LEGPCODE LEGPNAME
       3 SANDOZ
       4 CIBA SUB 1..N
       5 GEIGHY SUB 1…N
       6 SANDOZ SUB 1…N
       7 CIBA GEIGY LTD.
       8 NOVARTIS


LEGPCODE PROPNAME       PROPVALUE STATUSCODE2 STATUSTEXT   STATUSPERC DATEFROM    DATETO CHGREASON
       3 OWNERSHIP      FULLOWN              3                    100      1876     1995 MERGE
       4 OWNERSHIP      FULLOWN              1                    100      1884     1969 MERGE
       5 OWNERSHIP      FULLOWN              2                    100      1758     1969 MERGE
       6 OWNERSHIP      FULLOWN              3                    100      1876     1995 MERGE
       7 OWNERSHIP      FULLOWN              7                    100      1970     1995 MERGE
       4 OWNERSHIP      FULLOWN              7                    100      1970     1995 MERGE
       5 OWNERSHIP      FULLOWN              7                    100      1970     1995 MERGE
         TRANSFORMATI
       3 ON             MERGE                8                     50      1996     9999
         TRANSFORMATI
       7 ON             MERGE                8                     50      1996     9999
       8 OWNERSHIP      FULLOWN              8                    100      1996     9999 MERGE
       4 OWNERSHIP      FULLOWN              8                    100      1996     9999 MERGE
       5 OWNERSHIP      FULLOWN              8                    100      1996     9999 MERGE
       6 OWNERSHIP      FULLOWN              8                    100      1996     9999 MERGE
                                                                                                     21

Más contenido relacionado

Similar a Patstat and patstat related resources for patent data analisys

Linked Open Government Data in UK
Linked Open Government Data in UKLinked Open Government Data in UK
Linked Open Government Data in UKreeep
 
Geospatial Metadata and Spatial Data: It's all Greek to me!
Geospatial Metadata and Spatial Data: It's all Greek to me!Geospatial Metadata and Spatial Data: It's all Greek to me!
Geospatial Metadata and Spatial Data: It's all Greek to me!EDINA, University of Edinburgh
 
Geospatial metadata and spatial data workshop: 19 June 2014
Geospatial metadata and spatial data workshop: 19 June 2014Geospatial metadata and spatial data workshop: 19 June 2014
Geospatial metadata and spatial data workshop: 19 June 2014EDINA, University of Edinburgh
 
Linked Open Data (LOD) part 1
Linked Open Data (LOD) part 1Linked Open Data (LOD) part 1
Linked Open Data (LOD) part 1IPLODProject
 
Oxford University Geospatial Metadata Workshop 20110415
Oxford University Geospatial Metadata Workshop 20110415Oxford University Geospatial Metadata Workshop 20110415
Oxford University Geospatial Metadata Workshop 20110415EDINA, University of Edinburgh
 
Leeds University Geospatial Metadata Workshop 20110617
Leeds University Geospatial Metadata Workshop 20110617Leeds University Geospatial Metadata Workshop 20110617
Leeds University Geospatial Metadata Workshop 20110617EDINA, University of Edinburgh
 
Strumsky lobo (2011) does patenting intensity beget quality
Strumsky lobo (2011) does patenting intensity beget qualityStrumsky lobo (2011) does patenting intensity beget quality
Strumsky lobo (2011) does patenting intensity beget qualityivan weinel
 
Cambridge University Geospatial Metadata Workshop 20110524
Cambridge University Geospatial Metadata Workshop 20110524Cambridge University Geospatial Metadata Workshop 20110524
Cambridge University Geospatial Metadata Workshop 20110524EDINA, University of Edinburgh
 
Reference Knowledge Models for Smart Application
Reference Knowledge Models for Smart ApplicationReference Knowledge Models for Smart Application
Reference Knowledge Models for Smart ApplicationMaxime Lefrançois
 
Northumbria University Geospatial Metadata Workshop 20110505
Northumbria University Geospatial Metadata Workshop 20110505Northumbria University Geospatial Metadata Workshop 20110505
Northumbria University Geospatial Metadata Workshop 20110505EDINA, University of Edinburgh
 
Linked Data Tutorial (Florianópolis)
Linked Data Tutorial (Florianópolis)Linked Data Tutorial (Florianópolis)
Linked Data Tutorial (Florianópolis)Oscar Corcho
 
Introduction to Linked Data
Introduction to Linked DataIntroduction to Linked Data
Introduction to Linked DataOscar Corcho
 

Similar a Patstat and patstat related resources for patent data analisys (20)

Linked Open Government Data in UK
Linked Open Government Data in UKLinked Open Government Data in UK
Linked Open Government Data in UK
 
Geospatial Metadata and Spatial Data: It's all Greek to me!
Geospatial Metadata and Spatial Data: It's all Greek to me!Geospatial Metadata and Spatial Data: It's all Greek to me!
Geospatial Metadata and Spatial Data: It's all Greek to me!
 
Geospatial metadata and spatial data workshop: 19 June 2014
Geospatial metadata and spatial data workshop: 19 June 2014Geospatial metadata and spatial data workshop: 19 June 2014
Geospatial metadata and spatial data workshop: 19 June 2014
 
Linked Open Data (LOD) part 1
Linked Open Data (LOD) part 1Linked Open Data (LOD) part 1
Linked Open Data (LOD) part 1
 
Geospatial Metadata Workshop
Geospatial Metadata WorkshopGeospatial Metadata Workshop
Geospatial Metadata Workshop
 
Geospatial Metadata Workshop
Geospatial Metadata WorkshopGeospatial Metadata Workshop
Geospatial Metadata Workshop
 
Geospatial Metadata Workshop
Geospatial Metadata WorkshopGeospatial Metadata Workshop
Geospatial Metadata Workshop
 
WPI172219015000848
WPI172219015000848WPI172219015000848
WPI172219015000848
 
Oxford University Geospatial Metadata Workshop 20110415
Oxford University Geospatial Metadata Workshop 20110415Oxford University Geospatial Metadata Workshop 20110415
Oxford University Geospatial Metadata Workshop 20110415
 
Leeds University Geospatial Metadata Workshop 20110617
Leeds University Geospatial Metadata Workshop 20110617Leeds University Geospatial Metadata Workshop 20110617
Leeds University Geospatial Metadata Workshop 20110617
 
An introduction to patent data
An introduction to patent dataAn introduction to patent data
An introduction to patent data
 
Strumsky lobo (2011) does patenting intensity beget quality
Strumsky lobo (2011) does patenting intensity beget qualityStrumsky lobo (2011) does patenting intensity beget quality
Strumsky lobo (2011) does patenting intensity beget quality
 
Cambridge University Geospatial Metadata Workshop 20110524
Cambridge University Geospatial Metadata Workshop 20110524Cambridge University Geospatial Metadata Workshop 20110524
Cambridge University Geospatial Metadata Workshop 20110524
 
Reference Knowledge Models for Smart Application
Reference Knowledge Models for Smart ApplicationReference Knowledge Models for Smart Application
Reference Knowledge Models for Smart Application
 
Northumbria University Geospatial Metadata Workshop 20110505
Northumbria University Geospatial Metadata Workshop 20110505Northumbria University Geospatial Metadata Workshop 20110505
Northumbria University Geospatial Metadata Workshop 20110505
 
Cornell 2011 05-13
Cornell 2011 05-13Cornell 2011 05-13
Cornell 2011 05-13
 
Ciard Initiative and a Global Infrastructure for Linked Open Data
Ciard Initiative and a Global Infrastructure for Linked Open Data Ciard Initiative and a Global Infrastructure for Linked Open Data
Ciard Initiative and a Global Infrastructure for Linked Open Data
 
Linked Data Tutorial (Florianópolis)
Linked Data Tutorial (Florianópolis)Linked Data Tutorial (Florianópolis)
Linked Data Tutorial (Florianópolis)
 
Semantic Data Enrichment: a Human-in-the-Loop Perspective
Semantic Data Enrichment: a Human-in-the-Loop PerspectiveSemantic Data Enrichment: a Human-in-the-Loop Perspective
Semantic Data Enrichment: a Human-in-the-Loop Perspective
 
Introduction to Linked Data
Introduction to Linked DataIntroduction to Linked Data
Introduction to Linked Data
 

Más de Gianluca Tarasconi

Of Unicorns, Yetis, and Error-Free Datasets (or what is data quality?)
Of Unicorns, Yetis, and Error-Free Datasets (or what is data quality?)Of Unicorns, Yetis, and Error-Free Datasets (or what is data quality?)
Of Unicorns, Yetis, and Error-Free Datasets (or what is data quality?)Gianluca Tarasconi
 
PATSTAT & Patentsview: complements or substitutes?
PATSTAT & Patentsview: complements or substitutes?PATSTAT & Patentsview: complements or substitutes?
PATSTAT & Patentsview: complements or substitutes?Gianluca Tarasconi
 
Patents applicants: how to create the full time series
Patents applicants: how to create the full time seriesPatents applicants: how to create the full time series
Patents applicants: how to create the full time seriesGianluca Tarasconi
 
Patstat indicators step by step
Patstat indicators step by stepPatstat indicators step by step
Patstat indicators step by stepGianluca Tarasconi
 
QUELLO CHE I BREVETTI NON DICONO Aidb 2/12/16
QUELLO CHE I BREVETTI NON DICONO Aidb 2/12/16QUELLO CHE I BREVETTI NON DICONO Aidb 2/12/16
QUELLO CHE I BREVETTI NON DICONO Aidb 2/12/16Gianluca Tarasconi
 
Sharing names and address cleaning patterns for Patstat
Sharing names and address cleaning patterns for PatstatSharing names and address cleaning patterns for Patstat
Sharing names and address cleaning patterns for PatstatGianluca Tarasconi
 
Patent databases for business intelligence
Patent databases for business intelligencePatent databases for business intelligence
Patent databases for business intelligenceGianluca Tarasconi
 

Más de Gianluca Tarasconi (10)

Of Unicorns, Yetis, and Error-Free Datasets (or what is data quality?)
Of Unicorns, Yetis, and Error-Free Datasets (or what is data quality?)Of Unicorns, Yetis, and Error-Free Datasets (or what is data quality?)
Of Unicorns, Yetis, and Error-Free Datasets (or what is data quality?)
 
PATSTAT & Patentsview: complements or substitutes?
PATSTAT & Patentsview: complements or substitutes?PATSTAT & Patentsview: complements or substitutes?
PATSTAT & Patentsview: complements or substitutes?
 
Patents applicants: how to create the full time series
Patents applicants: how to create the full time seriesPatents applicants: how to create the full time series
Patents applicants: how to create the full time series
 
Patstat indicators step by step
Patstat indicators step by stepPatstat indicators step by step
Patstat indicators step by step
 
PATSTAT users 7 sins
PATSTAT users 7 sinsPATSTAT users 7 sins
PATSTAT users 7 sins
 
QUELLO CHE I BREVETTI NON DICONO Aidb 2/12/16
QUELLO CHE I BREVETTI NON DICONO Aidb 2/12/16QUELLO CHE I BREVETTI NON DICONO Aidb 2/12/16
QUELLO CHE I BREVETTI NON DICONO Aidb 2/12/16
 
Industria italiana dal 78
Industria italiana dal 78Industria italiana dal 78
Industria italiana dal 78
 
Patenting in the south
Patenting in the southPatenting in the south
Patenting in the south
 
Sharing names and address cleaning patterns for Patstat
Sharing names and address cleaning patterns for PatstatSharing names and address cleaning patterns for Patstat
Sharing names and address cleaning patterns for Patstat
 
Patent databases for business intelligence
Patent databases for business intelligencePatent databases for business intelligence
Patent databases for business intelligence
 

Último

Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...anilsa9823
 
Monte Carlo simulation : Simulation using MCSM
Monte Carlo simulation : Simulation using MCSMMonte Carlo simulation : Simulation using MCSM
Monte Carlo simulation : Simulation using MCSMRavindra Nath Shukla
 
Grateful 7 speech thanking everyone that has helped.pdf
Grateful 7 speech thanking everyone that has helped.pdfGrateful 7 speech thanking everyone that has helped.pdf
Grateful 7 speech thanking everyone that has helped.pdfPaul Menig
 
Call Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine ServiceCall Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine Serviceritikaroy0888
 
Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...Roland Driesen
 
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...Aggregage
 
Best VIP Call Girls Noida Sector 40 Call Me: 8448380779
Best VIP Call Girls Noida Sector 40 Call Me: 8448380779Best VIP Call Girls Noida Sector 40 Call Me: 8448380779
Best VIP Call Girls Noida Sector 40 Call Me: 8448380779Delhi Call girls
 
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...lizamodels9
 
RSA Conference Exhibitor List 2024 - Exhibitors Data
RSA Conference Exhibitor List 2024 - Exhibitors DataRSA Conference Exhibitor List 2024 - Exhibitors Data
RSA Conference Exhibitor List 2024 - Exhibitors DataExhibitors Data
 
Mondelez State of Snacking and Future Trends 2023
Mondelez State of Snacking and Future Trends 2023Mondelez State of Snacking and Future Trends 2023
Mondelez State of Snacking and Future Trends 2023Neil Kimberley
 
7.pdf This presentation captures many uses and the significance of the number...
7.pdf This presentation captures many uses and the significance of the number...7.pdf This presentation captures many uses and the significance of the number...
7.pdf This presentation captures many uses and the significance of the number...Paul Menig
 
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...amitlee9823
 
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A  SALESMAN / WOMANA DAY IN THE LIFE OF A  SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMANIlamathiKannappan
 
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptxB.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptxpriyanshujha201
 
John Halpern sued for sexual assault.pdf
John Halpern sued for sexual assault.pdfJohn Halpern sued for sexual assault.pdf
John Halpern sued for sexual assault.pdfAmzadHosen3
 
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...amitlee9823
 
Famous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st CenturyFamous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st Centuryrwgiffor
 
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best ServicesMysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best ServicesDipal Arora
 

Último (20)

Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
 
Monte Carlo simulation : Simulation using MCSM
Monte Carlo simulation : Simulation using MCSMMonte Carlo simulation : Simulation using MCSM
Monte Carlo simulation : Simulation using MCSM
 
Grateful 7 speech thanking everyone that has helped.pdf
Grateful 7 speech thanking everyone that has helped.pdfGrateful 7 speech thanking everyone that has helped.pdf
Grateful 7 speech thanking everyone that has helped.pdf
 
Mifty kit IN Salmiya (+918133066128) Abortion pills IN Salmiyah Cytotec pills
Mifty kit IN Salmiya (+918133066128) Abortion pills IN Salmiyah Cytotec pillsMifty kit IN Salmiya (+918133066128) Abortion pills IN Salmiyah Cytotec pills
Mifty kit IN Salmiya (+918133066128) Abortion pills IN Salmiyah Cytotec pills
 
Call Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine ServiceCall Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine Service
 
Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...
 
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
 
Best VIP Call Girls Noida Sector 40 Call Me: 8448380779
Best VIP Call Girls Noida Sector 40 Call Me: 8448380779Best VIP Call Girls Noida Sector 40 Call Me: 8448380779
Best VIP Call Girls Noida Sector 40 Call Me: 8448380779
 
Forklift Operations: Safety through Cartoons
Forklift Operations: Safety through CartoonsForklift Operations: Safety through Cartoons
Forklift Operations: Safety through Cartoons
 
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
 
RSA Conference Exhibitor List 2024 - Exhibitors Data
RSA Conference Exhibitor List 2024 - Exhibitors DataRSA Conference Exhibitor List 2024 - Exhibitors Data
RSA Conference Exhibitor List 2024 - Exhibitors Data
 
Mondelez State of Snacking and Future Trends 2023
Mondelez State of Snacking and Future Trends 2023Mondelez State of Snacking and Future Trends 2023
Mondelez State of Snacking and Future Trends 2023
 
7.pdf This presentation captures many uses and the significance of the number...
7.pdf This presentation captures many uses and the significance of the number...7.pdf This presentation captures many uses and the significance of the number...
7.pdf This presentation captures many uses and the significance of the number...
 
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
 
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A  SALESMAN / WOMANA DAY IN THE LIFE OF A  SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMAN
 
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptxB.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
 
John Halpern sued for sexual assault.pdf
John Halpern sued for sexual assault.pdfJohn Halpern sued for sexual assault.pdf
John Halpern sued for sexual assault.pdf
 
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
 
Famous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st CenturyFamous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st Century
 
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best ServicesMysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
 

Patstat and patstat related resources for patent data analisys

  • 1. By Gianluca Tarasconi – Kites Univ. Bocconi / O.S.T.
  • 2. About the speaker  Background in Management Engineering @ Politecnico of Milan  Database Architect @ KITeS (previosly CESPRI) since 2002  Project manager for data production in EU Projects STI-NET, TENIA, AEGIS and EU Tenders ICT network impact, INNOVA, Higly Cited Patents, Measurement and analysis of knowledge and R&D exploitation flows, assessed by patent and licensing data  Collaborations on database projects with: MIT, LSE, Danish Board of Technology, Bonn Graduate School of Economic, Universtät Mainz, BETA …  Redactor of blog rawpatentdata.blogspot.com
  • 3. What is PATSTAT  is a snapshot of the EPO database for over about 70 million applications from more than 80 application authorities, containing bibliographic data, citations and family links. It requires the data to be loaded in the customer's own database.  + low cost of ownership  - costs of implementation
  • 4. Data Sorces for PATSTAT  Source for EP data is DOCDB (EPO master documentation database)  Source for other offices are files provided by other patent authorities  + Good coverage for US, EU states, JP, EPO, WIPO  - For other authorities gaps and leaks not easy to identify
  • 5. Implementing the DB (I)  Over 20 tables in a relational DB with application is as main primary key  EPO adds / improves data each ediction
  • 6. Implementing the DB (II)  + standard scripts, a growing community to exchange procedures etc. (example)  - need a person who has both DB and patent data knowledge
  • 7. Plug & play extensions Datasets that can be added with no effort:  Regpat: OECD dataset giving NUTS3 for each applocant / inventor (EP only)  Han: OECD Harmonized applicants names dataset (EP only)  eee_ppat: KUL/Eurostat standard names and sector allocation (all patstat)  Tls221: Epo legal data table, allowing to include changes of ownership, oppositions... (example)  ape-inv: Inventors disambiguation tools and academic inventors. Note: all tables, but TLS221 are free of cost
  • 8. Some papers using Kites-Patstat DB Lissoni, F., Llerena, P., McKelvey, M., and B. Sanditov "Academic Patenting in Europe: New Evidence from the KEINS Database," Research Evaluation, 17(2): 87-102. Bacchiocchi E., Montobbio F. (2009); Knowledge Diffusion from University and Public Research. A Comparison between US Japan and Europe using Patent Citations. Journal of Technology Transfer, vol.34 (2), pp.169-181. Breschi S., Lissoni F., Montobbio F. (2008). University patenting and scientific productivity. A quantitative study of Italian academic inventors. European Management Review. The Journal of the European Academy of Management 5(2): 91-109 Corrocher N., Malerba F., Montobbio F. (2007); Schumpeterian Patterns of Innovative Activity in the ICT Field. Research Policy. vol. 36, pp. 418-432 Breschi S., Lissoni F., Montobbio F. (2007). The Scientific Productivity Of Academic Inventors: New Evidence From Italian Data. Economics of Innovation and New Technology, Vol. 16, Issue 2, pp. 101-118 Della Malva A, Breschi S, Lissoni F, Montobbio F. (2007). L'attivita' brevettuale dei docenti universitari: L'Italia in un confronto internazionale. Economia e Politica Industriale.v.2 pp.43-70. [pdf] Montobbio F. (2008); Patenting Activity in Latin American and Caribbean Countries.In World Intellectual Property Organization(WIPO) - Economic Commission for Latin America and the Caribbean (ECLAC) - Study on Intellectual Property Management in Open Economies: A Strategic Vision for Latin America". Forthcoming Frazzoni S., Mancusi M., Rotondi Z., Sobrero M., Vezzulli A., (2011), “Relationship with banks and access to credit for innovation and internationalization in SMEs”, L’EUROPA E OLTRE. Banche e imprese nella nuova globalizzazione, XVI Rapporto sul sistema finanziario italiano, Edibank, 2011. ISBN 978-88-449-0495-1. V. Sterzi: Patent quality and ownership: An analysis of UK faculty patenting, Research Policy, 2012 (forthcoming)
  • 9. Some advanced applications  OST patent applicants data quality procedure and Match with ORBIS  OST common identifier among Patstat WoS, Framework programs DBs
  • 10. Applicants data quality procedure and Match with ORBIS (I)  Goal of the procedure is to clean and standardize patent applicants names (ie removing type of company, common misspelling etc.)  After names C&S a procedure has been developed in order to apply 5 different match algorithms in order to give allow the best matches with ORBIS company names.
  • 11. Applicants data quality procedure and Match with ORBIS (II)  Data quality procedure developed using portable query and tables (see Tarasconi - Sharing names/address cleaning patterns for Patstat from patstat users day 2011)  Match procedure developed aiming to be multiporpose (IE has already been used to match TM vs Patents applicants @ KITeS)  Code and tables available for MySql and Oracle. http://documents.epo.org/projects/babylon/eponet.nsf/0/92ab5eb34ff406d1c125795d0050bbc c/$FILE/PATSTAT_user_day_2011_presentations.zip
  • 12. Applicants data quality procedure and Match with ORBIS (III)  C&S step results: from 12.280.000 pat. applicants to about 3.800.000 companies  Match against: 353.294 Orbis Companies in Nace 2540, 2630, 2651, 2910, 3030, 3011, 8422 (defense)  Results: 94726 Patent applicants against 66256 Orbis companies  Benchmark: Againsts a sample of 1% validation returned a precision rate of 91% and a recall of 95%
  • 13. OST Common identifier (I) Data cathegories existing across patent, scientific publications and Framework programs data: PATSTAT FPS WOS inventors/applicant participants Geographic data s addresses addresses affiliations addresses inventors, Individuals applicants contacts authors companies applicants participants affiliations sci /tech taxonomies IPC TPs subject cathegories
  • 14. OST Common identifier (II) 1)DEFINE ATOMIC ENTITIES AND NON AMBIGUOS JOINS  Even if they regard similar entities there are differences among datasets on the granularity they use on data.  (ie in WOS affiliations may be by lab / dept while patents may be by IP office: different size)  Bridge dataset should use a entity size allowing unique data match across different sets. This might need some changes also in existing databases.  Bridge dataset should also make possible a hierarchic structure of entities allowing join at different level to main datasets.
  • 15. OST Common identifier (III)  Example
  • 16. OST Common identifier (IV)  2) TIMESERIES  2a) DATASET ASINCHRONIES  Data may enter the database with different time frame depending from the dataset.  (IE PATSTAT is a full update so a snapshot at moment of data creation, WOS is an incremental update; so name changes/M&A could make same entity different in 2 datasets; note also geographic entities change with time: counties, countries…)  Bridge tables must have a time-related dimension.  2b) DATA TRANFORMATIONS  Data change within time.  (IE companies may merge, split [most critical case], change name, change owner…)  Bridge tables must have a continuation dimension allowing to follow transformation of entities.
  • 17. OST Common identifier (V)  Timeseries examples Sarajevo chg from YU to BS in 1992 BEFORE Sarajevo YU BS AFTER Sarajevo YU 1800 1991 Sarajevo BS 1992 9999
  • 18. OST Common identifier (V)  OBJECT / PROPERTIES DATASTRUCTURE  Data structure proposed should be a TEMPORAL DATABASE(1), allowing to store PROPERTIES/STATUS/EVENTS, so FI contain following fields:  PROPERTY NAME (ie ownership, affiliation…)  PROPERTYVALUE (ie new owner, new affiliation)  DATEFROM  DATETO  CHGREASON (if blank is still valid)  VALUE1…N (ie type of acquisition, % ownership…)  Along with properties must also be defined how properties are inherited among entities (IE CNRS Bordeaux inherits from CNRS ownership, probably sector of activity… )  (1) See Richard T. Snodgrass. "TSQL2 Temporal Query Language". www.cs.arizona.edu. Computer Science Department of the University of Arizona
  • 19. APPENDIX: Temporal database Example (I) NOVARTIS Novartis pharma is originated by merge of CIBA (1884) GEIGY (1758) and Sandoz (1876) Until 1970 they are 3 separate entities LEGPCODE LEGPNAME 1 CIBA 2 GEIGHY 3 SANDOZ 4 CIBA SUB 1..N 5 GEIGHY SUB 1…N 6 SANDOZ SUB 1…N LEGPCODE PROPNAME PROPVALUE STATUSCODE2 STATUSTEXT STATUSPERC DATEFROM DATETO CHGREASON 1 OWNERSHIP FULLOWN 1 100 1884 9999 2 OWNERSHIP FULLOWN 2 100 1758 9999 3 OWNERSHIP FULLOWN 3 100 1876 9999 4 OWNERSHIP FULLOWN 1 100 1884 9999 5 OWNERSHIP FULLOWN 2 100 1758 9999 6 OWNERSHIP FULLOWN 3 100 1876 9999 19
  • 20. Temporal database : Example (II) NOVARTIS 1970 first merge CIBA + GEIGHY = CIBA GEIGHY LTD LEGPCODE LEGPNAME 1 CIBA 2 GEIGHY 3 SANDOZ 4 CIBA SUB 1..N 5 GEIGHY SUB 1…N 6 SANDOZ SUB 1…N 7 CIBA GEIGY LTD. LEGPCODE PROPNAME PROPVALUE STATUSCODE2 STATUSTEXT STATUSPERC DATEFROM DATETO CHGREASON 1 OWNERSHIP FULLOWN 1 100 1884 1969 MERGE 2 OWNERSHIP FULLOWN 2 100 1758 1969 MERGE 3 OWNERSHIP FULLOWN 3 100 1876 9999 4 OWNERSHIP FULLOWN 1 100 1884 1969 MERGE 5 OWNERSHIP FULLOWN 2 100 1758 1969 MERGE 6 OWNERSHIP FULLOWN 3 100 1876 9999 TRANSFORMATI 1 ON MERGE 7 50 1970 1970 TRANSFORMATI 2 ON MERGE 7 50 1970 1970 7 OWNERSHIP FULLOWN 7 100 1970 9999 4 OWNERSHIP FULLOWN 7 100 1970 9999 5 OWNERSHIP FULLOWN 7 100 1970 9999 20
  • 21. Temporal database : Example (III) NOVARTIS 1996 second merge: CIBA GEIGHY + Sandoz = Novartis LEGPCODE LEGPNAME 3 SANDOZ 4 CIBA SUB 1..N 5 GEIGHY SUB 1…N 6 SANDOZ SUB 1…N 7 CIBA GEIGY LTD. 8 NOVARTIS LEGPCODE PROPNAME PROPVALUE STATUSCODE2 STATUSTEXT STATUSPERC DATEFROM DATETO CHGREASON 3 OWNERSHIP FULLOWN 3 100 1876 1995 MERGE 4 OWNERSHIP FULLOWN 1 100 1884 1969 MERGE 5 OWNERSHIP FULLOWN 2 100 1758 1969 MERGE 6 OWNERSHIP FULLOWN 3 100 1876 1995 MERGE 7 OWNERSHIP FULLOWN 7 100 1970 1995 MERGE 4 OWNERSHIP FULLOWN 7 100 1970 1995 MERGE 5 OWNERSHIP FULLOWN 7 100 1970 1995 MERGE TRANSFORMATI 3 ON MERGE 8 50 1996 9999 TRANSFORMATI 7 ON MERGE 8 50 1996 9999 8 OWNERSHIP FULLOWN 8 100 1996 9999 MERGE 4 OWNERSHIP FULLOWN 8 100 1996 9999 MERGE 5 OWNERSHIP FULLOWN 8 100 1996 9999 MERGE 6 OWNERSHIP FULLOWN 8 100 1996 9999 MERGE 21