SlideShare una empresa de Scribd logo
1 de 37
Descargar para leer sin conexión
Understanding	
  Change	
  in	
  
Versioned	
  KOS	
  on	
  the	
  Web	
  
Albert	
  Meroño-­‐Peñuela	
  
Christophe	
  Guéret	
  
Stefan	
  Schlobach	
  
	
  
@albertmeronyo	
  
	
  
EvoluFon	
  and	
  variaFon	
  of	
  classificaFon	
  systems	
  –	
  KnoweScape	
  workshop	
  
04-­‐03-­‐2015	
  
CEDAR:	
  Harmonizing	
  Historical	
  Census	
  
Data	
  in	
  the	
  SemanFc	
  Web	
  
CEDAR:	
  Harmonizing	
  Historical	
  Census	
  
Data	
  in	
  the	
  SemanFc	
  Web	
  
CEDAR:	
  Source	
  Historical	
  Data	
  
	
  
Dutch	
  Historical	
  Censuses	
  (1795-­‐1971)	
  	
  
[Public	
  Historical	
  StaFsFcal	
  Data]	
  
	
  
	
  
5	
  
From	
  scans	
  to	
  spreadsheets	
  
Uniform	
  queries	
  on	
  the	
  Web	
  
1795	
  	
  1830	
  	
  1840	
  	
  1849	
  	
  1859	
  	
  1869	
  	
  1879	
  	
  1889	
  	
  1899	
  	
  1909	
  	
  1919	
  	
  1920	
  	
  1930	
  	
  1947	
  	
  1956	
  	
  1960	
  	
  1971	
  
(through	
  ~3K	
  
heterogeneous	
  tables)	
  
RDF	
  Data	
  Cube	
  
“There	
  are	
  many	
  situaFons	
  where	
  it	
  would	
  
be	
  useful	
  to	
  be	
  able	
  to	
  publish	
  mulF-­‐
dimensional	
  data,	
  such	
  as	
  staFsFcs,	
  on	
  the	
  
web	
  in	
  such	
  a	
  way	
  that	
  they	
  can	
  be	
  linked	
  
to	
  related	
  data	
  sets	
  and	
  concepts.”	
  
Albert Merono-Penuela: Understanding Change in Versioned Web-Knowledge Organisation Systems (KOS)
Albert Merono-Penuela: Understanding Change in Versioned Web-Knowledge Organisation Systems (KOS)
RDF	
  Data	
  Cube	
  vocabulary	
  (QB)	
  
•  SDMX	
  compaFble	
  
•  Defines	
  cubes	
  as	
  a	
  set	
  of	
  observa*ons	
  that	
  consist	
  of	
  
dimensions,	
  measures	
  and	
  a/ributes	
  
• 	
  Dimensions:	
  Fme	
  period,	
  region,	
  sex	
  (qb:DimensionProperty)
• 	
  Measure:	
  populaFon	
  life	
  expectancy	
  (qb:MeasureProperty)	
  
• 	
  Ajribute:	
  unit	
  of	
  measure	
  =	
  years,	
  metadata	
  status	
  =	
  
measured	
  (qb:AttributeProperty)	
  
ObservaFon:	
  “the	
  measured	
  life	
  expectancy	
  of	
  males	
  in	
  
Newport	
  in	
  the	
  period	
  2004-­‐2006	
  is	
  76.7	
  years”	
  
Dynamic	
  ClassificaFons	
  
•  Gemeentegeschiedenis.nl	
  
Dynamic	
  ClassificaFons	
  
hjp://lod.cedar-­‐project.nl/maps/	
  (kudos	
  to	
  Richard	
  Zijdeman)	
  
Dynamic	
  ClassificaFons	
  
•  HISCO	
  
hjp://historyofwork.iisg.nl/	
  
LSD	
  Dimensions	
  
hjp://lsd-­‐dimensions.org/	
  
hjps://github.com/albertmeronyo/LSD-­‐Dimensions	
  
Daily	
  JSON-­‐LD	
  dumps	
  
hjp://lsd-­‐dimensions.org/	
  
Concept Drift
	
   Census	
  classificaFon	
  of	
  
occupaFons	
  as	
  for	
  
	
  
	
  1859	
  
•  Root	
  node	
  is	
  void	
  
•  Depth	
  1:	
  occupaFon	
  groups	
  
•  Leaves:	
  actual	
  occupaFons	
  
Concept Drift
	
   Census	
  classificaFon	
  of	
  
occupaFons	
  as	
  for	
  
	
  
	
  1889	
  
•  Root	
  node	
  is	
  void	
  
•  Depth	
  1:	
  occupaFon	
  groups	
  
•  Leaves:	
  actual	
  occupaFons	
  
Concept Drift
	
   Census	
  classificaFon	
  of	
  
occupaFons	
  as	
  for	
  
	
  
	
  1899	
  
•  Root	
  node	
  is	
  void	
  
•  Depth	
  1:	
  occupaFon	
  groups	
  
•  Leaves:	
  actual	
  occupaFons	
  
Concept	
  Dris	
  
Upper ontologies
(HISCO, AC)
Year-
dependent
ontologies
1859 1869 1879
Concept	
  Dris	
  
Upper ontologies
(HISCO, AC)
Year-
dependent
ontologies
Concept	
  Dris	
  
Upper ontologies
(HISCO, AC)
Year-
dependent
ontologies
? ?
PredicFng	
  Change	
  
•  KOS	
  version	
  chains:	
  subsequent	
  unique	
  
version	
  iden*fiers	
  to	
  unique	
  states	
  of	
  KOS	
  
•  ProblemaFc	
  for	
  
– Data	
  publishers	
  (KOS	
  maintainability)	
  
– Data	
  users/linkers	
  (link	
  validity)	
  
A.	
  Meroño-­‐Peñuela,	
  C.	
  Guéret,	
  S.	
  Schlobach.	
  Predic1ng	
  Change	
  in	
  Versioned	
  Knowledge	
  
Organisa1on	
  Systems	
  on	
  the	
  Web.	
  IJCAI	
  2015	
  (under	
  review)	
  
PredicFng	
  Change	
  
•  Proposal:	
  generic	
  approach	
  to	
  predict	
  when	
  and	
  
where	
  a	
  Web	
  KOS	
  of	
  any	
  domain	
  will	
  change	
  
–  Using	
  supervised	
  learning	
  on	
  past	
  versions	
  of	
  KOS	
  
•  SotA1:	
  predicFon	
  of	
  class	
  extension	
  in	
  	
  
–  1	
  OBO/OWL	
  version	
  chain	
  (Gene	
  Ontology)	
  
–  using	
  few	
  classifiers	
  
•  Contribu1on2:	
  predicFon	
  of	
  concept	
  dri:	
  in	
  	
  
–  150	
  Web	
  KOS	
  version	
  chains	
  
–  using	
  all	
  (21)	
  SotA	
  classifiers	
  (WEKA	
  API)	
  
2	
  A.	
  Meroño-­‐Peñuela,	
  C.	
  Guéret,	
  S.	
  Schlobach.	
  “Predic1ng	
  Change	
  in	
  Versioned	
  Knowledge	
  
Organisa1on	
  Systems	
  on	
  the	
  Web”.	
  IJCAI	
  2015	
  (under	
  review)	
  
1	
  C.	
  Pesquita,	
  F.M.	
  Couto.	
  “Predic1ng	
  the	
  extension	
  of	
  biomedical	
  ontologies”.	
  PLoS	
  computa1onal	
  
biology	
  8	
  (9),	
  e1002630	
  	
  	
  
Concept	
  Dris	
  
•  Proxy	
  for	
  change	
  of	
  meaning	
  over	
  Fme1	
  
– Intension	
  dri:	
  occurs	
  when	
  there	
  is	
  a	
  difference	
  
in	
  the	
  properFes	
  or	
  ajributes	
  of	
  two	
  variants	
  of	
  
the	
  same	
  concept	
  
– Extension	
  dri:	
  occurs	
  when	
  there	
  is	
  a	
  difference	
  
in	
  the	
  individuals	
  that	
  belong	
  to	
  two	
  variants	
  of	
  
the	
  same	
  concept	
  
– Label	
  dri:	
  occurs	
  when	
  there	
  is	
  a	
  difference	
  in	
  the	
  
labels	
  of	
  two	
  variants	
  of	
  the	
  same	
  concept	
  
1	
  S.	
  Wang,	
  S.	
  Schlobach,	
  K.	
  Klein.	
  “What	
  Is	
  Concept	
  DriR	
  and	
  How	
  to	
  Measure	
  It?”.	
  EKAW	
  2010.	
  
Input	
  Datasets	
  
KOS	
  version	
  chains	
  from	
  
•  HISCO/CEDAR	
  (1	
  version	
  chain)	
  
•  DBpedia	
  (2	
  version	
  chains)	
  
•  Linked	
  Open	
  Vocabularies1	
  (134	
  version	
  chains)	
  
•  *Ontology	
  chains	
  from	
  637	
  SPARQL	
  
endpoints2	
  (6	
  version	
  chains)	
  
1	
  hjp://lov.okfn.org/	
  	
  	
  
2	
  hjps://github.com/albertmeronyo/ConceptDris-­‐data/tree/master/src	
  	
  
Features	
  
•  From	
  which	
  data	
  characterisFcs	
  (related	
  to	
  
change)	
  should	
  we	
  learn?	
  
•  SotA	
  in	
  Ontology	
  Change	
  [Stojanovic	
  2004]	
  
– Structure-­‐driven	
  (rdfs:subClassOf,	
  skos:broader)	
  
•  maxDepth,	
  children,	
  parents,	
  siblings	
  
– Data-­‐driven	
  (rdf:type)	
  
•  members,	
  childMembers,	
  parentMembers,	
  
siblingMembers	
  
– Usage-­‐driven	
  
•  incExtLinks	
  (on	
  the	
  Web!)	
  
Pipeline	
  
hjps://github.com/albertmeronyo/ConceptDris	
  	
  
EvaluaFon	
  
•  Use	
  a	
  subset	
  of	
  past	
  versions	
  for	
  learning	
  (Vt)	
  
•  Check	
  whether	
  changed	
  happened	
  by	
  
observing	
  Vr,	
  Ve	
  
Results	
  –	
  classifier	
  performance	
  
CEDAR/HISCO	
  classificaFon	
  
performance	
  over	
  Fme	
  
Dbpedia	
  ontology	
  classificaFon	
  
performance	
  over	
  Fme	
  
Results	
  –	
  understanding	
  performance	
  
RelaFonship	
  between	
  characterisFcs	
  of	
  input	
  version	
  chains	
  and	
  
selected	
  classifiers	
  /	
  performance?	
  
	
  
•  totalSize	
  
•  nSnapshots	
  
•  avgGap	
  
•  avgTreeDepth	
  
•  ra1oInstances	
  
•  ra1oStructural	
  
•  ra1oInserts	
  
•  ra1oDeletes	
  
•  ra1oComm	
  
f(xi)?	
  
q  roc	
  
q  classifier	
  
Albert Merono-Penuela: Understanding Change in Versioned Web-Knowledge Organisation Systems (KOS)
Albert Merono-Penuela: Understanding Change in Versioned Web-Knowledge Organisation Systems (KOS)
Table 1:
Dependent variable:
functions rules trees functions rules trees functions rules trees
(1) (2) (3) (4) (5) (6) (7) (8) (9)
log(nSnapshots) 0.291 0.257 1.975 0.180 0.239 1.745 0.193 0.212 1.838
(0.656) (0.765) (1.503) (0.680) (0.790) (1.512) (0.667) (0.777) (1.497)
log(avgGap) 0.238 0.145 1.385⇤
0.266 0.173 1.269⇤
0.248 0.161 1.351⇤
(0.242) (0.271) (0.734) (0.240) (0.269) (0.703) (0.240) (0.270) (0.729)
log(totalSize) 0.669⇤⇤⇤
0.539⇤
0.052 0.636⇤⇤
0.531⇤
0.010 0.641⇤⇤⇤
0.524⇤
0.025
(0.249) (0.278) (0.563) (0.251) (0.282) (0.555) (0.249) (0.279) (0.557)
avgTreeDepth 0.399 0.334 0.534 0.393 0.336 0.564 0.385 0.323 0.553
(0.302) (0.330) (0.719) (0.304) (0.334) (0.728) (0.303) (0.332) (0.728)
ratioInstances 1.378 2.463 3.090 1.071 2.246 3.394 1.269 2.330 3.221
(3.485) (4.021) (6.654) (3.455) (3.981) (6.629) (3.476) (4.005) (6.649)
ratioStructural 9.054 1.357 9.539 9.039 1.674 10.799 9.594 1.116 10.030
(6.040) (6.135) (13.505) (6.142) (6.353) (13.945) (6.136) (6.267) (13.827)
ratioInserts 3.006 2.376 3.540
(1.906) (2.210) (4.401)
ratioDeletes 1.918 0.929 2.341
(1.907) (2.154) (4.058)
ratioComm 1.440 0.945 1.615
(1.028) (1.170) (2.219)
Constant 5.610⇤⇤
5.580⇤⇤
12.702⇤⇤
5.288⇤⇤
5.259⇤⇤
12.402⇤⇤
4.059⇤
4.494⇤
14.266⇤⇤
(2.248) (2.511) (5.954) (2.210) (2.494) (5.759) (2.265) (2.585) (6.511)
Akaike Inf. Crit. 313.543 313.543 313.543 316.179 316.179 316.179 314.605 314.605 314.605
Note: ⇤
p<0.1; ⇤⇤
p<0.05; ⇤⇤⇤
p<0.01
Classifier	
  SelecFon	
  
SimulaFon	
  of	
  avgGap	
  VS	
  Classifier	
  Family	
  SelecFon	
  
Conclusions	
  
•  SemanFc	
  technology	
  for	
  Social	
  History	
  
–  It	
  saved	
  work!	
  
•  Historical	
  datasets	
  as	
  an	
  observatory	
  of	
  dynamic	
  
KOS	
  
–  Logging	
  usage	
  of	
  KOS	
  in	
  Linked	
  StaFsFcal	
  Data	
  
•  Modeling	
  change	
  in	
  Web	
  KOS	
  
–  Version	
  chains	
  are	
  scarce	
  (beware	
  of	
  bias)	
  
–  Chain	
  recipe:	
  nSnapshots,	
  avgTreeDepth,	
  
raFoStructural,	
  raFoInserts,	
  raFoComm	
  
–  Classifier	
  dependence:	
  avgGap,	
  totalSize	
  
Thank you
Questions, suggestions, comments
most welcome
@albertmeronyo
https://github.com/albertmeronyo/ConceptDrift
http://www.cedar-project.nl
http://krr.cs.vu.nl/
http://easy.dans.knaw.nl/
http://lsd-dimensions.org/
Me	
  in	
  6	
  tweets	
  
hjp://www.albertmeronyo.org	
  
•  Background:	
  Computer	
  Science,	
  Web	
  hacker,	
  AI	
  &	
  Law	
  
•  PhD	
  candidate	
  at	
  the	
  VU	
  University	
  Amsterdam,	
  DANS,	
  
and	
  eHumaniFes	
  group	
  (KNAW)	
  
•  Topic:	
  SemanFc	
  Web	
  for	
  the	
  HumaniFes	
  	
  
•  CEDAR	
  project	
  (2012-­‐2015):	
  harmonized	
  historical	
  
Dutch	
  censuses	
  in	
  the	
  SemanFc	
  Web	
  	
  
•  Problem:	
  staFsFcal	
  data	
  publishing,	
  concept	
  dris	
  and	
  
dynamics	
  of	
  meaning	
  	
  
•  Last	
  paper:	
  What	
  is	
  Linked	
  Historical	
  Data?	
  (EKAW	
  
2014)	
  	
  

Más contenido relacionado

Similar a Albert Merono-Penuela: Understanding Change in Versioned Web-Knowledge Organisation Systems (KOS)

07 data structures_and_representations
07 data structures_and_representations07 data structures_and_representations
07 data structures_and_representationsMarco Quartulli
 
Making your data work for you: Scratchpads, publishing & the biodiversity dat...
Making your data work for you: Scratchpads, publishing & the biodiversity dat...Making your data work for you: Scratchpads, publishing & the biodiversity dat...
Making your data work for you: Scratchpads, publishing & the biodiversity dat...Vince Smith
 
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...Marko Rodriguez
 
CEDAR & PRELIDA Preservation of Linked Socio-Historical Data
CEDAR & PRELIDA Preservation of Linked Socio-Historical DataCEDAR & PRELIDA Preservation of Linked Socio-Historical Data
CEDAR & PRELIDA Preservation of Linked Socio-Historical DataPRELIDA Project
 
From Simple Features to Moving Features and Beyond? at OGC Member Meeting, Se...
From Simple Features to Moving Features and Beyond? at OGC Member Meeting, Se...From Simple Features to Moving Features and Beyond? at OGC Member Meeting, Se...
From Simple Features to Moving Features and Beyond? at OGC Member Meeting, Se...Anita Graser
 
Interpretation, Context, and Metadata: Examples from Open Context
Interpretation, Context, and Metadata: Examples from Open ContextInterpretation, Context, and Metadata: Examples from Open Context
Interpretation, Context, and Metadata: Examples from Open ContextEric Kansa
 
SKOS, Past, Present and Future
SKOS, Past, Present and FutureSKOS, Past, Present and Future
SKOS, Past, Present and Futureseanb
 
Toward Next Generation of Gazetteer: Utilizing GeoSPARQL For Developing Link...
Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Link...Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Link...
Toward Next Generation of Gazetteer: Utilizing GeoSPARQL For Developing Link...Dongpo Deng
 
History and Background of the USEWOD Data Challenge
History and Background of the  USEWOD Data ChallengeHistory and Background of the  USEWOD Data Challenge
History and Background of the USEWOD Data ChallengeKnud Möller
 
Quantitative Individuated Corpus Linguistics
Quantitative Individuated Corpus LinguisticsQuantitative Individuated Corpus Linguistics
Quantitative Individuated Corpus LinguisticsCornelius Puschmann
 
EVOLUTION OF ONTOLOGY-BASED MAPPINGS
EVOLUTION OF ONTOLOGY-BASED MAPPINGSEVOLUTION OF ONTOLOGY-BASED MAPPINGS
EVOLUTION OF ONTOLOGY-BASED MAPPINGSAksw Group
 
On the many graphs of the Web and the interest of adding their missing links.
On the many graphs of the Web and the interest of adding their missing links. On the many graphs of the Web and the interest of adding their missing links.
On the many graphs of the Web and the interest of adding their missing links. Fabien Gandon
 
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...ICZN
 
The Matrix: connecting and re-using digital records of archaeological investi...
The Matrix: connecting and re-using digital records of archaeological investi...The Matrix: connecting and re-using digital records of archaeological investi...
The Matrix: connecting and re-using digital records of archaeological investi...Keith.May
 
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsJon Voss
 
Harvard Hypermap: An Open Source Framework for Making the World’s Geospatial ...
Harvard Hypermap: An Open Source Framework for Making the World’s Geospatial ...Harvard Hypermap: An Open Source Framework for Making the World’s Geospatial ...
Harvard Hypermap: An Open Source Framework for Making the World’s Geospatial ...Paolo Corti
 
Semantic Web related top conference review
Semantic Web related top conference reviewSemantic Web related top conference review
Semantic Web related top conference reviewGong Cheng
 
Semantics as a service at EMBL-EBI
Semantics as a service at EMBL-EBISemantics as a service at EMBL-EBI
Semantics as a service at EMBL-EBISimon Jupp
 

Similar a Albert Merono-Penuela: Understanding Change in Versioned Web-Knowledge Organisation Systems (KOS) (20)

CBS CEDAR Presentation
CBS CEDAR PresentationCBS CEDAR Presentation
CBS CEDAR Presentation
 
07 data structures_and_representations
07 data structures_and_representations07 data structures_and_representations
07 data structures_and_representations
 
Making your data work for you: Scratchpads, publishing & the biodiversity dat...
Making your data work for you: Scratchpads, publishing & the biodiversity dat...Making your data work for you: Scratchpads, publishing & the biodiversity dat...
Making your data work for you: Scratchpads, publishing & the biodiversity dat...
 
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
 
CEDAR & PRELIDA Preservation of Linked Socio-Historical Data
CEDAR & PRELIDA Preservation of Linked Socio-Historical DataCEDAR & PRELIDA Preservation of Linked Socio-Historical Data
CEDAR & PRELIDA Preservation of Linked Socio-Historical Data
 
From Simple Features to Moving Features and Beyond? at OGC Member Meeting, Se...
From Simple Features to Moving Features and Beyond? at OGC Member Meeting, Se...From Simple Features to Moving Features and Beyond? at OGC Member Meeting, Se...
From Simple Features to Moving Features and Beyond? at OGC Member Meeting, Se...
 
Interpretation, Context, and Metadata: Examples from Open Context
Interpretation, Context, and Metadata: Examples from Open ContextInterpretation, Context, and Metadata: Examples from Open Context
Interpretation, Context, and Metadata: Examples from Open Context
 
SKOS, Past, Present and Future
SKOS, Past, Present and FutureSKOS, Past, Present and Future
SKOS, Past, Present and Future
 
Toward Next Generation of Gazetteer: Utilizing GeoSPARQL For Developing Link...
Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Link...Toward Next Generation of Gazetteer:  Utilizing GeoSPARQL For Developing Link...
Toward Next Generation of Gazetteer: Utilizing GeoSPARQL For Developing Link...
 
History and Background of the USEWOD Data Challenge
History and Background of the  USEWOD Data ChallengeHistory and Background of the  USEWOD Data Challenge
History and Background of the USEWOD Data Challenge
 
Koppel, Riding, Pace, and Ockerbloom, "Library Systems & Interoperability: Br...
Koppel, Riding, Pace, and Ockerbloom, "Library Systems & Interoperability: Br...Koppel, Riding, Pace, and Ockerbloom, "Library Systems & Interoperability: Br...
Koppel, Riding, Pace, and Ockerbloom, "Library Systems & Interoperability: Br...
 
Quantitative Individuated Corpus Linguistics
Quantitative Individuated Corpus LinguisticsQuantitative Individuated Corpus Linguistics
Quantitative Individuated Corpus Linguistics
 
EVOLUTION OF ONTOLOGY-BASED MAPPINGS
EVOLUTION OF ONTOLOGY-BASED MAPPINGSEVOLUTION OF ONTOLOGY-BASED MAPPINGS
EVOLUTION OF ONTOLOGY-BASED MAPPINGS
 
On the many graphs of the Web and the interest of adding their missing links.
On the many graphs of the Web and the interest of adding their missing links. On the many graphs of the Web and the interest of adding their missing links.
On the many graphs of the Web and the interest of adding their missing links.
 
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
 
The Matrix: connecting and re-using digital records of archaeological investi...
The Matrix: connecting and re-using digital records of archaeological investi...The Matrix: connecting and re-using digital records of archaeological investi...
The Matrix: connecting and re-using digital records of archaeological investi...
 
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
 
Harvard Hypermap: An Open Source Framework for Making the World’s Geospatial ...
Harvard Hypermap: An Open Source Framework for Making the World’s Geospatial ...Harvard Hypermap: An Open Source Framework for Making the World’s Geospatial ...
Harvard Hypermap: An Open Source Framework for Making the World’s Geospatial ...
 
Semantic Web related top conference review
Semantic Web related top conference reviewSemantic Web related top conference review
Semantic Web related top conference review
 
Semantics as a service at EMBL-EBI
Semantics as a service at EMBL-EBISemantics as a service at EMBL-EBI
Semantics as a service at EMBL-EBI
 

Más de COST Action TD1210

Joseph T. Tennis: Casting Our Eyes Over the Threads of the Cataloguer’s Work:...
Joseph T. Tennis: Casting Our Eyes Over the Threads of the Cataloguer’s Work:...Joseph T. Tennis: Casting Our Eyes Over the Threads of the Cataloguer’s Work:...
Joseph T. Tennis: Casting Our Eyes Over the Threads of the Cataloguer’s Work:...COST Action TD1210
 
Paul Groth: Data Analysis in a Changing Discourse: The Challenges of Scholarl...
Paul Groth: Data Analysis in a Changing Discourse: The Challenges of Scholarl...Paul Groth: Data Analysis in a Changing Discourse: The Challenges of Scholarl...
Paul Groth: Data Analysis in a Changing Discourse: The Challenges of Scholarl...COST Action TD1210
 
Christophe Gueret: Publish Web data - an interactive session
Christophe Gueret: Publish Web data - an interactive sessionChristophe Gueret: Publish Web data - an interactive session
Christophe Gueret: Publish Web data - an interactive sessionCOST Action TD1210
 
Almila Akdag Salah: Looking at classification systems from the point of view ...
Almila Akdag Salah: Looking at classification systems from the point of view ...Almila Akdag Salah: Looking at classification systems from the point of view ...
Almila Akdag Salah: Looking at classification systems from the point of view ...COST Action TD1210
 
Toby Burrows: Vernacular Classification: Knowledge Organization in the Humani...
Toby Burrows: Vernacular Classification: Knowledge Organization in the Humani...Toby Burrows: Vernacular Classification: Knowledge Organization in the Humani...
Toby Burrows: Vernacular Classification: Knowledge Organization in the Humani...COST Action TD1210
 
Valentine Charles: Linking cultural heritage with KOS: the Europeana example
Valentine Charles: Linking cultural heritage with KOS: the Europeana example Valentine Charles: Linking cultural heritage with KOS: the Europeana example
Valentine Charles: Linking cultural heritage with KOS: the Europeana example COST Action TD1210
 
Richard Smiraglia: Empirical methods for knowledge evolution across Knowledge...
Richard Smiraglia: Empirical methods for knowledge evolution across Knowledge...Richard Smiraglia: Empirical methods for knowledge evolution across Knowledge...
Richard Smiraglia: Empirical methods for knowledge evolution across Knowledge...COST Action TD1210
 
Aida Slavic Managing KOS: Evolution of concepts and their representation
Aida Slavic Managing KOS: Evolution of concepts and their representationAida Slavic Managing KOS: Evolution of concepts and their representation
Aida Slavic Managing KOS: Evolution of concepts and their representationCOST Action TD1210
 

Más de COST Action TD1210 (8)

Joseph T. Tennis: Casting Our Eyes Over the Threads of the Cataloguer’s Work:...
Joseph T. Tennis: Casting Our Eyes Over the Threads of the Cataloguer’s Work:...Joseph T. Tennis: Casting Our Eyes Over the Threads of the Cataloguer’s Work:...
Joseph T. Tennis: Casting Our Eyes Over the Threads of the Cataloguer’s Work:...
 
Paul Groth: Data Analysis in a Changing Discourse: The Challenges of Scholarl...
Paul Groth: Data Analysis in a Changing Discourse: The Challenges of Scholarl...Paul Groth: Data Analysis in a Changing Discourse: The Challenges of Scholarl...
Paul Groth: Data Analysis in a Changing Discourse: The Challenges of Scholarl...
 
Christophe Gueret: Publish Web data - an interactive session
Christophe Gueret: Publish Web data - an interactive sessionChristophe Gueret: Publish Web data - an interactive session
Christophe Gueret: Publish Web data - an interactive session
 
Almila Akdag Salah: Looking at classification systems from the point of view ...
Almila Akdag Salah: Looking at classification systems from the point of view ...Almila Akdag Salah: Looking at classification systems from the point of view ...
Almila Akdag Salah: Looking at classification systems from the point of view ...
 
Toby Burrows: Vernacular Classification: Knowledge Organization in the Humani...
Toby Burrows: Vernacular Classification: Knowledge Organization in the Humani...Toby Burrows: Vernacular Classification: Knowledge Organization in the Humani...
Toby Burrows: Vernacular Classification: Knowledge Organization in the Humani...
 
Valentine Charles: Linking cultural heritage with KOS: the Europeana example
Valentine Charles: Linking cultural heritage with KOS: the Europeana example Valentine Charles: Linking cultural heritage with KOS: the Europeana example
Valentine Charles: Linking cultural heritage with KOS: the Europeana example
 
Richard Smiraglia: Empirical methods for knowledge evolution across Knowledge...
Richard Smiraglia: Empirical methods for knowledge evolution across Knowledge...Richard Smiraglia: Empirical methods for knowledge evolution across Knowledge...
Richard Smiraglia: Empirical methods for knowledge evolution across Knowledge...
 
Aida Slavic Managing KOS: Evolution of concepts and their representation
Aida Slavic Managing KOS: Evolution of concepts and their representationAida Slavic Managing KOS: Evolution of concepts and their representation
Aida Slavic Managing KOS: Evolution of concepts and their representation
 

Último

Vani Magazine - Quarterly Magazine of Seshadripuram Educational Trust
Vani Magazine - Quarterly Magazine of Seshadripuram Educational TrustVani Magazine - Quarterly Magazine of Seshadripuram Educational Trust
Vani Magazine - Quarterly Magazine of Seshadripuram Educational TrustSavipriya Raghavendra
 
Work Experience for psp3 portfolio sasha
Work Experience for psp3 portfolio sashaWork Experience for psp3 portfolio sasha
Work Experience for psp3 portfolio sashasashalaycock03
 
EBUS5423 Data Analytics and Reporting Bl
EBUS5423 Data Analytics and Reporting BlEBUS5423 Data Analytics and Reporting Bl
EBUS5423 Data Analytics and Reporting BlDr. Bruce A. Johnson
 
Riddhi Kevadiya. WILLIAM SHAKESPEARE....
Riddhi Kevadiya. WILLIAM SHAKESPEARE....Riddhi Kevadiya. WILLIAM SHAKESPEARE....
Riddhi Kevadiya. WILLIAM SHAKESPEARE....Riddhi Kevadiya
 
KARNAADA.pptx made by - saransh dwivedi ( SD ) - SHALAKYA TANTRA - ENT - 4...
KARNAADA.pptx  made by -  saransh dwivedi ( SD ) -  SHALAKYA TANTRA - ENT - 4...KARNAADA.pptx  made by -  saransh dwivedi ( SD ) -  SHALAKYA TANTRA - ENT - 4...
KARNAADA.pptx made by - saransh dwivedi ( SD ) - SHALAKYA TANTRA - ENT - 4...M56BOOKSTORE PRODUCT/SERVICE
 
The Stolen Bacillus by Herbert George Wells
The Stolen Bacillus by Herbert George WellsThe Stolen Bacillus by Herbert George Wells
The Stolen Bacillus by Herbert George WellsEugene Lysak
 
What is the Future of QuickBooks DeskTop?
What is the Future of QuickBooks DeskTop?What is the Future of QuickBooks DeskTop?
What is the Future of QuickBooks DeskTop?TechSoup
 
Drug Information Services- DIC and Sources.
Drug Information Services- DIC and Sources.Drug Information Services- DIC and Sources.
Drug Information Services- DIC and Sources.raviapr7
 
SOLIDE WASTE in Cameroon,,,,,,,,,,,,,,,,,,,,,,,,,,,.pptx
SOLIDE WASTE in Cameroon,,,,,,,,,,,,,,,,,,,,,,,,,,,.pptxSOLIDE WASTE in Cameroon,,,,,,,,,,,,,,,,,,,,,,,,,,,.pptx
SOLIDE WASTE in Cameroon,,,,,,,,,,,,,,,,,,,,,,,,,,,.pptxSyedNadeemGillANi
 
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdf
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdfP4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdf
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdfYu Kanazawa / Osaka University
 
How to Send Emails From Odoo 17 Using Code
How to Send Emails From Odoo 17 Using CodeHow to Send Emails From Odoo 17 Using Code
How to Send Emails From Odoo 17 Using CodeCeline George
 
AUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptxAUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptxiammrhaywood
 
How to Show Error_Warning Messages in Odoo 17
How to Show Error_Warning Messages in Odoo 17How to Show Error_Warning Messages in Odoo 17
How to Show Error_Warning Messages in Odoo 17Celine George
 
How to Add Existing Field in One2Many Tree View in Odoo 17
How to Add Existing Field in One2Many Tree View in Odoo 17How to Add Existing Field in One2Many Tree View in Odoo 17
How to Add Existing Field in One2Many Tree View in Odoo 17Celine George
 
Prescribed medication order and communication skills.pptx
Prescribed medication order and communication skills.pptxPrescribed medication order and communication skills.pptx
Prescribed medication order and communication skills.pptxraviapr7
 
CHUYÊN ĐỀ DẠY THÊM TIẾNG ANH LỚP 11 - GLOBAL SUCCESS - NĂM HỌC 2023-2024 - HK...
CHUYÊN ĐỀ DẠY THÊM TIẾNG ANH LỚP 11 - GLOBAL SUCCESS - NĂM HỌC 2023-2024 - HK...CHUYÊN ĐỀ DẠY THÊM TIẾNG ANH LỚP 11 - GLOBAL SUCCESS - NĂM HỌC 2023-2024 - HK...
CHUYÊN ĐỀ DẠY THÊM TIẾNG ANH LỚP 11 - GLOBAL SUCCESS - NĂM HỌC 2023-2024 - HK...Nguyen Thanh Tu Collection
 
Education and training program in the hospital APR.pptx
Education and training program in the hospital APR.pptxEducation and training program in the hospital APR.pptx
Education and training program in the hospital APR.pptxraviapr7
 
Department of Health Compounder Question ‍Solution 2022.pdf
Department of Health Compounder Question ‍Solution 2022.pdfDepartment of Health Compounder Question ‍Solution 2022.pdf
Department of Health Compounder Question ‍Solution 2022.pdfMohonDas
 

Último (20)

Vani Magazine - Quarterly Magazine of Seshadripuram Educational Trust
Vani Magazine - Quarterly Magazine of Seshadripuram Educational TrustVani Magazine - Quarterly Magazine of Seshadripuram Educational Trust
Vani Magazine - Quarterly Magazine of Seshadripuram Educational Trust
 
Prelims of Kant get Marx 2.0: a general politics quiz
Prelims of Kant get Marx 2.0: a general politics quizPrelims of Kant get Marx 2.0: a general politics quiz
Prelims of Kant get Marx 2.0: a general politics quiz
 
Work Experience for psp3 portfolio sasha
Work Experience for psp3 portfolio sashaWork Experience for psp3 portfolio sasha
Work Experience for psp3 portfolio sasha
 
EBUS5423 Data Analytics and Reporting Bl
EBUS5423 Data Analytics and Reporting BlEBUS5423 Data Analytics and Reporting Bl
EBUS5423 Data Analytics and Reporting Bl
 
Riddhi Kevadiya. WILLIAM SHAKESPEARE....
Riddhi Kevadiya. WILLIAM SHAKESPEARE....Riddhi Kevadiya. WILLIAM SHAKESPEARE....
Riddhi Kevadiya. WILLIAM SHAKESPEARE....
 
KARNAADA.pptx made by - saransh dwivedi ( SD ) - SHALAKYA TANTRA - ENT - 4...
KARNAADA.pptx  made by -  saransh dwivedi ( SD ) -  SHALAKYA TANTRA - ENT - 4...KARNAADA.pptx  made by -  saransh dwivedi ( SD ) -  SHALAKYA TANTRA - ENT - 4...
KARNAADA.pptx made by - saransh dwivedi ( SD ) - SHALAKYA TANTRA - ENT - 4...
 
The Stolen Bacillus by Herbert George Wells
The Stolen Bacillus by Herbert George WellsThe Stolen Bacillus by Herbert George Wells
The Stolen Bacillus by Herbert George Wells
 
What is the Future of QuickBooks DeskTop?
What is the Future of QuickBooks DeskTop?What is the Future of QuickBooks DeskTop?
What is the Future of QuickBooks DeskTop?
 
Drug Information Services- DIC and Sources.
Drug Information Services- DIC and Sources.Drug Information Services- DIC and Sources.
Drug Information Services- DIC and Sources.
 
Personal Resilience in Project Management 2 - TV Edit 1a.pdf
Personal Resilience in Project Management 2 - TV Edit 1a.pdfPersonal Resilience in Project Management 2 - TV Edit 1a.pdf
Personal Resilience in Project Management 2 - TV Edit 1a.pdf
 
SOLIDE WASTE in Cameroon,,,,,,,,,,,,,,,,,,,,,,,,,,,.pptx
SOLIDE WASTE in Cameroon,,,,,,,,,,,,,,,,,,,,,,,,,,,.pptxSOLIDE WASTE in Cameroon,,,,,,,,,,,,,,,,,,,,,,,,,,,.pptx
SOLIDE WASTE in Cameroon,,,,,,,,,,,,,,,,,,,,,,,,,,,.pptx
 
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdf
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdfP4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdf
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdf
 
How to Send Emails From Odoo 17 Using Code
How to Send Emails From Odoo 17 Using CodeHow to Send Emails From Odoo 17 Using Code
How to Send Emails From Odoo 17 Using Code
 
AUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptxAUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptx
 
How to Show Error_Warning Messages in Odoo 17
How to Show Error_Warning Messages in Odoo 17How to Show Error_Warning Messages in Odoo 17
How to Show Error_Warning Messages in Odoo 17
 
How to Add Existing Field in One2Many Tree View in Odoo 17
How to Add Existing Field in One2Many Tree View in Odoo 17How to Add Existing Field in One2Many Tree View in Odoo 17
How to Add Existing Field in One2Many Tree View in Odoo 17
 
Prescribed medication order and communication skills.pptx
Prescribed medication order and communication skills.pptxPrescribed medication order and communication skills.pptx
Prescribed medication order and communication skills.pptx
 
CHUYÊN ĐỀ DẠY THÊM TIẾNG ANH LỚP 11 - GLOBAL SUCCESS - NĂM HỌC 2023-2024 - HK...
CHUYÊN ĐỀ DẠY THÊM TIẾNG ANH LỚP 11 - GLOBAL SUCCESS - NĂM HỌC 2023-2024 - HK...CHUYÊN ĐỀ DẠY THÊM TIẾNG ANH LỚP 11 - GLOBAL SUCCESS - NĂM HỌC 2023-2024 - HK...
CHUYÊN ĐỀ DẠY THÊM TIẾNG ANH LỚP 11 - GLOBAL SUCCESS - NĂM HỌC 2023-2024 - HK...
 
Education and training program in the hospital APR.pptx
Education and training program in the hospital APR.pptxEducation and training program in the hospital APR.pptx
Education and training program in the hospital APR.pptx
 
Department of Health Compounder Question ‍Solution 2022.pdf
Department of Health Compounder Question ‍Solution 2022.pdfDepartment of Health Compounder Question ‍Solution 2022.pdf
Department of Health Compounder Question ‍Solution 2022.pdf
 

Albert Merono-Penuela: Understanding Change in Versioned Web-Knowledge Organisation Systems (KOS)

  • 1. Understanding  Change  in   Versioned  KOS  on  the  Web   Albert  Meroño-­‐Peñuela   Christophe  Guéret   Stefan  Schlobach     @albertmeronyo     EvoluFon  and  variaFon  of  classificaFon  systems  –  KnoweScape  workshop   04-­‐03-­‐2015  
  • 2. CEDAR:  Harmonizing  Historical  Census   Data  in  the  SemanFc  Web  
  • 3. CEDAR:  Harmonizing  Historical  Census   Data  in  the  SemanFc  Web  
  • 4. CEDAR:  Source  Historical  Data     Dutch  Historical  Censuses  (1795-­‐1971)     [Public  Historical  StaFsFcal  Data]      
  • 5. 5   From  scans  to  spreadsheets  
  • 6. Uniform  queries  on  the  Web   1795    1830    1840    1849    1859    1869    1879    1889    1899    1909    1919    1920    1930    1947    1956    1960    1971   (through  ~3K   heterogeneous  tables)  
  • 7. RDF  Data  Cube   “There  are  many  situaFons  where  it  would   be  useful  to  be  able  to  publish  mulF-­‐ dimensional  data,  such  as  staFsFcs,  on  the   web  in  such  a  way  that  they  can  be  linked   to  related  data  sets  and  concepts.”  
  • 10. RDF  Data  Cube  vocabulary  (QB)   •  SDMX  compaFble   •  Defines  cubes  as  a  set  of  observa*ons  that  consist  of   dimensions,  measures  and  a/ributes   •   Dimensions:  Fme  period,  region,  sex  (qb:DimensionProperty) •   Measure:  populaFon  life  expectancy  (qb:MeasureProperty)   •   Ajribute:  unit  of  measure  =  years,  metadata  status  =   measured  (qb:AttributeProperty)   ObservaFon:  “the  measured  life  expectancy  of  males  in   Newport  in  the  period  2004-­‐2006  is  76.7  years”  
  • 11. Dynamic  ClassificaFons   •  Gemeentegeschiedenis.nl  
  • 13. Dynamic  ClassificaFons   •  HISCO   hjp://historyofwork.iisg.nl/  
  • 14. LSD  Dimensions   hjp://lsd-­‐dimensions.org/   hjps://github.com/albertmeronyo/LSD-­‐Dimensions   Daily  JSON-­‐LD  dumps  
  • 16. Concept Drift   Census  classificaFon  of   occupaFons  as  for      1859   •  Root  node  is  void   •  Depth  1:  occupaFon  groups   •  Leaves:  actual  occupaFons  
  • 17. Concept Drift   Census  classificaFon  of   occupaFons  as  for      1889   •  Root  node  is  void   •  Depth  1:  occupaFon  groups   •  Leaves:  actual  occupaFons  
  • 18. Concept Drift   Census  classificaFon  of   occupaFons  as  for      1899   •  Root  node  is  void   •  Depth  1:  occupaFon  groups   •  Leaves:  actual  occupaFons  
  • 19. Concept  Dris   Upper ontologies (HISCO, AC) Year- dependent ontologies 1859 1869 1879
  • 20. Concept  Dris   Upper ontologies (HISCO, AC) Year- dependent ontologies
  • 21. Concept  Dris   Upper ontologies (HISCO, AC) Year- dependent ontologies ? ?
  • 22. PredicFng  Change   •  KOS  version  chains:  subsequent  unique   version  iden*fiers  to  unique  states  of  KOS   •  ProblemaFc  for   – Data  publishers  (KOS  maintainability)   – Data  users/linkers  (link  validity)   A.  Meroño-­‐Peñuela,  C.  Guéret,  S.  Schlobach.  Predic1ng  Change  in  Versioned  Knowledge   Organisa1on  Systems  on  the  Web.  IJCAI  2015  (under  review)  
  • 23. PredicFng  Change   •  Proposal:  generic  approach  to  predict  when  and   where  a  Web  KOS  of  any  domain  will  change   –  Using  supervised  learning  on  past  versions  of  KOS   •  SotA1:  predicFon  of  class  extension  in     –  1  OBO/OWL  version  chain  (Gene  Ontology)   –  using  few  classifiers   •  Contribu1on2:  predicFon  of  concept  dri:  in     –  150  Web  KOS  version  chains   –  using  all  (21)  SotA  classifiers  (WEKA  API)   2  A.  Meroño-­‐Peñuela,  C.  Guéret,  S.  Schlobach.  “Predic1ng  Change  in  Versioned  Knowledge   Organisa1on  Systems  on  the  Web”.  IJCAI  2015  (under  review)   1  C.  Pesquita,  F.M.  Couto.  “Predic1ng  the  extension  of  biomedical  ontologies”.  PLoS  computa1onal   biology  8  (9),  e1002630      
  • 24. Concept  Dris   •  Proxy  for  change  of  meaning  over  Fme1   – Intension  dri:  occurs  when  there  is  a  difference   in  the  properFes  or  ajributes  of  two  variants  of   the  same  concept   – Extension  dri:  occurs  when  there  is  a  difference   in  the  individuals  that  belong  to  two  variants  of   the  same  concept   – Label  dri:  occurs  when  there  is  a  difference  in  the   labels  of  two  variants  of  the  same  concept   1  S.  Wang,  S.  Schlobach,  K.  Klein.  “What  Is  Concept  DriR  and  How  to  Measure  It?”.  EKAW  2010.  
  • 25. Input  Datasets   KOS  version  chains  from   •  HISCO/CEDAR  (1  version  chain)   •  DBpedia  (2  version  chains)   •  Linked  Open  Vocabularies1  (134  version  chains)   •  *Ontology  chains  from  637  SPARQL   endpoints2  (6  version  chains)   1  hjp://lov.okfn.org/       2  hjps://github.com/albertmeronyo/ConceptDris-­‐data/tree/master/src    
  • 26. Features   •  From  which  data  characterisFcs  (related  to   change)  should  we  learn?   •  SotA  in  Ontology  Change  [Stojanovic  2004]   – Structure-­‐driven  (rdfs:subClassOf,  skos:broader)   •  maxDepth,  children,  parents,  siblings   – Data-­‐driven  (rdf:type)   •  members,  childMembers,  parentMembers,   siblingMembers   – Usage-­‐driven   •  incExtLinks  (on  the  Web!)  
  • 28. EvaluaFon   •  Use  a  subset  of  past  versions  for  learning  (Vt)   •  Check  whether  changed  happened  by   observing  Vr,  Ve  
  • 29. Results  –  classifier  performance   CEDAR/HISCO  classificaFon   performance  over  Fme   Dbpedia  ontology  classificaFon   performance  over  Fme  
  • 30. Results  –  understanding  performance   RelaFonship  between  characterisFcs  of  input  version  chains  and   selected  classifiers  /  performance?     •  totalSize   •  nSnapshots   •  avgGap   •  avgTreeDepth   •  ra1oInstances   •  ra1oStructural   •  ra1oInserts   •  ra1oDeletes   •  ra1oComm   f(xi)?   q  roc   q  classifier  
  • 33. Table 1: Dependent variable: functions rules trees functions rules trees functions rules trees (1) (2) (3) (4) (5) (6) (7) (8) (9) log(nSnapshots) 0.291 0.257 1.975 0.180 0.239 1.745 0.193 0.212 1.838 (0.656) (0.765) (1.503) (0.680) (0.790) (1.512) (0.667) (0.777) (1.497) log(avgGap) 0.238 0.145 1.385⇤ 0.266 0.173 1.269⇤ 0.248 0.161 1.351⇤ (0.242) (0.271) (0.734) (0.240) (0.269) (0.703) (0.240) (0.270) (0.729) log(totalSize) 0.669⇤⇤⇤ 0.539⇤ 0.052 0.636⇤⇤ 0.531⇤ 0.010 0.641⇤⇤⇤ 0.524⇤ 0.025 (0.249) (0.278) (0.563) (0.251) (0.282) (0.555) (0.249) (0.279) (0.557) avgTreeDepth 0.399 0.334 0.534 0.393 0.336 0.564 0.385 0.323 0.553 (0.302) (0.330) (0.719) (0.304) (0.334) (0.728) (0.303) (0.332) (0.728) ratioInstances 1.378 2.463 3.090 1.071 2.246 3.394 1.269 2.330 3.221 (3.485) (4.021) (6.654) (3.455) (3.981) (6.629) (3.476) (4.005) (6.649) ratioStructural 9.054 1.357 9.539 9.039 1.674 10.799 9.594 1.116 10.030 (6.040) (6.135) (13.505) (6.142) (6.353) (13.945) (6.136) (6.267) (13.827) ratioInserts 3.006 2.376 3.540 (1.906) (2.210) (4.401) ratioDeletes 1.918 0.929 2.341 (1.907) (2.154) (4.058) ratioComm 1.440 0.945 1.615 (1.028) (1.170) (2.219) Constant 5.610⇤⇤ 5.580⇤⇤ 12.702⇤⇤ 5.288⇤⇤ 5.259⇤⇤ 12.402⇤⇤ 4.059⇤ 4.494⇤ 14.266⇤⇤ (2.248) (2.511) (5.954) (2.210) (2.494) (5.759) (2.265) (2.585) (6.511) Akaike Inf. Crit. 313.543 313.543 313.543 316.179 316.179 316.179 314.605 314.605 314.605 Note: ⇤ p<0.1; ⇤⇤ p<0.05; ⇤⇤⇤ p<0.01 Classifier  SelecFon  
  • 34. SimulaFon  of  avgGap  VS  Classifier  Family  SelecFon  
  • 35. Conclusions   •  SemanFc  technology  for  Social  History   –  It  saved  work!   •  Historical  datasets  as  an  observatory  of  dynamic   KOS   –  Logging  usage  of  KOS  in  Linked  StaFsFcal  Data   •  Modeling  change  in  Web  KOS   –  Version  chains  are  scarce  (beware  of  bias)   –  Chain  recipe:  nSnapshots,  avgTreeDepth,   raFoStructural,  raFoInserts,  raFoComm   –  Classifier  dependence:  avgGap,  totalSize  
  • 36. Thank you Questions, suggestions, comments most welcome @albertmeronyo https://github.com/albertmeronyo/ConceptDrift http://www.cedar-project.nl http://krr.cs.vu.nl/ http://easy.dans.knaw.nl/ http://lsd-dimensions.org/
  • 37. Me  in  6  tweets   hjp://www.albertmeronyo.org   •  Background:  Computer  Science,  Web  hacker,  AI  &  Law   •  PhD  candidate  at  the  VU  University  Amsterdam,  DANS,   and  eHumaniFes  group  (KNAW)   •  Topic:  SemanFc  Web  for  the  HumaniFes     •  CEDAR  project  (2012-­‐2015):  harmonized  historical   Dutch  censuses  in  the  SemanFc  Web     •  Problem:  staFsFcal  data  publishing,  concept  dris  and   dynamics  of  meaning     •  Last  paper:  What  is  Linked  Historical  Data?  (EKAW   2014)