WebART in 10 minutes

•Descargar como KEY, PDF•

1 recomendación•2,888 vistas

Jaap Kamps

Introduction to the WebART project -- Web Archive Retrieval Tools -- http://staff.science.uva.nl/~kamps/webart/

Tecnología Empresariales

Web Archive Retrieval Tools
Paul Doorenbosch Jaap Kamps Richard Rogers Arjen de Vries René Voorburg

CATCH Meeting HiTime e-History, November 1, 2011

Information
Access

Paul Doorenbosch

Arjen de Vries René Voorburg

Web
Archive
Jaap Kamps

New
Media
Richard Rogers

Unlimited ways to publish/access/share information

Ease of publishing on the Web comes at a price

Web content is ephemeral

Web archives preserve the heritage of the future

d to the information defined. After the morning introduc-
lieve that informa- tory session, we split the workshop
Focus on use: Web research(ers)
search falls squarely
human-computer
into three new working groups, based
on the results of that discussion.
ome emphasis on
val, rather than vice
f the thrusts o f this
attempt to character-
users engage in, to
ctivities, and to iden-
chniques and mea-
appropriate insights
or and performance.

participated in the
were chosen on the
ef submitted position
sented a broad spec-
and academia. Partic-
France, Canada,
U.S. After accep-
J.
s were asked to sub-
age) position
- ©
escribed relevant
pectives a few weeks
hop. These papers

Complex tasks are still painstaking!

Many queries, tabs, notes, cut-and-paste, ...

Interactively construct a (hidden) query

Each block = data or manipulations
Strategy Builder

Build a dedicated search engine “on the ﬂy”

Research methods become search strategies

Store, reﬁne, reuse, share strategies

Researchers enrich the archive

Archival selection determines future use

Supporting Complex Search Tasks
Nick Belkin Charlie Clarke Ning Gao Jaap Kamps Jussi Karlgren
Thanks!
SIGIR 2011 Workshop, July 28, 2011

Más contenido relacionado

La actualidad más candente

Web Archives and the dream of the Personal Search EngineArjen de Vries

The Future of Finding: Resource Discovery @ The University of OxfordChristine Madsen

The library in the life of the userlisld

Multilingual presentation ifla 2013 08-19Janifer Gatenby

Gary Price, MIT Program on Information ScienceMicah Altman

Building and Managing Social Media CollectionsJason Casden

Data Designed for DiscoveryOCLC

Exploring a world of networked information built from free-text metadataShenghui Wang

20161019-dlc-making-it-happen-together-demonstrating-resilience-thru-successf...Andrew Bourgeois

Best Practices for Descriptive MetadataOCLC

Intro to Linked Open Data in Libraries Archives & Museums.Jon Voss

20170222 ku-librarians勉強会 #211 :海外研修報告：英国大学図書館を北から南へ巡る旅kulibrarians

Ir1Tomas Anikevičius

ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsJon Voss

ArchivePress Presentation (BL 21/7/2009)Richard Davis

CST2560 Oct 2019EISLibrarian

Dulin PermaCC Talk for MIT PISMicah Altman

Let's Get Visible! with Karla Smith, Winnefox Library SystemWiLS

Collection Directions - Research collections in the network environmentConstance Malpas

Connecting the Dots: Linking Digitized Collections Across Metadata SilosOCLC

La actualidad más candente (20)

Web Archives and the dream of the Personal Search Engine

The Future of Finding: Resource Discovery @ The University of Oxford

The library in the life of the user

Multilingual presentation ifla 2013 08-19

Gary Price, MIT Program on Information Science

Building and Managing Social Media Collections

Data Designed for Discovery

Exploring a world of networked information built from free-text metadata

20161019-dlc-making-it-happen-together-demonstrating-resilience-thru-successf...

Best Practices for Descriptive Metadata

Intro to Linked Open Data in Libraries Archives & Museums.

20170222 ku-librarians勉強会 #211 :海外研修報告：英国大学図書館を北から南へ巡る旅

Ir1

ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums

ArchivePress Presentation (BL 21/7/2009)

CST2560 Oct 2019

Dulin PermaCC Talk for MIT PIS

Let's Get Visible! with Karla Smith, Winnefox Library System

Collection Directions - Research collections in the network environment

Connecting the Dots: Linking Digitized Collections Across Metadata Silos

Similar a WebART in 10 minutes

When Search becomes Research and Research becomes SearchJaap Kamps

Looking for Data: Finding New ScienceAnita de Waard

CSAFE CRE PresentationUniversity of Otago

JISC repositories and preservation programme: Plenary presentation 2009Kevin Ashley

Observing Web Archives: The Case for an Ethnographic Study of Web ArchivingJessica Ogden

Lightning Talks: All EartCube Funded ProjectsEarthCube

Web 2.0 Tools for Researcherstbirdcymru

Labscope introJosé Izquierdo

Sgci iwsg-a-10-10-16Nancy Wilkins-Diehr

Revolutionizing scientific communication and collaborationKonrad Förstner

Introduction to Research Data Management for postgraduate studentsMarieke Guy

OAI7 Research Objectsseanb

New Metaphors: Data Papers and Data CitationsJohn Kunze

Digital Tools, Trends and Methodologies in the Humanities and Social SciencesShawn Day

Digital library services and the changing environmentJohn MacColl

OeRC Seminarseanb

EmergeNick Sheppard

Reach Out to Research : library support services (R2R) Guus van den Brekel

NgspTim Clark

Moving the repository upstreamChris Rusbridge

Similar a WebART in 10 minutes (20)

When Search becomes Research and Research becomes Search

Looking for Data: Finding New Science

CSAFE CRE Presentation

JISC repositories and preservation programme: Plenary presentation 2009

Observing Web Archives: The Case for an Ethnographic Study of Web Archiving

Lightning Talks: All EartCube Funded Projects

Web 2.0 Tools for Researchers

Labscope intro

Sgci iwsg-a-10-10-16

Revolutionizing scientific communication and collaboration

Introduction to Research Data Management for postgraduate students

OAI7 Research Objects

New Metaphors: Data Papers and Data Citations

Digital Tools, Trends and Methodologies in the Humanities and Social Sciences

Digital library services and the changing environment

OeRC Seminar

Emerge

Reach Out to Research : library support services (R2R)

Ngsp

Moving the repository upstream

Más de Jaap Kamps

ICTIR'17 OpeningJaap Kamps

From Finding to DiscoveringJaap Kamps

Expose in 10 minutesJaap Kamps

INEX 2010 OpeningJaap Kamps

Bachelor Cultural Information Science 2010-2011Jaap Kamps

IIiX 2012 Nijmegen BidJaap Kamps

Museum0610Jaap Kamps

Más de Jaap Kamps (7)

ICTIR'17 Opening

From Finding to Discovering

Expose in 10 minutes

INEX 2010 Opening

Bachelor Cultural Information Science 2010-2011

IIiX 2012 Nijmegen Bid

Museum0610

Último

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo

Artificial Intelligence: Facts and MythsJoaquim Jorge

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous

Finology Group – Insurtech Innovation Award 2024The Digital Insurer

HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics

[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93

Tech Trends Report 2024 Future Today Institute.pdfhans926745

Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung

Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal

From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software

Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun

Automating Google Workspace (GWS) & more with Apps Scriptwesley chun

Real Time Object Detection Using Open CVKhem

A Domino Admins Adventures (Engage 2024)Gabriella Davis

What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco

Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech

Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun

WebART in 10 minutes

1. Web Archive Retrieval Tools Paul Doorenbosch Jaap Kamps Richard Rogers Arjen de Vries René Voorburg CATCH Meeting HiTime e-History, November 1, 2011

2. Information Access Paul Doorenbosch Arjen de Vries René Voorburg Web Archive Jaap Kamps New Media Richard Rogers

3. Unlimited ways to publish/access/share information

4. Our daily lives take place “on the Web”

5. Ease of publishing on the Web comes at a price Web content is ephemeral Web archives preserve the heritage of the future

6. d to the information defined. After the morning introduc- lieve that informa- tory session, we split the workshop Focus on use: Web research(ers) search falls squarely human-computer into three new working groups, based on the results of that discussion. ome emphasis on val, rather than vice f the thrusts o f this attempt to character- users engage in, to ctivities, and to iden- chniques and mea- appropriate insights or and performance. participated in the were chosen on the ef submitted position sented a broad spec- and academia. Partic- France, Canada, U.S. After accep- J. s were asked to sub- age) position - © escribed relevant pectives a few weeks hop. These papers

7. Search support has massively improved

8. Complex tasks are still painstaking! Many queries, tabs, notes, cut-and-paste, ...

9. Exploratory and faceted search

10. Interactively construct a (hidden) query

11. Search strategy from building blocks

12. Each block = data or manipulations Strategy Builder Build a dedicated search engine “on the ﬂy”

13. Research methods become search strategies Store, reﬁne, reuse, share strategies Researchers enrich the archive

14. Archival selection determines future use

15. Digital humanities is a paradigm switch

16. Supporting Complex Search Tasks Nick Belkin Charlie Clarke Ning Gao Jaap Kamps Jussi Karlgren Thanks! SIGIR 2011 Workshop, July 28, 2011

Notas del editor

Good afternoon. My name is Jaap Kamps and it is my pleasure to introduce the WebART (Web Archive Retrieval Tools) project.\n
\n
The project is a collaboration of three groups of researchers: \n1. Specialists working on Information Access (Computer Science, Arjen de Vries);\n2. New media scholars working on the Web and the Web Archive (Humanities, Richard Rogers); and\n3. Web Archivists from the Dutch Web Archive (Heritage Sector, Rene Voorburg en Paul Doorenbosch).\nWhat is special is that all three groups are actively building technical tools -- the Koninklijke Bibliotheek does large scale crawling; the new media scholar build dedicated crawlers/screen-scrapers and analysis tools; and the computer scientists think they know the next generation of search tools.\n
The Web is a unique object with an unprecedented size and growth curve, and with distance the largest source of information on -- basically -- everything. The Web has had a revolutionary impact on how we publish, access, and share information. \n
In fact, it has a fundamental impact on our daily lives that increasingly take place &#x201C;on the Web.&#x201D;\n
However, this increasing dependence on the Web comes at a price: the ease of publishing on the Web also results in the easy loss of information&#x2014;Web content tends to be ephemeral. This project addresses the problem of our future cultural heritage. Globally this has been addressed head on by the Internet Archive, now supplemented by many national initiatives.\n
\n
We don&#x2019;t want to focus on preservation, but on its use. That is, we critically assess the value of Web archives for realistic research scenarios, and develop information access tools and methods that maximize the archive&#x2019;s utility for research. Web research tends to require complex selections and manipulations of the data.\n
Search technology has advanced at an insane rate over the last decade. Who is still old enough to remember the early days of the Web, when people spent considerable parts of their time to collect and organize bookmarks.\n
Despite the progress, complex tasks are still poorly supported by a modern search engines! The best strategy is to slice-and-dice the complex information need into many small sub-requests, and combine all the information post-hoc and outside the search engine into a coherent answer.\n
Some systems allow for more complex interaction -- for example systems catering for exploratory or faceted search.\n
Such systems are creating complex search query in the back end -- and on restricted domains much of the complexity could be hidden from the searcher.\n
What if we have a way to open up this box? -- and allow searchers to manipulate complex requests or search strategies directly by combining several building blocks in unconstrained ways. Modern structured DB/IR technology allows for powerful, declarative queries or search strategies turning a collection of Web pages into a high dimensional data space.\n
Each building block corresponds to a particular data source or manipulation of the data. The search strategy builds effectively a dedicated search engine &#x201C;on the fly&#x201D;.\n
What will happen if we put these tools in the hands of the Web researchers? We will develop the appropriate building blocks and incrementally let them construct complex search strategies. Effectively, this means they can on the fly do their research, rather than have a turn around time of weeks or months in developing the right kind of crawler, the right kind of analysis tool, and then executing it. Moreover, researchers can store the search strategies, reuse and refine them, and share them with colleagues. In essence, the research methods will evolve in parallel with the search strategies, at a much faster pace and scale than ever before...\n
\n
However, the chosen selection and archiving strategies of Web material will have a crucial impact on their future value as cultural heritage. What choices are made or enforced upon us? What is the missing Web? The broken Web? The banned Web? We will critically evaluate the state of the Web Archive the resulting recommendations may prevent the loss of digital heritage.\n
Progress is particularly thorny since we combine radically different research paradigms -- the truth finding paradigm of the exact sciences and the interpretative paradigm of the humanities -- we are in a unique situation of three disciplines (Computer Science, Media Studies, Heritage Field) looking at the same object of study, although seeing it also in different ways.\n
\n

WebART in 10 minutes

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a WebART in 10 minutes

Similar a WebART in 10 minutes (20)

Más de Jaap Kamps

Más de Jaap Kamps (7)

Último

Último (20)

WebART in 10 minutes

Notas del editor