This document discusses the Vogue Archive digital project undertaken by Condé Nast from 2009-2011 to digitize and provide access to the complete run of Vogue magazine. The project involved scanning over 2,700 issues containing over 400,000 pages and 120,000 articles and images. Sophisticated metadata and taxonomies were developed to organize and search the content. Both internal Condé Nast employees and external researchers and brands can access the high-resolution digital archive. The document explores how this corporate digital archive project relates to and advances the field of digital humanities and alternative library and information science.
6. What is Vogue?
• Monthly document of record, “The Bible”
• Fashion history
• High society
• Fashion in context
• Joan Didion, Richard Avedon, Irving Penn,
Susan Sontag, Arthur Schlesinger, Jr., William
Faulkner, Graham Greene…
9. Vogue Archive Overview
• Robust digital archive
• 1892-present: every article, image, and advertisement
• Discover content by:
• Browsing by issue date
• Searching and faceting on specific terms
• Stunning hi-res visuals
• Powerful metadata
• Broad communities of users
16. Workflow Overview
CN Digital Archive
CN Physical
Archive
External vendor:
- Image scans
- OCR
- XML conversion
Digital Archive,
CN technologists
Vogue
Archive
17. Conversion and QA Workflow
Issues out
to vendor
Hi-Res
Tiff
JPG
XML
OCR
25 TB
Tiffs
-
5 TB
JPGs
Delivery to
Condé Nast for
storage
Ingest
XML,
staging
Create
thumbnails,
deep zoom,
printable
images
Image
storage
on Web
Servers
Master
XML on
SQL
databases
Keyworder
+ text-mining
Ingest XML,
production
Fast Feed
Update, data
merge
Vogue
Archive Data
Store (SQL)
FAST server
Viewer
18. Taxonomies and Ontologies
ALL ABOUT METADATA…
• OCR
• Digital Archive + Fashion Historians
…AND STANDARDIZATION
• Hierarchies + flat lists
• Industry vocabulary, Vogue usage
• Proper terms
19. Hierarchies
Clothing Coat
Fit-and-
flare
Fit-and-flare = Fit and flare =
Hourglass
Clothing Coat Fitted
Clothing Coat Frock Frock = Cutaway Prince Albert, Victorian
Clothing Coat Gabardine
Trench, Raincoat, Mac,
Mackintosh, Military
Clothing Coat Greatcoat Greatcoat = Watchcoat Military
KEYWORD CATEGORY DESCRIPTOR SYNONYMS SUGGESTED SEARCHES
29. DH in the Private Sector
(James Smithies)
“…narrowing the gap between the commercial
and scholarly worlds.”
“…out of the ivory tower and back to an
engagement with the ‘real world.”
30. DH in the Private Sector
Branding
Marketing
“Competing for Attention”
(Tom Scheinfeldt)
31. Funding—Corporate Lessons?
“The Best Revenue Models and Funding Sources for Your Digital
Resources” (Ithaka S+R and JISC)
Advertising
Selling/Licensing
Content
Selling/Licensing
Platforms +
Expertise
32. Is the Vogue Archive a DH project?
• It’s complicated…
• YES:
• Digital processing of archival collections
• Networked information
• Enhanced study and research
• Inherently interdisciplinary
• Fashion, History, Journalism, Computer Sciences
• Has an essential public-facing application
• NO:
• Commissioned and executed in a corporate environment
• Subscription-based
• Not overtly concerned with humanistic inquiry
Can talk about other archive projects…WWD, AD, Montrose,
While DH and alt-LIS are mostly thought of as existing within academia, here I want to make a case for a parallel world of similar activity happening in the private sectorUsing technical solutions to render flat collections searchable, machine readable, and eminently networkedInherently public-facing (although, with an obvious corporate caveat which we’ll get to)Capitalizing on relatively new roles and relationships among librarians, archivists, technologists, and so onHope to show that my digital archive group is somewhat of a corporate analog to groups like yours
Made innovations in advertising (full-page colors)
Editorial Assets and RightsDigital ArchivePhysical ArchiveLibraryTechnologyPermissions + ContractsDiverse group of skills and backgroundsGovern all content preservation and reuse workflowsBridge Editorial and Business departments across companyPerfectly comprised to fulfill mandate of developing digital archive projects and managing company’s digital assets
Pure corporate market: trend forecasting, design inspiration, ad research – fashion houses, dept stores, stylistsCultural nonproft: Met Costume Institute (like Punk exhibit…this is all about trends field), MoMA…research for rexhibitions, cultural research, catalog writingAcademic/Public libraries: FIT, Parsons, NYPL…scholarly research
Digital and physical archive staff members use it for research and exhibit prepOther CN employees—like Vogue magazine and business departments--use it for editorial and business research needsExternal researchers: individual users and institutions: ad companies, libraries, and fashion houses
In 2009, Vogue editorial and business departments met to plan an archive project, motivated by two primary interests:Preserving the magazine’s historyMonetizing its archival collectionVogue’s attention to its brand and image was a decisive factor in pursuing this and in how design elements played outMove from strictly a product into a functional digital archiveServe as a brand-compatible delivery mechanism for Vogue contentTiming: in 2009, print media was eager to develop digital products for generating new revenue streamsEmulate digital archives of Rolling Stone and PlayboyA consumer-facing product to boost subscriptionsWe ended up outdoing and improving upon both, generating a truly enhanced, integrated archival product
Editorial Assets and RightsParent department of Library, Digital, and Physical ArchivesHome also to copyright lawyers and legal analystsVogue MagazineEditorial including print and Vogue.comBusiness including sales, marketing, advertising, and licensing divisionsWGSNFashion trend forecasting and advertising companyProquestE-publisher specializing in archival databases of academic and wide range of periodical content
Work was spread out among various groups:A core in-house team comprised of archivists andlibrarians, project managers, fashion historians, technologists, and lawyersGenerated taxonomies and authority filesHandled all tagging duties and research questionsManaged databases and production environmentsExternal partners in site development and designBuilt custom front- and back-end toolsMaintained security, authentication, and analytics piecesScanning and XML conversion vendors:Facilities in Virginia and India
Preserve Vogue’s 120+ year publishing catalogCreate a multi-faceted research tool, flexible enough to fit the research and commercial needs of different communities of usersBuild a product that people would want to use, and to pay forDesign had to be elegant, intuitive, user-friendlyWe sought to incorporate functional and aesthetic aspects of e-commerce sites + “next generation catalogs” and OPACs:Breadcrumbs, faceted searching, relevancy rankingsStrong visuals + core metadata displayEstablishing relationships among assets
*Workflow schematic courtesy of DemetriVasiadis
Metadata: pulled from OCR and generated by taggers through extensive developmentPrimary objective: standardize dataHierarchical lists of common termsBased off industry standardsVogue usage—vocabulary, syntax, in-house styleUniversal and contemporary terminologyProper termsPeopleTrendsBrands
We divided everything up into Edit and AdOur concern for Advertisements was providing baseline information not going too far or in detailEdit pages, however, needed to be built out more. We wanted to “deep tag” everything (which we’ve done back through the 1940s
MontroseDynamic, customizable internal DAM system—3.95 million assetsOne-stop research + licensing functionality across all user groupsCrowdsourcing Fashion ImagesDesigning tagging projects using Amazon Mechanical TurkGeared towards making royalty-free images discoverablePrivate DAM/Photo PortalEnriched, upstream metadata Reconceiving life-cycle of an assetSeveral million additional asset additionWWD ArchiveDigital archive of 100 years of daily WWD issues—over 1 million pagesArchitectural Digest ArchiveDigital archive of 90 years of roughly monthly issues
In fact, there’s an entire website called “what is digital humanities.com”, created by a practitioner in the field, where every time you refresh the page, another definition pops up.So right away, we have the opportunity to step into this nominal and interpretive space and assert new definitions, or at the very least, new offshoots of ongoing definitions and conceptions.
Open Syllabus (Columbia, et al.): “An effort to create the first large-scale online database of university course syllabi as a platform for the development of new research, teaching, and administrative tools.”Humanities Studios (metaLAB): “…They combine in-depth research, design thinking, and hands-on training with digital tools and media in an environment that involves sustained cross-disciplinary teamwork.”Women Writers Project (Brown, Northeastern): “…Our goal is to bring texts by pre-Victorian women writers out of the archive and make them accessible to a wide audience of teachers, students, scholars, and the general reader. We support research on women's writing, text encoding, and the role of electronic texts in teaching and scholarship.Omeka and Neatline (George Mason):“…a free, flexible, and open source web-publishing platform for the display of library, museum, archives, and scholarly collections and exhibitions”Building Inspector and Map Warper (NYPL Labs):Crowdsourcing and visualization tools, respectively, for using historical mapsScalar (USC):“Born-digital, open source, media-rich scholarly publishing that’s as easy as blogging”
Image courtesy of NYPL labs“…because the library already functions as a [sic] interdisciplinary agent in the university, it is the central place where DH work can, should be and is being done. DH projects involve archival collections, copyright/fair use questions, information organization, emerging technologies and progressive ideas about the role of text(s) in society, all potential areas of expertise within the field of librarianship.”(Micah Vandegrift)5“…I see libraries playing a role in the collection, re-purposing and organising of data that may lead to further analysis by individual researchers or (sub)departments.”(Ben Showers)6
“Two points bear comment here. Firstly, digital humanists might not only be the advance guard of a movement of the humanities into the digital world. They might also be, in effect if not intent, narrowing the gap between the commercial and scholarly worlds. For the first time since World War Two, a significant humanist movement has appeared that engages directly with methods and attitudes commonly held in the commercial and governmental worlds. Far from being a flight into cyberspace, the digital humanities might represent an emergent movement out of the ivory tower and back to an engagement with the ‘real world.’ If this is the case, humanists might want to be concerned. Even if the intent isn’t there, the need to control increasingly complex technologies demands a degree of rationalization and control that runs counter to a great deal of our tradition. ”
“That said, advertising is not at all a new idea in the scholarly community. For many years, publishers of scholarly journals have sold ad space in the back pages of their publications, albeit at fairly modest rates. Today, some content holders have been willing to experiment with Google Adsense, though most examples of substantial support we have observed have come from sponsorship arrangements, not basic advertising.”“For example, an obvious source of value for many projects resides in the content it creates. It may be valuable to those who view it but it may also be valuable to other third party publishers who may find creative ways to reuse it in other works. In this case, selling the content to users is one option; licensing it to third parties is another; and developing a premium version for specific uses – such as broadcast quality video for commercial programmes (fee) versus online viewing of the same clips (free) – is yet another.The technology created in the process of developing a digital resource can also yield real value. Some project leaders may choose to license the use of the platform they have created to others, or to charge a fee for specialised tools, while the underlying content is still freely available. In a similar vein, the expertise that a team develops when undertaking such work can sometimes be leveraged and turned into a fee-for-service activity, as team members consult with other projects, whether in an advisory role, or in helping to build other technology activities in their field.”