We chose to do a series of informational interviews with several people at the Frick to not only learn more about their career paths, but to discuss more with them about the current state of librarianship, museums and digital culture and the close work relationship experienced they share at the Frick Art and Reference Library.
Mark Bresnan | Head, Bibliographic Records
Sumitra Duncan | Web Archiving Coordinator, NYARC
Deborah Kempe | Chief, Collections Management and Access
Louisa Wood Ruby | Head, Photoarchive Research
Public Services Department:
Ralph Baylor | Assistant Librarian for Public Services
Archiving is traditionally thought to be done at an aggregate level, but the kind of detail required in quality assurance (QA) procedures within web archiving proves to be the antithesis of this concept. Instead of a broad stroke approach, every page and every link must be inspected and reviewed to ensure relevant captures have been made. If necessary, patch crawls must be preformed to capture any missed materials and the range of the crawls scope may be adjusted for future captures.
55 Gansevoort – Megan
55 Gansevoort, named aptly by its building address, was a small gallery opened in the Fall of 2013. With over 20 exhibitions at this location, 55 Gansevoort had to close its doors. Ellie Rines opened the tiny pop-up gallery to showcase contemporary, experimental works. The gallery touted to be “open” 24/7 since anyone could see inside the small space through the doors. She also has a project space in Marfa, TX as well as a new popup space that opened December 2015 at 56 Henry Street in New York. The Wayback URL can be found here.
This was the first I QA’d. It is part of the NYARC New York City Gallery Collection. I only had one issue with capturing a video in the QA process, otherwise this gallery was pretty straight forward.
Brooklyn Museum – Megan
The Brooklyn Museum site was begun by previous NYARC intern and Pratt Alum, Chantal Sulkow. Before handing it off to me, we spent a few hours looking at the live and archived site together and reviewing some of the QA issues that had arisen. The biggest issue was the common use of carousels on the Dinner Party Exhibition pages. Examples of this seen here and here. The issue was that the carousel images were not loading correctly. The images seemed to be captured, but not visible in the carousel. As a user clicked through the carousel, only metadata text about the image was visible.
The Dawson Dawson-Watson Catalogue Raisonné is an ongoing family project. It is part of the NYARC Catalogue Raisonné collection. The site not only features a catalogue raisonné for Dawson-Watson, but an extensive family tree, articles, resources and archival photographs of the Dawson-Watson family.
During the QA of the archived website, I realized that updates had been made. Another crawl date was set and I began reviewing again. I had to setup multiple patch crawls but everything was able to be captured this way. The site is continually being updated so I’m sure in another 3-5 months, it will need to be crawled and QA’d again.
The site maintains a highly dynamic form a vast majority of the archived site was not captured in the initial crawl; including all the sites images, videos, and interactive capabilities, which in turn affected the sites overall visual structure. Luckily a large portion of these lost materials rendered during QA and I was able to patch crawl them.
The Frick Collection – Sydney
The Frick’s site was passed down from a former intern. During her time with the site she was able to identify and address all of the major issues with web archiving. The main problems with the site were, of course, with dynamic content; virtual tours added to the website beginning in June 2015, which Archive-Its crawler cannot captured, and most embedded video, which may be captured but always fails to playback. I primarily focused on making sure all new content was captured successfully each month as the institution continued to update and make avaliable more information on its webpage. The archived version of this site can be found here.
Gladstone Gallery – Sydney
The contemporary gallery maintains three gallery spaces in New York and one in Brussels. Between these sites the gallery supports a large list of artists, each receiving their own page on the overall site, highlighting exhibition(s) which they have held or participated in at the Gladstone Gallery.
There were no issues while preforming QA, though there are several issues with playback on the live site specifically with dynamic images and image carousels. A more complete capture should be possible after the site is restructured. The archived version of this website can be found here.
The Arshile Gorky Foundation -Megan
The Arshile Gorky Foundation website is part of the NYARC Catalogue Raisonné Collection. The website includes an image gallery, a Catalogue Raisonne, a chronology of Gorky’s life, research resources, archival photographs as well as information about registering works with the foundation (for potential/possible owners of Gorky’s work).
This was the easiest website for me to QA, the archived website is here. The site is built simply and clearly and there were no issues with any images or forms loading.
The Library of William Morris – Sydney
This site maintains a detailed account of all the physical texts once owned by the artist, bookmaker and collector, William Morris. The website is divided by date according to the production of the text. Within each record the provenance history subsequent to Morris’ ownership is mapped out, occasionally including images and links to view a full-text version of the holding.
The site itself was fairly complete upon the first crawl with no obvious image or formatting issues. However, there were two main issues with QA’ing the site: the mysterious case of the increasing URL’s and the over 64,000 missing URL’s triggered. Both issues have been discussed with Karl at Archive-It (ticket #3974) and currently have no solutions, though once he has discussed with the engineers he will let us know if there is a way to rectify the problems.The archived version of this website can be found here.
The Victorian Web – Sydney
This extremely extensive site provides an all encompassing history of the Victorian era, including prominent figures, artists, movements, living conditions, cultural advancements, architecture, ect. Though the information is undoubtedly thorough the structure of the site itself is clunky and overwhelming, each page containing a plethora of links which in turn lead to others — ultimately resulting in a tangled spider web of sites.
The majority of the missing URL’s pertained to images which would not be captured in the time limit of the crawl. I was able to patch crawl a couple hundred but was unable to get thousands of them to render as missing. I discussed this issue with Sylvie (ticket #3557) who stated she would look into it (its been 4 months since our last interaction). Until then I have recorded each instance where a missing URL should be patch crawled so that when the issue has been resolved I will simply have to pull up those pages and trigger the missing URL’s. The archived version of this website can be found here.
After twelve years of operation, Wallspace closed its Chelsea doors in August 2015. The gallery co-founders, Jane Hait and Janine Foeller, founded the gallery in 2003 exhibiting and sometimes helping to launch artist careers.
Wallspace is part of the NYARC New York City Galleries Collection. The design of the website was a little tricky at first. The banner that appears at the top when opening the archived website in Archive-It blocked all text. Upon first viewing it appeared the page was blank (once the Archive-It header was minimized, once could easily see the site). There were some issues during the QA with carousel images not loading, but after working with Internet Archive staff and Archive-it engineers, the issues were resolved.
William Blake Archive – Sydney
This rather dated site holds a compendium of information on the Romantic poet, painter and printmaker William Blake. The home page is divided into eleven categories, including: an artistic archive of illuminated books, drawings, paintings, ect., Blake/An Illustrated Quarterly journal , recent additions to the archive, a faceted search, a biography on Blake by Denise Vultee, resources for further research, related sites, and several relating to the actual function and usability of the site.
The original crawl of the site was too small and didnt capture all the content. We ran a few patch crawls to try to capture what was missed initially. The majority of what was still left for me to patch crawl during QA was very standard, mostly images and hidden links. The archived version of this website can be found here.
The vast majority of this site has been crawled and captured successfully. I have found one final host page that should be crawled in order to capture the entire journal. That page can be found here. Overall the site should not need to be re-crawled often, I would say annually would be best in order to make sure that all updates are captured. However, I would suggest that the aforementioned page should probably be captured biannually as it is regularly updated with new journal entries.
Future of Web Archiving:
After experiencing the manual, labor intensive and slow process of QA’ing websites, we are both firm believers that the future of web archiving will involve a more sophisticated automation process. We also believe that since Rhizome developed their web recorder tool, that capturing dynamic content will evolve and be easier with a higher success rate.
We also believe that more and more researchers will find value in using web archives as resources. Web archiving is still a fairly new process and is primarily practiced by academic institutions. We did find some mentions of web archive materials being used here and here. Sumitra also shared with us a program that is still in beta, but serves as a place to track and share online resources, including web archives.