Analysis and Evaluation



Web Archiving and the Transition to Digital Art History

The NYARC web archiving Pratt fellowship based at The Frick has broadened my understanding of the complexity and importance of born digital archiving in a museum and art historical context. The amount of information and data on a website can be vast, daunting, and exciting. It is clear that born digital content requires a new framework and new approach. The future of digital art history, the role of the library in a museum context, and the role of data is being shaped and explored at a vast number of institutions with a vast array of tools.

Web archiving is at the forefront of digital art history and archives. As more and more resources transition from print to web formats, WARC files will increasing become important for research. As Drucker says, “…digitized art history, [is] one built on the use of online resources” (2013). With Rhizome’s web recorder and uploads to GitHub, I believe the subscription-based software, Archive-It, will be vastly improving. I also believe Rhizome’s web recorder will aid smaller institutions that currently have little to no funding for web archiving to maintain some form of a collection. Laura Childs touched on this when she shared about her internship at Fordham this spring during our practicum presentations.

When Sydney and I interviewed Ralph Baylor, Assistant Librarian at the Frick, he shared with us a vision he has of a Digital Humanities Lab that would be part of the Frick Art Reference Library (FARL). His vision is on point- I believe as more and more data is becoming freely available through web resources and digital formats, researchers are looking for ways that they can access, use, and foster discussion with these datasets. This does not come without challenges. There is an array of digital humanities tools available. Perhaps the future of art librarianship will include digital humanities labs as Baylor envisions. Zorich mentions these challenges, “Art history’s research centers and academic departments cannot provide the technological infrastructure needed to support digital art history they…lack the human resources…to develop and sustain digital projects” (2013). I hope this will change and improve as we see the importance of preserving and providing access to important art historical resources.

Our supervisor, Sumitra Duncan, also touches on the importance of collaboration between institutions in her DLF blog post on Web Archiving at NYARC. She states that NYARC’s interest “…is not in building hidden collections, but rather to work within the constoritum, and with the greater community, to enable wide access…” (2016). I strongly believe that collaboration is likely the only way technological infrastructure can be built and maintained. Sharing best methodologies workflows and practices create complete and usable resources.

My coursework, fellowship, and professional development have all pointed to transformations in the world of information and particularly with art historical resources slowly shifting to digital.

Ken Soehner’s Art Librarianship class highlighted numerous art resources that are shifting to the web. Here is a modified list of some of these resources we touched on:

I can imagine that web archiving many of the above named resources could be of value for a researcher. The WARC files created using Archive-It combined with NYARC’s discovery tool would allow full text searching aiding in research on a certain topic.

Matt Miller’s, Programming for Cultural Heritage, was an eye opening class that exposed me to big picture ideas of what can be done with information on the web and how to access and use data in a cultural heritage context. Researchers can work with data in a programmatic way that significantly changes how data is handled and how quickly it can be processed and parsed. In one of our many weekly projects, we worked with NYC Open Data. We used python scripts to convert the data into a dictionary that allowed us to make sense of what we were looking at and answer inquiries. Data in a CSV file is what we called flat -meaning it is difficult to manipulate. By converting to a python dictionary, we were able to see it in lists and quickly do manipulations using a single line in a script. This is only one data resource I mention and one way of manipulating, but I touch on it to exemplify some of the interesting things that could be done. In addition to Matt’s class, attending the NDSR symposium also revealed ways institutions are using python and simple command line requests in archives, particularly in dealing with out of date formats in audio/visual archives.

Drucker supports the importance and role of programming in art historical research when she states:  “While judgment cannot be automated, the analysis of specific features or properties in large corpora of digital files of texts or images on which art historical research proceeds can be significantly enhanced and augmented by the use of computational techniques” (2013).

This spring I took Profession MacDonald’s Information Architecture. I quickly saw the importance of content inventories, the first step we do when beginning to QA a site. For QA, the content inventory ensures we are covering the scope of the website. In Information Architecture, we used the content inventory to show the true scale of a site and to get an idea of the objective and direction of the site. Content inventories can highlight inconsistent display, inconsistencies in categorization of content, out of date and irrelevant content, and the range of file types on a site. Critiquing content and site organization was not part of our fellowship, but it became clear that if a website had good site organization, web archiving and QA’ing the site was straight forward and took considerably less time. As the spring semester progressed, my awareness of site structure increased revealing inconsistencies and poor design choices which often paralleled difficulties with using Heretrix.

Prof Cocciolo’s Management of Archives and Special Collections course was primarily focused on paper-based archival theory and practice, but many of our discussions pointed to issues around digitization and copyright as dilemmas around making materials accessible online. This also brought to mind the work I was doing at my fellowship. If an archive puts materials up for access online or creates an online exhibition (similar to those on the DPLA site), they could web archived for future research or reference. Having an online presence seems essential for staying relevant and being accessible. If an archive doesn’t have an online presence, how will their collections be discovered.

The NYARC Web Archiving Fellowship through Pratt has reaffirmed my interest and goal to be a librarian and information technologist that can offer support for the future of a digital art history.


Baylor, R. (2016, April 4). Interview. Notes can be viewed here.

Drucker, J. (2013). Is there a “digital” art history?. Journal of Documentation, 29:1-2, 5-13doi: 10.1080/01973762.2013.761106

Duncan, S. (2016, January 25). Web archiving at the New York Art Resources Consortium (NYARC). Digital Library Federation blog. Retrieved from

Gaehtgens, T.W. (2013). Thoughts on the digital future of the humanities and art history. Visual Resources: An International Journal of Documentation, 29:1-2, 22-25. doi: 10.1080/01973762.2013.761110

Miller, M. (2015, September 17). From Programing for cultural heritage class. Reference to the class can be found:

Soehner, K. (2015, Fall). Information and discussion from Fall 2015, LIS: 667, Art Librarianship.

Zorich, D. (2013). Digital art history: A community assessment. Visual Resources: An International Journal of Documentation, 29:1-2, 14-21. doi: 10.1080/01973762.2013.761108

Image- Rafael RozendaalBlank Windows, 2016