That DAM Project
The Denver Art Museum (DAM) has an encyclopedic collection of over 70,000 artworks, spanning centuries and the entire globe, over 400 of which fall into the fairly broad category of “electronic media” (video and software-based installations, CD-ROMs, websites, video tapes, and more). Due to the exhibition and acquisition of electronic media art and time-based media art often out pacing the development of best practices for the preservation of such artworks, the Denver Art Museum, like many museums, has developed a “backlog” of untreated and minimally cataloged objects necessary for the realization of the museum's media artwork collection. To combat this backlog, the museum has adopted an iterative, project-based approach to identification and treatment of such objects. I had the pleasure of taking a leading role in the most recent project at the DAM, an IMLS grant-funded initiative aimed at migrating all of the media artwork in the collection to a digital repository, and updating all catalog records for each object, either physical or digital, associated with those works. My position at the DAM began in the spring of 2017, but the ground work was laid 2 years before.
Internship / Intro
My summer internship at the DAM in 2015 was, in my humble opinion, a great example of what an internship ought to be, and how both the intern (me) and the institution (the DAM) can benefit. There’s a lot of justifiable frustration with cultural heritage’s reliance on internships - cheap, temporary, often non-local labor. I’m a bit torn in that I agree that far too much cultural heritage labor is preformed by workers forced in to unsustainable and stressful employment models (contractors, grant-funded positions, internships, fellowships, etc.), ironically while being charged with focusing on sustainability of collections. That being said, I think that professional projects within an institution that emphasize mentor-ship, self-directed research, and opportunities to fail are ideal models for students to learn, and have the added value of increasing productivity for an institution (ideally in an area that wouldn’t be pursued otherwise). I think there’s probably some sort of “everything in moderation” ideal that could be espoused here, that over-reliance on temporary labor is unsustainable, but mutually beneficial internships and fellowships are achievable. There’s more to be said on this matter, and I would be curious to hear from others. My internship at the DAM in 2015 allowed me to pursue experience following my own professional interests, attend a symposium, and learn new skills. As an intern they paid me $21.50 an hour, and provided me with health benefits. The work I preformed at the DAM during my internship ended up being the foundation for our digital preservation workflow and ingest procedures for our digital repository (and as a result made me an ideal candidate for the grant funded position which began in 2017).
I was only able to move up from Conservation Intern to Assistant Conservator at the DAM because of the conservation department’s strong advocacy, and the willingness of the media collection’s stakeholders to engage and take action. Following a media heavy exhibition, Blink! Light Sound and the Moving Image, awareness of the complexities of electronic media at the museum led to the formation of the Variable Media Working Group. The VMWG is a committee from a variety of different departments that work with media art in one way or another (technology, collections, curatorial, registrar, conservation, etc.). This group was responsible for several policy changes, including actively pursuing grant funding and other projects to develop their media conservation program. The caretakers of the DAM’s collection have the ability to effectively articulate and advocate for institutional attention. This is one of the biggest hurdles to achieving preservation projects in general, perhaps in particular preservation projects that administrators aren’t used to hearing about. Then, being able to point to concrete steps towards demystifying these challenges, once you have admins’ attention, is key. The director of the DAM’s conservation department, Sarah Melching, and the Modern and Contemporary Art conservator, Kate Moomaw, first identified my internship, and then identified the next step, applying for and ultimately being awarded an IMLS grant to fund this most recent project.
When I arrived at the DAM in March, 2017, the grant project had already begun. Part of the grant proposal allocated funding for purchasing equipment, contracting vendors to help build the infrastructure for our digital repository, and contracts with two additional vendors to migrate analog video to digital formats for preservation. A lot of this vendor selection was based on the institutions’ prior experience with vendors. For example, given the positive experience we had with Archivematica during the 2015 internship, the DAM opted to invest in a maintenance contract with Artefactual. By the time I had arrived, 2 servers were already set up to the Archivematica support team’s specs, with the open source software installed and configured. The DAM had previously had analog video converted to digital formats for preservation by two vendors - one local, Post-Modern Media Services, and one based in New York, the newly formed team of long-relied on media art preservation and reformatting experts Bill Seery and Maurice Schechter. PM Media services was tapped to manage the migration of the museum’s Architecture, Design, and Graphics collection’s large holding of VHS tapes (~150 tapes), while Seery and Schecter handled the videotapes from the Modern and Contemporary collection which were on a wider array of formats. We opted to use cloud storage as our offsite backup (the DAM maintains two onsite storage systems of all of the digital preservation masters from the collection). We narrowed our options using the AVP Cloud Storage Vendor Profiles, ultimately selecting DuraCloud in part due to their commitment and experience with the cultural heritage sector.
Research / Development
Even with some of these big questions out of the way, I wasn’t able to just hit the ground running and start processing right away. The project began with a good deal of R&D, reviewing the workflow I had built as an intern, making tweaks, evaluating new tools, and asking other for advice. I’m sort of making public declarations of ignorance my brand on twitter, and this research phase of the project was no exception. Here’s me getting pixel aspect ratio ffmpeg scripts from Dave Rice, trying to figure out what the pros and cons of big endian and little endian are, brainstorming how to instruct the tech averse to create checksums, getting in touch with the developers of obscure audio codecs, and a particularly helpful conversation on optical media preservation:
During this research process, I began building our disk imaging workflow (which ended up being a project-wide process, often revisiting and revising), I reviewed our cataloging instructions for variable media, and I wired the new video equipment (list below) that we purchased from SD Video Engineering:
Sony MSW-M2000 IMX deck
Sony SVO-5800 S-VHS deck
Tektronix 1740 Waveform Monitor/Vectorscope
Tektronix 601M Waveform Monitor/Vectorscope
Tektronix TSG-170D Signal Generator
**Wiring up the video rack lead to the peak of my imposter syndrome-based insecurity during the project. I had to learn a lot of new skills on this project, so the fear that I was not qualified for my job often lurked in the back of my head. For whatever reason, figuring out how to work the myriad of buttons and dials on the front of the IMX deck, and my confusion about getting external reference to flow through the system, so that sync pulses originated at the signal generator (never did figure that out tbh), was the greatest feels-inducing task of the project. Video playback decks’ menu options are full-on enigmas without a minimum hour-long deep dive into the user manual. It all worked out in the end, the system I set up worked swimmingly, and I got a lot of joy out of playing with all of this video tech, but I certainly didn’t want to present this list of equipment and diagrams with this “it was easy, no problems at all” nonchalant manner.
Despite some of my anxieties, this period of research was certainly a lot of fun, and definitely an important part of the project, but it’s also a little nerve racking to be spending a lot of time on. There were 425 objects mentioned in the grant proposal, and we had laid out an ambitious plan to migrate, document and describe all of them. Theoretically, if the research phase is particularly successful, one might prevent costly surprises by identifying risk ahead of time, or expedite processing through an especially efficient workflow, but on the other hand, every day spent ironing those details out is a day not spent moving media from the backlog to the repository. Despite spending a good deal of time researching and testing disk imaging tools, I spent a lot of time mid-processing troubleshooting unexpected errors, and the same goes for video QC. Of course, barreling ahead blind would have been foolish, but I think limiting R&D in a project timeline, and planning to troubleshoot along the way may benefit those with a fixed project end date. That being said, I think I have identified testing tools and evaluating workflows as my favorite part of this weird profession I find myself in.
Video / QC
Although we were outsourcing our video migration, video preservation was still a huge part of my job, and yet another opportunity to familiarize myself with a whole suite of tools. I would drop off ten video works from the design collection to our local vendor, Post-Modern, every two weeks, and while I was there, I would pick up the uncompressed video files from last round (note my use of the term works here, as some of the VHS tapes in collection contained several individual works, complicating the workflow). The biweekly trips to Post-Modern became a fun ritual for me, getting to chat with and learn from “jack of all trades” media expert and business owner David Emrich. David has a lot of restoration and digitization experience, and brought a great perspective to the project. Coincidentally, his brother’s pioneering video artwork is in the DAM’s collection.
I would evaluate the preservation masters that Post-Modern had created using QCTools, a MediaConch policy, and playback of the media using our video rack. QCTools is an open source software developed specifically for the video preservation community. It allows for much more efficient analysis of a video signal, by creating visual representations of luminance and chrominance levels, temporal outliers, or vertical repetitions (particularly helpful for spotting TBC errors). Once I had created QCTools reports of our preservation master video files, I would use the graph layout to flag potentially problematic areas, especially if the luminance was particularly high or low in a certain spot. In that situation, I would double click on a frame from the video (displayed along the bottom of the graph layout in the current version of the software’s GUI) and scrub through this potentially dicey area using the “Broadcast Range Pixels” filter, which highlights pixels outside of broadcast range.
Like QCTools, MediaConch is also an open source software created for the video preservation community (they are both made by the MediaArea team, best known for the MediaInfo tool). I have used MediaConch as a “conformance checker,” to ensure that files I receive are encoded to spec. I used MediaConch during my residency at Louisiana Public Broadcasting to automate a quick quality assurance procedure, ensuring that web video files and archive video files were encoded in accordance with the station’s preservation and access policies. I used the tool in a really similar way at the DAM. Once Post-Modern and I had created our workflow, I created a MediaConch policy using one of the preservation masters as a template (using the “Policy from File” option on the policies tab of the software’s GUI). The policy from file option, though, makes a rule for every single field in the file’s MediaInfo report, including very specific things like “Encoded_Date.” My goal was to ensure that every file had the appropriate specifications, like resolution and frame rate, not that they were all exactly the same file size or made on the same date, so I used a few other files, also created by Post-Modern, to iron out the policies important rules, and delete the rest. Said a different way, I took several files that I knew were encoded the way that I wanted them to be, and I used their shared properties to create a policy that I could compare unknown files to, to see if they also had those shared properties. This became my first line of defense when QC-ing video files, and helped me catch a few stray files that had the wrong resolution, before I wasted valuable time evaluating their other properties.
After running the preservation master files through MediaConch and QCTools, I would play each file back over SDI (using our BackMagic PCIe card) to our waveform monitor and vectorscope, and then out to our CRT and LCD monitors. This manual, and admittedly time consuming process of playing back each of the preservation masters in real time still proved to be an imperative step in the QC process.
Of course, it was really helpful in cataloging the works to have actually watched them, but more significantly, I caught errors that I wouldn’t have caught otherwise. I remember specifically finding out one video had significant audio drift that wasn’t in the original - I had gone through all of the other steps already, and was about to ingest it before I noticed while viewing the file. When I encountered an error like this, I would playback the original tape to see if it was simply an artifact from a less than ideal production process (this was often the case), or if the error was created through the digitization process.
I would note my QC procedure and observations in our Transfer Notes document. That document, along with the exported QCTools reports, md5 checksum sidecar files, and a transcoded ffv1/pcm/mkv version of the uncompressed file (see the FIAF ffv1 and matroska reading list for more information on this encoding), would be packaged together and sent to Archivematica as a SIP. This QC and pre-ingest procedure was used for all of the video preservation masters created from analog video in the ADG collection. The process was very similar for the Modern and Contemporary art collection, but due to the fragility of the media, and the obscurity of some of the formats, we worked with a different vendor, Maurice Schechter and Bill Seery.
Working with Maurice and Bill was another great learning opportunity for me. I came out to visit the lab along with our Modern and Contemporary Art Curator, Rebecca Hart, and Maurice walked us through his process and reviewed each of the tapes with us. One of the more challenging tasks with our Modern and Contemporary (M&C) collection was selecting preservation masters - high quality digital files that would be used to create derivative exhibition copies from in the future - when we had received the artwork on multiple formats. We would transfer all of the tapes we had in the collection to file-based formats. But which one would we describe in our collection management system as the “preferred copy” for making future copies? In one instance, the video component from an installation in our M&C collection, a miniDV tape, was clearly made after the VHS tape we had received with the acquisition, but it was unclear if the miniDV had been created from the VHS or another (potentially higher quality) source. We noticed that both copies had the same head switching artifact in the lower horizontal lines of the picture area. However, this could either mean that the miniDV was made from the VHS (carrying the artifact with it during the transfer) or that they were both made from the same source, which already had the artifact “baked in.” Maurice opened up the VHS deck and manipulated one of the rollers in the VHS deck’s tape path as the tape was playing back, and showed us how the behavior of the head switching artifact moved with the rest of the picture area, as opposed to independent of the picture area. This demonstrated that the artifact was baked in to that copy, as opposed to an artifact of that copy being created. Ultimately, we documented the subtle differences between the miniDV and VHS copies and the DAM will follow up with the artist’s studio to better determine which copy is preferred.
Digital / Preservation
Once the video preservation masters were processed, they would go through the same ingest procedures as the disk images, or any other digital media in the collection. This process usually began in BitCurator, where I would assess the object. For disk images, this meant running commands like disktype and SleuthKit’s mmls, to identify the file system. For individual files, often this was just answering the question “what the hell am I looking at?” If it was a file format I wasn’t familiar with, I would run my file format identification tool of choice, seigfried, and use the PRONOM ID to look up the file on the PRONOM registry’s website. If I still didn’t understand what the file was, or if seigfried couldn’t identify it, I would turn to the Library of Congress Sustainability of Digital Formats website. I flippin’ love this website. It’s so nerdy! You can learn about how a format was created, how LoC views the sustainability of that format, subsequent versions of the same format and how they differ from the original, on and on. Once I felt sufficiently familiar with the file(s) I would begin the transfer process in Archivematica.
As I mentioned earlier, through a support contract with Artefactual, the Denver Art Museum’s Technology department built-to-spec servers for running Archivematica. Archivematica is open source, standards-based software designed to help automate processing of a digital collection. The system is based on the OAIS reference model, using a series of microservices to create information packages for storage and retrieval. I would submit files I had prepped for transfer from my local storage, over the network to the Archivematica server in a building nearby the museum. "Out of the box” Archivematica is designed to prompt the user to make a variety of decisions about what should happen to a file (or set of files) after they have been transferred to the software. Which file format identification tool should Archivematiaca use? Should files from a disk image be extracted? Should these files be normalized? For the most part my answers were the same for most transfers, so I would automate much of this through Archivematica’s Processing Configuration interface. I never fully automated this process though, I preferred to shepherd the files through so I could be sure nothing had gone wrong along the way. Sadly, things did go wrong more often than I expected.
When we first started using Archivematica we had a few features we wanted to request as part of our support contract. One was to include a “README” style file in every AIP, that would explain the directory structure of the package, the metadata files that are included in such a package, and thinking behind this structure. Basically the goal was that, if someone were to, somehow, stumble upon one of these AIPs, without much context they could get a general idea of what it was, and why it held the files it held. The Archivematica team liked this idea and felt it could be achieved fairly easily. We agreed on the language that would be in the README file and then Archivematica added it to an unreleased “QA” version of their next software release, 1.7. This is where the trouble started.
Long story short, being on an unreleased version of the software exposed us to bugs we wouldn’t have been exposed to otherwise, made it hard to identify bugs that weren’t related to the software, and, in a perfect storm of digital preservation software disappointment, seemed to go on forever due to a delay in the development of a stable version of 1.7. I was regularly in contact with Kelly Stewart and Justin Simpson, who were incredibly thorough and transparent about the issue. Mentioning their names together reminded me of the 2003 masterwork From Justin to Kelly, which was frequently mentioned in our troubleshooting emails.
There were a few other issues with Archivematica, that would have likely been easy to spot, but due to our issues with version 1.7, it took us a while to figure out. Thankfully, after a few months of very frequent contact with Artefactual support, digital preservation super star Ashley Blewer joined the Archivematica team, and was assigned to the Denver Art Museum. Ashley created this issue tracking document, which summarizes some of the challenges we faced (and which she helped us overcome). One issue Ashley helped us address turned out to not be an issue with Archivmeatica at all, it was an issue with our server. When we set up our DuraCloud “sync tool” (the java application that DuraCloud uses to synchronize our primary storage with our cloud backup), I opted to have the backups run whenever new data was added. The additional toll on the CPU of our server of backing up new data, while I was processing media through Archivematica, would cause everything to crash, hang, or otherwise fuck up.
The graph above, created by the application metricbeat, shows a spike when java applications demanded more processing power than our servers could provide. Once we had finally identified this problem, it was relatively easy to address. We changed the schedule of the DuraCloud backups to nightly, leaving the servers available for Archivematica processing during business hours. This solved one of the major hangups of the workflow, and allowed us to keep moving with the project fairly smoothly.
There were a few persistent problems. For instance, while Archivematica has a “sanitize file names” microservice, which is supposed to head off any issues caused by special characters, for whatever reason on our instance, any file name with diacritics would fail, and the transfer would be halted. As always, the Artefactual team was diligent in attempting to address the issue, even creating a new test environment to attempt to replicate our situation, but the problem could not be solved during my time at the museum. Elvia Arroyo-Ramirez points out the racism reflected in technology’s inability to appropriately process non-English text which contains diacritics in Invisible Defaults and Perceived Limitations: Processing the Juan Gelman Files. Like other cultural heritage organizations mentioned in Arroyo-Ramirez’s piece, at the DAM we removed the diacritic from the file name in question, and noted the change in the file’s catalog record. Since then, Ashley Blewer has encountered further diacritics madness through her work with Archivematica, and has written a great summary of the issue on her blog.
Clearly, our work with Archivematica was not without its ups and downs. The contract with Artefactual is not inexpensive, 25k/year, and even with that cost, a significant amount of staff time was invested in troubleshooting, reporting, and re-troubleshooting problems, some of which remained persistent. That being said, Archivematica does a lot. And it is open source. I think I went into the project thinking of our service contract as merely transactional, and as the project went on, I began to understand that the contract is a bit more like an investment. The entire field is bolstered by an open source solution to digital repository ingest, management, and retrieval, and that is Artefactual’s goal.
In my view, there’s a few key benefits of using the software. Greatest among them may be one of the simplest, it creates a barrier between you and your data. What? That sounds bad. But it isn’t. The less I’m navigating the digital repository myself, the less likely I am to accidentally modify or delete something. This sounds small, but having an interface between you and your data makes all the difference on a caffeine deprived morning, or when your computer is being cranky. A system that mandates a separate storage system for a repository, and forces the creation of DIPs, instead of allowing lazy scrolling through AIPs, does a lot to protect your files.
Those familiar with Archivematica might expect me to extol the wonders of the automated metadata generation in standardized, interoperable formats codified through acronyms whose meaning I have long since forgotten. Having a record of what has been done, to what, using which software, is certainly a huge step in digital preservation. It provides accountability, allows for “reversibility” of “treatment” (a cornerstone of contemporary conservation ethics), and enables a future user to understand how a digital object came to be in the repository. But, if an organization doesn’t have a way of accessing that metadata, or more importantly, aggregating that metadata in order to interpret and learn from it, that metadata is not serving a purpose. At the DAM there is a hope that one day the museum will be able to incorporate the millions of lines of XML created by Archivematica into a collection management system, but such an incorporation would require a significant investment (like the MoMA’s efforts with Binder), or adopting a new collection management system (a massive and expensive undertaking), both of which are unlikely any time soon.
Of course, the DAM is not alone in it’s need to parse complex and dense metadata describing digital collections. During my time at the DAM, I would view individual records in a user friendly HTML format thanks to Tim Walsh’s open source project METSFlask (an application I strongly recommend for any Archivematica users). Walsh originated another project in 2017 called SCOPE for the Canadian Center for Architecture, which scales many of the benefits of METSFlask to an entire collection. Additionally, more and more museums are using Archivematica to process digital files, and more cultural heritage institutions are processing born digital media than ever before. It is my hope that the broad need for managing this influx of metadata will help to drive aggregation and access methods, which in turn will only help to further motivate standards-based, interoperable characterization of digital objects (like the XML files created by Archivematica).
Conclusion / Documentation
Our need for almost constant contact with Archivematica support, and the need for frequent communication with our other vendors, underlines the labor involved with partnerships. I think when this project started we had viewed our “infrastructure” of contracts with vendors as an investment in efficiency, something that we could “contract out,” and therefore not have to worry about. This is not the case. I was highly involved with the work performed by all of our contractors, which takes time and energy away from the pressing task of moving media through the workflow.
Pace has been hard to measure in that context. How costly is a delay, if solving the delay will improve your workflow? The grant proposal stated that the museum intended to process 10 objects a week, a goal that I sometimes exceeded but more often fell short of. In the early days of our Archivematica woes, delays were seen as “growing pains” that could eventually be overcome, and eventually benefited from. Essentially we thought we would be able to make up for early delays through gained knowledge. Unfortunately, gained knowledge often led to further complexities, and growing pains of our early stages, once overcome, did not yield exponentially faster progress.
Realistic quotas for a project of this nature in theory could be a very helpful tool, but from my viewpoint could be very difficult to set. The early stages of a project are going to need to account for unexpected delays, including configuration, and likely re-configuration of systems essential to the process. Moreover, not all objects are going to demand the same amount of attention, and so “weighting” certain materials, works, or processes, to ensure due amount of resources have been allocated for specific tasks, is of significant importance.
Hand-wringing aside, we accomplished the goals of the grant, and made a huge dent in the media conservation backlog at the museum. We migrated hundreds of videotapes to digital formats, created a digital repository, disk imaged computer hard drives, and presented the results of our project at the American Institute of Conservation Annual Meeting and NYU’s It’s About Time symposium. I also had the pleasure of participating in the MoMA Disk Imaging Peer Forum, a discussion-based meeting of experts in the field of media conservation and digital preservation, focusing on disk imaging policies and procedures.
I’m really proud of the work that I was a part of at the DAM. It was challenging, interesting, fun (at times), and an appropriate ratio of ambitious to achievable. I felt a part of a team, but felt like I had my own responsibilities as well. If I didn’t know how to move forward, I could ask for help, but if I had strong feelings about how best to proceed, I was allowed to go with my gut. I was able to rely on skills I already had, but needed to learn new ones too. I’m very grateful to my supervisors, Kate Moomaw and Sarah Melching for all of their support, as well as the rest of the DAM conservation department, and the Variable Media Working Group.
If I could do one thing differently about my work at the DAM I would have shared more of what I was doing, more often. Of course, taking time to share documentation would take time away from working on the project. Given the timeline of the project, this wasn’t really an option, so I’m hoping this post will make up for that a bit. Please find a “round up” of many of the documents that were created as a part of the grant project below, and let me know if you have any questions! You can always find me on twitter, or feel free to comment below.
-Eddy
Document Round Up:
American Institute for Conservation 2018 Annual Meeting Paper, Rewind, Pause, Playback: Addressing a Media Conservation Backlog at the Denver Art Museum by Eddy Colloton and Kate Moomaw
Denver Art Museum “All Staff” Presentation on IMLS Electronic Media Conservation Project by Eddy Colloton and Kate Moomaw
NYU Moving Image Archiving and Preservation Program, Handling Complex Media Guest Lecture: Media Conservation at the Denver Art Museum (and stuff) by Eddy Colloton
Denver Art Museum Disk Imaging Workflow by Eddy Colloton
Archivematica DAM Case Study Summary by Ashley Blewer
Disk Imaging Investigation by Ashley Blewer
Denver Art Museum Variable Media Cataloging Instructions by Eddy Colloton and Kate Moomaw
DAM Cataloging Procedures for Variable Media by Eddy Colloton
Generic Cataloging Text For Argus for Variable Media by Eddy Colloton
Denver Art Museum Video Migration Report by Eddy Colloton and Kate Moomaw
MediaConch In Action: Issue 1 by Ashley Blewer and Eddy Colloton
Built To Past: Floppy Disk And VHS Art Need Creative Conservation By Stacy Nick