Search Functionality | Clusters | Environmental Impact | Funding
This is an experimental project to explore ways to use AI techniques to build a public interface to the digitized drawings in the Index of American Design. The digital humanities project was created by the Distant Viewing Lab at the University of Richmond by Taylor Arnold and Lauren Tilton in collaboration with the National Gallary of Art. We provide several search features exploration by topic clusters, and open search. The authoritative collection for the Index of American Design is the National Gallery of Art (NGA), which holds the physical watercolors, extensive archival material, and copyright information.
The search bar allows for querying both structured collections metadata and automatically-created tags generated with AI models. The search will query the collections title, location, artist, and automatically generated image captions. By default, we only return results that include all the terms in the query. The words do not need to appear in the same order. The algorithm also ignores capitalization. For example, if we type the word barn, the search will return images that have "barn" in their caption as well as any images created by Ruth M. Barnes. By default, we return results for words that start with your request. For example, the query pot will match both pot and pottery.
To force the model to only return words in the given order or in a specific form, the search term(s) can be enclosed in quotes. For example, "red jar" will only return results where red and jar appear together and "jar " (with a space before the final quote) will only find examples where jar is followed by a space.
In addition to filtering, the query also includes an ordering to the search results. We use a multimodal model called SigLIP to estimate how likely the search query would be a reasonable caption for the results. The returned results are ordered from the most to the least likely to be a caption for a given search. If no results are found, the entire collection is returned ordered by using the same logic: the likeliness to be the caption for that image. This allows for finding possible matches even if the specific words in the query are not used in the collection cataloging information or AI-generated captions. It is possible to only sort (and not filter) the results by adding the special tag :sort to the query string. This can be useful in cases where there are a small number of exact matches but many more close matches.
The automatically generated captions were created with OpenAI's GPT4.1 model (gpt-4.1-2025-04-14). We generated the captions by asking the model to "Provide a detailed plain-text description of the objects in this drawing as a single paragraph. Include information about the color and composition of the drawing. Do not give any subjective summary." The captions can be found at the bottom of each of the individual photograph pages. While these are far from perfect, making frequent minor errors and occasional major ones, we find that the resulting captions allow for finding images related to topics featured in the images but not mentioned directly in the artwork titles. To learn more about this approach and to download the entire set of captions, see our recent article on using multimodal AI models for search and discovery linked below.
To search for a specific artist, start the search string with the marker creator:. These search terms must match exactly; we recommend using the links on the individual pages to query by specific fields. When specifying location or cluster with these special tags, the tags are ignored in the sorting of the results using SigLIP as described above.
Please see the following papers for more details about the design and algorithms underlying the project. All publications are freely available.
T. Arnold and L. Tilton (2024) "Explainable Search and Discovery of Visual Cultural Heritage Collections with Multimodal Large Language Models." Proceedings of the Computational Humanities Research Conference. [pdf] [data]
T. Arnold and L. Tilton (2024) "Automated Image Color Mapping for a Historic Photographic Collection." Proceedings of the Computational Humanities Research Conference. (Best Short Paper, CHR 2024) [pdf] [data]
The links above include reproducable code and a downloadable version of the dataset.
We have associated each of the digitized images in the collection to one of 50 clusters. The clusters attempt to group together images by their primary subject matter and composition. These clusters have been automatically generated using the AI-created captions described in the previous section. Specifically, we used a generative model to summarize the 50 themes in the captions based on the collection of captions. Then, we assigned each image to the cluster name to which it has the closest SigLIP embedding. In other words, which of the cluster names would most likely be the caption for a image. These clusters should not be treated as authorative labels for the images. Rather, they are a potentially useful tool for understanding the scope of the collection that must be augmented with a close analysis of individual images.
There are important ongoing discussions and concerns regarding the environmental impacts of generative artifical intelligence models. We have aimed to be careful to minimize the environmental impact of the AI models used in the project. We ran the AI models a single time through the entire collection, and the results are now stored locally on our server. The process of running all of the models used one GPU for an approximate duration of 15 hours. This required roughly 3.75 kilowatt hours, creating around 1.62 kilograms of carbon emmisions, or approximately the amount of emmisions generated driving a standard-sized car a distance of 6.55 kilometers (4 miles) [source].
The project is funded in part by grants from the Mellon Foundation and the National Endowment for the Humanities. Funding has also been provided by the University of Richmond for continued work and development.
Names of the photographers for all images in your current search query. The numbers show the total number of images from a photographer that are part of your results. Click on a photographer to further filter the search, or clear the search bar above to see all photographers.
A map with a circle showing each of the locations corresponding to images in your current search query. Larger circles correspond to more photographs. Zoom in and out to see further details. Hovering over a circle shows the number of photographs in a given location from your current search. Click on a circle to further filter the search, or clear the search bar above to see all photographers.