Slides from my presentation at the 2014 Digital Humanities conference in Lausanne, Switzerland can be found here.
The talk provides a sprint through the basic functioning and technology behind the implementation of a computer vision based search platform for archives of historical printed materials.
Arch-V is a two-cluster tool set. The first cluster, the Arch-V BOVW Toolset ( the BowWow), is a collection of C++ tools for manipulating, analyzing, and extracting and creating Bag of Visual Word representations of images. It relies on the OpenCV library. Source code and license information can be found at https://bitbucket.com/cstahmer/archv.
The second cluster, the Arch-V Index and Search Toolset (Arch-VIST), is a collection of Java tools for optimizing bags of visual words for searching, indexing them with Lucene, performing searches of the index, and returning the results to a web application as JSON. The Arch-VIST source code can be found at https://bitbucket.com/cstahmer/archv_java.
Please note that this software is currently mid-stream in its development cycle. As such, the code is currently not optimized, and there is also not yet any installation or usage documentation. We have applied for more funding to support the effort to package the software for wide distribution and implementation. In the meantime, if you are interested implementing Arch-V, please contact me directly and I will happily walk you through the process and get you up and running.
I will be providing detail on the Arch-V functionality and architecture in narrative form in an upcoming post.