How To Scan 50 Miles of Historical Documents Into an Online Archive

Tracking the lightning quick development of modern cities is easy with Google Street View, but a big new project aims to provide context for the past 1,000 years of urban evolution in Venice, Italy. The Venice Time Machine will digitize and catalog a staggering amount of historical documents—a combined 50 miles worth of shelves!—then turn the data into an internet archive and adaptable 3D model.

Timing-wise, it certainly lines up with the zeitgeisty rush of cultural institutions—from the Met to the New York Public Library—putting their collections online and free for the public to browse. In this case, pulling together all the documents into a cohesive whole will be a pretty epic challenge, and one that's been centuries in the making.

The archives were located in different spots around the city until 1815, when they were moved to the Frari's convent in the middle of town. Now, the stacks and stacks (and stacks and stacks) contain a millennium's worth of all kinds of hand-written texts: maps, correspondences, tax statements, architectural plans, travel guides, and peace treaties, even birth and death registries and wills.

The multi-step process, helmed by the École Polytechnique Fédérale de Lausanne, the University Ca' Foscari, and the Lombard Odier Foundation, will involve significant input from both computers and people, using technology that is currently being tested and perfected. First, everything will be scanned by a semi-automatic robotic scanning unit into high resolution images, or—and this sounds incredible—a technique using "X-ray synchrotron radiation produced by a particle accelerator" will allow scanning without even turning a page.

How To Scan 50 Miles of Historical Documents Into an Online Archive

Those images will be transcribed with the help a text processor powered by the kind of algorithms used for protein structure analysis; these will detect strings of words that could be sentences.

Finally, a massive taxonomy of key phrases, people, and places will build out this interconnected, entirely searchable database. Eventually, the team believes they'll be able to create a morphing 3D model of Venice through the years, based on the gathered and synthesized info.

It will be fascinating to watch this process unfold, and its success could have major implications for putting more rare, print-only artifacts into the global mix. [Gizmag]