Evaluating Google’s Machine Learning Cloud Vision API for Partially Automated Placename Extraction from Historical Maps

Date:

Thursday, March 5, 2020, 12:00pm to 1:30pm

Location:

CGIS South Room S030

Presentation by Miranda Lupion, Harvard University

View the slides, listen to the audio, or read a text transcription of the audio of this presentation.

Abstract: My research explores the potential to partially or fully automate settlement name extraction from historical maps using Google’s out-of-the-box optical character recognition (OCR) algorithm. A prerequisite to meaningful cartographic analysis, manual placename extraction is time-consuming, expensive, and mundane. Implemented through an API, Google’s Cloud Vision algorithm may offer a partially automated alternative to this process. To test the pre-packaged algorithms accuracy, I use the API to perform feature detection on four sheets from the 1827 Geographical Atlas of the Russian Empire. I then assess the accuracy of each of the roughly 4,000 returned text units and record other details as well. Through exploratory data analysis of the results, I consider how language, font choice, and settlement placement impacts accuracy. Based on my results, I discuss areas where a trained rather than pre-packaged algorithm might boost OCR accuracy. I also consider potential pre- and post-processing procedures that, when applied to data, may improve outcomes. My findings have implications for geospatial digital humanities projects.

Speaker Bio: Miranda Lupion is a graduate student at Harvard University in the Regional Studies: Russia, Eastern Europe, and Central Asia M.A. program and an Innovation Fellow with the Davis Center's Imperiia Project. Her research interests include Internet technology regulation and use in Russia, Russian foreign policy, GIS, digital humanities, OCR, and automation.

Lunch will be served.

S	M	T	W	T	F	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30

f031a16c48b8db1278f2d0b9458ba048

Center for Geographic Analysis

Evaluating Google’s Machine Learning Cloud Vision API for Partially Automated Placename Extraction from Historical Maps

Date:

Location:

View the slides, listen to the audio, or read a text transcription of the audio of this presentation.

Upcoming Events

NSF REU Fellow Opportunity (Summer 2024)

Fisher Prize Award Competition

Evaluating the Science of Geospatial AI

GIS Institute Summer 2024

The Summer Workshop on Spatiotemporal Innovation 2024

Basic Introduction to GIS (fall 2024)

April 2024

IQ THEME OVERRIDE

3b5b4ae6a230d3d395811b98242384d5