Research Portfolio

Below is list of select projects supported by CGA, sorted newest to oldest:
See also: Award Winner

Network Analysis on Geospatial Big Data in Brazil

Network Analysis is a commonly encountered problem in GIS. Researchers are increasingly working with big geospatial datasets that contain millions of records. At this scale, traditional GIS methods of network analysis fall short and new approaches are needed to analyze the data. In this blog, we describe the procedure we used for calculation of shortest drive distances between 3.5 Million patients and their nearest Hospital in Brazil. There are several tools for calculating the shortest distance calculator; most common among them are...

Read more about Network Analysis on Geospatial Big Data in Brazil

Mangrove Forests Mapping

Coastal mangrove forests provide important ecosystem goods and services, including carbon sequestration, biodiversity conservation, and hazard mitigation. However, they are being destroyed at an alarming rate by human activities. To characterize mangrove forest changes, evaluate their impacts, and support relevant protection and restoration decision making, accurate and up-to-date mangrove forests mapping at large spatial scales is essential. Sponsored by NASA Carbon Monitoring Systems (CMS) Program, we developed a machine learning ensemble to map...

Read more about Mangrove Forests Mapping

Infogroup US Historical Business Dataset Analysis

This project involved creating geospatial measures for ~2,000 public firms from the Infogroup US Historical Business Dataset. One of the tasks involved calculating the following variables at the census block group level from the dataset for 23 years of data (1997 – 2019).

1. Businesses per office size type (Office_Size_Code)
2. Businesses per sales volume (Location_Sales_Volume_Code)
3. Businesses per employee size(Location_Employee_Size_Code)
4. Businesses per Business_Status_Code
5. Number of establishments. Will be calculated from Year_Established...

Read more about Infogroup US Historical Business Dataset Analysis

Detroit Zoning Analysis

The list of work order tickets for Detroit provided by the researcher was converted into a GIS polygon data set containing 355,500 polygons using Python script in ArcGIS Pro. Polygons were mapped using the string of coordinates found in the “Polygon” field. For the ticket polygons / parcel zoning analysis, PostGIS software was used on...

Read more about Detroit Zoning Analysis

Use of Social Media data to study Climate Change

Harvard CGA joined forces with MIT SUL in 2021 to use social media data to study the effects of climate change on people’s well being. To achieve this objective, we developed the Twitter Sentiment Global Index  (TSGI) dataset, an open dataset for monitoring Subjective Well-Being (SWB) globally. By applying Natural Language Processing techniques to our archive of 10 billion geotagged tweets...

Read more about Use of Social Media data to study Climate Change

High Performance Computing for Address Level Climate Data Extraction

A key objective of multiple public health researchers the CGA works with is to find ways to improve the health of cohort members by calculating various social and environmental exposures at cohort member address locations. To aid this project objective, the CGA processed daily precipitation, temperature, and humidity estimates for 4,796 cohort address locations for the years 1999 – 2017, resulting in over 73 million patient/days of calculations. Input climate data was the 800-meter resolution...

Read more about High Performance Computing for Address Level Climate Data Extraction

Mapping Ancient Landscapes

Racing against the clock as development encroaches on important Kurdish heritage sites, a team of landscape archaeologists deploys drones and comparative image analysis to capture previously undetected ancient settlements.  This project features research by Dr. Jason Ur and maps by Jeff Blossom, and was published as a chapter in a large format book ...

Read more about Mapping Ancient Landscapes

Spatiotemporal pattern of COVID-19 spread in Brazil

In collaboration with Dr. Marcia Castro of the T.H. Chan Harvard School of Public Health, the CGA analyzed the pattern of spread of COVID-19 cases and deaths in Brazil from February to October 2020.  Weekly geographic centroids locations weighted by municipal COVID-19 case and death counts were created on the national and state level.  Visualizations were created of the centroid progression over time.  Read more in this ...

Read more about Spatiotemporal pattern of COVID-19 spread in Brazil

COVID-19 Impact on Mortality of Various Causes in the United States

This project analyzed CDC published mortality data of a dozen major causes since 1999, and applied the Exponential Smoothing (ETS) algorithm to simulate the 2020 mortality rates per cause, per month and per state, assuming there was no COVID-19 pandemic. The difference between the simulated rates and the actual rates revealed COVID-19 impacts on mortality of various causes in the United States. Results are published in...

Read more about COVID-19 Impact on Mortality of Various Causes in the United States

RAPID: Building a Spatiotemporal Platform for Rapid Response to COVID 19

Sponsored by NSF, this project is to build a comprehensive data repository of virus cases, associated social and natural information from different resources for sustainable archive; share data with the research communities through smart data discovery capabilities with easy access; utilize the spatiotemporal computing infrastructure built in IUCRC STC for computational needs of COVID 19 research with online collaboration; develop workflows in collaboration with public health...

Read more about RAPID: Building a Spatiotemporal Platform for Rapid Response to COVID 19

COVID-19 Metrics for United States Congressional Districts

Nearly a year into the global pandemic, data on COVID-19 metrics for the United States Congressional Districts (CD) had not been readily available. Yet, having access to such data can substantially enhance the ability of elected officials and the constituents they represent to monitor and develop testing strategies and other measures to allow their districts to open safely.

Researchers at the ...

Read more about COVID-19 Metrics for United States Congressional Districts

Geotweet Archive v2.0

The Harvard Center for Geographic Analysis (CGA) maintains the Geotweet Archive, a global record of tweets spanning time, geography, and language. The primary purpose of the Archive is to make a comprehensive collection of geo-located tweets available to the academic research community.

The Archive extends from 2010 to the present and is updated daily. The number of tweets in the collection totals approximately 10 billion, and it is stored on ...

Read more about Geotweet Archive v2.0

Measurement of partisan segregation for 180 million U.S. voters using advanced geospatial data science

Partisan segregation among people has important political and social implications. Historically, such measurements have been limited to county levels but this innovative work enabled Harvard researchers to analyze partisanship down to the level of individuals for the first time. In this work, CGA along with the Department of Government Professor Ryan Enos and graduate student Jacob Brown have leveraged advances in geospatial data science to measure partisan segregation down to the...

Read more about Measurement of partisan segregation for 180 million U.S. voters using advanced geospatial data science

Geographic Apportioning of Health Indicators for Policy Relevant Decision Making

For elected officials to best represent constituencies they serve, information must be presented and analyzed for the entire geographic extent of their constituency. Health indicators are often collected and reported at census geographies, or other units that do not conform to constituency boundaries. To address this lack of health data being reported at policy relevant geographies, the Harvard Center for Population and Development Studies Geographic Insights team (...

Read more about Geographic Apportioning of Health Indicators for Policy Relevant Decision Making

U.S. Air Pollution Modeling

CGA created maps that display PM 2.5 and Ozone for the US by County from 2000 - 2012, which involved cross-tabulation of monitoring sites within  county boundaries.

Investigators:  Francesca Dominici and Christine Choirat, Department of Biostatistics, HSPH

Research Staff:  Fei Carnes, Jeff Blossom

...

Read more about U.S. Air Pollution Modeling

Out of Eden Walk

The Out of Eden Walk is a 24,000-mile journalistic endeavor to create a global record of human life at the start of a new millennium as told by villagers, nomads, traders, farmers, soldiers, and artists who rarely make the news.  Sponsored and hosted by National Geographic Society...

Read more about Out of Eden Walk

WorldMap / Dataverse Integration

Dataverse and WorldMap have been integrated by adding APIs to both systems (Funded by the Boston Research Initiative)

Three types of Dataverse file:

  • Tables with columns containing latitude and longitude
  • Tables with a column of known codes which represent geometries, i.e.  Zip, Census, FIPs, etc.
  • Files of type “shapefile”, zip compressed.

For each of the 3 types one can:

  • Create data-driven map symbolization options
  • Obtain downloads in variety of spatial formats
  • ...
Read more about WorldMap / Dataverse Integration

OmniSci (formerly MapD) - explore the power of GPUs in spatiotemporal analytics

Funded by OmniSci Technologies, LLC. as a member of the I/UCRC Spatiotemporal Innovation Center
Can be hosted on Harvard Cannon Cluster, Amazon AWS, Mass Open Cloud, MapD Cloud.

Use cases:

  • Improving access to hydrological models used in water management and public safety
  • Analyzing how political partisanship affects the geographic sorting of voters

See...

Read more about OmniSci (formerly MapD) - explore the power of GPUs in spatiotemporal analytics
  •  
  • 1 of 2
  • »
see more research projects