Network Analysis on Geospatial Big Data in Brazil

Network Analysis is a commonly encountered problem in GIS. Researchers are increasingly working with big geospatial datasets that contain millions of records. At this scale, traditional GIS methods of network analysis fall short and new approaches are needed to analyze the data. In this blog, we describe the procedure we used for calculation of shortest drive distances between 3.5 Million patients and their nearest Hospital in Brazil. There are several tools for calculating the shortest distance calculator; most common among them are...

Read more about Network Analysis on Geospatial Big Data in Brazil

Mangrove Forests Mapping

Coastal mangrove forests provide important ecosystem goods and services, including carbon sequestration, biodiversity conservation, and hazard mitigation. However, they are being destroyed at an alarming rate by human activities. To characterize mangrove forest changes, evaluate their impacts, and support relevant protection and restoration decision making, accurate and up-to-date mangrove forests mapping at large spatial scales is essential. Sponsored by NASA Carbon Monitoring Systems (CMS) Program, we developed a machine learning ensemble to map...

Read more about Mangrove Forests Mapping

Infogroup US Historical Business Dataset Analysis

This project involved creating geospatial measures for ~2,000 public firms from the Infogroup US Historical Business Dataset. One of the tasks involved calculating the following variables at the census block group level from the dataset for 23 years of data (1997 – 2019).

1. Businesses per office size type (Office_Size_Code)
2. Businesses per sales volume (Location_Sales_Volume_Code)
3. Businesses per employee size(Location_Employee_Size_Code)
4. Businesses per Business_Status_Code
5. Number of establishments. Will be calculated from Year_Established...

Read more about Infogroup US Historical Business Dataset Analysis

Detroit Zoning Analysis

The list of work order tickets for Detroit provided by the researcher was converted into a GIS polygon data set containing 355,500 polygons using Python script in ArcGIS Pro. Polygons were mapped using the string of coordinates found in the “Polygon” field. For the ticket polygons / parcel zoning analysis, PostGIS software was used on...

Read more about Detroit Zoning Analysis

High Performance Computing for Address Level Climate Data Extraction

A key objective of multiple public health researchers the CGA works with is to find ways to improve the health of cohort members by calculating various social and environmental exposures at cohort member address locations. To aid this project objective, the CGA processed daily precipitation, temperature, and humidity estimates for 4,796 cohort address locations for the years 1999 – 2017, resulting in over 73 million patient/days of calculations. Input climate data was the 800-meter resolution...

Read more about High Performance Computing for Address Level Climate Data Extraction

Mapping Ancient Landscapes

Racing against the clock as development encroaches on important Kurdish heritage sites, a team of landscape archaeologists deploys drones and comparative image analysis to capture previously undetected ancient settlements.  This project features research by Dr. Jason Ur and maps by Jeff Blossom, and was published as a chapter in a large format book ...

Read more about Mapping Ancient Landscapes

Spatiotemporal pattern of COVID-19 spread in Brazil

In collaboration with Dr. Marcia Castro of the T.H. Chan Harvard School of Public Health, the CGA analyzed the pattern of spread of COVID-19 cases and deaths in Brazil from February to October 2020.  Weekly geographic centroids locations weighted by municipal COVID-19 case and death counts were created on the national and state level.  Visualizations were created of the centroid progression over time.  Read more in this ...

Read more about Spatiotemporal pattern of COVID-19 spread in Brazil

COVID-19 Impact on Mortality of Various Causes in the United States

This project analyzed CDC published mortality data of a dozen major causes since 1999, and applied the Exponential Smoothing (ETS) algorithm to simulate the 2020 mortality rates per cause, per month and per state, assuming there was no COVID-19 pandemic. The difference between the simulated rates and the actual rates revealed COVID-19 impacts on mortality of various causes in the United States. Results are published in...

Read more about COVID-19 Impact on Mortality of Various Causes in the United States

RAPID: Building a Spatiotemporal Platform for Rapid Response to COVID 19

Sponsored by NSF, this project is to build a comprehensive data repository of virus cases, associated social and natural information from different resources for sustainable archive; share data with the research communities through smart data discovery capabilities with easy access; utilize the spatiotemporal computing infrastructure built in IUCRC STC for computational needs of COVID 19 research with online collaboration; develop workflows in collaboration with public health...

Read more about RAPID: Building a Spatiotemporal Platform for Rapid Response to COVID 19

COVID-19 Metrics for United States Congressional Districts

Nearly a year into the global pandemic, data on COVID-19 metrics for the United States Congressional Districts (CD) had not been readily available. Yet, having access to such data can substantially enhance the ability of elected officials and the constituents they represent to monitor and develop testing strategies and other measures to allow their districts to open safely.

Researchers at the ...

Read more about COVID-19 Metrics for United States Congressional Districts

Geotweet Archive v2.0

The Harvard Center for Geographic Analysis (CGA) maintains the Geotweet Archive, a global record of tweets spanning time, geography, and language. The primary purpose of the Archive is to make a comprehensive collection of geo-located tweets available to the academic research community.

The Archive extends from 2010 to the present and is updated daily. The number of tweets in the collection totals approximately 10 billion, and it is stored on ...

Read more about Geotweet Archive v2.0