Devika Kakkar, Ben Lewis, and Wendy Guan. 5/18/2022. “Interactive analysis of big geospatial data with high-performance computing: A case study of partisan segregation in the United States.” Transactions in GIS. Publisher's VersionAbstract
Researchers are increasingly working with large geospatial datasets that contain hundreds of millions of records. At this scale, desktop GIS systems typically fall short and so new approaches and methods are needed. The objective of this work is to develop new approaches to interactively analyze large datasets and then to demonstrate the usefulness of those approaches using a case study looking at voter, or partisan segregation. Historically, the measurement of partisan segregation has been limited to comparing large geographic areas such as counties or states because researchers only had access to aggregated data. In this case study, however, we measure partisan segregation down to the individual for 180 million U.S. voters using advanced geospatial data science and high-performance computing. This article discusses interactive method development for big geospatial data analysis including techniques used, solutions developed, and processing time statistics.
Xue Liu, Wendy Guan, and Rinki Deo. 1/2022. “Large-Scale High-Resolution Coastal Mangrove Forests Mapping Across West Africa With Machine Learning Ensemble and Satellite Big Data.” Frontiers in Earth Science. Publisher's VersionAbstract
Coastal mangrove forests provide important ecosystem goods and services, including carbon sequestration, biodiversity conservation, and hazard mitigation. However, they are being destroyed at an alarming rate by human activities. To characterize mangrove forest changes, evaluate their impacts, and support relevant protection and restoration decision making, accurate and up-to-date mangrove extent mapping at large spatial scales is essential. Available large-scale mangrove extent data products use a single machine learning method commonly with 30 m Landsat imagery, and significant inconsistencies remain among these data products. With huge amounts of satellite data involved and the heterogeneity of land surface characteristics across large geographic areas, finding the most suitable method for large-scale high-resolution mangrove mapping is a challenge. The objective of this study is to evaluate the performance of a machine learning ensemble for mangrove forest mapping at 20 m spatial resolution across West Africa using Sentinel-2 (optical) and Sentinel-1 (radar) imagery. The machine learning ensemble integrates three commonly used machine learning methods in land cover and land use mapping, including Random Forest (RF), Gradient Boosting Machine (GBM), and Neural Network (NN). The cloud-based big geospatial data processing platform Google Earth Engine (GEE) was used for pre-processing Sentinel-2 and Sentinel-1 data. Extensive validation has demonstrated that the machine learning ensemble can generate mangrove extent maps at high accuracies for all study regions in West Africa (92%–99% Producer’s Accuracy, 98%–100% User’s Accuracy, 95%–99% Overall Accuracy). This is the first-time that mangrove extent has been mapped at a 20 m spatial resolution across West Africa. The machine learning ensemble has the potential to be applied to other regions of the world and is therefore capable of producing high-resolution mangrove extent maps at global scales periodically.
Lingbo Liu, Ru Wang, Weihe Wendy Guan, Shuming Bao, Hanchen Yu, Xiaokang Fu, and Hongqiang Liu. 2/18/2022. “Assessing Reliability of Chinese Geotagged Social Media Data for Spatiotemporal Representation of Human Mobility.” ISPRS International Journal of Geo-Information, 11, 2. Publisher's VersionAbstract
Understanding the space-time dynamics of human activities is essential in studying human security issues such as climate change impacts, pandemic spreading, or urban sustainability. Geotagged social media posts provide an open and space-time continuous data source with user locations which is convenient for studying human movement. However, the reliability of Chinese geotagged social media data for representing human mobility remains unclear. This study compares human movement data derived from the posts of Sina Weibo, one of the largest social media software in China, and that of Baidu Qianxi, a high-resolution human movement dataset from ‘Baidu Map’, a popular location-based service in China with 1.3 billion users. Correlation analysis was conducted from multiple dimensions of time periods (weekly and monthly), geographic scales (cities and provinces), and flow directions (inflow and outflow), and a case study on COVID-19 transmission was further explored with such data. The result shows that Sina Weibo data can reveal similar patterns as that of Baidu Qianxi, and that the correlation is higher at the provincial level than at the city level and higher at the monthly scale than at the weekly scale. The study also revealed spatial variations in the degree of similarity between the two sources. Findings from this study reveal the values and properties and spatiotemporal heterogeneity of human mobility data extracted from Weibo tweets, providing a reference for the proper use of social media posts as the data sources for human mobility studies.
Akhil Kumar, Yogya Kalra, Weihe Wendy Guan, Vansh Tibrewal, Rupali Batta, and Andrew Chen. 9/29/2021. “COVID-19 impact on excess deaths of various causes in the United States.” Annals of GIS. Publisher's VersionAbstract
Media regarding COVID-19 fatality counts is crucial, affecting policy and health measures nationwide. However, misinformation regarding other causes of death has led to dubious claims about the seriousness of the coronavirus. This research aims to identify the changes in a dozen causes of death during the pandemic using CDC data from 1999 to 2020. Using the Exponential Triple Smoothing (ETS) algorithm, this project estimated the mortality of eleven causes of death for 2020 under the assumption of no COVID-19 pandemic. Using Power BI and Tableau, this data was visualized together with 2020 actual death counts to determine which causes of death were significantly impacted by the coronavirus. The dashboard revealed an increase in several causes of death including Alzheimer’s Disease and Diabetes, a decrease in Chronic Lower Respiratory Disease deaths, and a slight increase in Influenza deaths. These findings, while at odds with much of the media surrounding COVID-19 mortality, are corroborated by adjacent scientific research.
Wendy Guan and Liz Hess. 7/6/2020. “Understanding the Ecosystem of Geospatial Research and Service in Universities.” Journal of Map & Geography Libraries . Publisher's VersionAbstract
The study of location and location-based phenomena is a flourishing field. Many universities have grown their research and/or services in this field (often called GIS), established centers that are primarily engaged in the research of GIS, or applying GIS technologies to support researches of other fields. Some straddle “research of” and “research with” GIS in the same center, engaging in both GIScience research, often by researchers in a department or school, and geospatial technology services, often for users across the university. We conducted an online survey to scour the landscape of such centers in universities worldwide, to understand how they are structured, managed, financed, and sustained. The survey also included units as part of a library, department, or lab. Eighty-one valid responses were analyzed, revealing these organizations’ administrative, financial, staffing, and operational status; their history, visions, responsibilities, resources, constrains, challenges, and opportunities. The result showed differences between universities with and without a geography department.
Yongming Xu, Benjamin Lewis, and Weihe Wendy Guan. 2019. “Developing the Chinese Academic Map Publishing Platform.” ISPRS Int. J. Geo-Inf., 2019, 8, Pp. 567-. Publisher's VersionAbstract
The discipline of the humanities has long been inseparable from the exploration of space and time. With the rapid advancement of digitization, databases, and data science, humanities research is making greater use of quantitative spatiotemporal analysis and visualization. In response to this trend, our team developed the Chinese academic map publishing platform (AMAP) with the aim of supporting the digital humanities from a Chinese perspective. In compiling materials mined from China’s historical records, AMAP attempts to reconstruct the geographical distribution of entities including people, activities, and events, using places to connect these historical objects through time. This project marks the beginning of the development of a comprehensive database and visualization system to support humanities scholarship in China, and aims to facilitate the accumulation of spatiotemporal datasets, support multi-faceted queries, and provide integrated visualization tools. The software itself is built on Harvard’s WorldMap codebase, with enhancements which include improved support for Asian projections, support for Chinese encodings, the ability to handle long text attributes, feature level search, and mobile application support. The goal of AMAP is to make Chinese historical data more accessible, while cultivating collaborative opensource software development.