Repository logo

Investigating the enhancement of air pollutant predictions and understanding air quality disparities across racial, ethnic, and economic lines at US public schools




Cheeseman, Michael J., author
Pierce, Jeffrey R., advisor
Barnes, Elizabeth, committee member
Fischer, Emily, committee member
Ford, Bonne, committee member
Volckens, John, committee member

Journal Title

Journal ISSN

Volume Title


Ambient air pollution has significant health and economic impacts worldwide. Even in the most developed countries, monitoring networks often lack the spatiotemporal density to resolve air pollution gradients. Though air pollution affects the entire population, it can disproportionately affect the disadvantaged and vulnerable communities in society. Pollutants such as fine particulate matter (PM2.5), nitrogen oxides (NO and NO2), and ozone, which have a variety of anthropogenic and natural sources, have garnered substantial research attention over the last few decades. Over half the world and over 80% of Americans live in urban areas, and yet many cities only have one or several air quality monitors, which limits our ability to capture differences in exposure within cities and estimate the resulting health impacts. Improving sub-city air pollution estimates could improve epidemiological and health-impact studies in cities with heterogeneous distributions of PM2.5, providing a better understanding of communities at-risk to urban air pollution. Biomass burning is a source of PM2.5 air pollution that can impact both urban and rural areas, but quantifying the health impacts of PM2.5 from biomass burning can be even more difficult than from urban sources. Monitoring networks generally lack the spatial density needed to capture the heterogeneity of biomass burning smoke, especially near the source fires. Due to limitations of both urban and rural monitoring networks several techniques have been developed to supplement and enhance air pollution estimates. For example, satellite aerosol optical depth (AOD) can be used to fill spatial gaps in PM monitoring networks, but AOD can be disconnected from time-resolved surface-level PM in a multitude of ways, including the limited overpass times of most satellites, daytime-only measurements, cloud cover, surface reflectivity, and lack of vertical-profile information. Observations of smoke plume height (PH) may provide constraints on the vertical distribution of smoke and its impact on surface concentrations. Low-cost sensor networks have been rapidly expanding to provide higher density air pollution monitoring. Finally, both geophysical modeling, statistical techniques such as machine learning and data mining, and combinations of all of the aforementioned datasets have been increasingly used to enhance surface observations. In this dissertation, we explore several of these different data sources and techniques for estimating air pollution and determining community exposure concentrations. In the first chapter of this dissertation, we assess PH characteristics from the Multi-Angle Implementation of Atmospheric Correction (MAIAC) and evaluate its correlation with co-located PM2.5 and AOD measurements. PH is generally highest over the western US. The ratio PM2.5:AOD generally decreases with increasing PH:PBLH (planetary boundary layer height), showing that PH has the potential to refine surface PM2.5 estimates for collections of smoke events. Next, to estimate spatiotemporal variability in PM2.5, we use machine learning (Random Forests; RFs) and concurrent PM2.5 and AOD measurements from the Citizen Enabled Aerosol Measurements for Satellites (CEAMS) low-cost sensor network as well as PM2.5 measurements from the Environmental Protection Agency's (EPA) reference monitors during wintertime in Denver, CO, USA. The RFs predicted PM2.5 in a 5-fold cross validation (CV) with relatively high skill (95% confidence interval R2=0.74-0.84 for CEAMS; R2=0.68-0.75 for EPA) though the models were aided by the spatiotemporal autocorrelation of the PM22.5 measurements. We find that the most important predictors of PM2.5 are factors associated with pooling of pollution in wintertime, such as low planetary boundary layer heights (PBLH), stagnant wind conditions, and, to a lesser degree, elevation. In general, spatial predictors are less important than spatiotemporal predictors because temporal variability exceeds spatial variability in our dataset. Finally, although concurrent AOD is an important predictor in our RF model for hourly PM2.5, it does not improve model performance during our campaign period in Denver. Regardless, we find that low-cost PM2.5 measurements incorporated into an RF model were useful in interpreting meteorological and geographic drivers of PM2.5 over wintertime Denver. We also explore how the RF model performance and interpretation changes based on different model configurations and data processing. Finally, we use high resolution PM2.5 and nitrogen dioxide (NO2) estimates to investigate socioeconomic disparities in air quality at public schools in the contiguous US. We find that Black and African American, Hispanic, and Asian or Pacific Islander students are more likely to attend schools in locations where the ambient concentrations of NO2 and PM2.5 are above the World Health Organization's (WHO) guidelines for annual-average air quality. Specifically, we find that ~95% of students that identified as Asian or Pacific Islander, 94% of students that identified as Hispanic, and 89% of students that identified as Black or African American, attended schools in locations where the 2019 ambient concentrations were above the WHO guideline for NO2 (10 μg m-3 or ~5.2 ppbv). Conversely, only 83% of students that identified as white and 82% of those that identified as Native American attended schools in 2019 where the ambient NO2 concentrations were above the WHO guideline. Similar disparities are found in annually averaged ambient PM2.5 across racial and ethnic groups, where students that identified as white (95%) and Native American (83%) had a smallest percentage of students above the WHO guideline (5 μg m-3), compared to students that identified with minoritized groups (98-99%). Furthermore, the disparity between white students and other minoritized groups, other than Native Americans, is larger at higher PM2.5 concentrations. Students that attend schools where a higher percentage of students are eligible for free or reduced meals, which we use as a proxy for poverty, are also more likely to attend schools where the ambient air pollutant concentrations exceed WHO guidelines. These disparities also tend to increase in magnitude at higher concentrations of NO2 and PM2.5. We investigate the intersectionality of disparities across racial/ethnic and poverty lines by quantifying the mean difference between the lowest and highest poverty schools, and the most and least white schools in each state, finding that most states have disparities above 1 ppbv of NO2 and 0.5 μg m-3 of PM2.5 across both. We also identify distinct regional patterns of disparities, highlighting differences between California, New York, and Florida. Finally, we also highlight that disparities do not only exist across an urban and non-urban divide, but also within urban areas.


Rights Access


biomass burning
machine learning
environmental justice
air quality


Associated Publications