Automating the interpretation of PM2.5 time‐resolved measurements using a data‐driven approach

Date Published
12/2020
Publication Type
Journal Article
Authors
DOI
10.1111/ina.v31.310.1111/ina.12780
Abstract

The rapid development of automated measurement equipment enables researchers to collect greater quantities of time-resolved data from indoor and outdoor environments. While significant, the interpretation of the resulting data can be a time-consuming effort. This paper introduces an automated process of interpreting PM2.5 time-resolved data and differentiating PM2.5 emissions resulting from indoor and outdoor sources. We use Random Forest (RF), a machine learning approach, to study a dataset of 836 indoor emission events that occurred over a 2-week period in 18 apartments in California. In this paper, we show model development and evaluate its performance as the sample size and source vary. We discuss the characteristics of the dataset that tended to help the source identification and why. For example, we show that data from many events and from different apartments are essential for the model to be suitable for analyzing a new separate dataset. We also show that longitudinal data appear to be more helpful than the time frequency of measurements within a given apartment. We use the resulting RF model to analyze PM2.5 data of an entirely separate dataset collected from 65 new homes in California. The RF model identifies 442 indoor emission events, with only a few misidentifications.

Journal
Indoor Air
Volume
31
Year of Publication
2021
Issue
3
Pagination
860 - 871
ISSN Number
0905-6947
URL
Short Title
Indoor Air
Organizations
Research Areas
File(s)
Download citation