Dataset associated with "Design and Testing of a Low-Cost Sensor and Sampling Platform for Indoor Air Quality" Jessica Tryner (1,2), Mollie Phillips (2), Casey W. Quinn (3), Gabe Neymark (2), Ander Wilson (4), Shantanu H. Jathar (1), Ellison Carter (5), John Volckens (1)* 1. Department of Mechanical Engineering, Colorado State University, 1374 Campus Delivery, Fort Collins, Colorado 80523 2. Access Sensor Technologies, 2401 Research Blvd, Suite 107, Fort Collins, Colorado 80526 3. NSG Engineering Solutions, 227 Central St NE, Olympia, Washington 98506 4. Department of Statistics, Colorado State University, 1801 Campus Delivery, Fort Collins, Colorado 80523 5. Department of Civil and Environmental Engineering, Colorado State University, 1372 Campus Delivery, Fort Collins, Colorado 80523 Associated article citation: Tryner, J., Phillips, M., Quinn, C. W., Neymark, G., Wilson, A., Jathar, S. H., Carter, E., & Volckens, J. (2021). Design and Testing of a Low-Cost Sensor and Sampling Platform for Indoor Air Quality. Building and Environment, 206, 108398. https://doi.org/10.1016/j.buildenv.2021.108398 *Please contact John Volckens (john.volckens@colostate.edu) regarding this dataset. See the manuscript for details on data collection, quality control, and analysis. The first two files in this dataset, "01a_LICOR_LI820_20201007.txt" and "01b_LICOR_LI820_20201012.txt", include raw data from an LI-COR Biosciences LI-820 CO2 Gas Analyzer. These data were collected in the kitchen of a house in Fort Collins, Colorado, USA in October 2020. Line 1 in each file contains the local date and time in Fort Collins, Colorado, USA when the data log started. This time is shown in the format "YYYY-mm-dd at HH:MM", where YYYY is the year, mm is the month (01 to 12), dd is the day of the month (01 to 31), HH is the hour (00 to 23), and MM is the minute (00 to 59). Line 2 in each log file contains descriptive headers, with units, for the four columns of data listed below. Log data begin on line 3. The "01a_LICOR_LI820_20201007.txt" and "01b_LICOR_LI820_20201012.txt" data files include the following columns: 1. Time(H:M:S) - local time (HH:MM:SS), 2. CO2(ppm) - carbon dioxide concentration (parts per million), 3. CellTemp(c) - temperature in the optical cell (Celsius), 4. CellPres(kPa) - pressure in the optical cell (kilopascals). The second two files in this dataset, "02a_TSI_QTrak_7575X1818010.txt" and "02b_TSI_QTrak_7575X1818011.txt", include raw data logged by two TSI Incorporated QTrak 7575-X Indoor Air Quality Monitors with model 982 probes. These data were collected in the kitchen of a house in Fort Collins, Colorado, USA in October 2020. Line 31 in each log file contains descriptive headers for the data columns. The units for each column are listed on line 32. Log data begin on line 33. The "02a_TSI_QTrak_7575X1818010.txt" data file, which contains measurements taken using the QTrak serial number 7575X1818010 and probe serial number P18150032, includes the following seven columns: 1. Date - local date (mm/dd/YYYY), 2. Time - local time (HH:MM:SS), 3. CO2 - carbon dioxide concentration (ppm), 4. T - ambient temperature (Celsius), 5. H - ambient relative humidity (%), 6. CO - carbon monoxide concentration (ppm), 7. BP - ambient barometric pressure (kilopascals). The "02b_TSI_QTrak_7575X1818011.txt" data file, which contains measurements taken using the QTrak serial number 7575X1818011 and probe serial number P18150031, includes the following nine columns: 1. Date - local date (mm/dd/YYYY), 2. Time - local time (HH:MM:SS), 3. CO2 - carbon dioxide concentration (ppm), 4. T - ambient temperature (Celsius), 5. H - ambient relative humidity (%), 6. Dewpoint - dew point temperature (Celsius), 7. Wet bulb - wet-bulb temperature (Celsius), 8. CO - carbon monoxide concentration (ppm), 9. BP - ambient barometric pressure (kilopascals). Third, the "03_TEOM.csv" data file includes raw data logged by a Tapered Element Oscillating Microbalance (1405 TEOM, ThermoFisher Scientific, Waltham, MA, USA). These data were collected in the kitchen of a house in Fort Collins, Colorado, USA in October 2020. Line 4 contains descriptive headers for the data columns. Log data begin on line 5. The "03_TEOM.csv" data file includes the following columns: 1. Date - local date (mm/dd/YYYY), 2. Time - local time (HH:MM:SS), 3. tmoStatusCondition_0 - system status (0 if no warnings are present), 4. tmoTEOMAMC_0 - mass concentration (micrograms per cubic meter), 5. tmoTEOMAFilterLoad_0 - filter loading as a fraction of maximum (%), 6. tmoTEOMAFrequency_0 - oscillating frequency of the tapered element in the mass transducer (Hz), 7. tmoOperatingMode_0 - TEOM operating mode at time of data capture (1 = stabilizing, 2 = collecting data, 3 = computing data, 4 = fully operational), 8. tmoTEOMAFlowVolumetric_0 - the main flow rate (liters per minute), 9. tmoAmbientTemp_0 - the ambient temperature (Celsius), 10 tmoAmbientRH_0 - the ambient relative humidity (%). Fourth, the "04_filter.csv" data file includes the masses of the 37-mm diameter polytetrafluoroethylene (PT37P-PF03, Measurement Technology Laboratories, Minneapolis, MN, USA) filters used to sample PM2.5 and PM10 with the Home Health Boxes and the ASPEN boxes installed inside the kitchen and outside the home in Fort Collins, Colorado, USA. All filter masses were measured in the Automated Air Analysis Facility (AIRLIFT) at Colorado State University in Fort Collins, CO, USA. For additional information on the AIRLIFT, see https://doi.org/10.4209/aaqr.210037. The "04_filter.csv" data file includes the following columns: 1. ID - filter identification code, 2. Pre-1 (ug) - the first filter mass measurement taken prior to sampling (micrograms), 3. Pre-2 (ug) - the second filter mass measurement taken prior to sampling (micrograms), 4. Pre-3 (ug) - the third filter mass measurement taken prior to sampling (micrograms), 5. Post-1 (ug) - the first filter mass measurement taken after sampling (micrograms), 6. Post-2 (ug) - the second filter mass measurement taken after sampling (micrograms), 7. Post-3 (ug) - the third filter mass measurement taken after sampling (micrograms). Fifth, the "05_activity_log.txt" data file describes the timing of various activities that took place in a home in Fort Collins, Colorado, USA between October 8, 2020 and October 15, 2020 UTC. The local time in Fort Collins was UTC-06:00 on these dates. The "05_activity_log.txt" data file includes the following columns: 1. Start_Time (local) - the local date/time when the activity started (mm/dd/YYYY HH:MM), 2. Stop_Time (local) - the local date/time when the activity stopped (mm/dd/YYYY HH:MM), 3. Event - the category to which the activity was assigned ("Door/windows open", "AC", "HEPA", "HEPA+Ionizer", "Burner", or "Cooking"), 4. Description - a text description of the activity, 5. Confirmed - denotes whether the activity was included in home occupants' original activity log (1) or identified when time-series reference monitor data were reviewed with occupants after the experiment (0), 6. Stove_Oven - denotes whether the natural gas cooking burners present in the stovetop or oven were on (1) or off (0) during the given time period, 7. Central_AC - denotes whether the central air conditioning system was on (1) or off (0) during the given time period, 8. Windows_Door - denotes whether windows and/or the sliding glass door in the kitchen were open (1) or closed (0) during the given time period, 9. HEPA - denotes whether the high-efficiency particulate air (HEPA) filter portion of the portable air cleaner was on (1) or off (0) during the given time period, 10. Ionizer - denotes whether the bipolar ionizer in the portable air cleaner was on (1) or off (0) during the given time period. Sixth, the "06_Alphasense_calibration_data.csv" data file includes calibration coefficients provided by Alphasense for their electrochemical gas sensors. The "06_Alphasense_calibration_data.csv" data file includes the following columns: 1. Sensor Type - the model number of the sensor (CO B4, NO2 B43F, or OX B431), 2. Board Type - the model number of the Alphasense Individual Sensor Board (ISB) on which the sensor was mounted (0 for CO B4 sensors; 2 for NO2 B43F and OX B431 sensors), 3. Gain (mV/nA) - the gain associated with the ISB on which the sensor was mounted (0.8 mV/nA for the type 0 ISBs on which the CO B4 sensors were mounted; -0.73 mV/nA for the type 2 ISBs on which the NO2 B43F and OX B431 sensors were mounted), 4. SENSOR NUMBER - the nine digit serial number associated with the sensor, 5. WE Zero (mV) - the total working electrode zero offset, which is equal to the sum of the working electrode electronic offset on the individual sensor board and the sensor working electrode output in zero air (i.e., WE_T in Alphasense Application Note 803-05, where WE_T = WE_e + WE_0), 6. Aux Zero (mV) - the total auxiliary electrode zero offset, which is equal to the sum of the auxiliary electrode electronic offset on the individual sensor board and the sensor auxiliary electrode output in zero air (i.e., AE_T in Alphasense Application Note 803-05, where AE_T = AE_e + AE_0), 7. WE Sensor (nA/ppm) - the sensitivity of the sensor to the pollutant of interest (carbon monoxide for CO B4 sensors, nitrogen dioxide for NO2 B43F sensors, and ozone for OX B431 sensors), 8. Sensitivity (mV/ppb) - the sensitivity of the sensor to the pollutant of interest (equal to [Gain (mV/nA)] * [WE Sensor (nA/ppm)] / [1000 (ppb/ppm)]), 9. ELECTRONIC ZERO (WE) (mV) - the working electrode electronic offset on the individual sensor board (WE_e in Alphasense Application Note 803-05), 10. ELECTRONIC ZERO (AUX) (mV) - the auxiliary electrode electronic offset on the individual sensor board (AE_e in Alphasense Application Note 803-05), 11. NO2 Sensitivity (nA/ppm) - the sensitivity of each OX B431 sensor to nitrogen dioxide ([NO2 Sensitivity (mV/ppb)] = [Gain (mV/nA)] * [NO2 Sensitivity (nA/ppm)] / [1000 (ppb/ppm)]). Seventh, the "07_HHB_CO2_model_coefficients.csv" data file includes the coefficients of linear mixed empirical calibration models fit to the low-cost nondispersive infrared (NDIR) Sensirion SCD30 carbon dioxide (CO2) sensor data using seven-fold cross validation. Each model had the form shown in Equation 10 of the manuscript: c_{CO2,SCD30,ij} = alpha + a_i + (gamma * c_{CO2,LI-820,j}) + epsilon_{ij}, where c_{CO2,SCD30,ij} was a 1-hour average CO2 concentration reported by SCD30 sensor i at time j (ppm), alpha was a fixed intercept (ppm), a_i was a random intercept for SCD30 sensor serial number i (ppm), gamma was a fixed slope, c_{CO2,LI-820,j} a 1-hour average CO2 concentration reported by the LI-COR Biosciences LI-820 CO2 Gas Analyzer at time j (ppm), and epsilon_{ij} was the random error (ppm). The "07_HHB_CO2_model_coefficients.csv" data file includes the following columns: CO2serial - the serial identification number associated with the Sensirion SCD30 sensor (a 20-digit number formatted as xxxxxxx-xxxxxxx-xxxxxx), HHBserial - the serial identification number of the Home Health Box in which the given Sensirion SCD30 sensor was installed (formatted as HBXXXXX), k - the fold for which the model coefficients were used to predict carbon dioxide concentrations (k = 1..7), alpha - the fixed intercept for fold k (ppm), a_i - the random intercept associated with CO2serial for fold k (ppm), gamma - the fixed slope for fold k. Eighth, the "08_HHB_CO2_model_coefficients.csv" data file includes the coefficients of linear mixed empirical calibration models fit to the low-cost electrochemical Alphasense B-series sensor data using seven-fold cross validation. Each model had the following form (as shown in Equation 7-9 in the manuscript): c_j = alpha + a_i + (beta + b_i)(WE_{u,ij} - WE_{e,i}) + (gamma_T * T_{ij}) + (gamma_{RH} * RH_{ij}) + epsilon_{ij}, where c_j was a 1-hour average gas concentration measured at time j by the reference monitor(s) for the pollutant(s) detected by the sensor (ppb), alpha was a fixed intercept (ppb), a_i was a random intercept associated with sensor serial number i, beta was a fixed slope, b_i was a random slope associated with sensor serial number i, WE_{u,ij} was the 1-hour average uncorrected working electrode voltage logged by HHB i at time j, WE_{e,i} was the working electrode electronic offset on the individual sensor board for sensor i, gamma_T was a fixed slope (ppb/Celsius), T_(ij) was the temperature measured inside gas sensor housing i at time j (Celsius), gamma_(RH) was a fixed slope (ppb/%), RH_(ij) was the relative humidity measured inside gas sensor housing i at time j (%), and epsilon_{ij} was the random error (ppb). The "08_HHB_CO2_model_coefficients.csv" data file includes the following columns: sensor_type - the model number of the sensor (CO B4, NO2 B43F, or OX B431), algorithm - the equation number that describes the model (see the associated manuscript for additional details), HHBserial - the serial identification number of the Home Health Box (HHB) in which the given sensor was installed (formatted as HBXXXXX), k - the fold for which the model coefficients were used to predict gas concentrations (k = 1..7), alpha - the fixed intercept for fold k (ppb), a_i - the random intercept associated with the HHB/sensor serial number i for fold k (ppb), beta - the fixed slope for fold k (ppb/V), b_i - the random slope associated with the HHB/sensor serial number i for fold k (ppb/V), gamma_T - the fixed slope for fold k (ppb/Celsius), gamma_RH - the fixed slope for fold k (ppb/%). Ninth, the "09_pollutant_data_processed.csv" data file includes processed pollutant concentration data from reference monitors (one LI-COR Biosciences LI-820 CO2 Gas Analyzer, two TSI Incorporated QTrak 7575-X Indoor Air Quality Monitors with model 982 probes, one Thermo Environmental Instruments Model 42C Trace Level Chemiluminescence NO-NO2-NOx Analyzer, one Thermo Environmental Instruments Model 49C UV Photometric O3 Analyzer, and one ThermoFisher Scientific 1405 TEOM Tapered Element Oscillating Microbalance) and nine Home Health Boxes installed in the kitchen of a house in Fort Collins, Colorado, USA in October 2020. The reference NO2 and O3 concentrations listed in this data file differ from the raw data contained in the zip files associated with the Model 42C Trace Level Chemiluminescence NO-NO2-NOx Analyzer and Model 49C UV Photometric O3 Analyzer in that all concentrations listed in "09_pollutant_data_processed.csv" were adjusted using the linear calibration models for these instruments that were developed as described in Section S1.2.1 of the Supporting Information associated with the manuscript. The "09_pollutant_data_processed.csv" data file includes the following columns: 1. DateTimeUTC - the UTC date and time (YYYY-mm-ddTHH:MM:SSZ), 2. monitor - the type of monitor from which the data were obtained ("Reference" or "HHB"), 3. pollutant - the pollutant ("CO2", "CO", "NO2", "O3", or "PM2.5"), 4. unit - the unit associated with the "concentration" value ("ppm" for CO2 and CO, "ppb" for NO2 and O3, and "mu*g~m^-3" for PM2.5), 5. interval - the interval at which the concentration values were logged or calculated (30 or 60 s for data logged by reference monitors, uncorrected CO2 concentrations obtained from the Sensirion SCD30 sensors in the HHBs, gas concentrations calculated from Alphasense electrochemical sensor data using Equations 1-4 from the manuscript, and PM2.5 concentrations obtained by corrected low-cost sensor data to gravimetric filter samples; 3600 seconds for CO2, CO, NO2, and O3 concentrations calculated using empirical linear mixed calibration models), 6. concentration - the concentration of the pollutant in the specified unit, 7. HHBserial - the serial identification number of the Home Health Box (HHB) from which the data were obtained (NA if monitor type is "Reference"), 8. algorithm - the equation number used to calculate the concentration (see the associated manuscript for additional details; NA if monitor type is "Reference"; -1 for uncorrected CO2 concentrations logged directly from the Sensirion SCD30 sensors in the Home Health Boxes). This dataset also includes four .zip files: 42C.zip - Raw data recorded by the Thermo Environmental Instruments Model 42C Trace Level Chemiluminescence NO-NO2-NOx Analyzer (eight .txt data files and one README.txt file) 49C.zip - Raw data recorded by the Thermo Environmental Instruments Model 49C UV Photometric O3 Analyzer (eight .txt data files and one README.txt file) HHB.zip - Raw data recorded by nine Home Health Boxes (25 .txt data files, a README.txt file, and a "Home Health Box Log File Legend.pdf" file) ASPEN.zip - Raw data recored by the ASPEN box (two .txt data files, a README.txt file, and a "ASPEN Box Log File Legend.pdf" file) Recommended citation: Tryner, J., Phillips, M., Quinn, C. W., Neymark, G., Wilson, A., Jathar, S. H., Carter, E., & Volckens, J. 2021. Dataset associated with "Design and Testing of a Low-Cost Sensor and Sampling Platform for Indoor Air Quality". Colorado State University. Libraries. http://dx.doi.org/10.25675/10217/233921 Data license: The material is open access and distributed under the terms and conditions of the Creative Commons Public Domain "No rights reserved" (https://creativecommons.org/share-your-work/public-domain/cc0/).