January 5th, 2017
README to accompany python code used to produce figures and results included in the following publication:

Lassman, W., Ford, B., Gan, R.W., Pfister, G., Magzamen, S., Fischer, E.V., Pierce, J.R.; "Spatial and temporal estimates of population exposure to wildfire smoke during the Washington State 20121 wildfire season using blended model, satellite, and in-situ data", GeoHealth, In Review

*****************************************************************************************************
README prepared by William Lassman.
Questions, contact wlassman@atmos.colostate.edu
*****************************************************************************************************
Purpose of the README
-This document will accompany the process* data and python code used to do the analysis and prepare figures
in the cited publication
-This document will explain data, include metadata for saved data, and will also give a brief overview of
the code
-The code is fairly well-commented, so only a brief overview will be described here


*All raw data used in this analysis are available freely on the internet

-Measurements: EPA AQS, Washington State Department of Ecology
-Satellites: NASA MODIS Level 2 Deep Blue AOD product
-Model: WRF-Chem, available from National Center for Atmospheric Research (NCAR)

-Process code was written in python using all open-source and available modules and packages.
*****************************************************************************************************
Decription of the data

Data are stored as 'pickle' python temporary save files to make reading/manipulating data easier.
They are contained in the 'Data' subdirectory, and labelled as follows:

MODIS_regrid.npz   -MODIS AOD results after post-processing (compositing, gap filling, and regridding).
		   -Two numpy arrays contain the data:
		   	-mAOD - Lat x Lon x Time-Gridded AOD observations for each day
			-mAODsites - Sites x Time-AOD observations co-located with each surface monitor

			
PyKrige.npz	   -Dataset that is produced by ordinary kriging of observations.
		   -Two numpy arrays contain the results:
		   	-kPM - Lat x Lon x Time-Gridded estimates of PM from kriging, including all surface monitors
			-kPMsites - Sitex x Time-LOOCV estimates of PM co-located with measurements \
			(i.e. closest surface monitor removed from the kriging estimate)


sites.npz	   -In-situ measurement sites, stored using a homemade python class 'SurfSite.py' format. (see code)
		   -site lons/lats and other metadata are stored as part of this class structure. 			   Anytime the data are loaded, code to demonstrate unpacking is shown.
		   -This file also contains the lat/lon arrays for the grid (2-D numpy arrays),..\
		    as well as a 1-D array containing Julian day


WRF-Chem.npz		-Dataset containing surface PM2.5 concentrations output from WRF-Chem sims..
			.. and consolidate to 24-hour averages
			-Two numpy arrays:
				-wPM-gridded estimates of PM2.5 (lats x lons x times)
				-wPMsites-co-located at surface sites (site x time )


*****************************************************************************************************
Decription of the code

LIST OF PYTHON SCRIPTS THAT PREPARE DATA

npz2nc.py			# Used to prepare the repository, shouldn't be relevant for you.
				# All this does is write the 'AllData.nc' NetCDF File

nc2npz.py			# Reads 'AllData.nc' NetCDF File




LIST OF PYTHON SCRIPTS TO DO ANALYSIS


KrigeSurfData.py		# Read surface data from sites.npz file, use PyKrige to make kPM. ..\
				# ..\ Then does LOOCV to produce kPMsites. compares results to ..\
				# ..\surface observations and saves output to PyKrige.npz


EvalBlends.py			# Calculates GRR and GWR blends (as well as obsolete PBP blend) at ..\
				# measurement sites, and ..\
				# ..\ uses LOOCV to evaluate blends then makes 1-to-1 plots and map ..\
				# ..\ plots using Basemapplots.py module


PrintBlendsforHealth.py		# Calculates GRR and GWR blends on regular grid and prints to ..\
				# ..\ NetCDF, with option to make map plots of output.


eval_WRF-Chem.py		# Reads WRF-Chem output and surface observations, does comparisons ..\ 					# and produces figures

eval_MODIS.py			# Reads MODIS and surface observations, does comparisons ..\
				# and produces figures


MakeBootstrapEsts.py		# Redoes kriging, GRR, and GWR with variable number of sites, ..\
				# ..\ randomly selecting which monitor locations to include. ..\
				# ..\ Saves ouput to .npz files in Code/BootstrapData subdirectory
				#
				# NOTE: must input number of sites removed and number of trials
				# from the command line. Please refer to comments in this script.


BootstrapAnalysis.py		# Reads output from MakeBootstrapEsts.py and make plot showing ..\
				# ..\ how the methods perform as surface monitors are withheld.



LIST OF PYTHON MODULES USED TO SUPPORT THE SCRIPTS

SurfSite.py		#Introduces Surface Site class to organize data. sites.npz stores the data ..\
			# ..\ in this format so  this is needed to run the scripts that use this file

Basemapplots.py		# Code used to produce map plots. Leans on the python Basemap module ..\
			# ..\ (freely availabe) and recycles code that implements basemap to ..\
			# ..\ produce the actual map plots

udf.py			# 'User Defined Functions' (udf). Contains other helper functions that are ..\
			# ..\ used a lot. This verison only contains the haversine function ..\
			# ..\ for calculating distances on the surface of a sphere







