Statistical Analysis of Norman Temperature
in the Reanalysis Data

The statistical understanding here is not required to be very deep. The purpose of this task is to introduce some simple statistical analysis capabilities of Python, and how to use Python to assist you in data preparation. Also introduced here is the Python plotting package matplotlib.

In regress.zip you will find a completed project to "predict" the 2 meter temperature in Norman, from a few other coincident variables. The prediction is a nowcast, as if we have "lost" our thermometer and we want to know the current temperature.

Your task is to modify the nowcast project into a forecast project. You will need to write a simple Python script that will append the temperature 24 hours hence, to the line (record) containing the current data.

The data preparation for the nowcast is contained within usewgribh.tar.gz. The data was put together from the reanalysis data, using a modified form of wgrib, dubbed wgrib_hacked.c. The modified code allows for efficient extraction of a time series of data from a single point in reanalysis grid. There is also a short utility script merge_col.py that builds up a time series of multiple variables from files containing time series of a single variable. That script could easily be modified to help with the data preparation portion of the task, but it certainly isn't necessary.

Here is what is required for the task:

  1. Scatters plots (like those below) for the prediction of the +24 hr temperature, using
    • all the predictors: t,u,v,cy,sy,morn
    • only the current temperature: t
    • pure persistance (meaning manually setting the coefficient for t equal to 1, and all others 0). (You will find that in this rather dopey exercise, your forecast models don't beat persistance by very much.)
  2. A time series plot, similar to those below, comparing the 12 Z observed temperature with the 12 Z predicted temperature. (The predicted temperature should be the most skillful predcition, the one made using all the predictors.)


Here are some time series plots of observations, plotted using matplotlib:


Here are the results of the nowcast:


Here is what my plot looks like for the second part of the task: