13.7. Temperature Data Project

This project will demonstrate:

  1. Importing data into MATLAB
  2. Fixing out-of-range data
  3. Working with data containing dates
  4. Using statistical moving window functions

The instructions are given partly as a tutorial, but are also just my notes for when I was working on it for the first time. You might need to make some adjustments.

The output of the assignment will be a plot. Save the plot to a ‘png’ picture file and upload the picture on Canvas.

13.7.1. Part 1: Getting the Data

Google search for Hourly temperature data:

  • Quality Controlled Datasets - National Climatic Data Center - NOAA https://www.ncdc.noaa.gov/crn/qcdatasets.html
  • Hourly02 directory has what we want. The folders and files are organized by year and reporting station. Two stations are in KS (Manhattan, Oakley). Downloaded the Manhattan data for the year of your choice.
  • From the documentation, field 4 - Local Standard Time (LST) date field 5 - The Local Standard Time (LST) time of the observation. field 10 - Average air temperature, in degrees C, for the entire hour.
  • Note that data starts based on UTC time, so it has the last hours of the previous year.
  • Use the Import Data tool. Rename and import fields 4, 5, and 10 to Date, Hour, and Temp.

View sample of data

>> tempData(1:6,:)
ans =
6x3 table
    Date       Hour    Temp

    __________    ____    _____
    2.0151e+07    1900    -1.8
    2.0151e+07    2000    -3.3
    2.0151e+07    2100    -4.4
    2.0151e+07    2200    -5.7
    2.0151e+07    2300    -3.7
    2.016e+07       0    -3.7

Remove the big table

>> clear CRNH02032016KSManhattan6SSW

Take a quick look at the data

>> plot(tempData.Temp)

The data has some very large negative values.

Check for missing data – all there, good

>> any(ismissing(tempData.Temp))
ans =
logical
    0

For convenience

temp = tempData.Temp

>> min(temp)
ans =
    -9999
>> max(temp)
ans =
    38.8000  -- about 101.8 F (probably okay)

It looks like there are a few clusters of -9999 temps, probably when no reading was taken. The last good reading is probably the best that we can do. The following function cleaned the data.

function goodTemp = fix_badTemps( temp )
%FIX_BADTEMPS - change out-of-range temp data to previous good value.

    goodTemp = temp;
    while any(goodTemp < -30 )
        for b = find(goodTemp < -30)'
            goodTemp(b) = goodTemp(b-1);
        end
    end
end

Test the data now:

>> min(tempData.Temp)
ans =
    -27

Save the clean data

>> writetable(tempData, 'EastKS_temperatures16.csv')

13.7.2. Part 2: The Plotted X-axis

Now for the date and time: Following function returns a datetime value from the dates and times given:

function date = get_date_time( dateNum, timeNum )
%GET_DATE_TIME Convert numeric date and time to datetime
%   dateNum such as: 20151231
%   timeNum such as: 1900

    year = floor(dateNum/10000);
    month = floor((dateNum - year*10000)/100);
    day = dateNum - year*10000 - month*100;
    hour = timeNum/100;
    date = datetime(year,month,day,hour,0,0);
end

Following makes an okay initial plot:

>> dates = get_date_time(tempData.Date, tempData.Hour);
>> plot(dates, temp)

13.7.3. Part 3: Statistical Analysis

In the following code, we find the daily mean (average), maximum, and minimum temperatures. Then, to smooth out random fluctuations moving window averages are taken over a six week window size. This is probably a larger window size than is desires. Experiment with shortening the window size.

% convert to Fahrenheit
temps = tempData.Temp*9/5 + 32;
% plot(dates,temps);
%% Daily stats
dAve = movmean(temps, 24);
dMax = movmax(temps, 24);
dMin = movmin(temps, 24);
% 6 week filters
ave6w = movmean(dAve, 1008);
max6w = movmean(dMax, 1008);
min6w = movmean(dMin, 1008);
plot(dates, ave6w, 'LineWidth', 3);
hold on
plot(dates, max6w, 'r', 'LineWidth', 2);
plot(dates, min6w, 'g', 'LineWidth', 2);
hold off
legend('Daily Average','Daily Max','Daily Min','Location','northwest');
ylabel('Temperature, {}^{\circ}F');
title('Smoothed Temperature Data');