CHAPTER 4
Analysis of Results
4.1 Prediction Tool Development
The objective of this analysis is to develop a simplified tool to predict peak occupancy rates. The relationships between number of rooms, average occupancy and peak occupancy that were evident in the recorded data provide the basis for this analysis. The predictions from this simplified tool are validated by comparison with data from the NCAR facility and from a separate research project of the Energy Center of Wisconsin.
4.2 Occupancy Data Curve-fits: Average Decile Statistics
The analysis of the occupancy data begins with the Average Decile procedure results of 120 data triads of daily "8to5" data. This data is summarized in Table 2.4, showing the peak and counts (number of sets in each cell) by average decile and set size. The tendencies are for the peak to decrease as the set size increases, and to increase as the average decile increases.
Both of these tendencies are expected. As set size increases, the likelihood of extreme occupancy conditions, i.e. "full occupancy", is reduced since such an event would require extreme temporal coincidence. An analogy would be the likelihood that "when everyone flips a coin, every one comes up heads". As the number of those involved increases, there is a reduced chance that all those coincidences will occur at the same time.
The increase in peak as average decile increases is also anticipated, in part because the peak will always be greater than the average. In addition, greater average occupancy means more total "occupant-hours" to distribute over the available hours, and also a greater likelihood of extreme occupancy conditions. To continue with the analogy, if, instead of 50%, the probability of coming up heads were 75%, then "every one comes up heads" would be much more likely.
It is the combination of these two tendencies which indicates that a starting place for predicting peak occupancy is to have both set size and average occupancy rate as the independent variables.
The scatter of the data over a range of set sizes, and the low number of sets in each cell, makes this data questionable for developing the proposed prediction tool. However, grouping the data by average deciles, and generating linear regression fits to the groups provides a set of slope (m) and intercept (b) values for each decile. This equation is of the form:
Eq. 4.1: Peak = Min (1.0, m(AvgDec)*SetSize + b(AvgDec))
The minimum function is necessary to restrict the result to be less than or equal to 1.0. This linear regression was done with the monthly daily peak occupancy data from the Average Decile procedure as the dependent variable, and the results are provided in Table 4.1, along with a sample calculation. The regression coefficients are quite high, and the trends across them are displayed in Table 2.4.
Due to the low number of values in each cell, as shown in Table 2.4, using these coefficients can produce significant errors. One example of these errors is that the peak obtained from using any coefficients for AvgDec greater than or equal to 6 will be 1.00 to 1.03, regardless of SetSize. Even though this approach is not pursued further, the results presented in Table 4.1 indicate a correlation does exist between the independent variables of set size and average occupancy and the dependent variable of peak occupancy.
A review of the counts in Table 2.4 shows that the set size tends to be larger for groups with average deciles between 0.3 and 0.7. This observation supports the development of a model using the data from the Set Size Statistics.
4.3 Occupancy Data Curve-fits: Set Size Statistics
The monthly daily peak occupancy data generated with the set size procedure is grouped by set size, making 5 groups of 600 data triads. The scatter plots of the peak value for these groups are shown in Figures 4.1 through 4.5 (4.1, 4.2, 4.3, 4.4, 4.5). Linear regression is applied to each group and produces an equation of the form:
Eq. 4.2: Peak = Min (1.0, m(SetSize)*AvgDec + b(SetSize))
Again the minimum function is used to limit the result to be less than or equal to 1.0.
The corresponding slope (m), intercept (b) and regression coefficients values are also listed in Table 4.2, along with a sample calculation.
The trends in the peak over the changing set size and average are consistent with the data from the Average Decile procedure. The results provided in Tables 4.1 and 4.2 shows that the regressions are accurate with high correlation coefficients.
The standard deviations for the average and peak values for each hour are shown in Table 4.3. Standard deviation is reduced as the set size increases, for all hours and both average and peak values. In addition, the relative changes between adjacent set sizes with the same hour have a clear pattern, with the differences between standard deviations decreasing as the set size increases. Thus the difference between the standard deviations for set size 10 and 20 is greater than the difference between the standard deviations for set size 20 and 30. This is also true for all hours and both average and peak values.
This analysis shows that set size is a valid independent variable, to be used in conjunction with average occupancy as the basis of the prediction tool.
4.4 Occupancy Data Curve-fits: Combined Data
The monthly daily data from the Average Decile and Set Size procedures are combined into one data set of 3120 triads, and a multiple linear regression calculated using SPSS. Numerous combinations of linear terms were tried, starting with just average and count (which is equal to set size), and increasing the number and variety of terms using those two independent variables to develop the best fit. The best fit obtained uses eight terms that are a combination of average and count, and a constant:
Again the minimum function is used to limit the result to be less than or equal to 1.0. The coefficients and constant calculated by SPSS are presented in Table 4.4 and the details of the regression are shown in Appendix C. The example calculation provided in Table 4.4 corresponds quite well with the example calculations of Tables 4.1 and 4.2.
The hourly data from the Average Decile and Set Size procedures are combined into nine data set of 3120 triads each, and similar curve fits were made for each of the hourly period data groups. The detailed results from SPSS are shown in Appendix C and the coefficients are listed in Table 4.4. The forms of the equations were made consistent with the curve fit of the monthly daily data discussed above and shown with an example in Table 4.4.
A software tool has been developed to perform the calculations for the curve fits. The routine accepts a value for the count (set size) between 2 and 50, and an average occupancy rate over 0 and less than 1, and returns one daily ("8to5") and nine hourly (e.g. "Hr09") peak occupancy rates calculated according to the curve fits described in Table 4.4.
The performance of the tool was validated against data from the SetSize procedure. The comparison of monthly daily values is shown in Table 4.5 and the comparison with monthly hourly values in Table 4.6.
The monthly daily ("8to5") values shown in Table 4.5 are calculated by the tool from inputs of count equal to the specified set size and each month's occupancy as the average occupancy. While there are some months which do not agree very closely, the overall agreement is quite good, within 5%. In particular the averages over the year are very close matches.
The monthly hourly predictions shown in Table 4.6 are in two versions. The first is headed "Hourly" and is calculated by the tool from inputs of that hour's average as the average occupancy and the count as the specified set size. The result from the tool which corresponds to the specified hour is listed. For example, for set size of 10 and for the hour 11, the inputs would be count = 10 and average = 0.54, and the results from the tool for Hr11 would be entered into the "Hourly" column (0.84).
The second version uses the average of the hourly averages as the input, and the result for all the hours are entered into the table. For example, for set size of 10, the average of the hourly average is 0.48, which becomes the input to the tool. The results are the profile listed under the heading "Daily".
The difference between these two versions is that "Hourly" uses the hourly average to calculate just that one hourly peak, while "Daily" uses the average over the day and returns the entire peak occupancy profile for all nine hours as well as the daily "8to5" value.
For the "Hourly" version the values are extremely close, while for the "Daily" version there are some significant differences for the Hr09 and Hr13 values, but the others match quite well. The differences in the Hr09 and Hr13 values can be understood by considering the standard deviation information presented in Table 3.1. The greater differences are related to the larger standard deviations of the peak values, especially as percent of the average values of the peak, that are found in the Set Size statistics for these hours. Since Hr09 and Hr13 are not likely to be the maximum values over the day, nor the times when the building peak occurs, these differences are not critical. The averages are in agreement, while the "8to5" prediction values are slightly greater than the data but reasonably close.
The same procedure was applied to monthly hourly predictions adjusted up or down by one or two standard deviations. The results are shown in Table 4.7 and Figures 4.6 through 4.10 (4.6, 4.7, 4.8, 4.9, 4.10) for set size of 10 rooms, and in Table 4.8 and Figures 4.11 through 4.15 (4.11, 4.12, 4.13, 4.14, 4.15) for set size of 50. The "#StDev +/- = 0" portion shown at the top of these tables has the same values as shown in Table 4.6. The average and peak values for each hour were then increased or decreased by the standard deviation of that value and hour combination for the specified set size. Both hourly and daily predictions were made and the comparisons are shown. The trends are consistent within and between these two tables. As the number of standard deviations increase, the errors in the predictions increase. When the averages and peak values are increased by one or two standard deviations, the "%diff " for both Hourly and Daily predictions becomes increasingly negative. Decreasing by one or two standard deviations has the reverse effect, by the same magnitude. This is true for both set sizes, but the changes are less for the set size of 50 rooms, which is expected. As set size increases the variability of the peak should decrease.
The changes in the errors of the predictions indicate that the prediction tool values tends toward the average conditions ("#StDev +/- = 0") as the number of standard deviations rises. Inspecting the errors for the "#StDev +/- = +2" or "#StDev +/- = -2" versions shows that the predictions are typically closer to the "#StDev +/- = 0" value of the same set size than the adjusted peak values. The predicted values are under the high peak values and over the low peak values. The hourly predicted values all show this trend, while the daily predicted values do except for Hr09 and Hr13. As discussed above, these hours are probably not critical to the successful application of the peak occupancy data.
4.5 Occupancy Data Comparison with ECW Data
For further validation of the simplified tool developed in this work, data was requested from S. Pigg of the Energy Center of Wisconsin (ECW). Published work [Pigg, Eilers and Reed, 1996] reported but did not include the desired information, so prepublished data that can be compared with the results of this work were provided.
This data was taken from 61 private offices in an academic institution which were being monitored by the ECW. The facility is a college office and classroom building, so while the rooms which were monitored can be considered very similar to the NCAR rooms, overall the facilities have a different function. The data was provided as individual values for each hour of each month. Therefore, it could be collapsed into either annual hourly or monthly daily values. Both of these sets of values were calculated and the comparison of the ECW data and the corresponding predictions are summarized in Table 4.9 and shown in Figures 4.16 and 4.17. Similar to the comparisons with the Set Size data described previously, the hourly comparisons has both the "Hourly" and the "Daily" versions. These are developed the same way as discussed above. While some of the "Daily" values are significantly different, the overall agreement is reasonable, particularly the average over all the hours. The "8to5" value slightly underestimates the maximum of the ECW data, but is within an acceptable range considering the differences in the facilities involved.
For the "Hourly" version, the agreement is remarkably close, with the exception of the Hr09 values. Otherwise the correspondence is as close as could be desired.
The "Monthly" data comparison shows that the NCAR predictions are consistently low by several percent. The averages and the maximum values have equivalent disparities between the predictions and the ECW data. Since the ECW data is from a facility which probably has a more homogenous schedule than NCAR, this result is expected.
Considering the differences in the two facilities, the agreement between the ECW data and the predictions based on NCAR data is very good.