### Discussion and Conclusion Master Thesis

#### by ustroetz

In this post I want to discuss a couple of things regarding my research before I draw my finale Conclusion.

I want to talk a little bit about the Results from the Statistical Analysis and about the actual Harvest Costs for the State Forest.

**Outliers**

During the Statistical Analysis to outlier patterns kept recurring which I think do have a significance on the result:

The first pattern, that high *Harvest Costs* occur in extremely densely stocked stands with a low volume of the trees, is a realistic real world condition and an expected silvicultural behavior. If stands are extremely densely stocked the *Volume per Tree* value is low. Therefore the outliers were not removed from the dataset. But from a practical point of view these stands are not likely to be harvested.

The second pattern is that high *Harvest Costs* arise in stands with a high *Slope* value.

This is also an expected behavior; the steeper the slope, the more expensive it is to harvest. But in real world conditions slopes steeper than 40% are not harvested with ground based machinery. Yet the given stands of the CSF are located in this terrain and are assigned by the CSF as harvestable areas. Therefore these outliers were also not removed from the dataset.

Even though both patterns can occur in real world situations, it is likely that they caused the unexplained variance in *Cost*. So I think future studies should take definitely take that into account and should consider removing those outliers.

**Interpretation Regression Model**

I could write a whole lot about possible interpretations about the regression model. You can read all that in the full paper. But I want to give you one sample calculation, that highlights fairly good the results:

The intercept of the spatially explicit regression is 22.80 $/ton. This is the *Cost* if the stand is in an absolute flat ground (*Slope* of 0 %), and the stand is located at the road (*Skidding Distance* of 0 ft.). The coefficient value for *Skidding Distance *is 0.0076 $/ft and the *Slope* coefficient value is 0.33 $/%. Since both coefficients are positive, *Harvest Cost* will increase if *Slope* increases or the distance to the road increases. For each percent increase in *Slope*, the *Harvest Cost* will increase by 0.33 $/ton. The *Harvest Cost* will also increase by 0.0076 $/ton for each foot increase of the *Skidding Distance*. Or expressed in other units, the *Harvest Cost* will increase by 7.58 $/ton for each additional 100 feet to skid. So lets say the stand is 1000 ft away from the road and the Slope is 10%. The basic harvest cost of 22.80 $/ton would increase to 33.70 $/ton (=22.80 $/ton+0.0076 $/ft*1000 ft+0.33 $/%*10%).

**Harvest Costs Colorado State Forest **

The calculated *Harvest Costs *for the specific stands of the Colorado State Forest kind a serve as a validation of this research. The regression with all four predictors is useful since it produced almost the identical results as were produced with the full model. The mean of all stand’s *Harvest Costs* differed only by 0.75 $/ton and the standard deviation differed by 0.11 $/ton. This confirms again the high R-squared of 0.98, but also confirms the unexplained 1.72% variance of the regression. The spatially explicit regression differs more from these results. The mean of the regression with all predictors to the mean of the spatially explicit regression differs by 3.69 $/ton. The standard deviation of the spatially explicit regression is with 8.6 $/ton significantly lower than the full regression model’s standard deviation of 10.18 $/ton. This is because the spatially explicit regression is missing the two *Non-Spatial Predictors*, and assumes therefore a fixed value for the variables. Therefore the variance in *Cost* is smaller.

The produced *Cost Surface* for the southern part of the Colorado State Forest with a mean *Harvest Cost *of 40.83 $/ton and a standard deviation of 15.75 $/ton are higher than the other calculations means and standard deviations. The high mean value and the high standard deviation, result from the fact that many stands, or in the case of the *Cost Surface* pixels, are not connected to roads. The map shown in a previous post clearly shows that stands close to the road are in the lower price range (green color). Therefore extreme high *Skidding Distance *values result. Therefore the *Cost Surface* is only useful in areas where roads already exist, though usually roads are not created until a stand is to be harvested. This makes planning and cost calculation very difficult. Therefore a way that estimates where potential logging roads will be located, is needed to calculate meaningful *Harvest Costs* for areas without road access.

**Conclusion**

In my conclusion I want to come back to my original research question:

The research showed that *Spatial Predictors* predict 40% of *Timber Harvest Costs. *The remaining 60% are predicted by the variables *Trees per Ac*re and *Volume per Tree*. Therefore the **first research question**, which asks what the significance of *Spatial Predictors *on* Timber Harvest Costs* is, can be answered as follows: Spatial Predictors have a significance of 40% on *Timber Harvest Costs*.

The **second research question**, which asks if it is possible to calculate* Timber Harvest Costs *solely based on* Spatial Predictors*, depends on the use case:

It is not possible to calculate with this method an absolute *Harvest Cost*, because an R-squared of 0.4045 of the spatially explicit regression model is too low to calculate *Harvest Costs* solely based on *Spatial Predictors*.

But this study was conducted in order to answer if it is possible to calculate *Timber Harvest Costs *for use in optimization models. Optimization models require iterating through millions of potential solutions and comparing results in terms of an objective function. For this kind of optimization a R-squared of 0.4045 is sufficient because it gives relative *Harvest Costs*. This allows optimization models to compare the *Costs* of different stands and scenarios. These models do not require absolute *Harvest Cost*.

Therefore the results of this research make it possible to include Harvest Costs in optimization models for ecological forestry approaches. With their inclusion optimization models are significantly improved.