Thursday, February 26, 2015

Z-Scores, Mean Center, and Standard Distance


 

Introduction

     As climate continues to change, the severity and spatial pattern of tornadoes is likely to change as well. It is widely believed that tornadoes will become more frequent in occurrence and stronger in intensity across the Midwest. Therefore, it is important to analyze the location and intensity over recent years to determine spatial patterns. After spatial patterns have been determined government officials can use this data to create various laws that will increase the safety of individuals at risk. In this specific exercise we will be calculated mean center, weighted mean center, standard distance, and weighted standard distance of reported tornadoes with a known width between 1995-2006 and 2007-2012. This information will then be used to determine if storm shelters should be required across the entirety of Oklahoma and Kansas or just in areas that experience the majority of the largest tornadoes.
 

Study Area

      For this exercise, we will be examining the two states that experience the largest number of severe tornadoes, Oklahoma and Kansas (Figure 1). At the heart of "Tornado Ally" these two states were impacted by 2,221 tornadoes from the year 1995 to 2012, with 66 being over 700 feet in width. The combined population of Oklahoma and Kansas is 6,782,072 according to 2014 estimates and the combined land surface area is 152,175 square miles, leading to a combined population density of 44 people per square mile.
 
 
 

Methodology

     Several different statistical tools can be used in order to determine the spatial distribution of tornadoes across Oklahoma and Kansas. Based on the geographic data of tornadoes we use ArcGIS to calculate the mean center, weighted mean center, standard distance, and the weighted standard distance. The mean center of the data is the geographic center of all of the points. In order to calculate the mean center, the latitude values of all points are averaged and all the longitude values are averaged (Equation 1).
 
 
 
Equation 1
 
 
     Similarly to mean center, weighted mean center also finds the geographic center of a set of points. However, weighted mean center applies a weight to each of the latitude and longitude values for each point (Equation 2). In this scenario, the width of each tornado is applied and shifts the overall center of the data toward areas that have high frequencies of large tornadoes.
 
 
 
Equation 2
 
 
     Next, a standard distance is created around the tornado locations for the years 1995-2006 and from 2006-2012, respectively. The standard distance is the radius of the circle around the mean center that contains 68 percent of the total points (Equation 3). This information can be useful in determining how concentrated a data set is based on the geographic mean center. 
 
 
Equation 3
 
 
      A weighted standard distance was also applied to the tornado data. Similarly to the standard distance the weighted standard distance determines the length of a radius then contains 68% of the total data points based on some defined weight (Equation 4). Using a weighted mean center is used in determining concentration of data points based on whichever weight is applied, in this case the width of tornadoes.
 
 
Equation 4
 
    
     Next, the Z-scores were calculated for three counties in the area. Z-scores are defined as the relative position of the data compared to the mean. The Z-score value basically tells us how many standard deviations above or below the mean a specific point falls on.
 
 
Equation 5
 
     
     Finally, we calculated the number of tornadoes that will occur in any given county 70% of the time and 20% of the time respectively. In order to do this we need to find the corresponding Z-score associated with each percentage, and solve for Xi in Equation 5.

Results

 
     Based on calculations above, we determined the location of the geographic mean center of tornadoes in the study area between 1995 and 2006 is located just north of border of Kansas and Oklahoma pretty much right in the center of the study area. This shows us that the distribution of tornadoes is spread fairly evenly throughout the area. However, when weighted by tornado width, the weighted mean center shifts slightly south. This states that while there is an equal number of tornadoes spread throughout the study area, there are stronger tornadoes located in the southern portion of the study area.
 
 
Figure 2  Locations of tornadoes from 1995-2006 and the
location of the geographic mean center and the weighted mean
center based on tornado width.
 
      Between the years 2007-2012 the location of tornadoes shifts slightly compared to 1995-2006. The geographic mean center shifts slightly north showing that a larger number of tornadoes are occurring in Kansas. The weighted mean center shifted slightly north and slightly east. This means that there were less large tornadoes that occurred in the southern portion of the study area. It also means more large tornadoes occurred in the eastern portion of the study area.
 
Figure 3  Locations of tornadoes from 2007-2012 and the location
of the location of the geographic mean center and the weighted mean
center based on width.
 
      Figure 4 shows a combination of Figure 2 and Figure 3. As you can see, less large tornadoes occurred in the southern portion between 2007-2012 than 1995-2006. You can also see that less large tornadoes occurred in the western portion of the study area from 2007-2012 than 1995-2006.

Figure 4  The mean center and weighted mean center of tornadoes
from 1995-2006 compared to tornadoes from 2007-2012.
 
 
     Figure 5 shows the standard distance of tornado locations weighted by width. As you can see the weighted standard distance shows the area where 68% of the tornado width occurs. The standard distance in this case is fairly large which indicates that there is not a strong concentration of points located in one area, but rather tornadoes spread fairly evenly across the area.


Figure 5  Standard distance of tornadoes weighted on tornado width.
 
 
     From 2007-2012 the weighted standard distance (Figure 6) is much smaller than the standard distance from 1995-2006. This means that more of the larger tornadoes become concentrated toward the center of the study area. 
 
Figure 6  Weighted standard distance tornadoes from 2007-2012
based on width.
 
 
      Figure 7 compares the standard distance from 1995-2006 and from 2007-2012. There is a slight shift of larger to the north east and an overall increase in concentration of larger tornadoes toward the weighted mean center.


Figure 7  Comparing the weighted standard distance between 1995-2006 and 2007-2012
to show how the concentration of tornadoes has change over the two time periods.
 
 
     We also mapped the standard deviation of tornadoes per county from 2007-2012 (Figure 8). This map shows the number of tornadoes per county compared to the mean. There are 8 counties throughout the study area that experience over 1.5 standard deviations of tornadoes, while there were around 50 counties that experienced less than -.5 standard deviations of tornadoes.


Figure 8  Map showing the number of tornadoes per county based on the
standard deviation.
 
 
 
 
      According to the data, Russell County, KS has experience 25 tornadoes between 2007 and 2012. This number is significantly higher than the mean of 4.3 per county. Using Equation 5 above, the Z-score of Russell County, KS is 4.88. This means that the number is tornadoes in Russell County, KS is 4.88 standard deviations above the mean of the study area. Caddo County, OK experienced a slightly lower number of 13 tornadoes from 2007-2012 than Russell County. However, 13 tornadoes is significantly higher than the mean of 4.3 tornadoes per county. Using Equation 5, the Z-score of Caddo County, OK was calculated to be 2.09, meaning that the number of tornadoes in Caddo County is 2.09 standard deviations above the mean. Finally, the Z-score was calculated for Alfalfa County, OK. This county experienced only 5 tornadoes from 2007-2012. Using Equation 5, the Z-score was calculated to be .23, meaning the number of tornadoes in Alfalfa County was only .23 standard deviations above the mean.
 
     Finally, based on Equation 5, we determined that each county in the study area will experience 1.764 tornadoes 70% of the time between the given years. We also conclude that each county in the study area will experience 7.612 tornadoes only 20% of the time over the given years.
 

Conclusion

     In conclusion, The mean center and weighted mean center don't necessarily tell us whether or not an area should be force to build tornado shelters. They basically just tell us that tornadoes occur pretty evenly throughout the entirety of the study area. The standard distance also does not tell us where tornado shelters should be required because the radius of the standard distance is so large. If more tornadoes were concentrated closer to the mean center, a case could be made to require tornado shelters. Finally, the map of the standard deviation does provide a good example of where tornado shelters should be required. Counties that experience less than -.5 standard deviations of tornadoes should not be required to build tornado shelters, while areas that experience over 1.5 standard deviations of tornadoes should be required to build shelters. However, since there are so many tornadoes that occur in a fairly random pattern, it would not be a bad idea to strongly encourage everybody living in this area to have a tornado shelter. One can never be quite sure whether or not a specific county will experience an EF5 tornado that will have catastrophic effects.

No comments:

Post a Comment