Tuesday, March 7, 2017

Assignment 3

Jeffrey Hessburg
Assignment 3
GEOG 370
9 March 2017

For this assignment, I have been hired by an independent research consortium to study the geography of foreclosures in Dane County, Wisconsin.  County officials are worried about the increase in foreclosures from 2011 to 2012.  As an independent researcher I have been given the addresses of all foreclosures in Dane County for 2011 and 2012. My goal is to explain the patterns and provide some understanding to the trends in the foreclosures. 

The following are calculations of the Z scores of three separate Census Tracts. Each Tract has a calculation based on the Count2011 data and a calculation based on the Count 2012 data. To make the calculation, the mean and standard deviation of the Count2011 and Count2012 data is needed. To find these I went under symbology then to Quantities and clicked on Classify.  Once I clicked on Classify a box under Classification Statistics displays the Mean, and Standard Deviation. The last thing needed is the Xi value of the 3 census tracts. These can be located in the attribute table. Once all of the values are found, the following equation is used to calculate the Z scores: Zi=(Xi-μ)/S where Zi=Z-score, Xi=Observation of I, μ= Mean of the data, S= Standard deviation of the data. 

For Count2011 mean 11.39 Standard deviation 8.776
Calculation three tracts:
Census Tract 122.01: Xi=6      (6-11.39)/8.776      Zi= -0.6141
Census Tract 31:        Xi=24    (24-11.39)/8.776    Zi= 1.437
Census Tract 114.01: Xi=32    (32-11.39)/8.776    Zi= 2.348

For Count2012 mean 12.30 Standard deviation 9.906
Calculation three tracts:
Census Tract 122.01: Xi=6       (6-12.30)/9.906    Zi= -0.6360
Census Tract 31:        Xi=18     (18-12.30)/9.906  Zi= 0.5754
Census Tract 114.01: Xi=39     (39-12.30)/9.906  Zi= 2.695

The Z-score calculations show how many standard deviations away from the mean each tract is. Tract 114.01 is the furthest from the mean and tract 31 is closest.

The map below shows changes in foreclosures from 2011 to 2012. Positive numbers indicate more foreclosures in 2012 than 2011. Negative numbers indicate more foreclosures in 2011 than 2012.
The question is asked: if these patterns for 2012 hold next year in Dane County, based on this Data what number of foreclosures for all of Dane County will be exceeded 70% of the time? Where on the map will this most likely happen?
To solve this I look at the z-score chart to determine the z-score for 70%. The z-score is -0.52.
I can then enter this in the Count 2012 equation to determine the Xi value then evaluate potential Census Tracts that have this value.
(Xi-12.30)/9.906=-0.52     Xi=7.149
This result means that, based on the 2012 data, 70% of the time, a county track will have more than 7.149 foreclosures. 
The map above is good for illustrating the areas in Dane County that have the least amount of foreclosures. These are the blue areas.

What number of foreclosures for all of Dane County will be Exceeded only 20% of the time?  
Z Score 0.84
(Xi-12.30)/9.906=.84   Xi=20.62
This result means that 20% of the time, based on the 2012 data, a county track will have more than 20.62 foreclosures.
The map above is good for illustrating the areas in Dane County that have the most amount of foreclosures. These are the pink areas.


Conclusion:
It is clear to see that there are big spatial patterns involving foreclosure in Dane County. It appears that in 2012 the highest number of foreclosures are the bigger sections not in the center or in the southern parts. The lowest number of foreclosures are the smaller sections in the center of the county, and just a couple outside the center. 
It is also clear to see big change in the number of foreclosures from 2011 to 2012. This can be observed in the first map. The lightest and darkest colors indicate this change. Based on the first map and the increased mean from 2011 to 2012, if the same trend continues, I believe that there will be even more foreclosures in 2013 than in 2012. 

Monday, February 20, 2017

Assignment 2

Assignment 2
Jeff Hessburg
2/21/2017
GEOG 370

PART I
For this assignment, I will be comparing two bike race teams who competed in the TOUR de GEOGRAPHIA; Team ASTANA and Team TOBLER The comparisons will inlude the range, mean, median, mode, kurtosis, skewness, and standard deviation, of the race times for each team. After analyzing the calculations, I will determine which team I would rather invest in, to make the most money.

The Results:

Range- The range is a calculation of the largest value subracted by the smallest value. For this example, the range can be explained by how much time faster the first person on each time finished compared to the slowest. The difference between the fastest and slowest person is much great on team Tobler than team Astana.
Team ASTANA=1 hr 10 min
Team TOBLER=31 min

Mean- The mean is the average of all of the numbers. To calculate, all of the times are added up then divided by the total number of times. for this example, the mean is the average time the bikers on each team finished.
Team ASTANA =37 hr 56.67 min
Team TOBLER=38 hr 5.47 min

Median- The median is the middle number when all values are put in order. For this example, there are 15 bikers. That means the median is the the average of the 7th and 8th place finisher.
Team ASTANA=38 hr
Team TOBLER=38 hr 9 min

Mode- The mode is the value that occurs most frequently. For this example, this means the results are the times that racers on each team finished most frequently.
Team ASTANA=37 hr 52 min and 38 hr
Team TOBLER=38 hr 9 min

Kurtosis- The kurtosis is the sharpness of a peak of a distribution curve. A number above one means the distribution has a peak, a number less than -1 means the distribution is relatively flat. A peak means that most of the values are close together. A flat distribution means the values are more spread out. For this example, both values are greater than one, this means that the distribution curve is peaked and the values are relatively close together.
Team ASTANA=1.168
Team TOBLER=2.927

Skewness (Population)- The skewness is a measurement of how symmetrical the distribution curve is. Positive numbers means the mean will be be to the left with a tail to the right. A negative number means the mean will be to the right with a tail to the left. Numbers above 1 or below -1 means there is a skew. The closer to 0, means the smaller the skew. For this example, team Astana has almost new skew whatsoever and team Tobler has a negative skew.
Team ASTANA=-0.00231
Team TOBLER=-1.0259

Standard Deviation- The standard deviation is a statistic that explains how tightly a group of values are to the mean. the picture below is the best way to explain it.
Team ASTANA=16.63
Team TOBLER=7.62

My job is to determine which team is better to invest in.
The individual race winner gets 75% of a $300,000 pool  ($225,000) and the owner gets 25% ($75,000)
The team that wins receives 65% of a $400,000 pool ($260,000) and the owner gets 35% ($140,000)

Based off of the last race in the TOUR de GEOGRAPHIA, there is a good chance that the first place finisher will come from Astana, which is a guaranteed $75,000. 
The prompt does not explain how the team wins. I am guessing that it is just an average of all of the racers to determine how well the team did communally. the statistic that would determine this is the mean. So based off of the mean, team Astana would also be the best team to invest in. 

PART II
Below is a map of Wisconsin. On the map are three points. The yellow star shows the geographic mean center of the state. This means the very middle of the state. The green dot shows where the mean center of population of the state was in 2010, and the red dot shows where the mean center of population was in 2015. These are points of where geographically the average person in Wisconsin lives. these points are influenced from every direction. If there are more people in the south, the dot will be further south. If there are more people living in the west, the dot will be further to the left. It can be noted that the dot from 2010 to 2015 has moved slightly west. A reason for this change must mean that there are more people living in the west. Perhaps a big city like Eau Claire has had a increase in population which moved the population mean center. 




Wednesday, February 1, 2017

Assignment 1

Jeffrey Hessburg
GEOG 370
Spring 2017

For part I, I will explain the difference between Nominal, Ordinal, Interval, and Ratio Data.

Nominal data is data where each unit is assigned to a category. They are used for labeling variables, with no numerical significance. A type of nominal data is hair color. A given person can be labeled one type of hair color. For example; red, blonde, or brunette.  Another good example of nominal data is an electoral college map. Each state fits into a category of which president received the most votes. 

http://www.270towin.com/presidential_map_new/maps/gv32O.png
It is clear to see that each state fits into either Trump or Clinton. Two labels. 

Ordinal data is data where the values are ranked. The relationship from one data to another is based on if that data is more than or less than. An example of ordinal data is a company asking how satisfied people are with their products. They could answer:
1. Very Unsatisfied
2. Unsatisfied
3. Neutral
4. Satisfied
5. Very satisfied
Each of these answers gives the company data how much more or less each person is satisfied. 
Ordinal data can be quantitative. It can place values in categories then be ordered. An example is the map below of this. 
https://blog.zingchart.com/assets/zing-content/uploads/2015/11/Screen-Shot-2015-11-18-at-11.53.17-AM.png
In this map above, each county groups together every immunization percentages in California child care facilities and determines the percentages. Then the map viewer can determine which county has less than or greater than percentages of immunizations, relative to another county. 

Interval data is data on a numeric scale, where the order and difference between values is known. There is no true zero with interval data. For example, with an elevation map, there is never 0 elevation, it is just a reference. Below is an example of an elevation map, 
https://bgommartin.files.wordpress.com/2015/11/the-white-space-representing-the-elevation-change-between-two-contours-is-called-the-contour-interval.png?w=580
It can be noted that the differences are measurable with each contour line. 

Ratio data is similar to interval data, but there is a known zero. This means that it is possible to measure differences as well as ratios; how much larger or smaller one piece of data is compared to another. an example of ratio data is weight. there is a known 0.  No matter has negative mass. Another example, that can be mapped, number of vehicles per person in New York. There is no such thing as negative cars. 
http://la.streetsblog.org/wp-content/uploads/sites/2/2010/12/NY-Vehicles-Per-Person.jpg
It should be noted that there is a distinct 0 on this map. 

Part II
For part two my goal is to help my agriculture consulting/marketing company to increase the number of women as principal operators of a farm. To do this I have been instructed to create 3 maps showing the number of women principle operators for every county in Wisconsin. The first step to completing this goal is going to the U.S. Census website and downloading a shapefile that has every Wisconsin county. Next that shapefile must be added to ArcMap and joined together with an excel document that has the amount of women farm operators in each county. Once the data is joined together, each county is grouped into one four sections based on how many women farm operators there are in each county. The group they are placed in is determined by the classification method. Three different methods were used; equal interval, quantile, and natural breaks. All are shown in the three different maps below. Each map is comprised of the same data, the only difference is how the data is classified. After the classification was determined, an appropriate color scheme was chosen. Finally basic map elements were added such as a title, legend, north arrow, scale, and reference box. 

Equal Interval
The equal interval method classifies the number of women principle farm operators into groups that contain an equal range of values.


Quantile
The quantile method classifies the number of women principle farm operators into groups that contain an equal number of values.


Natural Breaks
The natural breaks method classifies the number of women principle farm operators into groups that are designed to determine the best arrangement of values into different classes.  
Only one of these maps is to be used to persuade women to become principle farm operators. In my opinion I believe that the best map to use for this purpose is the equal interval map. Compared to the other two maps, the equal interval map makes it look like there are hardly any women that are principle farm operators in Wisconsin. I think that because of this, it has the capability to inspire women to believe that they can change that, and become principle farm operators themselves. 

References:
Part I:
http://www.mymarketresearchmethods.com/types-of-data-nominal-ordinal-interval-ratio/
Notes from class were also used
*the link where each picture came from is beneath each picture 
Part II:
for definitions: http://support.esri.com/other-resources/gis-dictionary/term/natural%20breaks%20classification
*the data used for the maps are on the bottom right of each map