The police department's traffic unit decided early last year to step up enforcement around three intersections that consistently rank among the top accident locations in the city...That meant placing more officers in marked cars at the intersection...Compared with 2008, last year saw 14 fewer accidents at those intersections, an 11 percent decrease.
The box to the left of the article presents the data, which I've reproduced below:
``` Accident data Intersection 2008 2009
28th and Arapahoe 52 46
30th and Arapahoe 39 33
Foothills and Arapahoe 35 33
Citywide 3,242 3,405
Source: Boulder Police Department ```
As you may have guessed, I was quite skeptical of this article. Just because a random variable varies from experiment to experiment, doesn't mean the variation is anything but coincidence. I just flipped a coin 4 times. The first two times it was heads, and the second two, it was first heads and then tails. That, of course, doesn't mean the coin suddenly became less lucky. So, the question I want to answer in this post is: Is the Boulder PD justified in thinking that they've decreased the number of traffic accidents at these three intersections?
The way I'm going to answer this question is to first model these accidents with a probability distribution, and then use this to find the probability that the decrease in accidents could have been due to random variation. The distribution I'm going to use is a really interesting one called the Poisson distribution. This distribution is great for modeling events that happen very rarely, such as shark attacks and traffic accidents. Many people drive every day, but very few (relatively speaking) are involved in an accident. Further, this distribution has one parameter, λ, and both its variance and expected value are equal to it.
The issue, of course, is that we don't know the expected value or variance of traffic accidents at these intersections. We can, though, make a charitable assumption. In the interests of some fun mathematics, let's assume that 2008 was an incredibly average year, and that the number of accidents that occurred that year is exactly the expected value. Since the variance is equal to the expected value, we can, for example, say that the variance is 52 for 28th and Arapahoe.
Now that we know the variance and expected value, we can calculate the z-score of the decrease in accidents. This will tell us the probability that the random variable (in this case, the number of accidents at 28th and Arapahoe) could, by mere chance, decrease by 6 to 46. But wait a minute. z-scores only pertain to normal distributions, and we're using the Poisson distribution. This is ok, though. Since λ is large (λ > 25) it can be approximated with the normal distribution.1 So, let's calculate the z-score:
We now know that a decrease of 6 accidents from 52 falls at exactly -0.832 standard deviations from the mean. From this, we can tell that the chance that the number of accidents in a given year would vary between 46 and 58 (±6 from the mean of 52) is 60%, and the chance that the number of accidents would be something less than or equal to 46 is 20%. So, it seems quite likely that this variation was just random coincidence. The Boulder PD is probably being a bit too optimistic in thinking that it's their additional enforcement that decreased the number of traffic accidents at 28th and Arapahoe. Now, what about the other intersections? I've repeated this calculation for the other three statistics below:
``` Accident data Intersection 2008 2009 z-value P(z < z-value) (left tailed p-value)
28th and Arapahoe 52 46 -0.832 0.202
30th and Arapahoe 39 33 -0.971 0.1683
Foothills and Arapahoe 35 33 -0.334 0.369
Citywide 3,242 3,405 2.862 0.0021 (right tailed) ```
The decrease at 30th and Arapahoe is slightly more significant, but still has a 17% chance of being due to randomness. Foothills and Arapahoe fairs the worst of the three, with a whopping 37% chance of being random. Ironically, it is the citywide increase in accidents that is the most significant result. There's only a 0.2% chance that the number of accidents would randomly increase by at least 163! This is statistically significant, and it's plausible that this variation isn't due to random chance. Perhaps the Boulder PD spent too much time at these intersections at a cost to all of the other intersections in the city!
In any case, this analysis is, of course, far from perfect. The assumption I made concerning the expected value may very well have been wrong. Perhaps both 2008 and 2009 were anomalous years, and the expected value is actually much different. I will also grant that it is interesting that the citywide total went up, while these three intersections all went down, but I'm also not sure how much cherry picking they did. Really, the point of this post was to showcase some interesting mathematics, and show how far a little algebra can go. With only a little more data, I think this sort of analysis would have some interesting results. Hopefully someone at Boulder PD is listening. I wonder how much more money they threw at this enforcement campaign without really having any good data on its efficacy.
The first algorithm was the simplest, and consisted of weighing pairs of coins. We begin analyzing this algorithm by first noticing two things: (1) It could take anywhere between 1 and 4 trials to search through the 8 coins. (2) Each of these scenarios is equally likely. There's a 1 in 4 possibility that it will be found on the first trial. Ditto for the second, third, and fourth trial. With these two pieces of information, we can calculate the average number of steps it takes to search through 8 coins. This is done by multiplying each outcome (1, 2, 3, or 4 trials) with its probability (1 in 4) and summing the values. The expected value, or average number of steps for 8 coins, is therefore 2.5.
Another way of thinking about this goes as follows: Half the time you'll find the coin on or before 2 trials, and half the time you'll find the coin on or after 3 trials. Therefore, the average number of trials is 2.5.
That's simple enough, but can this be generalized? Yes. For n coins, the following sum represents the average number of trials:
If you're skeptical, try out the sum for 8 coins and verify that it produces the same equation as above. The sum simply multiplies each possible outcome (1, 2, 3, or 4 trials) by one over the total number of possible outcomes (1/4). Note that n must be an even number. If it's not, then round up to the nearest even number. So, what are the takeaway points?
The upshot is that the algorithm isn't horrible for a small number of coins, but once n starts to get big, so does the number of steps required. In fact, for 8 coins, this algorithm is slightly better, on average, than the binary search, which always takes exactly 3 steps, but it's also slightly worse than the ternary search, which always takes 2 steps. As I'll show, this doesn't hold when the number of coins gets larger.
Calculating the average number of steps for the binary and ternary search is much easier than for the simple algorithm, because the binary and ternary search always take a fixed number of steps, as shown by the decision tree (this isn't true if the tree is unbalanced, or if n isn't a power of 2 or 3). We can also see that for a decision tree to search through 8 coins, it must have 8 termination points at the bottom of the tree, called "leaves." The number of steps to complete the search is related to this by 2d≥8 for the binary tree and 3d≥8 for the ternary tree, where d is the number of steps or "depth" of the tree (Hein, p. 288). Solving for d we get an expression for the minimum number of steps, where n is the number of coins and b is 2 for a binary tree and 3 for a ternary tree:
The brackets are the ceiling operator, which rounds the number up, because a tree's depth must be an integer.
Hein takes this a step further and proves that the ternary decision tree is the optimal decision tree (p. 288). If there are 8 coins, there must be 8 leaves on the tree. The minimum depth for a tree with 8 leaves is 2. The tree in the last post has a depth of 2, and therefore must be an optimal tree (among others).
The takeaway points here are:
n | Simple Algorithm | Binary | Ternary |
8 | 2.5 | 3 | 2 |
128 | 32.5 | 7 | 5 |
1024 | 256.5 | 10 | 7 |
1,048,576 | 262,144.5 | 20 | 13 |
As Jon Bentley points out in Programming Pearls [Amazon.com], because the ternary search scales so much better, there's some sufficiently large value of n, at which a pocket calculator running the ternary search will outpace a supercomputer running the simple algorithm.
]]>That was my first thought, at least, and, as it turns out, it's wrong. The area only depends on the distance between the planes, and the diameter of the sphere. It doesn't matter where these planes are in relation to the sphere. As long as they both intersect the sphere, the surface area will equal πdh, where d is the diameter and h is the distance between the planes. Don't believe me? Here's the math to prove it:
First we begin with the formula for the surface area of a function rotated around the x axis from a to b. If you want an explanation of this equation, the section of my textbook that covers this is actually posted online [PDF] by the publishers (also note that this is problem 32 from that chapter).
Since we are finding the surface area of a sphere, we need the function for a circle, which we will rotate around the x axis, thus producing our sphere. If we can remember back to high school geometry, we'll know that the equation for a sphere is x2+y2=r2. We now solve for y, and plug it in for f(x), and also find the derivative of f(x) and plug it in for f'(x):
We now simplify the integral:
f(x) is moved into the square root:
We distribute:
More simplification:
We now integrate:
b-a is simply the distance between the two planes, which we call h:
2r is the diameter of the sphere, which we call d:
There you have it. The surface area of a sphere between two parallel planes is equal to πdh. It doesn't matter where these planes are in relation to the sphere. All that matters is that both planes intersect the sphere. Cool, huh? No? Ok, fine. Well, I thought it was interesting.
]]>Fk is, of course, the kth Fibonacci number. Now, before I go on, I want to point out that I stumbled upon these two equations and the proofs for these equations in Gilbert Strang's excellent Linear Algebra and Its Applications. All credit goes to him for coming up with this, and I recommend his book for an even more in depth explanation.
Anyway, back to the equation, which I found interesting for a few reasons:
To see how we get that equation from the equation in the last post, let's begin with the way Mr. Strang has it written in his book (it's the same as mine, but flipped upside down).
Where F0=0, F1=1, F2=1, etc. Let's now make some substitutions:
We now have:
This hasn't changed the content of the equation at all. We've just substituted in new symbols to represent the different terms.
The next step is to diagonalize the matrix A. Remember that diagonalizing A produces A = SΛS-1 where S contains A's eigenvectors and Λ contains A's eigenvalues. Substituting A = SΛS-1 into the previous equation yields:
Notice how everything but the first and last S and S-1 cancel, giving the final form: SΛkS-1u0. This is important, because Λ is a diagonal matrix, making Λk very simple:
λ1 and λ2 are, of course, the eigenvalues of A. Let's now make one last substitution and put all this diagonalization stuff together. First we define c as
Then we substitute it into the equation:
That might need some explaining. The first bit is simply the equation with c substituted in. Following that, the first matrix is S, but written to show that it contains x1 and x2, which are the eigenvectors of A placed vertically in the two columns. The middle matrix is Λk and the last is simply the matrix c. Finally, the matrices are multiplied out, yielding the final c1λk1x1+c2λk2x2
The final steps are simply to compute c, and A's eigenvalues and eigenvectors. I'll spare you the tedious algebra and simply tell you that:
This can be computed the standard way, by solving: det(A-λI)=0.
All the variables are then plugged into the equation for uk.
We want Fk, so we multiply, factor, and take the bottom row, giving the equation we want:
That's the full equation, but now notice that the second term is always less than 1/2. This mean we can simply drop it, yielding:
In fact, since the second term is always less than 1/2 and the full equation always gives us an integer, we can take this a step further: Rounding the approximation to the nearest integer will always give you the exact value for Fk. Cool.
]]>That generates the an and an+1 terms of the Fibonacci sequence.1 To see why, let's look at a recursive definition of the Fibonacci sequence.
That's easy enough to understand. The first and second terms are both 1, and the next term is the sum of the last term plus the current term. To see how this relates to the matrix above, we must turn this into a matrix equation. The trick here is to have a good understanding of matrix multiplication. If you need a refresher, here's a site with a tutorial. Let's look at an example of a 2x2 matrix multiplied by a 2x1 (even if you have a good understanding of matrix multiplication, hang tight, this is going somewhere).
Now let's make some substitutions and multiply it out:
As you can see, the effect of multiplying the b matrix by the Fibonacci matrix is that it moves the b1,2 position to the b1,1 position, and the b1,2 gets filled with the sum of b1,1 and b1,2. Let's now substitute the b's for an-1 and an:
Does the resultant matrix look familiar? If you look above, you'll see that it's the same as our recursive definition of the Fibonacci sequence! The top row is equal to the current number in the sequence (an) and the bottom row is equal to the next number in the sequence (an-1 + an = an+1). Continuing to multiply the resultant matrix by the Fibonacci matrix will cause consecutive entries to be produced. Because matrix multiplication is associative, we can move our multiplication to the exponent, and multiply that result by the first two terms in the sequence (0, 1), leading to our initial matrix:
1: Strang, Gilbert. "Linear Algebra and Its Applications."
]]>