Remember Stephen Colbert’s thrilling 434-part series? This is just like that, only with Seattle area bicycle counters. Once upon a time I wrote about the Fremont Bridge bike counter at the Seattle Bike Blog. Now SDOT publishes data from nine bike counters, and the plan is to run through them one at a time, breaking down the trends and complaining about the poor quality of Seattle Open Data.
I thought that I would start with the Burke-Gilman Trail near Magnuson Park in honor of the recent opening of the Montlake bike/walk bridge. The completion of Burke-Gilman construction through the UW will certainly be a relief for a lot of people. As the Seattle Bike Blog says, “With the addition of the bridge, biking and walking in the area will be essentially unrecognizable from just a few years ago.” After checking it out this week, I completely agree! Now let’s head a few miles northeast and check out the measured bicycle traffic along the Burke-Gilman trail near Magnuson Park.
Overview
Below is a Google Maps image of northeast Seattle with bicycle routes highlighted in green. Allegedly the counter is somewhere around NE 70th Street near Sand Point, which is a few miles from the UW and the Montlake Triangle mentioned above. I don’t know enough about how the counters work, but at the very least there’s no obvious visible sign such as the totems at the Fremont and Spokane St bridges, which is why I’m not sure about the exact location. The general location is marked on the map.
This bicycle counter started reporting data on January 1, 2014. It wouldn’t be a Seattle Open Data project, though, without data quality problems. The bicycle trip records are missing for June 2015 and dubious for May 2015. For example, on May 17, 2015, the counter recorded 2078 northbound bike trips and 4 southbound bike trips. This seems highly unlikely. Throughout basically the rest of the data the northbound and southbound trips roughly balance by day, so it would suggest that the extreme imbalances in May 2015 represent data errors. Because the total counts look plausible, though, I’ve left in the daily totals from May 2015 for this analysis, and chalked that up to an apportionment error.
Let’s start by looking at the broad trends in daily bicycle crossings on the Burke-Gilman near Sand Point. The table below shows average crossings per day throughout the year 2014 for each month.
Month | Average Crossings per day in 2014 |
January | 577 |
February | 442 |
March | 736 |
April | 1211 |
May | 1686 |
June | 1781 |
July | 1902 |
August | 1772 |
September | 1325 |
October | 848 |
November | 542 |
December | 433 |
Average | 1109 |
This is a very seasonal route! Average wintertime trips were around 400-600 per day, while mid-summer saw typical counts near 2000. July was the peak month and it was a dead heat between February and December for most lightly bike-trafficked month of 2014 on the BG near Magnuson Park. Across the whole year an average of 1,109 bicycle crossings per day were recorded at the counter.
We can also look by day of week. The table below shows average crossings per day throughout the year 2014 by day of week. This is a weekend warrior route! Saturday and Sunday saw the highest average bicycle crossings. During the work week, crossings seemed to start high and decline towards the weekend, but those differences were not very pronounced.
Day of Week | Average Crossings per Day in 2014 |
Sunday | 1406 |
Monday | 1085 |
Tuesday | 1010 |
Wednesday | 993 |
Thursday | 924 |
Friday | 961 |
Saturday | 1383 |
Those are probably the two most obvious characteristics of this bicycle counter: highly seasonal and beloved by recreational riders. The plot below of daily trips since January 2014 also shows these features pretty well.
During warmer months the weekend riders substantially exceeded the commuters, with the busiest summer days seeing over 3000 bicycle crossings. During the depth of winter the weekday/weekend split was roughly equal and much, much lower than in the summer.
Hour of Day
This is the fun part. Let’s look at the hourly trips by weekday or weekend. The plot below shows quantiles by hour of the day overlaid on all the data. Each thin black line is a day, and the colored lines represent summaries for that hour. A 2.5% quantile means that 2.5% of the time the crossings during that hour of the day were lower and 97.5% of the time they were higher. The advantage of this type of all-data plot is that you get to see the overall trends, along with whatever oddities poke out of the jumble of ordinary days.
We definitely got some oddities poking out! The early morning ones around 6 or 7 AM occurred during the mornings of Friday August 15, 2014 and Saturday August 16, 2014. These times appear to match the Cascade Bicycle Club’s Ride from Seattle to Vancouver and Party (RSVP)! At that time of the morning we would typically expect to see at most 150 crossings per hour, but instead it was measured at almost 600. However, the most prolific hour occurred in mid-May with over 700 crossings, at a time that matches the RedHook Brewery Haul Ash Tour de Brew. Finally, the third fun event jutting above the jumble of normal days appears to be the Cascade Bicycle Club’s Woodinville Wine Ride, taking riders from the Woodinville Commons to Cascade’s headquarters at Magnuson Park and back, presumably with ample wine.
Let’s look more at the ordinary day-to-day behavior, though, of the bicycle crossings. The plot below is similar but with large and unusual events removed.
Once again we see that weekend recreation appears to trump weekday commuting out by Sand Point. This isn’t too surprising I suppose, as working downtown would imply at least a ten mile trip each way. Obviously there are other employment centers like the University of Washington and Seattle Children’s much closer, but for sheer volume of commuters this doesn’t appear to be enough of a link between residential population and job centers.
On a good day (read: sunny summer day) it’s remarkable that the weekend warriors start hitting the pavement as early as 6 AM and maintain all the way to 8 PM. That’s a full day of people on the trail! The weekend peak occurs roughly between 10 AM and 3 PM. As we already saw, though, everything about this route is seasonal. If the zest and vigor of a sunny Saturday is remarkable, then the torpor of a cloudy winter day is somewhat odd in comparison.
The commuters also probably bear mention. They get going early! People are riding bicycles on the Burke-Gilman past Magnuson Park at 5 AM on a good day. It’s funny that the peak hour on high-traffic days appears to be 7 AM, while the peak hour on lower-traffic days appears to be 8 AM. This could be a feature of wintertime vs summertime daylight, a data collection error, a data analysis error, or maybe something else that doesn’t occur to me. The evening trips are much more concentrated than the morning trips, peaking at 5 PM throughout the year.
Inference
To really get our money’s worth out of this data, we can learn even more by fitting a regression model. The basic idea is to adjust for common influences of bicycle trips such as daylight, temperature, and precipitation, and then examine the parameters of the adjustment and look at what patterns remain.
If anybody actually cares, the model was specified with total daily bicycle crossings as a function of 1) average daily temperature, 2) that day’s precipitation, 3) the previous day’s precipitation (the “echo” effect), and 4) hours of sunlight. For temperature and precipitation, I downloaded data from the Sand Point NOAA GHCN weather station. For daylight I downloaded hours of sunlight for Seattle here.
Here are estimated coefficients:
Weekday | Weekend | |||
Variable | Estimate | p-value | Estimate | p-value |
Temperature (per 1F) | 25 | <1e-5 | 26 | <1e-5 |
Precipitation (per 0.1”) | -69 | <1e-5 | -90 | <1e-5 |
Yesterday’s Precipitation (per 0.1”) | -19 | 2.5e-4 | -41 | 0.01 |
Daylight (per hour) | 66 | <1e-5 | 176 | <1e-5 |
Bike Month | 182 | 4.2e-5 | 96 | 0.45 |
To read the table, the “Estimate” shows the change in daily bicycle crossings associated with the given variable. For example, during weekdays, adjusting for precipitation, daylight, and bike month, one higher degree of average temperature was associated with 25 additional crossings. The p-value assesses the extent to which that effect could have been explained by random chance alone, and most would achieve “statistical significance” as it is typically doled out.
There’s a lot of interesting things going on here! Once agin we see the “echo” effect, where yesterday’s precipitation also has an effect on today’s bicycle trips, although not as large as today’s precipitation. On weekdays, an additional tenth of an inch of precipitation corresponded to 69 fewer crossings, on average. An additional tenth of an inch of precipitation falling the previous day corresponded to 19 fewer crossings on average. The absolute magnitude of the precipitation depression on bicycle trips was larger on the weekends, although that also could have been due to the larger overall volume of traffic on weekends. It does make sense, however, that people riding bicycles for leisure or exercise could be more susceptible to adverse conditions than people riding bicycles as transportation.
The daylight/seasonal effect was quite large, especially for weekend crossings (we saw that in the daily plot in the first section). The interpretation is that one additional hour of daylight corresponded with 66 more crossings on weekdays and 176 more crossings on weekends. It probably bears mention that “daylight” is really a stand-in for a variety of seasonal effects, some practical like not riding in the dark and perhaps some psychological like not wanting to bicycle in February. There is also a heavy caveat on the daylight vs temperature inference, namely that dark and cold happen at the same time. The darkest days are often the coldest days, and vice versa. As such you shouldn’t really believe the exact numbers in the table above, but at a high level they are informative of the broad seasonal patterns of bicycle trips.
Finally, I added an indicator for whether the day fell within bike month or not. For weekdays, the fanfare of bike month was estimated to add 182 crossings per day on the Burke-Gilman near Sand Point. Given the non-bike-month relationship between bicycle trips and temperature, daylight, and precipitation, we would have expected something like 1300 average daily commute crossings, but during bike month we observed more like 1500 average daily crossings. For weekends, this value was estimated at 96 and was not statistically significant. It seems as though bike month brought out additional commuters around northeast Seattle but perhaps not more recreational riders.
Finally, the graphic below shows “residual” crossings — the difference between the observed count on that day and the number expected given the daylight, temperature, and precipitation.
Most strikingly, the weekend residual variation is quite a bit higher than the weekday residual variation. Some of this is probably the higher total crossings on the weekends, but doubtfully all. Even on the daily graphic in the first section it appeared as though weekend crossings were less predictable. I guess the interpretation is that weekday bicycle trips tend to be commute trips, and commuters can be fairly regular in their habits. The weekend counts may be more a function of which people and which groups somewhat randomly decided to hit the Burke-Gilman for their ride that day.
We also see a bit of the dreaded non-constant error variance. On the weekday side notice how the scatter pinches down in the wintertime. This is because the total crossings are so much smaller in the winter than in the summer. Smaller counts, smaller residuals. So, uh, we’re running into the problem that this is counts data. Life is easier if you can plausibly claim the data came from a Gaussian (bell-curve) looking thing, which is probably okay if the counts are really high or fairly consistent. In this case, the large seasonal swing somewhat invalidates the linear model assumption of constant residual variance. The main consequence of this is skepticism on the p-values in the table above. If this were a more rigorous, scientific study there are ways to address this problem, but for an exploratory analysis with the prime directive of amusement it’s probably unimportant.
Recap
Whew, that was a lot! To recap: highly seasonal and beloved by weekend warriors; the location of several group rides per year, often involving alcohol; and a successful conduit for bike month bonus commuters. Thanks for playing!
Pingback: Better Know a Bike Counter: Chief Sealth Trail at Thistle Street | Stats on the Street
Pingback: Visualizing SDOT Open Data Bicycle Counts – Stats on the Street