D.C. Actual Pedestrian-Related Crashes and Pedestrian Crash Prediction Dashboard
visualizing crash counts involving pedestrians and predicted these types of pedestrian-related crashes grouped by road segment for Washington D.C.
© CARTO, © OpenStreetMap contributors
What does the residual tell us?
Locations that have a higher number of explicit crashes are easily detected as locations of interest. Increased crash counts imply that a road segment or intersection is considered dangerous. However, the residual is a measure of the difference between the actual crash count and the predicted crash count. This difference can be positive or negative, and it can be used to identify locations where the actual crash count is significantly different from the predicted crash count.
Example: If there is a road segment have 50 crashes per year, it is labeled as dangerous. However, if a road segment is predicted to have only 10 crashes per year but actually has 20 per year, this should still be considered a place of interest because the location is experiencing more crashes than expected. This most likely indicates that there is something going on in and around this intersection, such as reduced visibility, tighter lanes, bad light cycles, or other external factors that are leading to this increase in crashes. The difference in expectation is a new statistic that can be used to identify locations that are experiencing more crashes than expected.
How do we predict crashes?
Using a regression model, we can predict the number of crashes that will occur at a given location. This is done by using features about each road segment such as speed limit, number of lanes, lane width, and AADT (average annual daily traffic). These features were paired with the true number of crashes at each location and then were divided into train and test sets that were fed into the model. The model was then trained and used to give the numbers presented in this dashboard.