Spatial analysis of fatal and injury crashes in Pennsylvania

https://doi.org/10.1016/j.aap.2005.12.006Get rights and content

Abstract

Using injury and fatal crash data for Pennsylvania for 1996–2000, full Bayes (FB) hierarchical models (with spatial and temporal effects and space–time interactions) are compared to traditional negative binomial (NB) estimates of annual county-level crash frequency. Covariates include socio-demographics, weather conditions, transportation infrastructure and amount of travel. FB hierarchical models are generally consistent with the NB estimates.

Counties with a higher percentage of the population under poverty level, higher percentage of their population in age groups 0–14, 15–24, and over 64 and those with increased road mileage and road density have significantly increased crash risk. Total precipitation is significant and positive in the NB models, but not significant with FB. Spatial correlation, time trend, and space–time interactions are significant in the FB injury crash models.

County-level FB models reveal the existence of spatial correlation in crash data and provide a mechanism to quantify, and reduce the effect of, this correlation. Addressing spatial correlation is likely to be even more important in road segment and intersection-level crash models, where spatial correlation is likely to be even more pronounced.

Introduction

Many factors affecting crashes operate at a spatial scale (e.g. land-use policy, demographic characteristics and highway infrastructure functional class). It is therefore reasonable to explore the use of spatial models of crash occurrence to better understand the implications of these policies.

In most roadway accident studies, crashes are grouped in spatial units that range from intersection or road section level to zip code or county level (e.g. Amoros et al., 2003, Miaou et al., 2003, Noland and Oh, 2004, Noland and Quddus, 2004, MacNab, 2004). One concern with these studies is the effect of spatial correlation (i.e. the spatial dependence among observations), which produces higher variance of the estimates and therefore, underestimated standard errors.

Recent developments in spatial modeling techniques have enabled researchers to investigate important issues related to risk estimation, unmeasured confounding variables, and spatial dependence (Richardson, 1992). An important advantage of spatial models is that spatial effects may reflect unmeasured confounding variables. This is particularly useful for unmeasured confounders that vary in space like weather, population, and others. More important yet, “the methods also facilitate spatial smoothing and data pooling when regions under investigation involve small-population areas”, MacNab (2004). Here the term ‘small-population areas’ refers to areas that present very few events, given a rare-event phenomenon, for example roadway crashes.

Previous research has dealt with the spatial component of road crashes in different ways. Crashes have been modeled as point events (e.g. Levine et al., 1995a, Jones et al., 1996), while others have modeled road crashes at different area levels, ranging from road sections to local census tracks or counties (e.g. Shankar et al., 1995, Amoros et al., 2003, Miaou et al., 2003, Noland and Oh, 2004, MacNab, 2004).

Honolulu census tract data have been used (Levine et al., 1995b) in a continuous model for predicting crashes. Analysis at the “ward” (census track) level has been conducted (Noland and Quddus, 2004) for fatalities, serious injuries, and slight injuries using four different categories of predictor variables: land-use indicator variables (employment and population density), road characteristics, demographic characteristics (age cohorts), and traffic flow proxies (proximate and total employment). Country-level data for Illinois (Noland and Oh, 2004) were used to estimate the expected number of crashes using infrastructure characteristics and demographic indicators as independent variables in a negative binomial (NB) model. Limitations of these studies are the use of proxy variables for traffic flow estimation and the lack of spatial correlation analysis. An additional paper (Amoros et al., 2003) developed NB models at county level in France that included interactions between road type and county.

Poisson-based full Bayes (FB) hierarchical models of county-level fatal (K), incapacitating (A), and non-incapacitating (B) injuries were estimated using both frequency and rate for the state of Texas (Miaou et al., 2003). Conditional auto-regressive model (CAR) was used to model spatial correlation and Markov Chain Monte Carlo (MCMC) was used to sample the posterior probability distribution. The main limitation of this paper is the use of the surrogate variables: percent of time that the road is wet, sharp horizontal curves, and roadside hazards. These predictor variables were estimated by proportions of crashes. For example, for percent of time that the road is wet, the variable was estimated by dividing the number of crashes that occurred under wet pavement by the total number of crashes. These estimators are clearly biased in the direction of the effect. Given the poor definition of contributing factors in the model, it is likely that the spatial correlation is overestimated. In a recent paper, Miaou and Song (2005) used the same approach and data in the ranking of sites for engineering safety improvements.

The adoption of the FB hierarchical approach by Miaou is an important advance in model estimation and is a departure point for this paper. The purpose of this research is to develop spatial models of road crash frequency for the State of Pennsylvania at the county level while controlling for socioeconomic, transportation-related, and environmental factors. The results from FB hierarchical spatial models are compared with the more traditional approach using an NB distribution to model crash frequency. Particular attention is paid to the inclusion of weather as a predictor and the search for spatial correlation among neighboring counties.

Section snippets

The Poisson and negative binomial distributions

When data arise as counts, the Poisson distribution is typically used to model them. Traffic crashes are a clear example of count data, therefore, a Poisson distribution is a useful stating point (see for example Jovanis and Chang, 1986, Shankar et al., 1995). An important characteristic of the Poisson distribution is that its variance is equal to its mean. Several authors (e.g. Shankar et al., 1995, Noland and Quddus, 2004) have argued that vehicle crashes are better represented by an NB

Data description

Fatal and injury crash data are obtained from PennDOT (Bureau of Highway Safety and Traffic Engineering, 1997, Bureau of Highway Safety and Traffic Engineering, 1998, Bureau of Highway Safety and Traffic Engineering, 1999, Bureau of Highway Safety and Traffic Engineering, 2000, Bureau of Highway Safety and Traffic Engineering, 2001). Two different sets of crash models were estimated: those using fatal crashes only as the dependent variable and those using just injury crashes. The spatial

Results

A series of NB regression models is used in initial data analysis and to provide a comparison set for FB models to follow. In each model, all variables listed in Table 1 are the starting point; variables are removed if they have significance levels above 0.10. Table 2 presents the NB model of fatal crashes. For this model three transportation-related variables are significant: DVMT, infrastructure mileage, and percentage of travel on federal aid roads. The coefficient for DVMT is negative which

Conclusions

There is no evidence of spatial correlation in fatal crashes; however, spatial correlation was found to be significant in injury crashes. The variance of the spatially correlated term (σu2) is significant in the FB injury crash model, which implies that some Poisson extra-variation in the data can be explained by spatial correlation.

Results concerning the effects of the covariates on fatal and injury crash risk are mostly consistent in the direction and magnitude for NB and FB models. In

Recommendations for future research

Crash models at the county level have several advantages over other types of crash models; one of the most important ones is the availability of transportation and socioeconomic data at county level. With the increase in quality and quantity of geographic information systems (GIS) data available in the Internet, public organizations may be able to incorporate additional land-use covariates in crash models, exploring land-use transportation interactions and their effects on crash risk.

Now that

References (49)

  • V. Shankar et al.

    Effect of roadway geometrics and environmental factors on rural freeway accident frequencies

    Accid. Anal. Prev.

    (1995)
  • Aguero-Valverde, J., 2005. Spatial models of county-level roadway crashes for Pennsylvania. MS thesis. The Pennsylvania...
  • L. Bernardinelli et al.

    Bayesian analysis of space–time variation in disease risk

    Stat. Med.

    (1995)
  • J. Besag

    Spatial interaction and the statistical analysis of lattice systems

    J. R. Stat. Soc. Ser. B

    (1974)
  • J. Besag et al.

    Bayesian image restoration with two applications in spatial statistics

    Ann. Inst. Stat. Math.

    (1991)
  • Bureau of Highway Safety Traffic Engineering

    1996 Pennsylvania Crash Facts and Statistics

    (1997)
  • Bureau of Highway Safety and Traffic Engineering

    1997 Pennsylvania Crash Facts and Statistics

    (1998)
  • Bureau of Highway Safety and Traffic Engineering

    1998 Pennsylvania Crash Facts and Statistics

    (1999)
  • Bureau of Highway Safety and Traffic Engineering

    1999 Pennsylvania Crash Facts and Statistics

    (2000)
  • Bureau of Highway Safety and Traffic Engineering

    2000 Pennsylvania Crash Facts and Statistics

    (2001)
  • B.M. Chichester et al.

    Associations between road traffic accidents and socio-economic deprivation on Scotland's west coast

    Scot. Med. J.

    (1998)
  • P. Congdon

    Applied Bayesian Modelling

    (2003)
  • J.B. Edwards

    Weather-related road accidents in England and Wales: a spatial analysis

    J. Transport Geogr.

    (1996)
  • L. Evans

    Traffic safety and the driver

    (1991)
  • Cited by (337)

    • Transportation resilience under Covid-19 Uncertainty: A traffic severity analysis

      2024, Transportation Research Part A: Policy and Practice
    View all citing articles on Scopus
    View full text