First glimpse of TxDPS ticket data

Aren Cambre's picture

I now have my first glimpse of citation data from the Texas Department of Public Safety. It is probably the most prolific ticket-writing force in the state. For about 10.5 years' data, I have about 11.5 million tickets.

Here's TxDPS ticketwriting activity in the DFW area:
TxDPS ticketwriting data for DFW metro area

It took a while to get here.

For the past few years, TxDPS has included latitude/longitude data with its tickets, so this is helpful. But the first several years, TxDPS generally used highway reference markers to say where the ticket was written.

Reference markers for Interstate highways are (relatively) easy: they are mileposts that start counting from 0 at the western or southernmost extent. In this case, "(relatively) easy" was still a couple of weeks of work for me to figure out line traversal and other spiffy PostGIS features.

All other highways--US routes all the way down to FM/RM/RR routes--use an obscure, grid-based reference marker system. It's hard to explain, but in a nutshell, a grid is superimposed on the state, and I think the grid has lines 1 mile apart. A highway's reference marker doesn't increment until it crosses the grid line. If a highway runs diagonal, then it has longer intervals between successive reference markers.

But it's not that simple. I couldn't figure out a rational way to translate reference markers to latitude and longitude. Even manual calculations on some easy highways didn't work.

For example, TxDOT uses US 82 as an example highway in its explanation of the reference marker system. US 82 generally runs ENE across the state. Because it's not a strictly E/W road, it should have more miles than reference markers. But it didn't work out that way. The reference marker where US 82 enters the state from NM is 222, and where it leaves the state at AR is 798. 798 - 222 = 576 reference marker increments. But a huge problem: US 82 only runs 505 miles through the state! US 82 has more reference markers than miles, which ought to be impossible!

I gave up and did an informal open records request on TxDOT. Turns out they had a dataset already that correlates reference markers to latitude and longitude. They warned me that the data may be a off. However, at least on Interstates, their data pretty closely matches the mile markers I calculated using line traversals. They are usually close but may occasionally wander less than 1/3 of a mile off. This, by the way, is based on informal sampling, not population analysis.

Now that I have geocoded reference markers, I can use that to geocode TxDPS tickets that don't have lat/long, which is what you see above.

But there's a major problem with the above. Each blue dot represents the location of TxDPS activity, where at least one ticket was written. And each blue dot may represent MANY instances of TxDPS activity. Further, the closer you get to the DFW city centers, the fewer tickets you'll uncover over each dot.

My next step is to convert this into some kind of heat map. The heat map will show intensity of activity. Rural Interstates should show far greater activity than urban cores, which should show minimal or no activity.

Something already quite interesting are the gaps in the data. You may notice that roughly at the Dallas County line, TxDPS activity goes to zero on I-35E. Is this because TxDPS doesn't patrol I-35E north of the line? Is there a dataset error? Do TxDPS officers mis-enter data? Why does TxDPS appear to be active on I-45 all the way to downtown?

Also, why no enforcement on NTTA roads? I already know this is a problem because NTTA contracts out traffic enforcement to TxDPS. I think it's because my geocoding excluded all but state and federal roads, and NTTA roads are locally-owned. This is another example where I need to fine-tune my C# geocoding program, possibly having to run the entire 15+ hour process all over again.

These are all the next questions I need to assess, and the heatmap will help make this data human-understanadable and ease this analysis and data correction.

Comments

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <small> <sup> <sub> <cite> <blockquote> <code> <ul> <ol> <li> <dl> <dt> <dd> <h2> <h3> <h4> <img> <br> <br /> <p> <div> <span> <b> <i> <table> <td> <tr> <tbody>
  • You can enable syntax highlighting of source code with the following tags: <code>, <blockcode>, <pre>, <c>, <cpp>, <drupal5>, <drupal6>, <java>, <javascript>, <mysql>, <php>, <ps1>, <python>, <r>, <ruby>, <sql>. The supported tag styles are: <foo>, [foo].
  • Lines and paragraphs break automatically.

More information about formatting options

By submitting this form, you accept the Mollom privacy policy.