By Shannon Pieper
Published Wednesday, April 24, 2013
Big data. What is it, and why on earth would a firefighter need to know about it? Because, 40 years ago, big data was responsible for bringing about the “War Years” in the South Bronx—when fires raged daily throughout the community, and a decimated FDNY struggled unsuccessfully to keep up. And although we’ve learned some lessons from this time, we are still making many of the same mistakes when it comes to using data to inform decisions about fire service resources and staffing.
That is, at least, according to Joe Flood.
The Bronx Is Burning
In 1970, the area known as the South Bronx was relatively stable—a few vacant lots and a dense, lower-income population, but also a lot of well-kept buildings and desirable tracts of land. Two years later, it was a minefield of burned out buildings—“vacants”—an area riddled with arson, where in some places as much as 90 percent of the population had fled. In the middle of it all was Engine Co. 82, the busiest fire company in the world. At one point, it was estimated that the company was on a call every 45 minutes, night and day. The fire station once known as “The Big House”—the anchor of a bustling community—became known as “The Little House on the Prairie”—struggling to hold on in a largely deserted wasteland.
How did this transformation occur so rapidly? That’s been the focus of Joe Flood’s reporting and writing for many years. Flood, a journalist for New York magazine and the author of The Fires: How a Computer Formula, Big Ideas, and the Best of Intentions Burned Down New York City—and Determined the Future of Cities, shared some insights about those “war years,” and what we might learn from them, at his FDIC session today.
Big Data
In retrospect, what brought about the transformation of the South Bronx wasn’t all that hard to understand—it was the direct result of budget cuts that dramatically affected deployment of fire service personnel and apparatus. What is strange and difficult to understand, is that the fire chief at the time—John O’Hagan—supported these cuts. “The least fight came from the fire department,” Flood notes.
Why would a chief who was widely respected, who had written ground-breaking works on high-rise fires and advocated for stricter building codes that, if followed, could have prevented the collapse of the World Trade Center—support policies that crippled the FDNY? It was, Flood believes, a result of O’Hagan’s belief in a statistical model. O’Hagan believed he had the best budget-cutting tool around—one that was supported by numbers, and therefore politically unmotivated.
Faced with steep budget cuts, O’Hagan had to cut something. To do so, he relied on a model that had been developed by the RAND Corporation. Flood notes that the company that had started out with one client—the U.S. military—was looking to branch out, and it began to work with NYC commissioners, studying housing and hospitals, in addition to police and fire.
RAND started out small: It addressed the problem of false alarms. Following the data, it showed where the highest number of false alarms occurred and managed to identify specific times and circumstances—just after school has let out, for example—in which they were more likely to occur. The solution: Send fewer engines to those responses. This strategy was largely successful. O’Hagan, already a self-educated student of statistics, was sold.
On a Grander Scheme
Then RAND took a giant leap forward. It began creating a model for predicting fires, and a model for predicting response times. In theory, Flood notes, you could combine the two to give you the best possible resource and personnel deployment. This is the essence of big data—building models that synthesize enormous data sets, making sense of information that's too complex for our usual tools.
The problem: The RAND models were flawed from the start. Fifteen of the 20 companies studied were in Manhattan, although the busiest companies by far were in the Bronx, Brooklyn, etc. The companies were almost exclusively ladder companies, not engines. And because this was pre-CAD, pre-GIS, pre-GPS, the researchers relied on the company officers to start a stopwatch to indicate they’d begun their response, then stop it when they arrived. But some officers deliberately shaved time off their watches, while others added to it—depending on whether they wanted to be perceived as the fastest company or were desperately trying to indicate that they needed more resources or fewer calls.
Flood points out some other flawed assumptions in the response time model: RAND assumed that the trucks would always be responding from their stations, despite the fact that they often responded from another call in their response area. The model also gave almost no weight to population density or building types. These assumptions, of course, de-emphasize poor, densely populated neighborhoods—like that of the South Bronx.
The Result
RAND’s model pushed the fire chief and the mayor to make cuts that crippled the ability of the FDNY to respond in certain neighborhoods. In fact, one change alone was critical. For some time, stations had operated with “second sections”—Engine 82 was the main engine, but if it went out on a call Engine 85 backed them up. Second sections had been under scrutiny because implementing them didn’t appear to reduce the call load. And indeed, under RAND’s new model, you would never need a second section—because you, in theory, respond to every call from the station. So just like that, there’s no need for Engine 85.
“In six or seven years, the FDNY moved or closed 50 companies,” Flood says, “the vast majority of them in the Bronx, Harlem and central Brooklyn. Because the model said you can close your busiest companies and not impact your ability to respond.”
In addition to cutting second sections, the RAND model led O’Hagan to slash:
- All non-response duties (inspections, hydrant tests, etc)
- Fire marshals
- Logistical units
- Garage/maintenance
Reading that list, it’s no longer a surprise that in a few short years, the Bronx was brought to its knees.
What Can We Learn?
The War Years of the 70s were unique, but many communities today are fighting their version. Fire service budgets have been slashed, and many times these decisions are made as a result of big data. Flood points out that both New York and Los Angeles have been exposed for “reducing response times”—by simply changing the way response times are measured, rather than actually improving response. These may be extreme cases—Flood says most misuses of data are well-intentioned mistakes—but they are examples of how we continue to put too much faith in statistical modeling.
To be clear, Flood is not against big data. He argues that we need models to interpret the data we have and produce predictions. However, he warns against allowing the models to think for us. “It can make problems seem like they can be solved quickly, that there’s no need to think hard, that the problems aren’t complex,” he says. “But models that contain algorithms often mask mistakes.”
If we can’t get away from data, but we also need to be cautious of misusing it, what can we do? Flood provides a few pointers:
- Measure it yourself. If you want to “fight city hall,” you’re going to have to have numbers to back you up. Learn to speak the language; bring in academics to measure things for you. Ensure the studies are being done right.
- Don’t look at it as good-vs.-evil. In today’s economy, cuts are a necessity. Big data can help us identify where we can make those cuts with the least amount of pain. “We have to be willing to look at what kinds of efficiencies we can realize,” Flood says.
- Don’t get overconfident in models. It can be tempting to look at data as black or white, light or dark. But reality is usually somewhere in between. Statisticians, however, rarely see the grey areas; as a result, their models can fail to make accurate predictions. Embrace big data, but don’t forget to think for yourself, as well.
As I listened to Flood, I was reminded of a book I’d read recently, Nate Silver’s The Signal and the Noise: Why So Many Predictions Fail—but Some Don't. A well-known political forecaster, Silver is a student of big data and his book is a study of its power and its limitations, and a strong reminder that our models are only as good as the people behind them, not just the data.
Or as Silver puts it: “Data-driven predictions can succeed—and they can fail. It’s when we deny our role in the process that the odds of failure rise.”
No comments:
Post a Comment
Please leave a comment-- or suggestions, particularly of topics and places you'd like to see covered