Readers may recall that back in October I started work on a research project to use Twitter as an early warning system for disasters, accidents, terrorist events and the like. The notion is that in today’s hyper-connected world the first thing some of us do upon encountering an unexpected event is to Tweet about it and that in all but the most local of events, those Tweets will outrun any physical manifestation of the event.
The first key decision is to determine what set of keywords to use as trip wires. As the first post indicated, I started out by looking for incidences of “WTF” and “what the fuck” under the assumption that those are the most likely terms used to express unpleasant, unknown surprise in our modern lexicon. Similarly, I tested other broad terms such as “whoa,” “holy shit” and “what the hell.”
My conclusion is that the global background levels of “WTF” and its ilk are way too high to filter out the truly anomalous incidents from the overwhelming expressions of things like, “WTF? No Coke Zero again?” “WTF are you doing wearing that dress?”
That has led me to build a more complex but ultimately more useful search set around specific terms. Using the close-to-real-time filtering of TweetDeck, I’m now tracking:
- “building shaking”
- “flash of light”
- “ground shaking”
- “hear gunfire”
- “hear sirens”
- “huge explosion”
- “huge noise”
- “incredibly loud”
- “lights in the sky”
I’m still getting false positives, of course, but the number is greatly reduced and some meaningful data is beginning to emerge (today’s very small earthquake is eastern Ohio/western Pennsylvania, for example, popped up on this search stream about an hour ago). This is very encouraging, but if anybody has ideas for additional terms, please throw them over the transom.
Next step is to find a way to map Tweets in real-time. If anybody knows of an application that does such a thing, please let me know.
Happy new year to all.