headlines

Some quick thoughts on Google News and media metrics

My schedule is tight this morning so I'll make this quick. In looking at Chris Bowers's SC results thread on Open Left, I noticed the following tidbit:

 

Update 12: I'm looking at Google News headlines on the primary to try and see what sort of narrative comes out of South Carolina. There appear to be three types of headlines right now. First, the most common is the bland, "Obama wins South Carolina," that won't help him much. Second, there is the "Obama wins racially charged primary," that probably won't help him at all (and may hurt him). Third, there is the "Obama wins huge" headline, which he really needs and will help him. Since he needs a bounce, he also needs a lot of "Obama wins big" type headlines.

 

Now, here's the question: is there anyway to automate this type of analysis? It seems to me that most major events will follow a similar path - headlines for the relevant story will follow one of a small number of basic paths. Is it possible to write a program which will automaticlly track those headlines and reveal how the newspapers "voted" on the story, based on the headlines? That would be one sweet media analysis tool.

I haven't done theoretical CS in a long time, but it seems to me the answer would probably have half a foot in clustering (assuming you could isolate all the stories about a single event, you could use some kind of thesaurus metric combined with clustering to identify the major headline groups, as Chris did manually); and half a foot in crowdsourcing (you'd probably need humans to help you decide which headline groups are most advantageous for the candidate). You could overlay all that on a database of newspaper circulation and a neat little automated graphing program to get some really great charts.

Full disclosure: My company did a small technical/design project for Chris and OpenLeft last year.

Syndicate content