Techmeme Leaderboard 2007 – More!

I’m an avid reader of Techmeme. Love the idea, UI, freshness, coverage, and most of all the quality of the articles.

When the Techmeme Leaderboard debuted earlier this month, lots of buzz circulated the blogosphere. Me, being a huge fan of partying on data, loved the concept, and wanted to take the analysis even further (Yuvi style, but with a search twist).

So yesterday I wrote up some code to crawl and analyze Techmeme articles over the whole year (Leaderboard shows the Top 50 sources for this month). I took a snapshot of Techmeme at 1:00PM every day between beginning January – end of September of 2007.

I computed basic statistics, like number of stories by author and source, as well as more involved measurements like the top word mentions of the year – in total and by category (used simple NLP to clean up the text and remove stopwords).

So, without further ado, here are the results:

Number of Stories by Author in 2007, Ranked
Number of Stories by Source in 2007, Ranked
Most Mentioned Words in 2007, Ranked
* words are stemmed
Most Mentioned Words, by Category, Trends in 2007, Ranked

Hope you guys find these results super interesting and useful.

About these ads

1 Comment

Filed under Blog Stuff, Data Mining, Information Retrieval, NLP, Statistics, Techmeme, Trends

One response to “Techmeme Leaderboard 2007 – More!

  1. Awesome analysis :) What NLP framework didja use? And language?

    (I too have an analysis of Techmeme coming up (It was supposed to be up a month ago (but is not since, well, I’ve become lazy (and also because I had a vacation in between)))).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s