In the last episode of Wikicounting I investigated how Wikipeia treated history, by calculating the size (in characters) of the Wikipedia entries for each individual year. The results were very interesting. In the comments of that post PEB mentioned that it would be fun to see how the size of the Wikipedia article for a specific year is correlated with the number of links that Google turns up.
Here is what you find if you plot the number of links turned up by Google against the number of characters on the wikipedia page:

I used google search queries of the form "year XXXX" (with the quotes), which should ensure that most of the results we return are relevant to the given year. There is a really good correlation (the odd points at the lower end of each spectrum are mainly years around the 1500s).
It's interesting to note that there are only two points with over ten million google links, they are 2000 and 2005 (the graph stops at 2005). In the year 2000 every single blog, news site and poorly designed geocities webpage was ranting about the y2k bug and so that point got boosted massively above the others around it. Just look at the numbers
(year, number of google links)
1998 1,520,000
1999 1,550,000
2000 15,600,000 <-- !!!!
2001 2,760,000
2002 4,480,000
2003 5,890,000
2004 8,100,000
2005 15,400,000
2006 37,500,000
2007 23,200,000 (and we're still in January!)
It looks like post-2000 number of links is almost doubling every single year, some may say that Google is growing at a geometric rate!
next time: We learn which deity is preferred by the people of the internet!
Also give me more ideas please
Here is what you find if you plot the number of links turned up by Google against the number of characters on the wikipedia page:
I used google search queries of the form "year XXXX" (with the quotes), which should ensure that most of the results we return are relevant to the given year. There is a really good correlation (the odd points at the lower end of each spectrum are mainly years around the 1500s).
It's interesting to note that there are only two points with over ten million google links, they are 2000 and 2005 (the graph stops at 2005). In the year 2000 every single blog, news site and poorly designed geocities webpage was ranting about the y2k bug and so that point got boosted massively above the others around it. Just look at the numbers
(year, number of google links)
1998 1,520,000
1999 1,550,000
2000 15,600,000 <-- !!!!
2001 2,760,000
2002 4,480,000
2003 5,890,000
2004 8,100,000
2005 15,400,000
2006 37,500,000
2007 23,200,000 (and we're still in January!)
It looks like post-2000 number of links is almost doubling every single year, some may say that Google is growing at a geometric rate!
next time: We learn which deity is preferred by the people of the internet!
Also give me more ideas please
Labels: google, wikicounting, wikipedia