Monday, October 29, 2007

Linux Distro Popularity According to Google

Over the years I've used a variety of Linux Distros: Mandrake, Red Hat, FreeBSD, Fedora, Gentoo, and Ubuntu. Distrowatch keeps track of everything we need to know about the distros, and recently there has been an enormous push in desktop Linux thanks to Dell putting Ubuntu on desktops and Compiz-Fusion bringing snazzy eye candy to even low end machines. Distrowatch gives some pretty decent stats on the main Distros but for a while I've wanted to know how Google sees their popularity; mainly by how many pages mention specific distros.



Using some python, a spreadsheet, and a little scraping, I was able to get my answer. To see how Google would rank different distros I'm using the number of results Google returns when searching for the Distro's name as my numbers. I'm going to write a HOWTO on the technical aspects of what I did sometime this week, but here are the basic steps

  1. In a Google Spreadsheet I made a sheet that held the names of the top distros on Distrowatch.
  2. Another sheet holds the full list from distro watch (366 on record at the time of this writing)
  3. I set up a dapp to take these names, and return the number of results Google would have if you searched them
  4. A python script pulls the distros out of the spreadsheet, queries the dapp, and puts the results back into another sheet
I have 2 sets of results. One is a query using the vanilla list out of the spreadsheet. The second is appending the word Linux to the distro if it does not already have it as the title, I was curious as to how this would effect the results. Below are the results of the most popular Distros on Distrowatch. Look, Ubuntu! The spreadsheet that has all of the findings (and all 366 distros) is shared here

DistroPage Hist
Ubuntu96,800,000
FreeBSD36,700,000
Fedora35,800,000
openSuse 29,500,000
Debian Linux28,100,000
KNOPPIX12,500,000
Mandriva Linux5,720,000
Gentoo Linux4,430,000
PCLinuxOS3,170,000
Slackware Linux3,010,000
MEPIS Linux1,640,000


Distro Chart

Stay tuned for the code behind it! Subscribe to the feed to get more updates.

10 comments:

Paddy3118 said...

What if a distro peaked some time ago and then trailed off in popularity?
There is no time axis.

- Paddy.

Nate Wheeler said...

Right, its just a snapshot for right now. I'm actually quite surprised FreeBSD is second in returned results. I would have thought fedora, suse, or debian would at least beat it.

Anonymous said...

FreeBSD isn't Linux.

Anonymous said...

Fine, Ubuntu is more popular. Are you happy now? Just remember, Windows is a lot more popular than any Linux distro. Does that mean it is a better OS? No, of course not. So, does the fact that Ubuntu is a more popular distro equate to it being the best Linux distro? If not, then why is this so important to you?

-S- said...

Here is a Google Trends graph for the top 5.

http://www.google.com/trends?q=Ubuntu%2CFreeBSD%2CFedora%2CopenSuse%2CDebian+Linux

Maarten Kooiker said...

Whow, that's quite an advantage Ubuntu has (due to the community I guess)....But also FreeBSD has quite some votes....
Nice research!

Anonymous said...

I have to say it: FreeBSD is not a Linux distribution.

Anonymous said...

FreeBSD isn't a linux distro.

Christian said...

I'm so glad FreeBSD has been downgraded to a "Linux distro," considering it uses a Berkeley kernel and BSD-licensed userspace... But then again it wouldn't be the first time the GNU community ripped off BSD code and claimed to have written it themselves.

Anonymous said...

First, I have to reiterate what others are saying, "FreeBSD is not Linux".

Second, for Christian: you maybe a troll, but I have to state this so others -- maybe noobies -- don't take your foolish words for real. Nobody wants to steal nothing. It seems Linux users simply tend to view *BSD as brothers or cousins, so they're not left outside. If this would ever be unacceptable for BSD folks, I suppose they could act and get the name references out of the site.


Last, but not least, FreeBSD is not GNU -- and this is important. There's a whole lot of philosophy in these 6 letters: GNU and BSD. Two ways of seeing freedom. For this reason they should never be mixed.

A Linux user, who never writes GNU/Linux, but happens to recognize what GNU ideas may represent to our world...