Many Eyes on Language

The “Language Speakers” bubble chart image below was created as part of IBM’s Many Eyes project:


It’s a really cool project which enables the creation of various types of visualizations given certain data sets. Language lovers will also be interested in the Phrase Net on the Many Eyes blog.


John Pasden

John is a Shanghai-based linguist and entrepreneur, founder of AllSet Learning.


  1. Wow! I didn’t realize how popular the language was. I think if North Korea discovered this that they would bump up their statistics so they could be #1.
    How is everything at C-Pod?

  2. bentinho Says: May 13, 2009 at 12:16 pm

    that’s an apollonian gasket! I think the number of speakers needs an update.

  3. Glad to see Wu represented so prominently. Jus’ sayin’.

  4. Data source: Wikipedia according to the original site. But French has 175 milion speaker on Wikipedia and only 78 according to the graph…

    Anyway, thanks a lot for sharing!

  5. Funny how tag clouds condensed to this. Looks groovy!

  6. How ’bout a project to see if we can get more than three people to agree on the number of speakers for any given language?

  7. I agree with what SWK wrote. This separation between first language and second language is artificial.
    What’s the first language of a mexican american living in California ? English … really ?

  8. yay ! – mandarin comes out tops ! 🙂

  9. I see “Cant…”, but are speakers of 閔南話 and other mutually unintelligible dialects spoken by large segments of people in the PRC being lumped in under the Mandarin heading? Or is everybody in the PRC falling into either the Mandarin or Cantonese categories?

  10. @Scott: I think that would listed under “Min.” I did a text cloud version, and Min was listed, so I’m assuming 闽南话 would be included in Min.

  11. Scott, I’m pretty sure 闽南话 is the light-green circle labeled Min, on the bottom right of the graph. If you follow the “Many Eyes” link in the original post you can see more detail, including a list of languages and language families represented. The chart includes at least Mandarin, Cantonese, Wu, Min, and Hakka, which covers quite a bit of the population.

  12. The Hindi balloon seems to be a bit larger than the English balloon, but the Hindi number is lower.

    I’m also surprised that German seems to be spoken more than French.

  13. This graph is cool. It shows how certain “important” languages aren’t that big and makes you realize that other languages are quite neglected. It’s nice to get that kind of perspective.

    Scientifically counting the number of speakers that a language has is tough since there is no good scientific definition for a language vs. a dialect. Some say that languages like Farsi and Dari, Swedish and Norwegian and Czech and Slovak should be counted as one language; others say that counting languages like Arabic and Hindi as one language is deceiving since they have mutually unintelligible dialects. Someone also already brought up the valid question of whether or not second language speakers could/should be added to the count. That’s what happens when we mix science with politics.

Leave a Reply