9 November 2011 – Mapping the Twitterverse


I thought I'd direct your attention to a fascinating study the results of which, in my humble opinion, are as beautiful as they are useful.  Check this out:

That's a map of the worldwide prevalence of languages used in Tweets.  Mike McCandless extracted the compact language identification software from Google Chrome, and Eric Fischer applied it to Twitter, generating a geolinguistic map of the global 'Twitterverse'.

Apart from the sheer beauty of the result (which in my opinion is a function of the simplicity of the approach and the fact that it is 100% based on empirical data), the cartographic output gives folks like us oodles of food for thought.  Take a look, for example, at the North American map:

It's pretty easy to understand the concern about the future of the French language in the midst of an otherwise almost entirely Anglophone continent, non?  At the same time, look at the US, and try to see how many different languages (colours) you can pick out, particularly in urban areas.  Nowhere else on Earth are so many languages represented in such intermingled proximity.  Incidentally, this map also demonstrates just how little of Canada's population - or at least the Twittering part of it - lies more than 100 km or so from the Canada-US border.  At the extreme left edge of the map, you can see how El Paso, Texas, is virtually all English-speaking, but also how, just across the Rio Grande, Juarez, Mexico, is entirely Spanish-speaking.  And why is everybody in Bermuda Tweeting in what appears to be either German or Swedish?

Now take a look at Northeast Asia:

Japan and the ROK are both pretty wired, aren't they?  No surprises there.  China's not far behind, though, although the Twittering seems to be largely confined to urban areas along the coast.  It's also a non-surprise that North Korea is an electronic wasteland.  In fact, it's interesting how much this map resembles the map of electrification I sent around some time last spring.  It's also interesting to see all the Russky Tweeters in Vlad and the Kuriles.  You can just see the Bonin Islands at the bottom centre, and at the bottom left, Taipei at the northern tip of Taiwan; I wish we could see more of it.

Now take a look to the north of Japan; there are a bunch of isolated, unilingual bright spots.  You know that's ocean, so what can they be?  There are some islands there – is that what we’re seeing?  Or is it something else?  Clusters of light spots where there isn't any land are probably groups of ships with multiple Tweeters aboard.  Are we seeing Korean and Japanese fishing fleets in the Sea of Japan?  But then there are also long, straight lines of dots (you can see many of these south of Japan). What makes those lines? My guess is we're seeing the electronic ghost of individual Tweeters travelling aboard international airliners.  How cool is that?

Now for the gold – Europe:

That has got to be one of the neatest things I've ever seen.  Look, you can see the Dutch!  And the Danes!  And the three Baltic countries! And the Catalans, for crying out loud!  You can see how thinly the former Soviet satellite states are "informationized" on an individual basis as compared to Western Europe (and Western Turkey!)  You can see how population patterns in Russia and the Ukraine follow the coasts, the rivers, the main cities, and the highways connecting them.  You can see how Corsica compares to Sardinia!  You can just barely see northern (Turkish) Cyprus, and the first hints of the Greek-speaking southern half of the island.  You can see what a horrid linguistic muddle the Balkans are.  And you almost can't see North Africa at all, although Tunis and Algiers stand out; that sort of makes you wonder to what extent the “Arab Spring” really was driven by modern communications technology, as opposed to telephones, newspapers, and word of mouth.

You can see the mountains between Portugal and Spain that Wellington and the lads had to struggle over in the Peninsular War, because they're still sparsely populated - but you can see how densely populated the Alps are!  You can see how England is almost totally blanketed by Tweeters!  And how the North Sea, the Med and the Bay of Biscay are full of Tweeter-bearing ships, but the Black Sea and the eastern Baltic, not so much.  And even cooler, you can see how it's almost impossible to make Switzerland out at all, because when you look at it in terms of language, it might as well be split between France, Italy and Germany.  Same deal for Belgium; Wallonia blends into France, and Flanders blends into Holland.  The notional, national borders are entirely invisible when you look at countries on a linguistic basis.

There are some weaknesses in this sort of approach.  Returning to the world map for a moment, one wonders about the sub-Continent; why is densely-populated India so short of Tweeters?  At this point you realize that you’re only seeing the shadows of English tweets; the software apparently doesn’t detect Hindi.  Pakistan, Afghanistan and Iran are likewise nearly blank, because Persian, Pathan, Dari and so forth are similarly not being picked up.  Just because you don’t see it, doesn’t mean it isn’t there.  “Absence of evidence is not evidence of absence.”

What else could we do with this sort of capability?  Look back at the world map up top for a second.  You can see the Canary Islands, but Africa is almost nonexistent.  What can we learn from that?  And could you use this capability for military purposes?  If you can make out a single Tweeter aboard an aircraft over water, I wonder if you could follow the movement of groups of soldiers or sailors by tracking their Twitter signature?  Would a 10,000-man US Army division full of iPhones, iPads, Blackberrys and the like show up on this sort of graphic?  Would you be able to see a US CVN with 6000 madly-Twittering crewmen and women aboard?  How about a Marine Amphibious Group afloat?  What would someone expecting an attack make of a big blotch of English Tweeters (with, say, 10-15% Spanish Tweeters mixed in with them) in the middle of the South China Sea?

