Jun 2015

An Interview with Outlier Ash

I’m very happy to report that the Outlier Dictionary of Chinese Characters I wrote about before has met its $75k funding goal. That means that this dictionary will soon be available through Pleco, so if you were holding out, doubtful it would actually happen, doubt no longer. Congratulations to the Outlier Linguistic Solutions team!

Ash Henson

Ash Henson

This is an interview with Ash Henson, Outlier Linguistic Solutions’ main academic guy. Like some other people I’ve spoken with, I was a bit apprehensive about the project at first, feeling it was all way too academic and probably not a good resource for beginners. The more I talked with Ash, though, the more I was convinced this was not the case. I do believe this is going to be a great resource for learners at all levels, and I look forward to using it myself, both for my own purposes, and for my beginner-level clients.

Anyway, here are some additional questions I had about the dictionary, answered by Ash.

1. You have an article on the problem with the concept of “radicals.” Would it be fair to say that radicals are just an outdated concept which we don’t need anymore because we can look almost everything up by computer now? Is your dictionary going to include the concept of radicals at all?

Well, I’d say that radicals are only reliable as a tool to look up characters in traditional dictionaries. If you only use electronic or software dictionaries, then it’s safe to say that you can ignore them. We will actually point out the radical for each character though, so that you can look up the radical for that character if you need to look it up in a paper dictionary. The main issue with “radicals” is that there are really several unique concepts that are called “radicals”. For instance, you often hear people say “Characters are made of radicals.” While that is a reasonable conclusion to make from the name “radical”, it misrepresents how characters actually work. There are around 500 semantic components that appear in characters and a lot of them cannot be broken down into “radicals”.

2. You’ve mentioned before that the Outlier Character Dictionary will include the most up-to-date research, including even corrections of mistakes in the legendary 说文解字 (Shuowen Jiezi). Could you give a simple example or two of that?

This type of data can be found in the Expert Edition. I’ll share two examples from the demo. For 監 (jiān) “to inspect”, the 說文 says that it is composed of the semantic component 臥 (wò) “to rest” which is used to express the idea “to look down from above” and the sound component 䘓 (kàn) “thick animal blood” abridged to 血. The problem is, 監 is a character from the early Shang dynasty (roughly 1600 bce to 1046 bce), while 臥 and 䘓 don’t appear until Warring States (roughly 475 bce to 221 bce).


Image taken from the Outlier Dictionary of Chinese Characters

Obviously, either this interpretation is anachronistic or maybe 臥 and 䘓 did exist earlier and we just haven’t found any proof. However, if you look at the earliest extant forms of 監, it’s very obvious that it’s a picture of a person looking into a container that has liquid in it. This “picture” is used to represent the idea “to inspect, examine” as this was how the ancients inspected their own faces, i.e., they used water in a container as a mirror.

Another example is 黑 hēi “black”. The 說文 says that the top part is a window and the bottom part is flame (炎 yán) and gives the meaning of 黑 as “the color of something burnt”. Note that the 說文 is explaining the Small Seal script form. The earliest forms show a person with a tattooed face. This is one of the ancient Five Punishments, where the name of the crime a person committed was tattooed onto their face.

3. After all this time, how can researchers be certain about what are mistakes in the 说文解字 (Shuowen Jiezi)?

Basically by way of tracing characters back to their earliest extant forms and seeing how characters are used in earlier scripts. Like in the 監 (jiān) example above, the 說文 says that it’s composed of 卧 and an abbreviated 䘓, but 卧 and 䘓 show up around a thousand years after 監. It’s like explaining the 1066 war in terms of the soldiers’ cell phones. Keep in mind, the author of the 說文 was a very erudite scholar, with a very broad range of knowledge, but he was limited by the information he had access to and by pre-scientific thinking. The 說文 is best understood as an insight into how Han dynasty Confucian scholars looked at the Small Seal script. Even with its problems, it still plays a very important role in this type of research.

4. You’ve told me before that a proper understanding of characters can help a learner guess the correct pronunciation of a character. This is hard to imagine, since a lot of components have a wide range of possible functions and even multiple possible pronunciations. (Examples: 干、赶、汗、旱 or 今、含、零、领、邻) How can you solve this mess?

Sound components can be really frustrating, because they generally don’t give an exact sound. In the same way semantic components give a hint as to the range of meaning a character might have, sound components generally also just give a range of sounds. English speakers might not realize this, but English spelling is very similar. That’s why the exact same spelling “minute” can be pronounced MIN-it for “60 seconds” or mahy-NOOT for “extremely small”. Actually, this second one can also be pronounced mahy-NYOOT, mi-NOOT or mi-NYOOT. As you can see, the spelling “minute” does not give an exact pronunciation, but a range of possible pronunciations.

As a native-English speaker, this isn’t a huge problem, because for the most part, we go from words we already know how to say correctly, to learning how to write them. During college we learn a lot of new, specialized words for the field of work we are training for. Most of these are learned either from reading or from hearing professors or other students use them. When I was in college, I often heard people say words incorrectly because they had only seen them in writing. This is a reflection of the fact that English spelling only gives a range of possible pronunciation rather than an exact, IPA-like pronunciation.

Making sense of sound patterns in Chinese characters is very useful, because they can be used to remember how to write characters. For instance, before I learned how sound works, whenever I had to write a character containing 艮 or 良, I would always ask myself, “Oh, man. Do I put that dot here or not?” It was very frustrating. Once I learned how sound components work, I looked up the pronunciation for 艮 (gèn) and 良 (liáng). Then I noticed that for characters pronounced “gen”, “hen”, or “ken”, it was 艮. If it was pronounced “lang”, “liang”, “nang” or “niang”, then it was 良. So, by learning about sound relations, I went from a meaningless dot-or-no-dot question, to a meaningful “What is the pronunciation of the character I want to write?” question. Though sound isn’t represented exactly in Chinese writing, there are a lot of clues we can use, especially if we know to look for them.

Now to the examples you brought up: 干、赶、汗、旱 or 今、含、零、领、邻

Let’s look at 干 (gān), 赶 (gǎn), 汗 (hàn), and 旱 (hàn) first. Notice that they all have the ending “-an” and that they all share the component 干. This is a strong clue that there is a sound relation. Also note that there is no discernible pattern with the tones. That’s because tones generally are not taken into account. Native speakers would generally use “-an” as the sound clue. However, it’s very useful to remember that “g-“, “k-” and “h-” are very closely related sounds.

As for 零 (líng), 领 (lǐng), and 邻 (lín). Notice that 令 is pronounced “lìng.” Once again, tones don’t count (not to say they aren’t important! They just aren’t represented by the sound component). Lastly, notice that the sound for 邻 ends in “-n” and not in “-ng.” In this particular case, that’s due to the simplification of 鄰 to 邻, and 粦 is pronounced “lín.”

Finally, looking at 今 (jīn) and 含 (hán), we notice that 今 and 令 above are graphically very similar, but like the 艮 (gèn) and 良 (liáng) example, we can use sound to keep 今 (jīn) and 令 (lìng) separate. Using sound patterns to understand the relation between 今 (jīn) and 含 (hán) is a little more complex. You have to understand both that “g-“, “k-” and “h-” are closely related as previously mentioned and that many “j-“, “q-“, and “x-” come from an earlier “g-“, “k-” and “h-“. In other words, two groups of closely related sounds are also somewhat related.

Why do sound series have this kind of variation? The answer to this question is fascinating, but complex. Most characters in use today find their origins thousands of years ago during the Zhou dynasty. Back then, the language was very different and very possibly had prefixes and suffixes and it was these prefixes and suffixes which cause this variation. Another reason is from regular sound changes over the last several thousand years.

5. Your dictionary is designed to provide a wealth of modern character research into characters through a modern interface. How would this be used by a beginner who sees characters as an annoying hurdle?

The key to optimal learning is obtaining the ability to use the system of Chinese characters as a tool for being able to recall character forms after long periods of time and as a tool for making intelligent guesses about characters you haven’t learned yet. Native speakers have these abilities, but they are far from perfect and they are the results of years of input. Non-native speakers learning Chinese can also get them after learning a few thousand characters.

However, as you can imagine, their instincts about characters are probably not as good as a native speaker’s. The main advantage of using our methods is that you can gain these abilities after a few hundred characters, because all of the sound and meaning connections are being pointed out explicitly for each character. And, as I showed above, if you learn our sound patterns, your feel for sound representation will be better than a native speaker’s. We also explain meaning connections in a more precise way, so your feeling for meaning representation will also be more accurate.

To those who think of characters as a nuisance, if you learn them our way, you’ll learn in a way that is both more meaningful (and therefore you’ll likely find it more interesting) and more effective, so you’ll spend less time re-learning characters. We can’t remove the pain entirely, but we can minimize it!

As of today, the Outlier Dictionary of Chinese Characters Kickstarter is sill going.


May 2015

4 Reasons I Want the Outlier Dictionary of Chinese Characters

There’s a new Kickstarter project related to learning Chinese definitely worthy of more attention: the Outlier Dictionary of Chinese Characters. I’ve had the pleasure of multiple Skype calls with John and Ash of Outlier Linguistic Solutions, and this project is no joke. They’re out to build something I’ve wished has existed for quite a while, and they’ve got the skills and dedication to make it happen.

The Kickstarter page is packed with explanation, so I won’t rehash the same information you can check out on your own. But I will tell you what’s interesting about this project to me.

  1. It integrates with Pleco. Pleco is already my favorite dictionary, largely because it contains so many different dictionaries. It would be annoying if the Outlier Dictionary were a separate app, and building an app from scratch is a huge drain on resources. So I think this was a smart way to launch the dictionary.
  2. The Outlier founders are learners turned experts (check out this profile). Sure, no one knows Chinese better than the Chinese, but the perspective of a foreigner that has the passion to devote years and years of his life to it is hugely valuable. They have put a lot of thought into the difference between how native speakers learn Chinese and how foreigners learn Chinese, they’ve deconstructed the process, and they’ve come up with a better way for foreigners to learn characters. We learners need this!
  3. The dictionary is academically rigorous. Unlike most dictionaries, it doesn’t hold the legendary 说文解字 (Shuowen Jiezi) as the ultimate infallible reference. In fact, research into mistakes made by the Shuowen are part of the dictionary. This is amazing!
  4. The approach taken to Chinese character structure is new and necessary. I’ve complained about certain products claiming that radicals are a revolutionary way to learn characters. They’re not. In fact, the term “radical” itself is outmoded and confusing, because it’s tied to outdated dead-tree character dictionaries. So the Outlier Dictionary rightly ditches the term “radical” in favor of “functional component,” and it doesn’t stop there. Check out this breakdown:

Outlier Functional Components

OK, but is it too geeky?

One of the concerns I expressed to the Outlier team was that they were building a dictionary for academics that didn’t really serve the practical needs of the average learner. They fervently assured me this was not the case; they are building a dictionary that enables a strong understanding of the system of functional components behind characters, while also enabling curious learners to go as deep as they want in their character studies. This is exactly how it should be done, so I can’t wait to get my hands on this dictionary. I also plan to keep working with the Outlier team and deepen my involvement in their project. I know that clients of AllSet Learning could really use what Outlier is developing.

I’m embedding a demo video at the bottom, but there is a ton of information on the Kickstart page, so check it out!

Outlier Linguistic Solutions — Demo Walkthrough from Outlier Linguistic Solutions on Vimeo.


Mar 2012

Dict.cn does Shanghainese

Shanghainese dialogs on Dict.cn

I was recently informed (thanks, Mark!) that Dict.cn, one of the popular, free online Chinese-English dictionaries, now offers Shanghainese content. I was pleasantly surprised to see a big list of mini-dialogs in Shanghainese! The bad news is that the dialog text is in characters ( for , etc.), and there’s no IPA or other phonetic transcription. They only have one speaker doing the audio, but there’s audio for every sentence (tip: mouse over the little speaker rather than clicking on it), so that’s not bad.

I asked my wife what she thought about the speaker’s accent. She said it was 新派上海话 (the form of the dialect spoken by modern young Shanghainese), and she felt that the female speaker was too (cutesy-sounding). But, hey… it’s Shanghainese.

I also recently did a little research on Shanghainese lessons in Shanghai. Interestingly, some of the schools that I know used to offer Shanghainese classes no longer do. Is the demand dropping? Have any readers out there taken Shanghainese lessons at a local university?


Jan 2012

Personal Experience with the Other Particle “ma”

I remember quite distinctly the way I learned the sentence-final particle . I had only been studying Chinese for a little over a year, and thus was quite familiar with the yes/no question particle , but not this new , which seemed a bit more complex. I might have studied it before and just ignored it, but once I was on the streets of Hangzhou and hearing it all the time, I knew it was time to start figuring out what this was all about.

So I broke out my trusty old Oxford dictionary (we still learned Chinese from actual books in those days), and looked up . Here’s what I found:

Oxford Concise English-Chinese Chinese-English Dictionary (2nd Ed.)

> : ma (助) 1 [used at the end of a sentence to show what precedes it is obvious]: 这样做是不对~! Of course it was acting improperly! 孩子总是孩子~! Children are children! 2 [used within a sentence to mark a pause]: 你~,就不用亲自去了。 As for you, I don’t think you have to go in person.

I know some people hate learning from dictionaries, and grammatical concepts especially can be difficult to learn that way, but for me this explanation was a revelation: used at the end of a sentence to show what precedes it is obvious.

I think a lot of us have personal experiences in which we acquire a new word, and the memory of those specific vocabulary acquisition experiences stay with us long after we internalize the words themselves (one of my own personal examples is my attempt to buy a bug zapper light). This is quite natural, and it’s also one of my key misgivings about SRS. The way we naturally acquire language stays with us and reinforces the entire process, tightly binding words, meaning, and real-world experience. SRS (or simple word lists in general) can’t really offer this deep of a connection.

But back to my dictionary example… How is this any different from an SRS learning method, divorced from a real-world connection? Logically, I feel like looking up a word in a dictionary isn’t much different from being presented a word electronically. Sure, there’s the tactile interaction with the book, and the effort involved in getting out the book in the first place, and the act of physically flipping to the appropriate page, then locating the appropriate headword with my finger. How much “momentum” do these behaviors actually amount to, in a learning context?

Although I can’t think of many compelling instances besides my example, I definitely feel that there are words which I learned (and not just “learned,” but developed a strong connection to) largely due to a dictionary. This leads me to two important questions:

How many of you out there have clear memories of really learning a word or expression through a dictionary? What was it that made it so memorable?
How many of you out there have clear memories of really learning a word or expression through SRS? What was it that made it so memorable?

For me, I think the dictionary’s explanation struck me so poignantly because I had actually already expended a significant amount of mental energy on the use of but I had not yet been able to express the ideas concisely, and the entry did just that, right when I needed it.

Please share some of your own personal learning experiences in the comments. I’m very interested to hear what you have to say.

Related Grammar Links:

Yes/No Questions with 吗 (Chinese Grammar Wiki link)
Expressing the Self-Evident with 嘛 (Chinese Grammar Wiki link)


Sep 2011

Ode to a Paper Dictionary

The Oxford Concise English-Chinese Chinese-English Dictionary is a solid dictionary. It’s a great compromise between “comprehensive” and “portable,” and it’s the one I had with me in my early days in Hangzhou, when I had to look up every other word that I heard. I started with the “handy pocket-sized” version, but I quickly realized that even though it was half the size, it was still a little brick of paper I had to slug around, and the characters were just way too small at that size. So I used the medium-size brick of paper comfortably for years.

I still have that dictionary, although it’s showing its age. Over the years, I have had to use packing tape to reinforce its edges and spine, but at least I managed to do that before it started totally falling apart. Here’s what it looks like now:



Slightly worn, you might say.

When I used this dictionary regularly, I used to highlight words, phrases, and sentences that popped out at me as being useful or somehow study-worthy. What’s great is that I can still browse the dictionary now and see what I once highlighted. Sure, nothing is dated; there is no metadata. But it’s enlightening and amusing to flip through this paper record of my progress.

A little sample:

避免 to avoid
关心 be concerned about
关于 about; on; with regard to; concerning
上当 be taken in
下流 obscene; dirty
大海捞针 look for a needle in a haystack [I’m pretty sure I never ever had a chance to use this, even if I managed to memorize it briefly]

You get the idea.

But the point of this post is not to recommend a great dictionary. I used that dictionary every day for a very important period in my life, and it facilitated all sorts of conversations on a regular basis (yes, I was one of those annoying students that would occasionally put a conversation on hold if there was a word I felt I just really needed to know right away). And yet, I don’t recommend that dictionary much at all. Nowadays I regularly recommend Pleco (and sometimes Wenlin) to AllSet Learning clients, but not paper dictionaries.

Why? Well, there are a number of reasons…

– Most people don’t want to carry around a heavy book, but they take their cell phones everywhere
– Most people find looking up words in a paper dictionary quite a hassle
– Electronic dictionaries are so fast, and with one more touch you’ve saved the word as well for later reference

I remember when I first came to China, lots of people were using mini hand-held electronic dictionaries. They were great, except that (1) they rarely provided pinyin for English-Chinese lookups, and (2) they had short dictionary entries with very few sample sentences. Well, those days are over. The day has finally arrived, and entries are now bursting with information, while internet connectivity offers potentially limitless sample sentences.

So why am I still a little sad? Well, there’s definitely something to be said for idly flipping through those pages made of paper. I’m not sure why dictionary serendipity of the eyes-to-paper variety feels more special than dictionary serendipity of the search-and-related-data variety, but it does. And looking at that old battered paper dictionary, its mere existence does feel meaningful. I beat the crap out of that thing with my learning, and then did just enough work to keep it on life support. And now I neglect it, relegating it to a bathroom book, while computer-based dictionaries serve my daily needs.

We had some good times, paper dictionary. It’s not you, it’s me. But relationships change.


Jun 2011

Pleco for Android + More Dictionaries!


Pleco has announced its long-awaited Android version (screenshots here)! This is interesting to me, because one of the major reasons I switched from an Android phone back to an iPhone was Pleco. I haven’t seen the Android version in action, but looking at the screenshots, it would seem that the iPhone is getting more Love.

From the Pleco Android beta announcement:

> This is an experimental release of our Android software; we’re making it available now for the sake of people who don’t want to wait any longer for the finished version, but there are quite a few bugs / ugly interfaces, the documentation is almost nonexistent (though you can get a pretty good idea of how it works from the iPhone version documentation), and there are also a few major features missing, so if you’re not very computer-savvy we’d recommend waiting for the finished version to be ready before downloading it, or at least waiting a few weeks to see what the feedback from other testers looks like in our discussion forums.

> In general, though, we’re very pleased with how our Android software turned out and with how much functionality we have been able to get into this first release. OCR (see below) is working beautifully on Android (both live and still, though currently only in “Lookup Words” mode), as are full-screen handwriting recognition, audio pronunciation, stroke order, and all of our add-on dictionaries. We’ve even gotten a significant portion of our document reader module working; there are no bookmarks or web browser yet, and it’ll choke if you try to load the complete text of 红楼梦, but for short-story-sized text files and snippets of text copied in from the clipboard it works quite well.

Meanwhile, the iPhone and iPad versions forge boldly ahead as well. I’m looking forward to the upcoming UI redesign. This part of the announcement was interesting:

> Central to this is a new feature we’re calling “merged multi-dictionary search”; basically, instead of typing in a word and having to flip between different dictionaries to see which matches they come up with, you’ll get all of the results from every dictionary in a single, sorted, duplicates-merged list, providing better information and doing it in a simpler way. That particular feature is actually likely to show up in an experimental form (off-by-default option) in a minor update we’ll be putting out in a few weeks; we want to make sure it’s working really well before we put it at the center of our product.

When I heard that Michael Love was looking for more dictionaries to license for Pleco, my initial reaction was, “why do you need more dictionaries? Add more dictionaries and it’s just too much hassle to navigate through them all.” And that’s a problem that this new “merged multi-dictionary search” would solve. I’m very interested to see what that ends up looking like, and how it affects the user experience.

So what are the new dictionaries being added to Pleco?

1. “the Oxford Concise English & Chinese Dictionary (now known as the Pocket Oxford Chinese Dictionary)”
2. “the Classical-Chinese-to-Modern-Chinese dictionary”
3. “the Traditional Chinese Medicine dictionary”
4. “the expanded edition of the Tuttle Chinese-English dictionary, and its companion English-Chinese title”
5. “a really nice multifunction Chengyu dictionary (detailed explanations, usage notes, antonyms/synonyms, etc)”
6. “a lovely little Chinese-Chinese student dictionary”
7. “another Chinese-Chinese student dictionary that would be our first title ever to be oriented around non-mainland users (i.e., the original print version is in traditional characters)”

Wow. And Pleco is still searching for a decent Cantonese dictionary and a character etymology dictionary to license.


Feb 2011

Wenlin 4.0 Review


I’ve been given a copy of Wenlin 4.0 for Mac by the Wenlin Institute for an honest review. It’s no secret that I’ve been a fan of Wenlin for a long time, so I’m really happy to see an update to this wonderful piece of software which most of us almost dared not hope would ever issue another update. But the day has finally come! The new version offers some very welcome updates, but one major disappointment as well.



Sep 2010

In Defense of Hanping (and Android)

Commenter Mark feels I was a bit unfair to Android phones as a Chinese study tool in my recent post, Back to the iPhone (it’s all about Chinese!).

Mark says:

> Have you tried Hanping Pro? It has far more features than the free version. Also, Hanping in super-fast on Android 2.2. [Note: that link doesn’t work in the PRC]

Mark goes on:

> I think the biggest problem here for John is that he’s comparing free Android apps with paid iPhone apps. Also, the iOS app market is about 1 year more mature than the Android market. Android is catching up fast and I would expect the quality and breadth of apps to catch up over the next year.

> Living in China, you don’t see paid apps in the Android Market. Those are generally much better quality than the free apps – especially in niche areas like Chinese learning.

> If you have an Android device and are living in China then all you need to do is put a US/UK/DE etc sim card in your phone (doesn’t have to be active and can connect to Market over wifi) and then you can see/buy whatever paid apps you want. Once you are done, swap back in your Chinese sim card (i.e. you only need to change the sim card when purchasing paid apps, not using them). This is of course a PITA, but its useful to know until Google comes up with a proper long-term solution.

Mark’s right. It’s not that I’m willing to buy iPhone apps and not Android apps, it’s that I can buy iPhone apps in China, but not Android apps (and I’ve tried). I’m not willing to somehow acquire an overseas SIM card just to buy apps. Sorry.

So it’s true… I might not have come to the same conclusion if I weren’t living in China.

> OCR? Google are rumoured to be bringing out an update to Google Goggles soon which will include multi-lingual OCR support (including Chinese). Use it from within any app (SMS, email, dictionary, flashcard etc) so no need for cumbersome copy/paste like you would need to on the iPhone.

> The vastly superior support on Android for inter-app communication is a big advantage over iOS’s “pasteboard” approach and this is very useful in language-learning where you are often juggling multiple apps. Currently, not too many apps take full advantage of this inter-app functionality but this will improve as the Android Market apps mature.

Like I said, I’m fickle. When Android phones become better than the iPhone, I’ll switch back. In the meantime, I’m just waiting for the competition (inlcuding over OCR) to heat up more. This is a very good thing.


Aug 2010

Back to the iPhone (it’s all about Chinese!)


I got a first generation (2G) iPhone in 2008. Then I switched to an Android in 2009. As of this past weekend, I’m back on an iPhone (3GS). Why? I’ll spare you most of the geekery… it’s largely related to Chinese.

The HTC Hero was a pretty solid early Android device. The new smartphones running Android 2.2 are way better now, though. I’m aware of this. It wasn’t just about upgrading hardware and getting the latest OS.

I don’t really care that the iPhone has more apps, snazzier apps, and more games. Unfortunately, with the app advantage the iPhone pulled off another important victory: better apps for learning Chinese. As a learning consultancy, AllSet Learning also recommends various tools for learning Chinese. Well, I’ve got to admit: the iPhone is now the best tool out there for learning Chinese. For myself and for my clients, it’s the phone I need to be using.

Here are the most important factors in my decision to switch back to the iPhone from Android:

iPhone Pros

– The iPhone has quite a few dictionaries available for the student of Chinese. The free ones are decent, but if you’re willing to shell out a little money, you can buy some very good dictionaries. Popular choices include Pleco, Cambridge English-Chinese (not free), iCED, Qingwen, and DianHua.

– Switching between input methods in the iPhone is instant and easy (especially if you only enable English and one Chinese input method). This is something I do so often that even a slight advantage starts to really matter.

– If you’re interested in handwriting recognition for Chinese (and this is a great learning tool in itself), Apple’s solid version of that is built into the OS.

– The ChinesePod app for the iPhone is better than the one for the Android. (This is a trend that’s not particular to ChinesePod.)

– Ummm, have you seen Pleco OCR?

Android Cons

– No good dictionaries. I don’t even know what everyone uses. Hanping? Honestly, until I heard about Hanping (which, although serviceable, is a very basic CC-CEDICT dictionary), I was just using the mobile version of nciku.

– Switching input methods is a bit slow and annoying. It’s tolerable… for a while. But if you do a lot of switching, it gets to you. (Or you might stay in pinyin mode all the time, which also slows you down, since it has no predictive text functionality.)

– It’s getting Pleco someday, but who knows when?

OK, but nothing is totally one-sided… There are a few other points I should mention.

iPhone Cons

– Google Maps is still messed up in Shanghai on the iPhone. What’s up with this? It always places you some 300-500 meters northwest of where you really are. Apple blames Google. (Google Maps works just fine on Android devices in Shanghai.) This is seriously annoying.

Android Pros

– Google Maps just works.

– Recharging with a regular USB cord is so, so nice. (When you forget your cord, you can even borrow a friend’s digital camera USB cable.)

An iPhone 4 that’s usable in Shanghai is still super expensive, which is a major reason why I got a 3GS. The iPhone 3GS and the high-end Android devices are comparably priced. I was tempted to check out one of the Android phones, but I can’t ignore those iPhone advantages. I’m fickle, though… we’ll see how things develop over the next year.


Aug 2010

The New Pleco OCR Is Amazing

There has been a bit of a buzz lately among the techy students of Chinese in Shanghai, and it’s all about the new functionality coming to the Pleco iPhone app. From the site:

> We’ve just announced an incredibly cool new feature for the next version of Pleco, 2.2; an OCR (Optical Character Recognition) that lets you point your iPhone’s camera at Chinese characters to look them up “live” (similar to an “augmented reality” system): demo video is here (or here if you can’t access YouTube).

Watch the video. Seriously. This is big.

Basically what the new app allows you to do is to add “popup definitions” to any Chinese you’re reading–even a book. It’s instantaneous. It uses the iPhone camera, but it’s not like taking a photo at all. (It’s more like using 3D goggles… Magical 3D goggles that provide pinyin readings and definitions for Chinese words.)

The technology behind this app is not terribly new… optical character recognition for Chinese characters has been getting steadily better over the years. But no smartphone app has done this well yet, and it’s a bit stunning to see Pleco performing so admirably right out of the gate.

Oh, and more good news from Pleco:

> Also, we’re finally working on an Android version of Pleco, and have just signed a license for our first Classical Chinese dictionary….

Awesome. Congratulations to Michael Love and the rest of the Pleco team.


Apr 2010

New Online Chinese Resources Links

I figured it was about time I set up a page with links to the Chinese learning resources I personally find most valuable and regularly use. So it’s up: Online Chinese Resources.

A few notes:

– I work for ChinesePod and think it’s great, so yeah, I’m going to recommend it. This should not be a big surprise. I’m aware of quite a few podcast alternatives, and I’ve listened to a few, but I have very limited actual experience with them.

– The list is not exhaustive; there are plenty of monstrous ones out there, and the problem is that they’re all way too long. This one is pretty short, and based on my own experience, which is what makes it useful.

– I am open to suggestions, but I won’t add anything until I’ve had a chance to check it out and spend enough time with it to decide it’s a must-have resource.

I’ll be updating the list pretty regularly, but I intend to keep it brief.


Dec 2009

Zhou Libo's New Book: Hui Cidian


Taking advantage of his current popularity, Shanghainese stand-up comedian Zhou Libo (周立波) has swiftly published a book on Shanghainese expressions called 诙词典 (something like “Comedic Dictionary”).

The book isn’t exactly a dictionary, but it groups a whole bunch of Shanghainese expressions by common themes or elements, then explains them entry by entry in Mandarin, followed by a usage example from Zhou Libo’s stand-up acts for each entry.

“Shanghainese” Characters

What’s interesting (and a bit annoying) is that Shanghainese sentences are written out in Chinese characters, and then followed by a Mandarin translation in parentheses. Here’s an example of such a sentence:

> “伊迪句闲话结棍,讲得来我闷脱了。(他这句话厉害,说得我一下子说不出话来了)”

> [Translation: “That remark of his was scathing. I had no comeback for that.”]

The book is peppered with sentences like this, and as a learner, I have some issues with them:

1. If you read the Shanghainese sentences according to their Mandarin readings, they sound ridiculous and make no sense (a lot of the time) in either Mandarin or Shanghainese.

2. Unless you’re Shanghainese, you will have no clue as to how to pronounce the Shanghainese words in the sentences properly (so what’s the point?).

3. I find myself really wondering how the editors chose the characters they used to represent the Shanghainese words.

To point #3 above, I know there are cases where the “correct character” can be “deduced” due to Shanghainese’s similarities to Mandarin. To use the example above, the Shanghainese “闷脱” can be rendered in Mandarin as “闷掉.” Then why 脱 instead of 掉? Well, 掉 has a different pronunciation in Shanghainese, and it’s not used in the same way as it is in Mandarin. The 脱 in “闷脱,” however, in Shanghainese is the same 脱 as in “脱衣服” in Mandarin (which is “脱衣裳” in Shanghainese). It seems like this game of “chasing the characters” from Mandarin to Shanghainese might be ultimately circular in some cases, but I can’t really judge.

The other point is that some of Shanghainese’s basic function words, pronouns, and other common words don’t correspond to Mandarin’s at all, and the characters used certainly seem like standard transliterations. An example from the sentence above would be the Shanghainese “迪” standing in for Mandarin’s “这,” or (not from above), the Shanghainese “格” for Mandarin’s “的.”

So how do you know which characters are “deductions” (these are kind of cool and can point to interesting historical changes in language), and which ones are mere transliterations? Well, research would help. I don’t have much time these days for such an endeavor, but I do know some Shanghainese professors of Chinese at East China Normal University who could point me to the right resources.

Shanghainese Romanization

Lack of a standard romanization system is a problem that has plagued students of Shanghainese forever. Some favor IPA, but most find it a bit too cryptic. The problem is there is still no clearly superior solution that has become standard.

Zhou Libo’s book doesn’t make any headway in the romanization department. Headwords are given a “Shanghainese pronunciation” using a sort of “modified pinyin” with no tones. This is definitely more helpful than nothing, but it’s another reason why this book doesn’t make much of a learner’s resource for Shanghainese. Where the romanization diverges from pinyin, you’re not sure how to pronounce it (“sö” anyone?), and where it matches pinyin, it’s often not really the same as pinyin.



Dec 2009

Pleco for iPhone is out!

Pleco for iPhone (beta)

After reviewing the beta version, interviewing Michael Love on the app, and commenting on beta testing progress, I’d be remiss not to note that the Pleco Chinese Dictionary iPhone app is out. And the really great news is that the basic app is free!

A quick intro from the Pleco product information page:

> Go to itunes.com/apps/PlecoChineseDictionary to instantly download the free basic version of Pleco for iPhone / iPod Touch; you can add on more advanced features / dictionaries from right inside of the app, but the basic version is an excellent little dictionary in its own right (and includes the same wonderful search engine as our more advanced software).

If you own an iPhone and you’re studying Chinese, get this app!


Oct 2009

Michael Love on the Pleco iPhone App

The following is an interview with Pleco founder Michael Love, regarding the Pleco iPhone app, which is now in beta testing.

John: The long wait for the iPhone app has caused much distress amongst all the Pleco fans out there. Any comments on the development process of your first Pleco iPhone app?

Michael: Well, much of the delay stems from the fact that we really only started working on the iPhone version in earnest in January ’09 – before that we were mainly working on finishing / debugging Pleco 2.0 on Windows Mobile and Palm OS. We laid out the feature map for that back in early 2006, when the iPhone was nothing but a glimmer in Steve Jobs’ eye, so by the time Apple released the first iPhone SDK in Spring ’08 we were already well past the point where we could seriously scale back 2.0 in order to get started on the iPhone version sooner.


Pleco 2.0

But as far as how the actual development has gone, the biggest time drain has been working around the things that iPhone OS doesn’t do very well. We’ve gone through the same process on Palm/WM too – we start off implementing everything in the manufacturer-recommended way only to find that there are certain areas of the OS that are too buggy / slow / inflexible and need to be replaced by our own, custom-designed alternatives.

On iPhone the two big problems were file management and text rendering. There’s no built-in mechanism on iPhone for users to load their own data files onto their devices; all they can do is install and uninstall software. So we had to add both our own web browser (for downloading data files from the web) and our own web server (for uploading data file from a computer) in order to allow people to install their own documents / flashcard lists / etc. We also had to implement a very elaborate system for downloading and installing add-on dictionaries and other data
files; for a number of reasons it wasn’t feasible to bundle all of those into the main software package, and again there was no way for users to install those directly from a desktop as they can on other mobile platforms.

And the iPhone’s text rendering system is actually quite slow and inflexible, which is rather disappointing coming from a company with as long and rich a history in the world of computer typography as Apple. The only official mechanism for drawing rich text (multiple fonts, bold, italic, etc) is to render it as a web page, which took way too long and used way too much memory to be practical for us; there also seem to be some bugs in the way Apple’s WebKit page rendering engine handles pages with a mix of Chinese and non-Chinese text. And even simple, non-rich-text input fields and the like are a big performance hog – it took the handwriting recognizer panel about 8x as long to insert a new character into Apple’s text input box as it did to actually recognize a character. So we basically ended up having to write our own versions of three different iPhone user interface controls in order to get the text rendering to work the way we wanted it too.

So a quick-and-dirty port of Pleco on iPhone could probably have been ready last spring, but getting everything working really smoothly took a lot longer.



Oct 2009

The Pleco iPhone App (beta)

I just recently had the pleasure of trying out the beta version of the new Pleco iPhone app. In case you’re not aware, Pleco is the software company behind what is regarded as the best electronic learner’s Chinese dictionary for any mobile device (and possibly the desktop as well). Given the dearth of really good Chinese dictionaries for the iPhone, Chinese learners have been eagerly awaiting the release of this iPhone app for quite some time. The wait has not been in vain; Pleco for iPhone is an outstanding app.

The Video Demo

Michael Love, Pleco founder, has made a two-part video of the new Pleco iPhone app:

For those of you in China, visit Pleco’s mirror site for the videos.

An All-New UI

I’ve never owned a device running Windows Mobile or Palm OS, so I’ve never been able to own Pleco before, but I’m familiar enough with previous versions to make basic comparisons.

The Pleco user interface received a much-needed makeover for the iPhone. While older versions of Pleco squeezed a plethora of buttons and options onto the screen (you have your stylus, after all), this iPhone Pleco had to find ways to increase buttons to tappable sizes and limit button clutter by hiding options on screens where you don’t need them all. Compare (Windows Mobile on the left, iPhone on the right):

maindict.gif Pleco for iPhone (beta)

aisearchdict.gif Pleco for iPhone (beta)



Oct 2009

Slumming it with nciku

I recently looked up the word 贫民窟 (meaning “slum”) in nciku. The definition included this example of usage:

> She decided to slum it for a couple of months.

> 她决定去贫民窟待几个月。

The Chinese sentence, translated back into English, would be:

> She decided to stay in a slum for a couple of months.

I think the translator missed something in this particular case, and the content of the sentences (as well as the order) strongly suggests that the Chinese is a (not so great) translation of the English.

So how nciku is getting its sample sentences for Chinese words? The OED is the champion of the dictionary quotation for the English language, containing tons of examples of its words’ usage “in the wild.” Dictionary sample sentences are best when taken from other sources, but those sentences should at the very least be composed in the language the dictionary serves. It seems this is not what’s happening with nciku, but maybe Collins (one of nciku’s data sources) is to blame.


Aug 2009

Tone and Color in Chinese

In his book Chinese through Tone and Color, author Nathan Dummitt presents his system of color-coded tones. In his own words:

> I hope that my system gives a context, even for non-visual learners, for distinguishing between the four tones in Mandarin and providing a mnemonic system to help them remember which tone goes with a particular word.

From the moment I first heard of this idea, I was intrigued by it. Associating tones with colors does open up a lot of possibilities. Once the system is internalized, you can drop tone marks and tone numbers altogether, and you can tone-code the Chinese characters themselves using color. (The best non-color approximation to this would be writing the tone marks above the characters, which you will find in some textbooks and programs.) So I was very receptive to this idea.

Despite being very open to the concept, when I saw the actual colors chosen to represent each tone, they just felt wrong to me. The pairings Dummitt chose were:


Why would these colors feel wrong to me? How could the tone-color associations be anything but arbitrary?

The reason that the colors felt wrong to me was that I had already thought about the relationships between the tones and my own perceptions of those tones. I had even (briefly) considered color when I sketched my “Perceptual Tone Contours” idea:

Perceptual Tone Contours in Mandarin Chinese

Specifically, I felt that first and fourth tone feel similar, and that second and third tone feel similar. I believe that perceived similarity is strong enough that it affects both listening comprehension and production. This is why I purposely colored first and fourth tone red in my diagram, and second and third tone blue.

An Alternate Color Scheme

OK, so now we’re getting down to the point of my post. As a thought exercise I asked myself: If I had to assign colors to the four tones, which colors would I use?

In answering this question, one has to believe that there are underlying principles which, when followed, might produce better results. Otherwise, arbitrary assignment is fine. So what are the principles? I have two:

1. The colors need to have a high degree of contrast so that they will stand out on a white background and not be confused with each other.

2. The colors chosen need to reflect the appropriate perceptual similarities.

There are other considerations you might take into account if you want to be super-thorough, of course. From an Amazon reviewer of Dummitt’s book:

> If a person was going to design a color code tone system they would probably want to avoid using red and green in the same color scheme. Red – green color blindness causes an inability to discriminate differences in red and green. Hence the testing when you get your driver’s license. 5 to 8 percent of males have this color blindness.

> Using red and orange in the same scheme is also not very bright. Much language learning is done on buses, trains, planes and their attendant stations. Lighting is sub-optimal in all these situations and much worse in China. Low light intensity impairs the ability to discriminate red from orange.

These points have some merit, I suppose, but I’m not sure what colors they leave. I’m sticking to the two principles I listed above. I don’t see how you’re going to avoid either red or orange altogether if you need easily distinguishable, high-contrast colors.

Regarding the principle of high contrast, I can’t disagree with Dummitt’s choices. You can’t choose yellow, and the ones he chose are easy to distinguish quickly.

As for perceptual similarities, I would reflect these similarities by grouping the four tones into two warm and two cool colors. In my Chinese studies over the years, I have often associated fourth tone with aggression or anger, both concepts which I would associate with the color red. Red = fourth tone is the strongest association I have, but from there, all the others fall into place. You can’t use yellow (poor contrast), so orange is your other warm color, going to first tone. My diagram has fourth tone and second tone diametrically opposed (falling versus rising), and green is directly opposite red on the color wheel, so I would go with green for second tone. That makes third tone blue.

The results:




Jun 2009

How to Pronounce nciku

The online Chinese dictionary everyone is using these days is nciku. Newbies and veterans alike all seem to dig it. The quality of the dictionary entries is a refreshing change from the deluge of unimpressive CEDICT clones. One common difficulty among nciku users of all levels, however, is that they can’t figure out how the hell to pronounce the name! Is it N-C-I-K-U, each letter pronounced like its name, or maybe N-C-I-koo, or something like In-see-koo? Just how do you really pronounce nciku, anyway??

By clicking on 简体 (or 繁體) in the footer to switch to the Chinese version of the site, you can see the nciku’s Chinese name: n词酷. So this should answer the original question: the “n” is pronounced like the name of the letter N, and the “ciku” part is pinyin cíkù.


But why?? What’s up with the name? Well, I have to say, it’s a pretty horrible name if your target market is foreigners. No one knows how to pronounce it when they see it. The name does make sense from a Chinese perspective, though.

First, the n. That’s the mathematical n, as in an unspecified number that could be really high. It might seem strange to bring mathematical variables into everyday conversation, but in modern Chinese it happens on a regular basis. In Mandarin when you do something n (n times), you did it so many times you don’t even know how many. Like we say “a million” in English, or, perhaps more appropriate in its ambiguity, “a zillion.” Rather than n, you can also say n, which also means a zillion times, but sounds quite similar to the beginning of the name n词酷.

词酷 is a concocted homophone for 词库, a somewhat technical word meaning “lexicon” or “word bank.” You can talk about a lexicon in terms of all the words of an entire language, or in terms of an individual’s own vocabulary.

So why for ? Well, is the popular transliteration for “cool,” and the character , appearing in such words as 数据库 (database), 语料库 (linguistic corpus), 车库 (garage), 仓库 (warehouse), quite frankly, isn’t very cool.

So there you have it: n词酷, a zillion word banks (but cool).


Oct 2008

The Death of Handheld Electronic Dictionaries?

Steven J wrote me with this question:

> I have been in china for two years and always used paperback dictionaries or the one on my computer. However, now that i will start studying it seems more handy to have one of these pocket size electronic dictionaries. However it seems that all of these machines have a pinyin function for INPUT only. When looking up a word in english, it only gives you characters. This is quite a pain in the ass for someone like me who can speak some Chinese, but is almost illiterate. Do you have any advice on where to find one of these gadgets that would suit my needs better or can you redirect me to a good place to find information on this topic?

I went through this exact same dilemma when I first arrived in China. I had my handy Oxford Concise English-Chinese Chinese-English Dictionary
which I took everywhere. I noticed the Chinese students all had these little handheld electronic dictionaries, and I wanted one to help me with Chinese. But they really don’t help you a whole lot when you have no way to look up the pinyin for the characters that appear.

I had a Canon Wordtank to help me get through my Japanese studies, and it was great. Designed for the student of Japanese, it provided a “jump” feature which made it easy enough to look up the readings of any word even if the readings weren’t directly displayed everywhere. It got me through my last two years of formal Japanese study, which involved a lot of reading and translation.

But for Chinese? I’ve seen some really cool dictionaries that essentially do what the Wordtank does, but for English, Mandarin, Cantonese, and Japanese. With audio. They’re not cheap, though.

I never found a reasonably priced handheld Chinese electronic dictionary that did what I want. I ended up jotting down words and looking them up at home on Wenlin or online.

The heyday of these little handheld dictionaries is coming to an end. I know several people that use their Nokia cell phones for all their English-Chinese dictionary needs. New dictionary apps for the iPhone abound, and the iPhone already has great handwriting recognition support for Chinese built in. Google’s Android is sure to have no shortage of dictionary apps; maybe even official Google Translate dictionary functions.

If you’ve made it this far without a handheld electronic dictionary, then you should just hold on a little longer. The days of single-function handheld electronic devices are numbered. I, for one, wish this new generation of handheld devices would move in for the kill a little faster.