Last month my friend Zach Franklin and I spent a half-hour in a recording studio talking about reading Marvel graphic novels as a way to practice Chinese. Not sure how often I’ll do this kind of recording, but hopefully you Chinese learners will find it interesting!
The last interview I did of Zach was all text, for the 2010 interview post The Value of a Master’s in Chinese Economics. Now you get to hear his voice and learn a bit more about how he uses his Chinese for less serious endeavors.
I was pleased to be contacted recently by Katie, the author of a new blog related to learning Chinese called Panda Toes. She’s based in Beijing, and has already gotten through the hardest parts of learning Mandarin, so she’s interested in sharing tips to help build reading fluency.
Your first thought might be, “to get better at reading Chinese, don’t I just need to read more Chinese?” Well, yes. No one’s going to argue with that logic. But even putting aside the crucial question of what a learner should read on her own, there are some techniques that can make the whole process less painful and more productive.
I really like how in her first article, The Art of Reading Chinese (as a non-native speaker), Katie gives a lot of emphasis to recognizing names (both Chinese and foreign). This point absolutely deserves a lot of attention, and it’s something I remember being tripped up by repeatedly, back in the day. (My time in the news translation trenches did me a lot of good in that regard, but it was most definitely not fun work.)
To add to Katie’s point, I’d like to emphasize it is most definitely worth your time to spend a bit more time learning Chinese names and their structure. While you shouldn’t make a big flashcard deck and memorize ALL THE NAMES, you should be gradually gaining familiarity with common names and common name structures. But how does one do this?
Learn your Chinese friends’ names. Really learn them. Every character, every tone. Ask why they were named that, and ask if it’s an unusual name, or a “typical Chinese name.” If their name contains certain characters that almost exclusively appear in people’s names, identify those as such. This learning is reinforced by your personal relationship with the person; this same knowledge-gathering would not be as effective on a group of random Chinese people.
Learn some famous Chinese names. Again, be selective. These should be names that have some meaning for you. If you like talking about politics, then learn politicians’ names. Same goes for Chinese movie stars, singers, directors, etc. You can gradually expand this list over time. As you do, make note of patterns. For example, the name 章子怡 (Zhang Ziyi). Did you know that there’s a “A子B” naming pattern? Almost all Chinese people will be familiar with it, and sooner or later you need to be too.
You need to know the super common Chinese surnames (again, learn them over time), but you also need to know that some super common words or characters can also be surnames. It really through me for a loop the first time I ran into surnames like 文, 水, 米, 左, or even 门.
Ask your Chinese friends what they think of other people’s names. This is especially easy to do when you’re trying to come up with your own Chinese name, or naming a baby, but it’s also something you can do anytime. Getting Chinese friends’ takes on other Chinese names will really enrich your understanding of names, and you’ll probably be surprised by how widely opinions will vary. Just remember that no matter how much you respect a person, no single person’s opinion is “correct.”
Good luck in building your reading fluency. I’m glad to see Panda Toes is live, and I’ll be contributing to this discussion more in the future. Most of my work these days in reading is with lower levels, editing Mandarin Companion graded readers, but my more advanced clients at AllSet Learning are always looking for interesting new reading content, so I’m always looking at new material for that too.
According to Wikipedia, subvocalization refers to “the internal speech typically made when reading.” It’s that “voice in your head” (you) pronouncing every word mentally. Subvocalization is normal, and is not generally considered a problem, unless you’re trying to learn to speed read. In that case. subvocalization is generally regarded as something that slows a reader down.
I found this section of Wikipedia quite interesting:
Advocates of speed reading generally claim that subvocalization places extra burden on the cognitive resources, thus, slowing the reading down. Speed reading courses often prescribe lengthy practices to eliminate subvocalizing when reading… [but] for competent readers, subvocalizing to some extent even at scanning rates is normal.
Typically, subvocalizing is an inherent part of reading and understanding a word. Micro-muscle tests suggest that full and permanent elimination of subvocalizing is impossible. This may originate in the way people learn to read by associating the sight of words with their spoken sounds…. At the slower reading rates (100-300 words per minute), subvocalizing may improve comprehension.
The Case of Chinese
OK, but now what about for Chinese? Chinese characters are not as directly tied to a phonetic system (like an alphabet), right? Plus Chinese kids learn characters by writing them over and over rather than by reading them aloud, right?
Well, not really. Here’s what research has to say (I added bold to certain parts):
…Reading English and reading Chinese have more in common than has been appreciated when it comes to phonological processes. The text experiments suggest that readers in both systems rely on phonological processes during the comprehension of written text. The lexical experiments show differences just where it is expected: Evidence for early (“prelexical”) phonology in English but not in Chinese, but evidence for still-early (“lexical”) phonology in Chinese. The time course of activation appears to be slightly different in the two cases. Thus, the similarity between Chinese and English readers is shown not in their dependence on a visual route, but in their use of phonology as quickly as allowed by the writing system.
So it’s not that Chinese readers don’t subvocalize; it just kicks in later, because it takes for time for readers to amass the knowledge of written Chinese needed. Interesting!
Obviously, you can dive a lot deeper into the research on subvocalization, reading comprehension, and cognitive differences between writing systems. (Please feel free to share links to relevant studies in the comments.) For my purposes, though, one important point is clear: there’s no need to exoticize reading Chinese any more than necessary. Yes, learning a bunch of characters is a hurdle, but you don’t really need to worry too much beyond that.
Subvocalizing in Chinese
First of all, we should remember that subvocalization is not “bad,” and it’s not something that native Chinese readers don’t do (some kind of “laowai problem”). But that doesn’t mean that there’s no danger of over-reliance on subvocalization when learning to read Chinese.
I personally have experienced what I consider a serious impediment to my reading fluency. I found that when I would read Chinese a text, I was reading it aloud very deliberately in my head (subvocalizing). The problem was that I had obsessed over correct tones for so long that I just couldn’t stop. This slowed me down even more than normal subvocalization would be expected to do. So even when I was just reading for purely informational purposes, my brain was insisting that I had to pronounce every tone of every word (in my head) exactly right. I knew this was slowing me down a lot, but I couldn’t stop! The “tone police” in my head were out of control.
I did eventually get over this bad habit, and the result was much more rapid reading speed, as well as the ability to truly scan a text for meaning quickly. How did I do it?
Two Cures for Subvocalization
My solution was “the firehose.” I forced myself to read a lot. I read long Chinese texts for which I knew the words, but wasn’t sure of the tones for all the words. In some cases, I may not have even been sure of all the exact readings of all the characters in those words. But I could still comprehend the general meaning of the texts, which was all I needed.
So the steps were:
Find a relatively long text which had information I needed (make the reading meaningful)
Force myself to read at a high speed, disallowing my brain from obsessing over uncertain readings
This worked, but I had to do it a lot, and to be honest, it was a little painful. Unlearning a habit is not easy, and if I’m not careful, I still find my brain dutifully reading aloud every single tone in my mind. But with just a little willpower, I can keep subvocalization in check when I need to, and greatly increase my reading speed.
The second solution is extensive reading. It’s a gentler version of the method described above. The idea is that if you know that you already know all the words (with correct tones) in a text, then forcing yourself to read it without focusing on the correct tones should be easier. No anxiety. You can let go and just read.
But here’s the key: you can’t just read a text first to identify all the words you don’t know, add the pinyin, and consider them “learned.” That’s not going to allow you to let go of subvocalization for unfamiliar texts. So you need to find reading material which is unfamiliar, and yet entirely composed of familiar words. This is what graded readers can help with.
Share Your Subvocalization Battle Tales
I’d be very interested to hear about any readers’ struggles with subvocalization when learning to read Chinese. Actually, any foreign language… it’s all relevant.
A while back I wrote about What 80% Comprehension Feels Like, and I quoted the English examples used in Marcos Benevides’ excellent presentation which simulate 80% comprehension in English by including made-up English-like vocabulary words.
I’ve been thinking about that presentation a lot, both about the impact of such a demonstration, as well as about how it could be accomplished in Chinese. I ended up creating my own examples in Chinese. I’ll go ahead and share that first, and follow up with some discussion of the considerations involved.
(Before you attempt to read the following, please note that if your Chinese is not at least at an intermediate level, the following exercise is not going to work. Like its English-language counterpart, these examples are most effective with native speakers.)
Here is 98% comprehension:
Here is 95% comprehension:
Here is 80% comprehension:
The tricky thing about reading Chinese is that it’s not just a matter of vocabulary and grammar; there’s an issue not present in English: the issue of Chinese characters. When a learner reads a difficult Chinese text, all three of these components tend to play a part in the difficulty: vocabulary, grammar, and characters.
But for the example to work for both learners and native speakers alike, there needs to be a way to guarantee that parts of the text were incomprehensible, as accomplished with made-up words in English. How can one do this in Chinese?
How I did it
First of all, to maximize the chances that the “intelligible” parts of the Chinese sample text are also readable by learners, I used as simple a text as I could: a Level 1 Mandarin Companion graded reader. For these examples, it was The Secret Garden.
Then, I had to be sure I chose the more difficult content words to swap out, and that I got all instances of them in each sample. Obviously, I had to count the words to make sure I got the desired percentage right. But equally important, to make my samples representative of real-life 98%, 95%, and 80% comprehension experiences, the words chosen should “cloud” reading comprehension to the appropriate degree, no more, no less.
But here’s the tricky part: how to represent characters the reader doesn’t know. The obvious way would be to create my own characters that don’t really exist. I enjoy doing this, but it’s time consuming, and to make it look truly credible it would have to not stand out at all when mixed in with the other characters. Too much work.
So I turned to the Unihan database of Chinese characters. Over the years, more and more obscure characters have been added to this set of characters, and I found a list of the most recent additions. (Most recently added should mean most obscure, but I chose Extension D from this page because it was both recent and a small download.)
A quick check confirmed that these characters were indeed obscure, but many of them didn’t look like simplified Chinese characters, or were just too weird, so I had to choose carefully. After making my choices, I also had to check to make sure that educated Chinese adults didn’t recognize the characters (guessing doesn’t count).
After that, I selectively swapped out characters in the samples. (My 80% comprehension text sample is the shortest, because I was running out of “good” obscure characters, and I didn’t want to have to find more!)
One interesting side effect of using such obscure characters in my texts was that most software couldn’t render them. Whatever fonts they used just didn’t include those bizarre characters. Only Wenlin, with its custom font designed to render all kinds of obscure characters, could display them all. So I had to do screenshots of Wenlin’s interface.
How to use this
I used these passages as part of a presentation on extensive reading at LanguageCon in September. I got the effect I wanted: Chinese members of the audience giggled (embarrassedly?) at the characters they didn’t know, especially when they got to the 80% comprehension example.
Chinese learners smiled wryly: there wasn’t much amusing about a fake recreation of the challenge they face on a daily basis, trying to read Chinese.
More than anything, I hoped that the Chinese audience could empathize with the learners of Chinese. Most Chinese people never know what it feels like to have to learn so many foreign characters as a part of a foreign language learning experience. Through these examples, though, they can get an inkling.
Actually, maybe they were chuckling in relief… at least they’ve got that challenge behind them.
The AllSet Learning blog also has a similar Chinese language article on this topic: 80%没有你想的那么多.
If you’re learning a foreign language and you don’t know what extensive reading is, it’s time to learn. This presentation deck by Marco Benevides is a great place to start: Extensive Reading – How easy is easy? (Excerpts below from: Extensive Reading: Benefits and Implementation. Benevides, Marcos. J. F. Oberlin University, Tokyo. Presented at IATEFL 2015 in Manchester.)
One of the major principles of extensive reading is that if a learner can comprehend material at 98% comprehension, she will acquire new words in context, in a painless, enjoyable way. But what is 98% comprehension? Humans are actually really bad at gauging this, partly because schools rarely teach this way. 98% comprehension means that only 1 in 50 words is unknown. But still, it’s hard to have a feeling for exactly what that’s like.
This is where Marco Benevides’s presentation is so genius. Here is 98%:
You live and work in Tokyo. Tokyo is a big city. More than 13 million people live around you. You are never borgle, but you are always lonely. Every morning, you get up and take the train to work. Every night, you take the train again to go home. The train is always crowded. When people ask about your work, you tell them, “I move papers around.” It’s a joke, but it’s also true. You don’t like your work. Tonight you are returning home. It’s late at night. No one is shnooling. Sometimes you don’t see a shnool all day. You are tired. You are so tired…
(And in case you’re not a native speaker of English or don’t quite get it, yes, there are nonsense words in there. Those represent the uncomprehended 2%.)
Here’s 95%, which represents a departure from extensive reading, because it requires more effort, and tends to be slower and less enjoyable:
In the morning, you start again. You shower, get dressed, and walk pocklent. You move slowly, half- awake. Then, suddenly, you stop. Something is different. The streets are fossit. Really fossit. There are no people. No cars. Nothing. “Where is dowargle?” you ask yourself. Suddenly, there is a loud quapen—a police car. It speeds by and almost hits you. It crashes into a store across the street! Then, another police car farfoofles. The police officer sees you. “Off the street!” he shouts. “Go home, lock your door!” “What? Why?” you shout back. But it’s too late. He is gone.
Finally, let’s skip to the oh-so-frustrating 80% comprehension level:
“Bingle for help!” you shout. “This loopity is dying!” You put your fingers on her neck. Nothing. Her flid is not weafling. You take out your joople and bingle 119, the emergency number in Japan. There’s no answer! Then you muchy that you have a new befourn assengle. It’s from your gutring, Evie. She hunwres at Tokyo University. You play the assengle. “…if you get this…” Evie says. “…I can’t vickarn now… the important passit is…” Suddenly, she looks around, dingle. “Oh no, they’re here! Cripett… the frib! Wasple them ON THE FRIB!…” BEEP! the assengle parantles. Then you gratoon something behind you…
I run into this number “80%” quite a lot in my work. Maybe it’s because of the 80/20 rule; I don’t know. But what I do know is that many learners think 80% comprehension in a conversation or in a business meeting is enough to follow. In reality, 80% is extremely frustrating because you can get so much of the conversation, but you’re still fairly clueless about a lot of the meat of the discussion. Generally speaking, you’ll know the topic, but fully understand virtually none of the details discussed. Pretty maddening.
This isn’t actually bad news… It doesn’t change the numbers of hours of focused practice needed to become fluent in a language. In fact, it goes a long way toward explaining that intermediate plateau, as you slog from an average of 60% comprehension or so to closer to 90%. That’s why you’re learning so much but don’t feel the breakthrough. It’s also why it’s so important to have a good teacher, and materials at your level.
I remember the first time I had the great idea to use Chinese children’s books as study material. I had been in China for about a year, and having exhausted my old textbook, I was starved for more interesting material. I came upon a book store, and, realizing how cheap books in China were, had the revelation that I should start learning from Chinese children’s books. It was so perfect, and so obvious… why hadn’t I done this earlier?!
Then reality came crashing in. There was a very good reason why I everyone wasn’t already doing it already: Chinese children’s books are meant for native speaker Chinese kids, and as such, they generally don’t make good material for foreign language learners. But why??
Before I talk about my conclusions as to why, let me just share a few examples from my local book store. This is no scientific survey, but I did my best to select from a number of different publishers and different types of children’s books. The pages I photographed are more or less random. I’m adding a few comments about the suitability of these stories for a high A2 (elementary) or low B1 (intermediate) learner.
– Note the failure to break the characters into words, and the pinyin over every character… both annoying for a learner of Chinese.
– The tone is a more written, formal style than most elementary learners are going to be ready for.
– Notable difficult words: 果然、蹲、急忙、吩咐、目露凶光、黄灿灿、铜钱、打火匣、看守、披
– Again, the failure to break the characters into words, and the pinyin over every character…
– The tone is a more written, formal style than most elementary learners are going to be ready for.
– Notable difficult words: 南辕北辙、中原、楚国、却、驾车、满不在乎、盘缠、摇摇头、糊涂、方向
– Again, the failure to break the characters into words, and the pinyin over every character…
– The tone is a more written, formal style than most elementary learners are going to be ready for.
– Notable difficult words: 恰巧、沼泽、女妖、魔鬼、祖母、参观、酒厂、老妖婆、地狱、一尊、石像、整天、烂泥、妖怪、谈论
– The density of hard words in this book is really high, based on this page
– Again, the failure to break the characters into words, and the pinyin over every character…
– The tone is less formal here, and the words used feel more oriented to kids, but a lot of the words are the type that native speaker kids could understand in the context of a story but would not use themselves; these are the words that would really trip up a lot of foreign language learners.
– You can see that on this page the character 天 is being taught, and yet there are much, much more difficult characters on this page. This highlights the fact that the book is meant to be read to the child; the child is not meant to read it.
– Notable difficult words: 懒、踢、脚、穿、接住、并、蹦、跳、突然、轰隆、一道、裂痕、瞬间、掉
I Go to Kindergarten
– This is my favorite of the bunch; I actually bought this book for my daughter as psychological prep before she started kindergarten.
– The characters are not too hard, but no pinyin! Finally…
– The tone is informal, and this is the kind of language that Chinese parents would expect their children to fully comprehend, in context.
– Somewhat difficult words: 嗨、全班、春游、别提、运动鞋、背着、排好队伍
– No pinyin here, and this one is definitely higher difficulty level.
– Difficulty-wise, a high B1 (approaching upper intermediate) learner could probably tackle this, if sufficiently motivated.
– Notable difficult words: 技术、拯救、反派、威胁、社会、消灭、责任、邪恶、存在、身影、而、则、视……为……、心腹之患、试图、保护、善良、顺利、或者、完成
Most Chinese children’s books are too hard for Chinese learners. It’ll be a frustrating slog to read many books (especially those chosen at random), and all the pinyin is likely to be less helpful than you think. There are some good ones suitable for foreign learners out there, but those are the exception rather than the rule. Randomly choosing children’s books for reading practice is not recommended.
I’ve thought about this issue for quite some time already, and my conclusion is that when the average Chinese parent reads a book to her child, the goal is more education-oriented than pleasure-oriented. I know a lot of American parents that work very hard to instill a love of reading in their children, so enjoyment is extremely important. Chinese parents, however, are under a mountain of pressure to get their kids into the best schools in an environment of intense competition. Of course they hope their children like to read, but it’s kind of beside the point. The real goal is to help their children pick up characters and vocabulary as quickly as possible.
If the goal is acquiring characters and vocabulary, it makes sense that the language introduced in these Chinese children’s books is going to be more advanced than one would expect. The children are native speakers, already fluent in Mandarin, and the story provides a clear context. Therefore, why not drop a few extra difficult words and characters on every page? It’s for the kids’ own good!
But wait… there’s HOPE!
There is hope for learners that really want something to read. (Little disclaimer: the following is going to be partly self-promotional, because this is one of the major problems in the Chinese learning industry that I’ve devoted my career to solving.) If there is enough interest among my readership, I’ll consider compiling a list of Chinese books by Chinese publishers suitable for learners (kind of like the kindergarten book above). For now, I’ll focus on several resources that are available to those outside of China.
Oscar & Newton Go to the Park is a print bilingual picture book by AllSet Learning, adapted from its original app form. The language is practical and informal, perfect for A2 adult learners as well as children. It’s now available on Amazon.
The Chairman’s Bao is a website that takes news stories and simplifies them into simpler, shorter articles. See my longer review here. This is great for intermediate learners that want to start working toward reading actual news. Includes audio.
Mandarin Companion creates graded readers (short novels without pinyin or translation) meant for learners of a high elementary or low intermediate level. We’ve got five Level 1 books out, and feedback is great. Our next two Level 2 books are coming out any day now. Books are currently available on the international Amazon website, but not the Chinese one.
Chinese Breeze is the original Chinese graded reader brand. It has cheaper books and more titles out, at levels ranging from high elementary to intermediate. If you’re going for quantity, look here. Books are currently available on both the international Amazon website, and the Chinese one.
If you have any other reading material to add, please leave a comment and share!
Hypothetically speaking, in a rewritten Chinese version of Great Expectations which takes place in modern China, Estella’s name should definitely be 冰冰. But what about Pip? Suggestions welcome! (His name in the typical Chinese translation is 匹普, which is horrible, and we’re certainly not using.)
(I will neither confirm nor deny that this question is related to Mandarin Companion‘s next release, which may or may not be the first Level 2 book.)
Yesterday Project Naptha hit Hacker News. It offers a way to extract electronic text from image files through a simple Chrome browser extension. Excited to see that simplified and traditional Chinese are both supported by the extension, I immediately installed the extension and tried it out.
The results? Unfortunately, Not so great.
When it doesn’t work at all
First of all, the script needs to recognize the text in the image. This first step doesn’t always go too well, even if the text seems relatively clear to the human eye. Let’s look at some cases where the extension found nothing, despite the Chinese text being pretty legible.
In this first case, the font is non-standard. OK, fair enough. That’s to be expected.
In this next case, the text is pretty clear, but the contrast is poor.
In this final example, the text is fairly clear to the human eye, but also low-res and slanted. That probably makes it difficult for the algorithm.
When it sort of works
In many other cases, some text was identified, but not enough for the extension to be really useful for anything. Here are some images where Project Naptha could identify some text, and the “select all text” function was applied. (The blue boxes show what Project Naptha identified in the images as “text.” Sometimes they are bizarrely incorrect.)
I found the last two quite surprising, considering how clear and straightforward the text is, and also high-res.
When it actually works
Sometimes it was relatively successful in identifying the text. In these cases you must first set the language to Chinese (either simplified or traditional, depending on the text). There’s a cool effect showing you that some processing is going on. When that’s done, you can copy and paste the text.
But… it might not be exactly what you were hoping for.
This selected Chinese text yielded the following copy-paste results:
> 总统亲 ã热ﬂ地接
If it had correctly captured all the text, it would have been:
This one is better:
> 化武器况妹俩也不示弱 麝芦神功连
> 连使出 胭宙电二怪打入深深的山沟
It should have been:
Also, my sample size is too small to make any definite conclusions, but it seems like the extension works better for simplified characters than for traditional.
I don’t mean to sound overly critical. This is amazing technology here, and the fact that it launched with any support for Chinese characters at all is pretty awesome (and brave)! I’m sure the technology will improve with time, and that is going to be tremendously helpful to Chinese learners.
To put this in perspective, the development of OCR (optical character recognition) for mobile devices meant that you could point your cell phone’s camera at any characters you see, and get feedback on what the characters say (sometimes). Project Naptha means the same thing, but for your home browsing experience. For me, that’s when I do a lot more Chinese reading, so it’s even more important. Once this technology is perfected, as long as you have a tool to help you read electronic Chinese text, you’re all set!
Personally, I think this is especially great news for comics. It’s no coincidence that I tested this extension out on comic book text. I’m really looking forward to seeing how this extension develops.
The following is a guest article written by a Sinosplice reader, Julian Suddaby. I have followed it with some commentary of my own.
Warning: if you’re a member of the “Chinese is super easy” faction, this article might annoy you a little, but be sure to read through to the end!
How Many Characters?
by Julian Suddaby, 2014-02-13
I asked Google “how many chinese characters do I need to learn” and the best sites I found pointed to linguist Jun Da’s website and used his data to argue that 3,500 characters should be enough for most people, being that you’ll know around 99.5% of the characters in general circulation.  Is that really enough?
Well, if you’ve got to that point, congratulations. It’s an achievement. But you may not want to stop accumulating characters just yet. Indeed, sad to say, at 3,500 you won’t even be able to read Jun Da’s name, being that 笪 is way down at frequency #5,231.  So how many, then, do you need to learn? Well, that depends on one question that you should ask yourself: what exactly do you want to read?
Students often want to read Chinese newspapers. The Southern Weekly 南方周末 being a popular choice, I took the ten most popular articles over the previous thirty days and ran them through a computer program that checked them against Jun Da’s most frequent 3,500 characters. The results are fairly encouraging for the Chinese student, I think: if you knew the 3,500 you’d only encounter forty-four new characters over the course of those ten articles, and twenty-nine of those you’d only see once and so would probably just take a guess at from context and move on. But you’d possibly want to look up 甄, a pseudonymous surname given to the subject of one of the articles (and thus appearing thirty-five times); 闰, used in the name of a Zhejiang corporation which appears to have buried five hundred tons of poisonous chemicals in their backyard (seven appearances); and 驿, used in the name of a company involved in a online security breach (also seven appearances). 
So, while you probably shouldn’t throw out your dictionary just yet, it does seem that trying to read a newspaper won’t be a disheartening experience.
A Children’s Book
Children’s novels are another popular choice of reading material for language students. Shen Shixi is a well-regarded children’s novelist, whose Jackal and Wolf has recently been translated into English by Helen Wang. I ran an analysis on another of Shen’s novels, 《鸟奴》(lit. “Bird Slave”). This is, character-wise, much more difficult than the newspaper articles, with two hundred and one characters not in the top 3,500. Ninety of those are used more than once. As you’d expect from Shen, the “king of animal fiction”, animal-related vocabulary is one particular problem here, and you’ll probably end up very confused if you don’t look up 鹩, used two hundred and eighty-four times; 喙, used thirty-six times; and 獾, used twenty-two times. 
The novel is about two hundred and forty pages long, and so you should expect to find a character you don’t recognize on most pages.
A wuxia novel
Jin Yong’s novels remain firm favorites. Rather than starting with the four volumes and 1,300 pages of The Legend of the Condor Heroes 《射雕英雄传》, students might perhaps try A Deadly Secret《连城诀》, which is just four hundred pages or so. In those four hundred pages you’ll encounter two hundred and ninety-six characters not in the top 3,500.The most frequently used are from the protagonists’ names (水笙, 水岱, and 万圭), but there are plenty of new common nouns and verbs used multiple times as well. 
On a page-by-page basis, you should recognize more characters than in the Shen Shixi novel above. In terms of total characters, however, A Deadly Secret is more of a challenge.
A modern classic
Lu Xun’s A Call to Arms 《呐喊》, despite collecting stories he wrote at a very early stage of modern Chinese literary vernacularization, should not be much more difficult than the two novels above—at least in terms of basic character recognition. Two hundred and thirty unseen characters in total, with 闰 (remember that one from above?), 珂 (used in a name) and 锵 (a sound) taking the top three spots. 
Even from this very cursory analysis, it appears that if your goal is to read Chinese fiction comfortably without a dictionary, you’re going to need to recognize more than 3,500 characters. Chinese writers use characters well into the four or five thousand frequency range very regularly.
So although reaching 3,500 is worth celebrating, I wouldn’t stop trying to acquire characters just yet. Keep reading and dictionary-checking, and don’t abandon memorizing/spaced repetition if that’s something you find helpful.  You’ll still be coming across new characters for a long, long time…. 
笪 Dà (a surname here, but means “a coarse mat of rushes or bamboo”, with 旦 dān providing the phonetic). Here and later I’m using Wenlin as my main reference for character glosses.↩
甄 Zhēn (a surname here, but originally meaning “to make pottery” and thus composed of 垔 and 瓦, but with no phonetic clue), 闰 rùn (used in a name here, but means “intercalary”; the much more common 润 shares the same pronunciation), 驿 yì (used together with 站 to mean “post/courier station”; right-hand side is the phonetic, as in 译).↩
鹩 liáo (“wren”, with the left-hand side providing the phonetic), 喙 huì (“snout; mouth; beak”, with both 口 and 彖 radicals semantic; no phonetic clue), 獾 huān (“badger”, with the right-hand side phonetic).↩
笙 shēng (“reed-pipe instrument”, bottom is the phonetic), 岱 Dài (“Taishan mountain”, top is the phonetic), 圭 guī (“jade tablet”, cf. 挂 or 桂 for the pronunciation).↩
珂 (“a jade-like stone”, right-hand side is the phonetic), 锵 qiāng (“clang”, right-hand side is the phonetic).↩
For the more technologically-oriented student, another option may be available: thanks to the increasing availability of texts in machine-readable formats students could run their own frequency analysis on a text they wanted to read and pre-learn characters they don’t already know. It’s a pity there don’t seem to be any easy-to-use programs or websites that offer this functionality.↩
It should also be noted that single character recognition is only part of reading Chinese, and is not on its own a good measure of reading proficiency. That said, the relative ease of measuring character recognition and frequency may justify its limited use as a self-diagnostic and motivational tool for learners of Chinese.↩
The following is my response:
Interesting! This sort of helps make a case for the importance of graded readers. (Have you seen Mandarin Companion?)
While I know your intent is to SEEK THE TRUTH, the overall tone of the article is, unfortunately, a little discouraging for struggling learners. For me, this totally highlights the need for materials that give the learner a sense of accomplishment for having reached 300, 500, 1000 characters, rather than an incessant message saying, “STILL NOT GOOD ENOUGH.”
You’re quite right, I suppose I am a little too rigidly 实事求是 in the piece! I completely agree with you about the need to avoid the demotivating “still not good enough” feeling and message that permeates most Chinese teaching materials (how I remember my exasperation when the 高级 textbook still required fifty plus new vocabulary items per short text!). There’s really a huge need for more good reading materials with limited character/vocabulary ranges, and your graded readers look fantastic.
Now before you go too crazy trying to read it, know this:
> Some time ago, Instagram user jumppingjack posted the above image of a note she left to her mum. She said that her brother secretly added extra strokes to the characters in the note. The result is interesting though: even though extra strokes were added, the note is still readable to most competent Chinese speakers.
That brother is kind of awesome. That is the kind of mischief I would have been all over as a kid, if only I had had Chinese characters at my disposal. The character substitutions are pictured at the right. (Note: not all of them are real characters.)
(Also, I totally sympathize with jumpingjack for writing the character 期 wrong, with the two sides swapped. I have done that way too many times myself.)
Try having a Chinese friend read the note and take note what gives them the most trouble. Read the original post for the note’s original content (in electronic text) and the author’s analysis and conclusions.
If you follow me on Twitter you may have heard of Mandarin Companion already, but this is the first time I’m directly mentioning it on Sinosplice. I was waiting until all five of our Level 1 digital editions were released for both Amazon Kindle and iBooks, and now they are.
Mandarin Companion graded readers are for learners with 1-2 years of formal study under their belts (or the equivalent), looking for something longer and more interesting to read for pleasure, without having to constantly reference a dictionary.
Mandarin Companion’s Level 1 books assume a foundation of only 300 Chinese characters, and it’s 300 characters you will know if you’ve studied virtually any standard course.
To create this graded reader series, I’ve teamed up with a partner, Jared Turner, while also leveraging the tools and talent at AllSet Learning.
What are the titles?
We released five Level 1 stories in 2013, all based on western classics and adapted into Chinese stories (more on that in a future post). Here are the first five titles:
The Secret Garden:《秘密花园》 This was our first book, and it was an awesome choice. It’s an excellent story, free of complicated settings or plot twists. There are more characters in this story than in most of our other ones, but they all have easy (and very Chinese) names, and the story ends up feeling very Chinese itself, despite the British roots. (Just look at the cover!)
The Sixty-Year Dream:《六十年的梦》 You can’t tell from the name, but this graded reader is an adaptation of Rip Van Winkle. In adapting this and making it totally Chinese, we had a lot of issues to consider. The original work is about going to sleep as a colonist before the American revolution, and waking up afterward in a newly formed country. It’s a story about change. Well, what country knows change better than China? For maximum dramatic effect, we chose a 60-year time span, going from pre-Communist China to post-Mao China. The relevant Chinese history of the periods adds a lot of color to the story.
The Monkey’s Paw:《猴爪》 I remember reading this classic story as a kid, and it totally creeped me out. The first time you’re introduced to the idea of pre-determinism it kind of blows your mind, right? I initially had my doubts as to how well this story could be adapted into simple Chinese while preserving the feel, but we pulled it off pretty well, if I do say so myself.
The Country of the Blind:《盲人国》 This graded reader is based on a classic H.G. Wells story, and I actually blogged about it not long ago, in conjunction with China. (Now you know why I was thinking about the story so hard!) The text of the story doesn’t get into any of those details, really, though… I just wanted as close to an “adventure” story as we could do at the 300-character level (it really is a challenge), and this one fit the bill. The sci-fi connection was icing on the cake! This one is also notable because we altered the original ending just a little bit.
Sherlock Holmes and the Red-Headed League:《卷发公司的案子》 What if you adapted Sherlock Holmes to 1920’s Shanghai? Well, this what happens! This one was fun, because we had to research styles of the time to get the illustrations right, but actually none of that affected the text of the story itself. (But hey, details matter, right? Sherlock.. errr, 高明 would approve!) It was definitely a pleasure to create our own take on the world’s most famous sleuth.
I’m really proud of these books we’ve created, and I wish I had had material like this when I was just starting out on my journey of learning Chinese. You don’t have to wait until you can read a Chinese newspaper to enjoy reading Chinese, really.