Slumming it with nciku

20 Oct 2009

I recently looked up the word 贫民窟 (meaning “slum”) in nciku. The definition included this example of usage:

> She decided to slum it for a couple of months.

> 她决定去贫民窟待几个月。

The Chinese sentence, translated back into English, would be:

> She decided to stay in a slum for a couple of months.

I think the translator missed something in this particular case, and the content of the sentences (as well as the order) strongly suggests that the Chinese is a (not so great) translation of the English.

So how nciku is getting its sample sentences for Chinese words? The OED is the champion of the dictionary quotation for the English language, containing tons of examples of its words’ usage “in the wild.” Dictionary sample sentences are best when taken from other sources, but those sentences should at the very least be composed in the language the dictionary serves. It seems this is not what’s happening with nciku, but maybe Collins (one of nciku’s data sources) is to blame.

Share

John Pasden

John is a Shanghai-based linguist and entrepreneur, founder of AllSet Learning.

Comments

  1. Oh dear. Actually, especially with Pleco accidentally left overseas, I’ve been often relying on those useful example sentences to make sure I was choosing the right word; the problem with a dictionary of course is it’s a many-to-many mapping and only an example sentence can properly disambiguate. The presence of the sentences is a good feature of Nciku. Shame they are not reliable.

    Compare to dict.leo.org, which I use constantly when racking my brain for a German word. Great when I just need a reminder, but if it’s a new word it’s often useless. A search for “common” yields 16 “common (adj.)” results with no way to distinguish them. This is a common weakness of all open-sourced dictionaries, and one reason why I hate CEDict more and more.

  2. Interesting. Before I clicked the slang dictionary link, I’d have guessed the English sentence had something to do with her dating someone from “the wrong side of the tracks.”

  3. nciku entries have 2 types of examples: the ones that appear in the definition area, and the ones in the “examples” area below. Examples directly attached to an entry (the ones that appear in the definition area below each meaning) are originally in the language of the entry headword, so ‘贫民窟’ will have examples that were originally Chinese, while ‘slum’ will have ones that were originally in English.

    The examples area at the bottom of the page is similar to the examples search mode, i.e. it will bring up all the examples that contain that word attached to any entry, whether they were originally in English or Chinese. But it’s sorted so that examples that contain the search term in their original form always appear above ones that only contain it in their translation, so for common words all of the examples were originally in Chinese (assuming the word you’re looking up is a Chinese one). For less common words, it’s better to see a translated example than nothing at all.

    You can tell what language an example was in by which language appears first – originally-in-Chinese examples have the Chinese sentence first with the English version in a lighter colour below it; originally-in-English examples are the other way round.

  4. Kevin,

    Thanks for the clarification! Good to know that the data is all nicely categorized.

    Looking at the dictionary page, though, this is definitely not clear. I see three boxes with Chinese-English dictionary entries, a link to English-Chinese entries, and then the examples. I just assumed the examples go with the definitions (I didn’t click the “English-Chinese” link, after all). It would be great if example sentence sources were subtly noted somewhere.

    Are you an nciku employee?

  5. That’s interesting! That’s why I prefer MDBG, although it does have mistakes occasionally as well.

  6. Yes, I work for nciku – thanks for letting us know about the problem.

    We did actually use to have links after the examples saying which entry they were attached to, but some users thought that the word in the link was a synonym/translation of the entry headword – e.g. we would have a link to “blight” after the example “Slums are a blight on a city”, but some users thought this meant that the meaning of 贫民窟 in that sentence was “blight”. But looking back, that was possibly also a result of not realising that these examples are search results. We’ll try to think of a better way to make this clear.

    • Hey Kevin, are you still working for NCIKU (or LINE)? I was a really devoted user, and thought the website offered a breadth of service and ease of use pretty much unparalleled on the web. I had been relying on it heavily to prepare for the HSK, and was really surprised to see that the site has been totally dismantled and replaced by some sort of stripped down, app-stlye sentence generator. I am assuming this is a Beta stage and that the site will be updated with other content later on, but I am really shocked that LINE would make this move literally overnight without notifying their customer base. Isn’t the policy generally to have a Beta service run concurrently with the original until the content is locked and loaded? This is a shot in the dark (I see the last message was posted 5 years ago), but any clarification you have to offer would be really helpful.

  7. Kevin,

    Yes, it’s a bit tricky, but I’m sure there’s a better way. Thanks for listening!

  8. It would be great to be able to rank examples according to usefulness, and then show the most useful examples first. This could be done in a way similar to comment ranking on larger sites. I have found that when translating some texts, the examples are really helpful, but need to be better organized. Including more examples from original Chinese texts would be great as well

  9. Tim OShaughnessy Says: November 8, 2009 at 6:00 am

    I may be mistaken but I don’t think a Chinese person would ever call a slum a 贫民窟. In my experience the most often phrase used especially when talking about outside China, BUT also IN china surprisingly…
    the word used for slum and ghetto is most common as… and sadly racist, though then again Chinese don’t see it this way when I have asked, its just a word to them, (And feel free to contradict me!) but the correct word is: 黑人区。

  10. Tim OShaughnessy Says: November 8, 2009 at 6:03 am

    well the will say 贫民区/贫民窟 but I honestly think sadly 黑人区 is far more 口语

  11. 我来串门的 Says: April 21, 2010 at 4:34 pm

    slum —— 贫民区 —— 城中村儿 (Young people will say)

  12. […] sentences are not as well-researched as they should be. John Pasden covers this at Sinosplice: incorrect nciku example sentences.The automatic audio-generation is quite useful, but it might be better to hear real recordings of […]

Leave a Reply

Your email address will not be published. Required fields are marked *