Seeing the Tones of Mandarin Chinese with Praat

by John Pasden

in language

21 Jan 2008

When you first start studying Chinese, you are introduced to Mandarin’s four main tones. You are invariably shown some variation of the chart on the right. You may have wondered where these lines came from. Are they just some artist’s conception of how the tones sound that everyone ended up agreeing on? No, actually, they’re tone contours, the result of linguistic research into the pitch contour of the various tones of Mandarin Chinese.

At this point, your average language student is going, “oh, right, pitch contour. Linguisticky mumbo jumbo. Whatever.” He then decides to accept the chart, no matter how helpful or useless he happens to find it, and move on. The reality, however, is that pitch contour is incredibly easy to see, thanks to a piece of free linguistic software called Praat. I’m going to show you how to do this yourself in a few easy steps so that you can stop accepting this “tone contour” stuff on faith alone.

Using Praat to See the Tones of Mandarin Chinese

Download Praat. It works on Windows, Mac, Linux, and all kinds of platforms. Very geek friendly.
Open Praat. The current version is 5.0.04, so that’s the one I’ll be using in all my screenshots. When you open Praat, you see two windows like this:

OK, perhaps not the most user-friendly interface in the world, but don’t worry. It’s not hard.
You see two Praat windows open. We’re just going to ignore that one on the right, because we’re not going to use it. In the left window, click on “Read” in the top menu, then select “Read from file…”.

From here you can open various sound files, such as .WAV files or even .MP3 files. To keep things simple, though, I want to open a file which contains only one spoken Mandarin syllable. For this, I can turn to my Mandarin Chinese Tone Pair Drills. In the downloadable file, there’s a directory called “1-Char Adj” which has several monosyllabic word examples for each of Mandarin’s four main tones. (Because the tone drills are freely available for download, you can reproduce this exact example, if you wish.)

I choose the fourth-tone word “dà” (meaning “big”) and open it. It appears in the window:
Now “Sound da4” should be highlighted in blue. If it isn’t, click on it. Select “Edit” from the menu at the right:

(OK, now I’m taking this really slow for those of you that might be intimidated by a piece of “linguistic software,” but I should point out that all we’ve done so far, really, is (1) open Praaat, (2) open an audio file, (3) click on edit.)
This will bring up a new window that looks like this:

OK, now we see two boxes. The one on the top is a waveform. The one on the bottom is a spectrogram, which is also where the pitch contour will be displayed. The pitch contour is the one we’re interested in.

Why is it all scrunched up on the left, though? That’s because the entire file is displayed in the window, and with the exception of the very beginning, most of the file is silence. So let’s zoom in on what we’re interested in. Click and drag in the window to select the blackish parts on the left. They should be highlighted in pink. Now turn your attention to the four little buttons in the bottom left corner of the window labeled “all,” “in,” “out,” and “sel.” These are actually zoom options, which stand for “show all,” “zoom in,” “zoom out,” and “zoom in on selection.” So click on “sel” now. You may want to resize the window at this point to make it more square. You should see something like this:

Now the pitch contour should be quite obvious, a blue line. (The ghostly grayish background is the spectogram. We won’t be paying much attention to it, but we’ll appreciate it making the image look cooler).

Are you surprised? There’s a funny break in it, but you can clearly see the falling pitch contour that we would expect for the fourth tone word “dà.”

That’s it! You can repeat this method for as many words as you want to. You can examine the pitch contours of native speakers’ speech, and you can even record yourself and look at the pitch contour of your own speech.

On that note, though, I had better point a few other things out.

Some complications

First, let’s look at the pitch contours of all four Mandarin tones. From now on we’ll be ignoring the waveform in the top box in both my explanations and screenshots, and I’ll add pinyin to the graphics to make the sounds easier to identify. Here’s a sampling (again, taken from my tone drills):

Many of you are thinking, wow, they really do look like the chart! But then the critics speak up: why isn’t first tone totally level? It kind of has an arc to it. Shouldn’t third tone rise more at the end? And what is with that break in fourth tone?

Well, the truth is that the chart I opened with is an idealized version of the tone contours. The real thing is actually quite a bit messier. To illustrate my point further, I’ll give you the pitch contours of some disyllabic Mandarin chunks:

So… why the dip in second tone “bú” of “bú cuò”? Why don’t the second tones “liú” and “xíng” rise to the same height, if they’re both second tones? In “jiǎohuá” why doesn’t the third tone rise more and why does the second tone seem to dip? Is there a problem with the source data?

No, there isn’t a problem with the source data. Theoretically, you should be able to re-record the words again and again until the pitch contours look how you want them to. But if you listen to the audio data we used, it sounds fine. So what gives?

My point is not to confuse you. There are answers to all these questions. But when you get down to the pitch contour of individual words spoken by individual people, the situation is, in reality, incredibly complex, and a nice little tone diagram doesn’t even begin to explain it all.

Conclusion

So what are you supposed to take away from all this? Well, first, I hope you did see that in general, the tones of Mandarin do follow the trends depicted in the basic tone diagram. I’m a visual learner, and I really struggled with the tones, so I feel like it helped me to be able to connect the audio data with a visual representation somehow. And it’s a whole lot easier to do than I first suspected.

Second, I hope you understand that if you’re struggling with tones, you really shouldn’t beat yourself up over it. The reality of tones in action is incredibly complex, and the basic tone chart is a gross oversimplification. The good news is that your brain is already fully equipped to figure out the real deal, and the basic tone chart is the only starting point you really need.

Lastly, If you thought this was going to be some kind of method which allows you to mimic the tones of native speakers through visual pitch contour comparisons, I’m sorry to tell you that I think that’s a very bad idea. Pitch contours “in the wild” aren’t consistent enough for that. It’s the kind of idea that might appeal to a programmer or a perfectionist, but in reality, that kind of practice isn’t likely to help you communicate better in Chinese.

linguistics tones

John Pasden

John is a Shanghai-based linguist and entrepreneur, founder of AllSet Learning.

Comments

coljac Says: January 21, 2008 at 9:15 am

Very interesting. My own tones were much less clear – could be a great way to practice. I’m going to record my g/f and myself saying the same stuff, and compare. It might identify some weaknesses in my own chinese tones.

Reply
syz Says: January 21, 2008 at 9:26 am

Nice post. This is a great illustration of what you can do with Praat for a tonal language. When I downloaded it a few months ago, I was kind of disappointed until I stumbled my way into the pitch contour function. Holy cow, you can actually show people their tones?! Very cool.

I’ve found that it can be a little disappointing outside the controlled environment of a quiet room and a real microphone. So now I’m still looking for the right opportunity to include the charts along with some real-noisy-world recordings.

Reply
John B Says: January 21, 2008 at 9:53 am

Lastly, If you thought this was going to be some kind of method which allows you to mimic the tones of native speakers through visual pitch contour comparisons, I’m sorry to tell you that I think that’s a very bad idea.

I think, though, that it could be useful in seeing what your own tones are doing. I remember trying to get second tone down pat, and swearing that I was making the same noise my wife was but having her tell me that I wasn’t. I’m a visual learner too, so if I could have seen exactly what I was doing, it may have helped me correct it. Your pitch contours won’t be exactly the same as any other person, but they should be in the same ballpark.

Reply
tim Says: January 21, 2008 at 11:36 am

Interesting analysis, especially for someone who spent way too much time doing Fourier transforms in school 🙂
My downfall is definitely the 3-2 pair, like 美国 mei3 guo2. I had one teacher explain that for the third tone, you should “only pronounce the first half”, which kinda makes sense from your graph above of jiao3 hua2. Was that pitch contour representative of other 3-2 pairs?

Reply
kastner Says: January 21, 2008 at 4:17 pm

Will try it.
In daily talkings, 3rd tone is much shorter and sometimes without rising, so I’ll intentionally make it a BASS voice to see what would happen.

Reply
Abstract Says: January 21, 2008 at 4:33 pm

If you speak several tonal languages (such as several Chinese dialects), you can train yourself to produce correct tone contours in languages you have never heard before, just by looking at diagrams.

Eventually, you can produce all five tone levels, and any combination thereof, on cue. (And of course you can distinguish them by ear.)

Reply
Feds Says: January 21, 2008 at 4:58 pm

No! A new method for Chinese language teachers to show me just how off my tones are 🙂 My self-delusion can’t stick it out in the face of this kind of software. I’m sure any language teachers reading this post must be chomping at the bit to use Praat – if only they had multimedia classrooms. Something for the tutors and do-it-yourselfers for now.

I wonder if I should use this on my in-laws to see how they confuse me with their non-tonal Shanghainese accents when they’re speaking Putonghua with me. Course I’ve got bigger problems when ‘hu’ is pronounced ‘fu’, etc.

Great post John

Reply
John Says: January 21, 2008 at 9:28 pm

John B,

Yeah, it’s definitely a useful tool for things like that. It’s especially useful if you’re confusing second tone and fourth tone and you need to see some proof of what’s going on!

I just meant that I think it’s not a good basis for study, or for “a method.” It’s a nice visual supplement though.

Reply
- MakMak Says: October 3, 2010 at 5:24 pm
  
  Any chance this might work for the Cantonese Language its tones (all 6 of them) that I can possibly find some visual way to represent them just like how you did for Mandarin? 🙂
  
  Reply
  - John Pasden Says: October 4, 2010 at 1:11 pm
    
    Of course it will work. You just need the audio files to analyze.
John Says: January 21, 2008 at 9:29 pm

tim,

Yeah, 3-2 is going to be something like that. The initial consonant will affect the overall starting point of the pitch contour, though.

Reply
Mark Says: January 22, 2008 at 2:21 am

This is very similar to what I was talking about on your post about whispering. It takes a very quiet room and a decent mic.

Reply
Lavarock Says: January 22, 2008 at 2:32 am

Very interesting (BTW, this is a wonderful blog!)
Now there is also a fifth tone in Mandarin, which is silent tone, how is it gonna show in this program?

Reply
Chiao Says: January 22, 2008 at 11:48 am

Lava Rock:

The actual pitch of the neutral tone depends on the tone of the preceding syllable.

http://en.wikipedia.org/wiki/Standard_Mandarin#Neutral_tone

Reply
John Says: January 22, 2008 at 2:40 pm

Chiao,

Good link. Wikipedia keeps getting better and better… I’m pretty sure that info wasn’t there last time I read that page.

Reply
Mike W Says: January 22, 2008 at 4:13 pm

I don’t know if any of you know about this already, but since no-one has mentioned it I thought I would. There is a piece of free software called SpeakGoodChinese, available from http://www.speakgoodchinese.org, that is based on Praat, and allows you to record you speaking within the program and then give you a visual representation of your tone, just like John has demonstrated above. SpeakGoodChinese also gives you the tone reference on the screen before you speak as a guide.

Hope someone finds it useful!

disclaimer: I am in no way affiliated with the makers of this software, just passing on the information 🙂

Reply
- Philip Harding Says: January 10, 2016 at 11:12 pm
  
  This program is very useful for two reasons. It saves recorded files individually and keeps them grouped with the target practice files, and it displays just the pitch contour without the spectrograph, which I don’t need (I can’t seem to isolate just the pitch contour with Praat).
  However, I have a USB mic on my main computer (a 2011 Mac Mini) which always works great but crashes every time on this program when I attempt to record. I tried adding a script to fix the problem but it wouldn’t work. Any feedback would be great.
  
  Reply
John Says: January 22, 2008 at 11:10 pm

Mike W,

Thanks for providing that link. I actually saw it a while ago, but I forgot about it.

Reply
jeebus Says: January 23, 2008 at 1:11 pm

@tim: 3-2 doesn’t have to be difficult. Think about how you say “uh-huh” when someone is telling you a story and you want them to tell you more. The uh goes low and the huh rises.

Reply
Jian Li Says: April 27, 2008 at 5:00 am

Hi, Hohn,
It is great to know that you are so enthusiastic on learning Chinese and help others to do so. I am working in NZ as a secondary school teacher.
It is bit of hard at the moment. But it will be fine gradually. I think the software could also be used for learning English. the Zhonglish and English can be mutual beneficial. Language is alwo my favorite. I hope to improve myself in applied linguistics as well someday.

I will try this and follow your web to see what’s new. It can also be used for Kiwis to lean Chinese.

Reply
Easing back into Chinese « All Mandarin, All The Time Says: July 20, 2008 at 5:42 am

[…] explanation of Chinese pronunciation that I have ever found so far. That and John Pasden’s frequency analysis of Mandarin tones. […]

Reply
Dave Johnson Says: August 17, 2008 at 1:33 am

I looked at the Wikipedia entry on the neutral tone, and there does seem to be a pattern to the changes caused by the preceding tone. It looks like the point is to emphasize the notion of returning to a middle level tone (relatively speaking), but the emphasis requires overshooting a bit. Plus mandarin does not really have a level middle tone, so what is used is one of the regular four tones that moves the pitch in the right direction. I would also guess that in linguistic terms this might also be a case of “co-articulation”, which is to say that speakers of all languages tend to modify any given sound to make it easier to say the succeeding sound smoothly and quickly. For example, in English we say the ‘K’ sounds in ‘key’ and ‘call’ with our tongues in different places, and Chinese learning English have just as much trouble with this as we do with tones in Chinese. The net result is that in learning Chinese we really have to learn the tone contour of an entire phrase, rather than simply trying to string together the tones of isolated words like lexical Lego blocks.

Reply
Toward Better Tones in Natural Speech | Sinosplice: Life Says: December 10, 2008 at 11:48 pm

[…] “idealized” properties. That is to say, if you look at their tone contours (remember how to do that with Praat?) in the sentence, they don’t all resemble the perfect angles in the classic chart we all […]

Reply
Olle Linge - Life, literature and the pursuit of dreams · Online Highlights 7 Says: November 2, 2009 at 12:23 am

[…] Mandarin tones with Praat – Praat is a program for visually analysing sound, in this case taking a look at what the tones in mandarin Chinese actually look like. How close are they to the tone curves we’ve all been taught? Very, very interesting. […]

Reply
Chinese Language Learning Tool: Praat « fun asian games Says: January 26, 2013 at 10:26 am

[…] https://www.sinosplice.com/life/archives/2008/01/21/seeing-the-tones-of-mandarin-chinese-with-praat […]

Reply
朱立安 Says: December 8, 2013 at 12:37 am

Thank you for this post!

My biggest hope for future Chinese students is that teachers will explain the tones in context, not as single characters. Every Chinese class starts with 媽麻馬罵 plus the sandhi rule for 3-3, but it would only take a little more time to practice every combination from 1-1 to 4-4.

It took me forever to get 3-2 and 4-3 compounds right, simply because I’m an extremely gullible student and kept looking at the chart 🙂

Reply
Alfred Reinold Baudisch Says: April 1, 2015 at 7:59 pm

After analysing some sound files and my own voice I’m surprised that none of my tones compare to the ones of a native speaker. You are correct when you said ” It’s the kind of idea that might appeal to a programmer or a perfectionist”. As a programmer I was already obsessed and going crazy analysing my voice there. I was even considering an app for that, but it wouldn’t be of use for better fluency, maybe just for fun.

Reply
Tony Narloch Says: September 23, 2016 at 3:20 am

I am a little confused.

It is stated that this will not aid in fluency. However, I thought that it would help with fluency, because it would rectify/correct some of the bad habits beginners (such as myself) pick up? Allow us to produce a sound of tonal combinations that is far more accurate or at least, closer to the ballpark of a native speaker?

Perfect emulation is inappropriate and unrealistic as a goal. But, surely me producing a sound that is more accurate (and by accurate I would gauge that according to the standard of “being understood by a native Mandarin speaker”) IS aiding in fluency…..?

Or am I missing something…? Basically, my worry is that whilst I have vocab under my belt, I am not going to be able to use it by virtue of the fact that I am not able to pronounce it properly.

Reply