Unicode with Blogger
10 Sep 2003
Unicode is great, but so far underused. It’s basically a newer, larger character set designed to make multilingual computing easier, indirectly bringing peace and harmony to all. Maybe one day we’ll be free of the mojibake and luanma (that’s Japanese and Chinese for “garbage characters”) that thwart our otherwise well-intended communications. Unicode is a step in the right direction.
What does implementing Unicode mean? It means you’ll no longer load up a page to find “garbage characters” and have to change the encoding used for the page. It means you can have characters from completely different character sets (say, Chinese and Korean and French) on the same page. Check out Glome for a good example of that. Unicode is great.
I bring this up largely because I think other China bloggers really ought to adopt Unicode in their blogs. Alf’s latest post reminded me of that. Even though he entered his Chinese name, “阿福,” correctly in Blogger, I can’t read it even when I change the encoding, and he made that post on my computer!
So I’d like to provide some instructions for those that use Blogger.
1. In Blogger, go to Settings, then Formatting.
2. Change the Encoding to “Universal (Unicode UTF-8)”.
3. Save Changes.
4. Go into the Blogger template.
5. In the
<head> section of the document (that’s the part between the
</head> tags), insert this line:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
6. Save Changes.
Now when people visit your blog, it will automatically load with Unicode encoding and characters should display fine.
IMPORTANT NOTE: When you input Chinese into your blog entry through Blogger, you must be sure your browser is in Unicode encoding already. Otherwise it’ll all turn out as garbage. If you remember to switch over halfway through your entry, post first, then change the encoding, because changing the encoding will make you lose everything you’ve written in Blogger’s “Edit Post” window. If some of what you’ve written is in Chinese, then you’ll want to copy and paste it into a text file, switch over to Unicode encoding, then copy and paste back in. Nothing lost.
IMPORTANT NOTE 2: If you’ve written in Chinese in the past and it can be viewed successfully in your archives simply by switching to Chinese encoding, it will nevertheless become garbage after you switch over to Unicode. You’ll have to decide if you think it’s worth it to switch. I do.