Below is a very informal guide on the pronunciation of the Japanese language. There always exceptions to every rule, but this is simply intended to aid non-native speakers to at least get close to the words and phrases commonly used in the practice of these arts, so please forgive any glaring errors.
The Five Vowels
The Japanese language has only five vowels:
a = “ah” as in father, not like the a in “fat” or “late”
e = “eh” as in pet, not like the e in “feet” or “athlete”
i = “ee” as in ink, not like the i in “fit” or “kite”
o = “oh” as in so, not like the o in “boss” or “would”
u = “oo” as in flu, not like the u in “bull” or “should”
The most important thing to note is that each letter almost always represents one single vowel sound. In English, the “i” in “sit” and the “i” in “site” represent quite different vowels. This type of wild variation never occurs in Japanese.
Each syllable has the same emphasis
This rule is the simplest but by far the most important. Of course, even in Japanese, important syllables are pronounced more strongly and they can be slightly longer than the unimportant ones. But, the distinction between stressed and unstressed syllables is much smaller than in English, so much so, it’s better to think of Japanese syllables having equal lengths and strengths.
For example, the word hakama (the black, pleated pants-skirt) is often pronounced ha-KA-ma in English, but technically should be flow more smoothly: ha-ka-ma.
There are no diphthongs
A diphthong is a slide from one vowel to another as in the English word “rain”. The “a-i” combination is considered a single vowel, so “rain” has only one syllable although it has two vowel letters.
The Japanese language doesn’t have any diphthongs. Two consecutive vowel letters simply indicate two separate vowels and hence two separate syllables. For example, the word “Inoue” (a common family name) is pronounced as four syllables: i-no-u-e. Consequently, long sequences of vowels aren’t uncommon in Japanese words, such as “Aioi” (a-i-o-i, a placename) and “aoi ie” (a-o-i-i-e, a blue house).
All Japanese syllables are open
An “open” syllable is one which ends with a vowel, whereas a syllable ending with a consonant is a “closed” syllable. For example, “get”, “an”, “cast”, and “sports” are closed, and “sky”, “knee”, “we”, and “a” are open.
The Japanese language has only open syllables. Since all syllables are open, you can unambiguously divide Japanese words written in Roman alphabets into syllables in most cases. For example, “yokozuna” should be yo-ko-zu-na; other divisions would contain one or more closed syllables: yok-o-zu-na, yo-koz-un-a, etc.
The special syllable “N”
One of the two exceptions to the preceding rule is the N-syllable. In Japanese, the sound of “n” in certain circumstances forms a syllable by itself. For example, “kanji” (a Chinese character) has three syllables: ka-N-ji.
The word “kanji” is unambiguously divided into syllables because of the open syllable rule (“kan” cannot be a single syllable because it is closed). On the other hand, there are words which cannot be uniquely divided into syllables from their Roman transliterations alone. For example, “Inoue” could be i-N-o-u-e instead of i-no-u-e. This is one of the shortcomings of Roman transliteration. If the word is written in the Japanese characters, there is no such ambiguity.
There are two methods to avoid this ambiguity. One is to use an apostrophe as in “Shin’ichi” (a common given name), which is unambiguously divided into syllables as shi-N-i-chi: the apostrophe is there to prevent the division shi-ni-chi. The other method is to use a hyphen: “Shin-ichi”. Unfortunately, these conventions aren’t always followed.
A long vowel comprises two syllables
This rule is quite simple. Each of the five Japanese vowels has a long and short versions. The long version comprises two syllables, and so, in accordance with the equal emphasis rule, the long version is precisely twice as long as the short one.
The long vowel can be viewed as a sequence of two vowels of the same kind. For example, “Iida” (a common family name) consists of three syllables: i-i-da. The “ii” of “Iida” can be viewed as the long version of “i” or as two consecutive short i’s.
SPELLING LONG VOWELS
Now, how do you spell long vowels? The biggest shortcoming of Roman transliteration of Japanese is that there’s no universally accepted method of indicating long vowels (except for “ii”). The Ministry of Education of Japan once endorsed the use of overbar (place a short horizontal bar on the vowel letter), but it’s use is far from consistent.
For the vowels [a] and [o], there are three common methods:
One of the most frequently used is just to give up indicating the length! With “Tokyo” both o’s are in fact long, which makes this word consist of four syllables: to-o-kyo-o. “Kyoto”, on the other hand, is Kyo-o-to.
Another method is to add an “h” after “a” and “o” as in “rahmen” for ra-a-me-N (a Japanized Chinese noodle) and “Endoh” for e-N-do-o (a common family name). This scheme works in many cases, but can result in ambiguity as in “Ohita” (a place name): Is it o-o-i-ta (with a long “o”) or o-hi-ta (with a short “o” plus a syllable “hi”)?
Another way uses two vowel letters as in “raamen” and “Endoo”. “Raamen” is OK, but “Endoo” would be pronounced wrongly by English speakers.
For the vowel [e], the long version is most often written as “ei”. For example, “sensei” is pronounced se-N-se-e with a long “e”, not se-N-se-i. But this exception is a minor one, because even if you really say se-N-se-i instead of se-N-se-e, you’ll sound OK. Your Japanese interlocutor may not even notice that your pronunciation isn’t quite right.
For the long [u], we usually give up indicating that it’s long and simply write “u”. So, “Kyushu” (a place name) is kyu-u-shu-u.
Consonants
The good news is most Japanese consonants are pronounced the same as English ones. The bad news is that spellings are sometimes somewhat confusing (consonants can switch depending on how a word is used, such as “harai” vs “barai”).
The letter “y” following another consonant letter
English doesn’t use the letter “y” in the way Japanese does, although English does have the same phenomenon. Say “you” aloud and then “oo” (as in “fool”). What’s the difference? Obviously, it’s the presence and absence of the initial y-sound. Next, say “few” and “foo” aloud. We hear the same difference: “fyoo” and “foo”.
Japanese has similar pairs. For example, “Tokyo” is pronounced “to-o-kyo-o” with four syllables and “toko” (to submit a manuscript) is pronounced “to-o-ko-o” also with four syllables.
This kind of y-sound following another consonant can be difficult for English speakers. For example, there’s no such combination as “ry” in English. To practice this combination, first say “yo” (as in “yoke”) and “o” (as in “oh”) alternately. And then, say “ryo” and “ro” alternately.