The majority of words in Japanese are made by combining consonants and vowels. What was not discussed was how the consonants of Japanese sound and differ with those found in English.
Just as there are consonants in English that don't exist in Japanese, there are also consonants in Japanese that don't exist in English. Furthermore, the consonants that Japanese shares with English need not be pronounced exactly the same. In fact, it's safe to assume that all consonants are slightly different in their own unique ways.
To fully grasp how consonants are pronounced, some technical terminology will need to be learned. However, just as was the case in the previous lesson, such terminology will always be accompanied with an explanation on its initial use.
Transcription Note: Japanese words introduced in this lesson will be written in Romaji (English letters). Additionally, morae with high pitch will be put in bold, and subsequent pitch drops will be marked with ↓. To understand why these conventions are used, refer back to Lesson 1.
In Lesson 1, we briefly learned that a consonant is a speech sound that involves obstructing airflow from the lungs in some way. Japanese has fewer consonants than English, but this doesn't mean the ones it has are all easy to acquire and pronounce for English speakers. First, we need to learn what constitutes a consonant. How do we know something is a distinct consonant and not just a variation of the same thing? Because these distinctions aren't the same across languages, we'll need to understand the answers to these questions to know how to pronounce Japanese properly.
What is a Phoneme?
Two sounds are considered contrastive if interchanging the two can cause a change in meaning. If two sounds can contrast each other, they are treated as phonemes of the language.
To demonstrate what this looks like in English, consider the consonants /b/ and /p/. These consonants are distinct phonemes of English. This can be proven by numerous minimal pairs in which the difference between /b/ and /p/ is the only factor that tells the words apart.
As demonstrated by the words above, a minimal pair is simply two (or more) words that differ only by a single sound in the same position that have different meanings.
What is an Allophone?
In both English and Japanese, there are phonetically (acoustically) different sounds that are treated as the same phoneme. For instance, consider the pronunciation of the consonant /t/ in English. Have you ever noticed that the "t" in "top" and the "t" in "stop" are not pronounced the same? If you're a native speaker of English and pronounce "top" with a hand in front of your mouth, you'll feel a puff of air hit your hand. This is called aspiration. If you pronounce "stop" likewise, you shouldn't feel that puff of air hit your hand. This means that English has both an aspirated and non-aspirated version of /t/, and both are treated as the same sound. These different versions of a single phoneme are called allophones.
|Aspirated /t/||Non-Aspirated /t/|
This distinction between aspirated and non-aspirated /t/ does not change the meaning of a word in English. Although they are in complementary distribution with each other--allophones that don't occur in the same location in a word--a speaker could theoretically replace one for the other and be understood. It is because they cannot appear in the same location of a word that makes it impossible for them to contrast words with each other.
There are also instances in both English and Japanese where the same consonant could be pronounced differently in the same location of a word without changing the meaning. This is called free variation. Think of the word "data." Some people pronounce the first "a" like the "a" in "cat" whereas others pronounce it like the "a" in "date." Neither group is wrong. It's just that the pronunciation of the vowel is in free variation between those two pronunciations.
Phoneme vs. Allophone is Language Specific
What counts as a phoneme or allophones of the same phoneme differs from language to language. It is also possible for a sound to be treated as an allophone of another in one environment but be treated as a separate phoneme in other environments, all within the same language. Although knowing this may seem trivial, the reason for why you should understand what these terms are is because of the fact that languages are all not the same in how they treat sounds. If you are a native speaker of language(s) other than English, you will have different yet just as problematic difficulties in distinguishing certain Japanese sounds if in fact they aren't differentiated in your own language(s).
Transcription Note: Just as in Lesson 1, phonemes will be placed in // whereas allophones will be placed in .
The Phonemes of Japanese
At the phonemic level, Japanese can be said to have at least 16 distinct consonant phonemes depending on how one likes to divvy things up.
The chart above shows all the distinct phonemes of Japanese. However, it is important to understand now that these are not all the possible sounds of Japanese because some of them have allophones and some of them happen to be allophones of other phonemes listed in the chart.
For the rest of this lesson, we'll learn exactly how to pronounce each of these consonants based on intrinsic features to their pronunciation. Because Japanese pronunciation isn't complicated aside from having features that may not be shared in the languages you speak, there should be no worries about having to memorize tons of variant pronunciations.
Most consonants come in pairs. For instance, /k/ and /g/ are both made in the exact same place in the mouth. The only difference is that pronouncing /g/ causes the vocal folds of the mouth to vibrate whereas /k/ does not. Consonants that do not cause the vocal folds to vibrate are called unvoiced consonants, and consonants that do cause the vocal folds to vibrate are called voiced consonants. Voicing is a seen across many languages to make such consonant pairs like /k/ and /g/.
Examples of unvoiced consonants in English include /k/, /s/, /t/, /h/, and /p/. These consonants alone are seen in thousands of words. Although English has more unvoiced consonants, these are the basic ones it shares with Japanese.
In English, unvoiced consonants are typically pronounced with aspiration. In Japanese, unvoiced consonants tend to be slightly more aspirated than they are in languages like Spanish but not nearly as so as in English or Korean.
The Unvoiced Consonants of Japanese
The basic unvoiced consonants of Japanese, as mentioned above, are /k/, /s/, /t/, /h/, and /p/. Overall, these consonants are less aspirated than their English counterparts. Other differences exist, of course, which is why each consonant will be introduced individually.
Just as is the case in English, /k/ is made by placing the back of the tongue against the soft palate in the back of the mouth.
/t/ is made by placing the blade of the tongue behind the upper teeth. When the vowel /u/ follows /t/, it becomes [ts]. This is an example of an allophone. An allophone is a variation of the same consonant in the confines of a particular language. This [ts] is the same as the /ts/ consonant cluster found in words like "its" in English. Unlike English, you must never drop the "t" in [ts] in Japanese. This means that "tsunami" is pronounced as /tsu.na.mi/.
Additionally, /t/ becomes [ch] when followed by the vowel /i/. However, the [ch] in Japanese is not like the "ch" in "chair." The Japanese [ch] is produced by first stopping air flow and then placing the blade of the tongue right behind the gum line while the middle of the tongue touches the hard palate of the mouth.
The consonant /s/ is pronounced just like it is in English, but it becomes [sh] when followed by /i/. When pronouncing [sh], the middle of the tongue is bowed and raised towards the hard palate of the mouth. Note that [sh] is made not as farther back in the mouth as is the case in English.
Pronouncing /h/ & /p/
/p/ is known as a plosive sound. It is made by releasing air upon opening one's lips. In Japanese, it isn't all that common because most words with /p/ come from other languages.
Both /p/ and /h/ are pronounced the same as in English, but /h/ has two allophones. When followed by /i/, /h/ sounds most like the "h" in "hue." When followed by /u/, it becomes [f]. The Japanese [f], though, is created by bringing the lips together and blowing air through them without using the teeth.
More Example Words
When the vowels /i/ and /u/ are in between and/or after unvoiced consonants--/k/, /t/, /s/, /h/, and /p/ along with their respective allophones--they often become devoiced (silent). Devoicing is a very distinctive feature of Standard Japanese pronunciation.
As an example, the phrase for "good morning" sounds like "o-ha-yo-o go-za-i-ma-s". However, it is important to note that many speakers, especially those that don't come from East Japan, do not devoice vowels.
Devoicing is never required in a word. In fact, even when a vowel could be devoiced, it doesn't mean it will. There's significant speaker variation. One thing that is certain, however, is that devoicing should not under normal circumstances occur between and/or after voiced consonants.
Practice: Pronounce the words below with the underlined vowels devoiced.
Kushami (Sneeze) Tafu (Tough) Hito(↓) (Person)
The unvoiced consonants and their allophones mentioned above all have a voiced consonant counterpart. For every voiced consonant, its pronunciation is the same as its unvoiced counterpart minus voicing.
|Unvoiced Counterpart||Voiced Counterpart|
|/h/ (and allophones)||/b/|
There are a few peculiarities that need to be discussed. However, before going into too much detail, /j/ and /dj/ will be mentioned later in this lesson.
1. /z/ typically becomes [dz] at the start of words. /dz/ tends to become [z] inside words, but this isn't always so. /z/ sounds like the "z" in "zoo," whereas /dz/ sounds like the "ds" in "kids." However, it is important to note that many speakers cannot tell the difference between the two sounds.
2. /h/, its allophones, and /p/ correspond with /b/. /b/ is made by bringing the lips together and then releasing them. This means its articulation is the same as /p/ but not as /h/.
3. /g/ can be pronounced as /ng/ inside words. This pronunciation is particularly common in the north and east of Japan.
Try pronouncing the following example words.
More Voiced Consonants
There are also voiced consonants that do not have unvoiced counterparts. These sounds are listed in the chart below.
|[n]||Made with the blade of the tongue on the back of the upper teeth with /a/, /e/, and /o/, behind the ridge of the mouth with /i/ (like in news), and behind the teeth with /u/ (like in noon).|
|[m]||Pronounced by bringing the two lips together just as in English.|
Its pronunciation varies drastically. It is typically pronounced as a flap, which is only seen in American English as the "t" in many words such as "water." At the beginning of a word, it sounds almost like /d/. Sometimes it's pronounced as a trill or like /l/.
|[y]||Pronounced the same in English by bringing the tongue up to the hard palate. This means it is a palatal consonant.|
|[w]||Its pronunciation is very similar to the Japanese /u/. Rather than protruding your lips, you compress them. It is only used with the vowels /a/ and /o/, but its use with /o/ won't even become important until later on in your studies.|
The differences in pronunciation detailed above make Japanese sound significantly different from English. Many sounds tend to be closer to the teeth, which is the case for [n] and [r], and movement of the tongue and parts of the mouth are more limited in range. To practice pronouncing these consonants, try saying the following words out loud.
Palatal consonants are made by the body of the tongue touching against the hard palate of the mouth. In Japanese, these consonants are usually limited to the vowels /a/, /u/, and /o/, and they're all created with the help of the consonant /y/. First, we'll look at those palatal consonants shown below in the chart.
|Consonant||C + /a/||C + /u/||C + /o/|
Terminology Note: Palatal consonants are all semi-voiced due to the use of /y/ following the initial consonant. The voicing of the initial consonant doesn't change. Thus, /gy/ would be fully voiced whereas /ky/ would not be voiced initially but become voiced by the end of the consonant. Here, /y/ acts more like a semi-vowel more so than another consonant, which is why none of these palatal consonants are treated as consonant clusters. Instead, they can be viewed as more additional phonemes in the language.
Usage Note: In loan-words, these consonants may be used with other vowels.
Most of these combination are very common in Japanese. They are most frequently found in words that come from Chinese. Below are some examples.
Other Palatal Consonants
The remaining palatal sounds that have yet to be looked at are /sh/, /ch/ and /(d)j/.
As we learned earlier, [sh] and [ch] are allophones of /s/ and /t/ respectively. They can also be treated as separate phonemes. This is because all five vowels can follow them, allowing them to become contrastive.
The voiced counterpart for both /sh/ and /ch/ is /(d)j/. This phoneme /(d)j/ has two allophones: [dj] and [j]. The former sounds like the j-sound in "judge," and the latter sounds like the j-sound in "seizure." Many speakers pronounce this phoneme as [dj] whenever it appears at the start of a word or after another consonant but as [j] anywhere else. Others only use the [dj] pronunciation.
Consonants may be lengthened in Japanese just like vowels. When you make a long consonant, the sound is perceived as sounding harder. The length of time you use to pronounce it increases from one mora to somewhere in between one and two morae. However, speakers conceptualize long consonants as being two morae.
The consonants that are typically doubled in Japanese are non-voiced consonants. These consonants include /p/, /k/, /t/, /s/, /sh/, /ch/, and /ts/. As far as transcribing them is concerned, they will be written as /pp/, /kk, /tt/, /ss/, /ssh/, /tch/, and /tts/ respectively.
|Shippai||Failure||Matchi||A match||Yokka||Four days||Zasshi||Magazine|
Usage Note: Voiced consonants are only voiced in a handful of loanwords from other languages, but even then they're usually pronounced as their long unvoiced counterparts.
There is a special voiced consonant in Japanese called the "moraic nasal." It counts as a mora on its own. Although usually transcribed as an "n," its pronunciation varies depending on the environment.
In its basic understanding, it is what's called a uvular "n" that is best transcribed as /N/. The uvula is back in the mouth, but when you pronounce it, the mouth constricts as if you were producing a regular /n/, which makes it sound more like the /n/ you're used to hearing but not quite.
This sound has a lot of allophones because it assimilates (becomes more similar) with the sound that follows. Because things can get quite complicated, we'll go over each situation separately with plenty of examples along the way. In Standard Japanese, this sound can't start words, but it is still quite complicated.
When /N/ is before a /p/, /b/, or /m/, it becomes [m]. This means that /m/ can in fact be a doubled with the aid of /N/.
|Sontoku||Loss and gain||Sentaku||Choice/laundry||Kantoo||The Kanto Region|
|Kingyo||Gold fish||Kango||Sino-Japanese word||Kangae||Idea|
Transcription Note: Typically, /dj/ is spelled as "j" since /j/ is largely pronounced as [dj].
When before vowels, /y/, /w/, /s/, /sh/, /z/, /h/, and /f/, /N/ sounds like a nasal vowel from the back of the mouth. At any rate, the vowel before /N/ is always nasalized, but when /N/ is followed by a vowel, all you may hear is a really nasal vowel and then the following vowel. Typically, this /N/ is usually just a very nasal ũ. Although this is usually spelled as "n" for simplicity, it'll be spelled as "ũ" below.
|Kaũzei||Tariff||Kiũyuu||Finance||Kaũsai||The Kansai Region|
1. When before /z/, some speakers pronounced /N/ as [n].
2. /Deũsha/ may also be pronounced as [deũsha].
At the end of words, /N/'s default pronunciation is [N]. However, there are plenty of speakers that pronounce it like a nasal vowel as seen above in this position. In singing, it will even be pronounced as [m]. This is actually true for any instance of /N/ in singing. For the purpose of this section, [N] will be written below as "N."