Research
Welsh has some sounds and pronunciation patterns that are not common in English and some other languages,unique and can therefore cause difficulties for those learning the language. In this project, we provide insights into the pronunciation of Welsh language by employing state-of-the-art magnetic resonance imaging (MRI) at Cardiff University’s world-leading Brain Imaging Centre (CUBRIC).
More concretely, we utilise MRI to generate videos of the anatomical vocal tract movement of Welsh speakers of different dialects while reading a specifically designed script. Images and recordings of the participant’s voice are thus used to produce videos that will show, as never before, how characteristic Welsh sounds are created.
Challenges of learning Welsh
As far as the pronunciation of Welsh is concerned, several difficulties can arise among learners acquiring the language as a second language. In some cases, these can affect learners’ ability to communicate effectively outside the classroom, or even to integrate successfully into Welsh-speaking circles or communities.
It is possible to group potentially difficult Welsh sounds as follows:
- Phonemes (or sounds) that are not common in most varieties of English, e.g. “ch” [χ] and “ll” [ɬ]
- Simple vowels which are commonly pronounced as diphthongs in English, e.g. “ô” (as opposed to the vowel in the English ‘no’)
- Diphthongs that do not appear in English, e.g. “yw”
- Phonological rules which differ from those of most English varieties, e.g. the consistent use of “r” after vowels in Welsh
- Difficulties associated with some ambiguities in the Welsh spelling system, e.g. the inconsistent use of /g/ in words consisting of written “ng”
How this project might help learning
The use of MRI for visualizing the movement of the tongue and other parts of the vocal tract during language production has been extensively demonstrated for other languages; however, this project will be the first of its kind for the Welsh language.
The resources created as part of this project consist of high-quality animations based on MRI data and should give learners of Welsh a clearer and more schematic indication of how to pronounce Welsh sounds effectively. We will therefore explore the way in which the new videos obtained can help not only learners with their search for new ways of learning intricate aspects of the pronunciation of Welsh, but also Welsh-language tutors with the development of novel mechanisms to facilitate the transmission of phonetic concepts.
Acquiring the data
The challenges of real-time MRI
MRI provides excellent images of the head, the tongue, and the larynx. However, MRI is quite slow, particularly when aiming to achieve a high-resolution image with good contrast. For this study, images must be acquired at a speed that allows us to capture the relevant changes in the position of the mouth, tongue, etc. So, the first challenge is the trade-off between good spatial and good temporal resolution. In the examples shown here, we were able to collect approximately 6 frames per second.
In addition, MRI is very sensitive to motion, which means that typically moving parts of your head (for example, moving your tongue and lips while speaking) would result in blurry images. A second challenge is thus choosing the right MRI technique to ensure our images are free of artifacts. This is achieved by choosing fast enough approaches to data collection, that will freeze motion, always ensuring that image quality is preserved.
The challenges of recording sound in the MRI scanner
The MRI scanner uses a very large magnet (more than 100,000 times the earth's magnetic field) to produce the images. This means that objects and wires that contain metal cannot be taken into the scanner room, and this includes standard microphones. Therefore, we used a special microphone (OptoAcoustics FOMRI) which uses fibre optics instead of regular wires. Another challenge with recording speech during an MRI scan is that the machine produces very loud noises, which tend to cover the sound of the human voice. This issue was tackled by recording approximately 10 seconds of pure noise for each participant before proceeding with the speech recordings. These measurements were correspondingly used to characterise and remove the noise from the audio signals, producing clean audio files.