Development of articulation simulation system using vocal tract Model

Yuka Sumita; Mariko Hattori; Ken Inohara; Rie Sakurai; Hisashi Taniguchi

doi:10.5772/55906

Author Information

Show +

Y.I. Sumita
- Department of Maxillofacial Prosthetics, Graduate School, Tokyo Medical and Dental University, Japan
K. Inohara
- Department of Maxillofacial Prosthetics, Graduate School, Tokyo Medical and Dental University, Japan
- Keishu-kai Inohara Dental Clinic, Japan
R. Sakurai
- Shinjuku-nishiguchi Dental Clinic, Japan
M. Hattori*
- Department of Maxillofacial Prosthetics, Graduate School, Tokyo Medical and Dental University, Japan
S. Ino
- Human Technology Research Institute, National Institute of Advanced Industrial Science and Technology, Japan
T. Ifukube
- Institute of Gerontology,The University of Tokyo, Japan
H. Taniguchi
- Department of Maxillofacial Prosthetics, Graduate School, Tokyo Medical and Dental University, Japan

*Address all correspondence to: sasamfp@tmd.ac.jp

1. Introduction

During prosthetic treatment, dentists need to consider various factors, such as esthetics, speech, mastication, and swallowing, etc. Among them, the effect of prosthetic treatment on speech is an important concern for patients; thus, dentists must take care to avoid causing speech impairments during prosthodontic treatment, such as removable-type prostheses; partial or complete dentures and fixed-type prostheses; bridges and crowns. Since teeth and alveolar bone, maxillary bone are the main speech articulators, speech can be impaired by dental disorders such as missing teeth or the presence of insufficient prostheses.

Speech, masticatory, and swallowing disorders may occur in patients after the surgical resection of tumors with the head-and-neck region. Particularly, when a maxillary defect remains due to the extent of tumor resection, the nasal cavity constantly communicates with the vocal tract composed of the larynx, pharynx, and oral cavity and maxillectomy defect changes the vocal tract shape, inducing marked speech disorder. Concretely, communication between the nasal and oral cavities cannot be blocked, and breath passes through the nose and it causes hypernasality, resulting in unclear phonation. Acoustically, the relationship between the first and second formants of vowels alters [1].

Figure 1.
Dento-maxillary prosthesis for maxillectomy. Dento-maxillary prosthesis involves obuturator separate between nasal part and oral part. It prevent the air leakage and works well to improve the functional impairments.

However, the details have not yet been clarified. The production of vowels, an important element of voice, depend on the vocal tract shape [2]. In the source-filter theory [3], searching for the cause of speech disorder is considered possible based on the transfer characteristics in the vocal tract, which is a resonant tube, determined by measuring changes in the shape.

Maxillofacial prosthetic treatment such as the use of obturators is useful for rehabilitating the speech of maxillectomy patients as prostheses can be used to reconstruct maxillofacial defects including missing maxillary and alveolar bone and teeth. However, maxillofacial prosthetic treatment often takes a long time, as the prosthesis has to be adjusted until the optimal form has been achieved. This is particularly difficult because maxillofacial defects often have complicated structures, and patients sometimes suffer pain in the defect area and/or trismus Thus, a treatment protocol for maxillecyomys that is easy to apply and does not place an unnecessary burden on patients or doctors is required. Surgeons sometimes take CT images and MRI images of their patients to check for the presence of recurrent lesions, and we believe that CT images and MRI images could be useful for establishing an appropriate prosthetic treatment strategy for maxillectomy patients.

For example, if CT images and MRI images could be used to produce a speech simulation system, it would make it easier to design prostheses with optimal forms and reduce the time spent adjusting them.

The goal of our study is to make a vocal cord model from digital data and then use it to establish a speech simulation system for maxillofacial prosthetics and for the preliminary survey,in this study, we utilized an MRI images acquired while the subject phonated the sound /a/. The image was produced by the Advanced Telecommunications Research Institute International (ATR) Innovative Technology for Human Communication and published by the National Institute of Information and Communications Technology (NICT). Using this image, we prepared a phonating vocal tract model and confirmed the acoustic features with the acoustic analysis.

2. Materials and methods

2.1. Preparation of the vocal tract model

The “MRI database of Japanese vowel production” was used according to a licensing agreement with ATR-Promotions. The database contains MR images (in the DICOM format) that were acquired while a subject utter 5 vowels. We subjected the MRI image acquired while the subject uttered the sound /a/ to binarization in order to add tissue to our model using the software Mimics 11.11 (Materialise), according to the method reported by Inohara et al. [4,5] Dentition image data from the “MRI database of Japanese vowel production” were subsequently added to the image, and the vocal tract was extracted. The data were converted to the STL format and used to produce a solid mold with a Z406 3D printer system (Z Corp.).

First, the region to be molded was selected and simulated. Since teeth are not visualized by MRI, a toothless model was initially prepared despite the binarization method being employed, as shown in the right figure of Fig.2, and the dentition was subsequently added. Of the images published by ATR-Promotions, an image acquired while the subject held blueberry jam in their mouth was adopted since blueberry jam exhibitsed a contrast medium-like effect, and the teeth was visualized as transparent regions. Slices corresponding to the teeth were individually extracted, and image data (Fig.3) for the dentition alone were prepared.

Figure 2.
Binarization was applied to the image acquired while phonating/a/ using Mimics 111.11 following the method reported by Inohara et al.

Figure 3.
Tooth-extracted mold was prepared Of the images published by ATR-Promotions, the image acquired while holding blueberry jam in the mouth was adopted. Slices corresponding to the teeth were individually extracted, and obtained image data.

The prepared dentition image data were arranged at anatomically appropriate sites on the original image. Fig.4 shows the maxillary region after the dentition data had been added. The mandibular region also had dentition data added to it.

Figure 4.
Maxillary region combined with dentition data The prepared dentition image data were arranged at anatomically appropriate sites on the original data to combine the images. Arrows: combined dentition data

These images were converted to the STL format and used to produce solid molds with the Z406 3D printer system at the Tsukuba Center of the National Institute of Industrial Science and Technology.

Figure 5.
Three-dimensional model was fabricated using ZPrinter. The model was separated 4 parts: head, nasal cabity, maxilla, mandible and pharyngteal.

2.2. Acoustic analysis

The completed vocal tract plaster models were combined, and an artificial larynx (Yourtone, Densei Inc.) was attached to the region corresponding to the vocal folds. In this preliminary study, a maxillary region with no defects was built. The initial maxillary region model had no defects, which was used to represent the preoperative conditions. Phonations were recorded for a model in a soundproof chamber. A microphone (Shure SM58) was placed 10 cm from the lips, and the sounds were recorded using a CSL4400 Computerized Speech Lab (Kay Pentax) and analyzed using the Wave Surfer software (KTH version 1.8.5) and the fast Fourier transform.

Figure 6.
Acoustic analysis An artificial larynx (YOURTONE, Densei Inc.) was attached to the region corresponding to the vocal cords. A microphone was placed at a site 10cm from lips.

3. Result

When a model was present, the first (F1) and second (F2) formants were observed at 613 and 1,109 Hz, respectively, and no antiformant was present. The formants produced by the model deviated from those in the original recording by 8.6% for F1 and 6.0% for F2.

4. Discussion

We prepared a 3-dimensional model based on MRI data obtained from the “MRI database of Japanese vowel production” published by ATR-Promotions. We used binarization to add tissue data to an MRI image acquired during the phonation of the sound /a/ using the software Mimics 11.11 (Materialise), according to the method reported by Inohara et al. After adding dentition data to the image, we used the resultant image to produce a solid mold that simulated a vocal tract that was in constant communication with the nasal cavity. We then produced without maxillary defects which is simulated the preoperative condition for the preliminary survey in our research. In the model and added an artificial larynx to the region corresponding to the vocal folds.

However, we adopted an artificial larynx as the sound source, and its acoustic characteristics obviously influenced those of the model. Thus, it is necessary to accurately investigate the acoustic characteristics of the model by improving the measurement conditions, such as utilizing white noise.

Moreover, the formants produced by the model deviated from those of the original recording; i.e., by less than 10%, but these deviations are relatively small according to a study by Takemoto et al [6]. Our use of a model with a rigid body and the loss of sound through the boundaries between the constituent parts of the model were considered to have caused these differences.

The model was divided into 4 separately molded parts in order to allow us to produce defects that simulated various surgical conditions for our further study, which simulate the presence of maxillary defect, but it might be necessary to improve the position of the joins in the model and how they were held together.

For our further study, we would like to fabricate the models with maxilla defect which is simulated the maxillectomy and confirm the acoustic characteristics in order to establish the simulation system for maxillectomy patients. By making larger defects and/or changing the defect site, our model could be used to investigate the relationship between changes in the vocal tract and its transfer characteristics. It might also be possible to identify defect sites that cause antiformants to be produced.

Such investigations might allow potential speech disorders to be predicted before surgery, the modification of resection and/or reconstruction methods to avoid such problems, and the rapid production of optimally shaped prosthetics for filling surgical defects.

5. Conclusion

From the MRI image, it is possible to make the vocal tract model and its acoustic characteristics are confirmed with acoustic analysis

All study-related procedures and tests were approved by the Ethics Committee of Tokyo Medical and Dental University (Approval No. 166). 269

The MRI image were obtained from the “MRI database of Japanese vowel production”, which was constructed during research commissioned by the National Institute of Information and Communications Technology: ‘Research and development of human information communication’ and performed and published by the ATR Innovative Technology for Human Communication. The obtained data were used and published based on a licensing agreement with ATR-Promotions.

The solid model was prepared with technical cooperation from Juri Yamashita and Kazumi Fukawa at the Tsukuba Center of the National Institute of Advanced Industrial Science and Technology.

This study was partially supported by a Grant-in-Aid for Young Scientists (B) from the Ministry of Education, Culture, Sports, Science and Technology, Japan.

This study was also supported by research grants from Support for Women Researchers from Tokyo Medical and Dental University.

References

1. ChibaTKajiyamaMThe Vowel, Its Nature and Structure; Tokyo-Kaiseikan (1942
2. SumitaY. IOzawaSMukohyamaHUenoTOhyamaTTaniguchi H:Digital acoustic analysis of five vowels in maxillectomy patients. J Oral Rehabil(2002
3. StevensK. NAcoustic Phonetics, MIT Press (1998
4. InoharaKSumitaY. IOhbayashiNInoSKurabayashiTIfukubeTTaniguchiHStandardization of thresholding for binary conversion of vocal tract modeling in Computed Tomography. J Voice,2010
5. Ken Inohara, Yuka I. Sumita and Shuichi Ino (2012Extraction of Airway in Computed Tomography, Computed Tomography- Clinical Applications, Luca Saba (Ed.), 978-9-53307-378-1InTech, Available from: http://www.intechopen.com/books/computed-tomography-clinical-applications/extraction-of-airway-in-computed-tomography
6. TakemotoHMokhtariPand KitamuraTAcoustic analysis of the vocal tract during vowel production by finite-difference time-domain method,” Journal of the Acoustical Society of America. 128 (6), 372437382010

[1] 1. ChibaTKajiyamaMThe Vowel, Its Nature and Structure; Tokyo-Kaiseikan (1942

[2] 2. SumitaY. IOzawaSMukohyamaHUenoTOhyamaTTaniguchi H:Digital acoustic analysis of five vowels in maxillectomy patients. J Oral Rehabil(2002

[3] 3. StevensK. NAcoustic Phonetics, MIT Press (1998

[4] 4. InoharaKSumitaY. IOhbayashiNInoSKurabayashiTIfukubeTTaniguchiHStandardization of thresholding for binary conversion of vocal tract modeling in Computed Tomography. J Voice,2010

[5] 5. Ken Inohara, Yuka I. Sumita and Shuichi Ino (2012Extraction of Airway in Computed Tomography, Computed Tomography- Clinical Applications, Luca Saba (Ed.), 978-9-53307-378-1InTech, Available from: http://www.intechopen.com/books/computed-tomography-clinical-applications/extraction-of-airway-in-computed-tomography

[6] 6. TakemotoHMokhtariPand KitamuraTAcoustic analysis of the vocal tract during vowel production by finite-difference time-domain method,” Journal of the Acoustical Society of America. 128 (6), 372437382010

Development of articulation simulation system using vocal tract Model

Selected Topics on Computed Tomography

Author Information

Y.I. Sumita

K. Inohara

R. Sakurai

M. Hattori*

S. Ino

T. Ifukube

H. Taniguchi

1. Introduction

Figure 1.

2. Materials and methods

2.1. Preparation of the vocal tract model

Figure 2.

Figure 3.

Figure 4.

Figure 5.

2.2. Acoustic analysis

Figure 6.

3. Result

Figure 7.

4. Discussion

5. Conclusion

References

Computed Tomography in Abdominal Imaging: How to Gain Maximum Diagnostic Information at the Lowest Radiation Dose

Your cart

Development of articulation simulation system using vocal tract Model

Selected Topics on Computed Tomography

Author Information

Y.I. Sumita

K. Inohara

R. Sakurai

M. Hattori*

S. Ino

T. Ifukube

H. Taniguchi

1. Introduction

Figure 1.

2. Materials and methods

2.1. Preparation of the vocal tract model

Figure 2.

Figure 3.

Figure 4.

Figure 5.

2.2. Acoustic analysis

Figure 6.

3. Result

Figure 7.

4. Discussion

5. Conclusion

References

Continue reading from the same book

Selected Topics on Computed Tomography

Your cart