Share / Export Citation / Email / Print / Text size:

International Journal on Smart Sensing and Intelligent Systems

Professor Subhas Chandra Mukhopadhyay

Exeley Inc. (New York)

Subject: Computational Science & Engineering , Engineering, Electrical & Electronic


eISSN: 1178-5608



VOLUME 8 , ISSUE 3 (September 2015) > List of articles


Atef Zaguia * / Chakib Tadj * / Amar Ramdane-Cherif *

Keywords : Multimodality, Ontology, Bayesian Network, Pattern, User Interface, Multimodal Fission.

Citation Information : International Journal on Smart Sensing and Intelligent Systems. Volume 8, Issue 3, Pages 1,667-1,686, DOI:

License : (CC BY-NC-ND 4.0)

Received Date : 03-May-2015 / Accepted: 21-July-2015 / Published Online: 01-September-2015



Today, technology allows us to produce extensive multimodal systems which are totally under human control. These systems are equipped with multimodal interfaces, which enable more natural and more efficient interaction between man and machine. End users can take advantage of natural modalities (e.g. audio, eye gaze, speech, gestures, etc.) to communicate or exchange information with applications. In this work, we assume that a number of these modalities are available to the user. In this paper, we present a prototype of a multimodal architecture, and show how modality selection and fission algorithms are implemented in such a system. We use a pattern technique to divide a complex command into elementary subtasks and select suitable modalities for each of them. We integrate a context-based method using a Bayesian network to resolve ambiguous or uncertain situations.

Content not available PDF Share



[1] P. K. Atrey, M. A. Hossain, A. El Saddik, and M. S. Kankanhalli, "Multimodal fusion for multimedia analysis: a survey," Multimedia Systems, vol. 16, pp. 345-379, 2010.
[2] H. Djenidi, S. Benarif, A. Ramdane-Cherif, C. Tadj, and N. Levy, "Generic multimedia multimodal agents paradigms and their dynamic reconfiguration at the architectural level," EURASIP J. Appl. Signal Process., vol. 2004, pp. 1688-1707, 2004.
[3] R. Bolt, "Put-that-there," Voice and gesture at the graphics interface ACMSIGGRAPH Computer Graphics, vol. 14, pp. 262-270, 1980.
[5] Turk, Matthew. "Multimodal interaction: A review." Pattern Recognition Letters 36 (2014): 189-195.
[6] T. D. C. Little, C. Y. R. Chen, C. S. Chang, and P. B. Berra, "Multimedia Synchronization," EEE Data Eng. Bull., vol. 14, pp. 26-35, 1991.
[7] S. Oviatt, P. Cohen, L. Wu, J. Vergo, L. Duncan, B. Suhm, J. Bers, T. Holzman, T. Winograd, J. Landay, J. Larson, and D. Ferro, "Designing the User Interface for Multimodal Speech and Pen-Based Gesture Applications: State-of-the-Art Systems and Future Research Directions," in Human-Computer Interaction. vol. 15, ed: Lawrence Erlbaum Associates, 2000, pp. 263–322.
[8] R. Raisamo, A. Hippula, S. Patomaki, E. Tuominen, V. Pasto, and M. Hasu, "Testing usability of multimodal applications with visually impaired children," MultiMedia, IEEE, vol. 13, pp. 70-76, 2006.
[9] M. Debevc, P. Kosec, M. Rotovnik, and A. Holzinger, "Accessible Multimodal Web Pages with Sign Language Translations for Deaf and Hard of Hearing Users," in Database and Expert Systems Application, 2009. DEXA '09. 20th International Workshop on, 2009, pp. 279-283.
[10] J. Lai, S. Mitchell, and C. Pavlovski, "Examining modality usage in a conversational multimodal application for mobile e-mail access," International Journal of Speech Technology, vol. 10, pp. 17-30, 2007/03/01 2007.
[11] M. C. Caschera, A. D'Andrea, A. D'Ulizia, F. Ferri, P. Grifoni, and T. Guzzo, "ME: Multimodal Environment Based on Web Services Architecture," presented at the OTM 2009 Workshops, Vilamoura, Portugal, 2009. pp 504-512.
[12] A. Karpov, A. Ronzhin, I. Kipyatkova, and L. Akarun, "Multimodal Human Computer Interaction with MIDAS Intelligent Infokiosk," in Pattern Recognition (ICPR), 2010 20th International Conference on, 2010, pp. 3862-3865.
[13] G. P. Laput, M. Dontcheva, G. Wilensky, W. Chang, A. Agarwala, J. Linder, and E. Adar, "PixelTone: a multimodal interface for image editing," presented at the Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Paris, France, 2013.
[14] A. Zaguia, M. D. Hina, C. Tadj, and A. Ramdane-Cherif, "Using Multimodal Fusion in Accessing Web Services," Journal of Emerging Trends in Computing and Information Sciences, vol. 1, pp. 121-138, 2010.
[15] D. Costa and C. Duarte, "Adapting Multimodal Fission to User’s Abilities," in Universal Access in Human-Computer Interaction. Design for All and eInclusion. vol. 6765, C. Stephanidis, Ed., ed: Springer Berlin Heidelberg, 2011, pp. 347-356.
[16] D. Perroud, L. Angelini, O. Abou Khaled, and E. Mugellini, "Context-Based Generation of Multimodal Feedbacks for Natural Interaction in Smart Environments," in AMBIENT 2012, The Second International Conference on Ambient Computing, Applications, Services and Technologies, 2012, pp. 19-25.
[17] Honold, Frank, Felix Schussel, and Michael Weber. "The Automated Interplay of Multimodal Fission and Fusion in Adaptive HCI." Intelligent Environments (IE), 2014 International Conference on. IEEE, 2014.
[18] Schnelle-Walka, Dirk, Stefan Radomski, and Max Mühlhäuser. "Multimodal Fusion and Fission within W3C Standards for Nonverbal Communication with Blind Persons." Computers Helping People with Special Needs. Springer International Publishing, 2014. 209-213.
[19] A. Benoit, L. Bonnaud, A. Caplier, P. Ngo, L. Lawson, D. G. Trevisan, V. Levacic, C. Mancas, and G. Chanel, "Multimodal focus attention and stress detection and feedback in an augmented driver simulator," Personal and Ubiquitous Computing, vol. 13, 2009. pp 33-41.
[20] A. Zaguia, A. Wahbi, M. Miraoui, C. Tadj, and A. Ramdane-Cherif, "Modeling Rules Fission and Modality Selection Using Ontology," Journal of Software Engineering and Applications, vol. 7, pp. 354-371, 2013.
[21] A. Zaguia, A. Wahbi, C. Tadj, and A. Ramdane-Cherif, "Multimodal Fission For Interaction Architecture," Journal of Emerging Trends in Computing and Information Sciences, vol. 4, February 2013 2013. pp. 152-166.
[22] A. Zaguia, M. D. Hina, C. Tadj, and A. Ramdane-Cherif, "Interaction context-aware modalities and multimodal fusion for acessing web services," Ubiquitous Computing and Communication Journal, vol. 5, N 4, pp.1-15, 2010.
[23] Q. Ji, Z. Zhu, and P. Lan, "Real-time nonintrusive monitoring and prediction of driver fatigue," Vehicular Technology, IEEE Transactions on, vol. 53, pp. 1052-1068, 2004.
[24] Java. (2013). Java Socket. Available:
[25] Zhang, Fei, Wuying Liu, and Yude Bi. "Review on Wordnet-based ontology construction in China." International Journal on Smart Sensing and Intelligent Systems 6.2 (2013). pp.630-647. [26] Guessoum, Djamel, Moeiz Miraoui, and Chakib Tadj. "SURVEY OF SEMANTIC SIMILARITY MEASURES IN PERVASIVE COMPUTING." International Journal on Smart Sensing and Intelligent Systems 8.1 (2015). pp 125 – 158.
[27] Pavlakos, Georgios, Stavros Theodorakis, Vassilis Pitsikalis, S. Katsamanis, and Petros Maragos. "Kinect-based multimodal gesture recognition using a two-pass fusion scheme." In Proc. Int’l Conf. on Image Processing. 2014. pp. 1495 – 1499.