**Our apologies if you receive multiple copies of this CFP**

MULTIMED2022: Multimedia and Multimodal Analytics in the Medical Domain and Pervasive Environments, Special Session in the 28th International Conference on Multimedia Modeling (MMM 2022), Qui Nhon, Vietnam, April 5-8, 2022.


- Full Paper Submissions: August 27, 2021
- Notifications: October 25, 2021
- Camera ready version: November 15, 2021

This special session aims at presenting the most recent works and applications in the area of multimedia analysis and digital health solutions in medical domains and pervasive environments.

More specifically, multimedia research is becoming more and more important for the medical domain, where an increasing number of videos and images are integrated in the daily routine of surgical and diagnostic work. This includes management and inspection of the data, visual analytics, as well as learning relevant semantics and using recognition results for optimizing surgical and diagnostic processes. More precisely, in the field of medical endoscopy more and more surgeons go over to record and store videos of their endoscopic procedures, such as surgeries and examinations, in long-term video archives. The recorded endoscopic videos are used later (i) as a valuable source of information for follow-up procedures, (ii) to give information about the procedure to the patients, and (iii) to train young surgeons and teach new operation techniques. Sometimes these videos are also used for manual inspection and assessment of the technical skills of surgeons, with the ultimate goal of improving surgery quality over time. However, although some surgeons record the entire procedure as video, for example in the Netherlands where it is enforced by law, many surgeons frequently record only the most important video segments. One way to support surgeons in accessing endoscopic video archives in a content-based way, i.e. in searching for a specific frame in an endoscopic video, is to automatically segment the video, remove irrelevant content, extract diverse keyframes, and provide an interactive browsing tool, e.g. with hierarchical refinement.

At the same time, the average lifespan increases and the care of diseases related to lifestyle and age, becomes costlier and less accessible. Pervasive eHealth systems seem to offer a promising solution for accessible and affordable self-management of health problems. To fulfill this vision, two important dimensions are the intelligent aggregation, fusion and interpretation of input from diverse IoT devices and personalised feedback delivered to users via intuitive interfaces and modalities. More precisely, pervasive and mobile technologies are one of the leading computing paradigms of the future. Transitioning from the world of personal computing, devices are distributed across the user’s environment, enabling the enrichment of business processes with the ability to sense, collect, integrate and combine multimodal data and services. A key requirement in multimodal domains is the ability to integrate the different pieces of information, so as to derive high-level interpretations. In this context, information is typically collected from multiple sources and complementary modalities, such as from multimedia streams (e.g. using video analysis and speech recognition), lifestyle and environmental sensors. Though each modality is informative on specific aspects of interest, the individual pieces of information themselves are not capable of delineating complex situations. Combined pieces of information on the other hand can plausibly describe the semantics of situations, facilitating intelligent situation awareness. However, the integration of devices and services to deliver novel solutions, in the so-called Internet of Things (IoT), may have been partially addressed with open platforms, but yet imposes further challenges, relevant not only to the heterogeneity, but also to the diverse context-aware information exchange and processing capabilities. On one hand, knowledge-driven approaches, such as rule- and ontology-based approaches, capitalise on knowledge representation formalisms to model activities explicitly by domain experts, combining multimodal information using predefined patterns rather than learning them from data. On the other hand, data-driven approaches rely on probabilistic and statistical models to represent activities and learn patterns from multimodal datasets. Hybrid solutions have shown that they can increase context understanding, using data-driven pre-processing (e.g. the learning of activity models) can increase the performance of ontology-based activity recognition and vice versa. Furthermore, apart from challenges emerging from the need to sense, reason, interpret, learn, predict and adapt, natural human-computer interaction via device agents, robots and avatars can deliver intuitive, personalised and context-aware spoken feedback. For example, wearable devices, smart home equipment and multimedia information can be enriched with face-to-face interactions, motivating people to actively participate in self-care activities and prescribed changes, as well as to promote chronic conditions' management and support of older adults' autonomy. Recently, human-computer interaction and conversational agents have been used in the migration domain, acting as personalised assistants of migrants and refugees supporting them in accessing health facilities, such as Public Health Services, and providing them relevant information for emergency services.

Research topics of interest for this special session include, but are not limited to:
- Multimedia indexing and retrieval for multimodal interaction in the health domain
- Multimedia indexing and retrieval with video recordings from medical endoscopy
- Multimedia recommendation
- Health and medical image quality assessment and enhancement
- Speech and audio analysis and retrieval for health applications
- Video content exploration in endoscopic video
- Speech and audio analysis and retrieval for health applications
- Mobile media retrieval
- Crowdsourcing in multimedia retrieval
- Query models, paradigms, and languages for multimedia retrieval
- Fusion of multimodal information for health and care-giving applications
- Semantic web approaches for multimedia health applications
- Multilingual and multimodal communication in interactive basic care and health assistive systems.
- Semantic reasoning of health multimedia data in interactive systems
- Web and social media retrieval for knowledge-based social companion applications
- Knowledge modelling, interpretation, context-awareness in pervasive environments
- Semantically-enriched situation awareness in smart homes
- Semantic interoperability in IoT platforms, sensors, mobile and wearable devices in pervasive environments
- Hybrid reasoning frameworks (data- and knowledge-driven solutions)
- Semantic Complex Event Processing
- Activity recognition based under uncertainty, noise and incomplete data
- Event modelling and detection
- Semantic Web technologies in eHealth, mHealth, uHealth, Ambient Assisted Living and medical applications
- Multimodal data fusion for health applications 
- Multimodal conversation, dialogue systems and personal coaches
- Multimedia interaction for agent applications
- Intelligent user interfaces for multimedia retrieval
- Avatar development for health applications
- Multimodal analytics for human machine interaction in the health and migration domain 
- Multimodal conversation and dialogue systems for social companion  and migration-focused agents?
- Interactive multimedia retrieval algorithms and applications for health 
- Interfaces and visualization tools for interaction with multimedia in the health domain.
- Facial analysis and gesture recognition in social agents

Special session papers will supplement the regular research papers and be included in the proceedings of MMM 2022. For more information on submitting a paper to the special session, please visit the Author guidelines page of the main MMM conference (http://mmm2022.org/authors.html). 

- Thanassis Mavropoulos, Information Technologies Institute, Centre for Research and Technology Hellas, Greece
- Georgios Meditskos, School of Informatics, Aristotle University of Thessaloniki, Greece
- Klaus Schoeffmann, Klagenfurt University, Austria
- Leo Wanner, ICREA - Universitat Pompeu Fabra, Spain
- Stefanos Vrochidis, Information Technologies Institute, Centre for Research and Technology Hellas, Greece
- Athanasios Tzioufas, Medical School of the National and Kapodistrian University of Athens, Greece