| 2023 | A Comparison of Video Browsing Performance between Desktop and Virtual Reality Interfaces. Florian Spiess, Ralph Gasser, Silvan Heller, Heiko Schuldt, Luca Rossetto |
| 2023 | A Dual-branch Enhanced Multi-task Learning Network for Multimodal Sentiment Analysis. Wenxiu Geng, Xiangxian Li, Yulong Bian |
| 2023 | A Multi-Teacher Assisted Knowledge Distillation Approach for Enhanced Face Image Authentication. Tiancong Cheng, Ying Zhang, Yifang Yin, Roger Zimmermann, Zhiwen Yu, Bin Guo |
| 2023 | A Recurrent Neural Network based Generative Adversarial Network for Long Multivariate Time Series Forecasting. Peiwang Tang, Qinghua Zhang, Xianchao Zhang |
| 2023 | A Robust Deep Learning Enhanced Monocular SLAM System for Dynamic Environments. Yaoqing Li, Sheng-hua Zhong, Shuai Li, Yan Liu |
| 2023 | A Unified Model for Video Understanding and Knowledge Embedding with Heterogeneous Knowledge Graph Dataset. Jiaxin Deng, Dong Shen, Haojie Pan, Xiangyu Wu, Ximan Liu, Gaofeng Meng, Fan Yang, Tingting Gao, Ruiji Fu, Zhongyuan Wang |
| 2023 | ASCS-Reinforcement Learning: A Cascaded Framework for Accurate 3D Hand Pose Estimation. Mingqi Chen, Feng Shuang, Shaodong Li, Xi Liu |
| 2023 | AVForensics: Audio-driven Deepfake Video Detection with Masking Strategy in Self-supervision. Yizhe Zhu, Jialin Gao, Xi Zhou |
| 2023 | Algorithms for Generating and Evaluating Visually Sorted Grid Layouts. Kai Uwe Barthel |
| 2023 | Attention-based Video Virtual Try-On. Wen-Jiin Tsai, Yi-Cheng Tien |
| 2023 | CLAP: Contrastive Language-Audio Pre-training Model for Multi-modal Sentiment Analysis. Tianqi Zhao, Ming Kong, Tian Liang, Qiang Zhu, Kun Kuang, Fei Wu |
| 2023 | CMMT: Cross-Modal Meta-Transformer for Video-Text Retrieval. Yizhao Gao, Zhiwu Lu |
| 2023 | CNNs with Multi-Level Attention for Domain Generalization. Aristotelis Ballas, Christos Diou |
| 2023 | CalorieCam360: Simultaneous Eating Action Recognition of Multiple People Using an Omnidirectional Camera. Kento Terauchi, Keiji Yanai |
| 2023 | Cross-Language Music Recommendation Exploration. Stefanos Stoikos, David Kauchak, Douglas Turnbull, Alexandra Papoutsaki |
| 2023 | Cross-View Sample-Enriched Graph Contrastive Learning Network for Personalized Micro-video Recommendation. Ying He, Gongqing Wu, Desheng Cai, Xuegang Hu |
| 2023 | CurveSDF: Binary Image Vectorization Using Signed Distance Fields. Zeqing Xia, Zhouhui Lian |
| 2023 | Deep Enhanced-Similarity Attention Cross-modal Hashing Learning. Mingyuan Ge, Yewen Li, Longfei Ma, Mingyong Li |
| 2023 | Dual-Modality Co-Learning for Unveiling Deepfake in Spatio-Temporal Space. Jiazhi Guan, Hang Zhou, Zhizhi Guo, Tianshu Hu, Lirui Deng, Chengbin Quan, Meng Fang, Youjian Zhao |
| 2023 | Dual-Path Semantic Construction Network for Composed Query-Based Image Retrieval. Shenshen Li |
| 2023 | Dual-Stream Multimodal Learning for Topic-Adaptive Video Highlight Detection. Ziwei Xiong, Han Wang |
| 2023 | EMP: Emotion-guided Multi-modal Fusion and Contrastive Learning for Personality Traits Recognition. Yusong Wang, Dongyuan Li, Kotaro Funakoshi, Manabu Okumura |
| 2023 | Edge Enhanced Image Style Transfer via Transformers. Chiyu Zhang, Zaiyan Dai, Peng Cao, Jun Yang |
| 2023 | Efficient CNNs and Transformers for Video Understanding and Image Synthesis. Jürgen Gall |
| 2023 | Escaping local minima in deep reinforcement learning for video summarization. Panagiota Alexoudi, Ioannis Mademlis, Ioannis Pitas |
| 2023 | Explaining Image Aesthetics Assessment: An Interactive Approach. Sven Schultze, Ani Withöft, Larbi Abdenebaoui, Susanne Boll |
| 2023 | Explicit Knowledge Integration for Knowledge-Aware Visual Question Answering about Named Entities. Omar Adjali, Paul Grimal, Olivier Ferret, Sahar Ghannay, Hervé Le Borgne |
| 2023 | Exploration of Lightweight Single Image Denoising with Transformers and Truly Fair Training. Haram Choi, Cheolwoong Na, Jinseop Kim, Jihoon Yang |
| 2023 | FaceLivePlus: A Unified System for Face Liveness Detection and Face Verification. Ying Zhang, Lilei Zheng, Vrizlynn L. L. Thing, Roger Zimmermann, Bin Guo, Zhiwen Yu |
| 2023 | FedPcf : An Integrated Federated Learning Framework with Multi-Level Prospective Correction Factor. Yu Zang, Zhe Xue, Shilong Ou, Yunfei Long, Hai Zhou, Junping Du |
| 2023 | Framing the News: From Human Perception to Large Language Model Inferences. David Alonso del Barrio, Daniel Gatica-Perez |
| 2023 | Graph Contrastive Learning on Complementary Embedding for Recommendation. Meishan Liu, Meng Jian, Ge Shi, Ye Xiang, Lifang Wu |
| 2023 | Graph Interactive Network with Adaptive Gradient for Multi-Modal Rumor Detection. Tiening Sun, Zhong Qian, Peifeng Li, Qiaoming Zhu |
| 2023 | How Responsible LLMs are beneficial to search and exploration in Retail industry. Nozha Boujemaa, Abdelrahman Hassan, Giorgi Kokaia, Pratyush Kumar Sinha |
| 2023 | Hypernymization of named entity-rich captions for grounding-based multi-modal pretraining. Giacomo Nebbia, Adriana Kovashka |
| 2023 | ICDAR'23: Intelligent Cross-Data Analysis and Retrieval. Guillaume Habault, Minh-Son Dao, Michael Alexander Riegler, Duc-Tien Dang-Nguyen, Yuta Nakashima, Cathal Gurrin |
| 2023 | Improving Generalization for Multimodal Fake News Detection. Sahar Tahmasebi, Sherzod Hakimov, Ralph Ewerth, Eric Müller-Budack |
| 2023 | Improving Image Encoders for General-Purpose Nearest Neighbor Search and Classification. Konstantin Schall, Kai Uwe Barthel, Nico Hezel, Klaus Jung |
| 2023 | Improving Query and Assessment Quality in Text-Based Interactive Video Retrieval Evaluation. Werner Bailer, Rahel Arnold, Vera Benz, Davide Coccomini, Anastasios Gkagkas, Gylfi Þór Guðmundsson, Silvan Heller, Björn Þór Jónsson, Jakub Lokoc, Nicola Messina, Nick Pantelidis, Jiaxin Wu |
| 2023 | Integrative Multi-Modal Computing for Personal Health Navigation. Nitish Nag, Hyungik Oh, Mengfan Tang, Mingshu Shi, Ramesh C. Jain |
| 2023 | Intra-inter Modal Attention Blocks for RGB-D Semantic Segmentation. Soyun Choi, Youjia Zhang, Sungeun Hong |
| 2023 | Introduction to the Sixth Annual Lifelog Search Challenge, LSC'23. Cathal Gurrin, Björn Þór Jónsson, Duc-Tien Dang-Nguyen, Graham Healy, Jakub Lokoc, Liting Zhou, Luca Rossetto, Minh-Triet Tran, Wolfgang Hürst, Werner Bailer, Klaus Schoeffmann |
| 2023 | Joint Geometric-Semantic Driven Character Line Drawing Generation. Cheng-Yu Fang, Xian-Feng Han |
| 2023 | Knowledge-Aware Causal Inference Network for Visual Dialog. Zefan Zhang, Yi Ji, Chunping Liu |
| 2023 | Label-wise Deep Semantic-Alignment Hashing for Cross-Modal Retrieval. Liang Li, Weiwei Sun |
| 2023 | Learning From Expert: Vision-Language Knowledge Distillation for Unsupervised Cross-Modal Hashing Retrieval. Lina Sun, Yewen Li, Yumin Dong |
| 2023 | Learning and Fusing Multi-Scale Representations for Accurate Arbitrary-Shaped Scene Text Recognition. Mingjun Li, Shuo Xu, Feng Su |
| 2023 | Learning with Adaptive Knowledge for Continual Image-Text Modeling. Yutian Luo, Yizhao Gao, Zhiwu Lu |
| 2023 | Less is More: Decoupled High-Semantic Encoding for Action Recognition. Chun Zhang, Keyan Ren, Qingyun Bian, Yu Shi |
| 2023 | MAAM: Media Asset Annotation and Management. Manos Schinas, Panagiotis Galopoulos, Symeon Papadopoulos |
| 2023 | MAD '23 Workshop: Multimedia AI against Disinformation. Luca Cuccovillo, Bogdan Ionescu, Giorgos Kordopatis-Zilos, Symeon Papadopoulos, Adrian Popescu |
| 2023 | MMSF: A Multimodal Sentiment-Fused Method to Recognize Video Speaking Style. Beibei Zhang, Yaqun Fang, Fan Yu, Jia Bei, Tongwei Ren |
| 2023 | MemeFier: Dual-stage Modality Fusion for Image Meme Classification. Christos Koutlis, Manos Schinas, Symeon Papadopoulos |
| 2023 | Modeling Functional Brain Networks with Multi-Head Attention-based Region-Enhancement for ADHD Classification. Chunhong Cao, Huawei Fu, Gai Li, Mengyang Wang, Xieping Gao |
| 2023 | More Than Simply Masking: Exploring Pre-training Strategies for Symbolic Music Understanding. Zhexu Shen, Liang Yang, Zhihan Yang, Hongfei Lin |
| 2023 | Multi-Label Meta Weighting for Long-Tailed Dynamic Scene Graph Generation. Shuo Chen, Ying-Jun Du, Pascal Mettes, Cees G. M. Snoek |
| 2023 | Multi-channel Convolutional Neural Network for Precise Meme Classification. Victoria Sherratt, Kevin Pimbblet, Nina Dethlefs |
| 2023 | Multi-granularity Separation Network for Text-Based Person Retrieval with Bidirectional Refinement Regularization. Shenshen Li, Xing Xu, Fumin Shen, Yang Yang |
| 2023 | Multi-modal Fake News Detection on Social Media via Multi-grained Information Fusion. Yangming Zhou, Yuzhou Yang, Qichao Ying, Zhenxing Qian, Xinpeng Zhang |
| 2023 | Multi-view Contrastive Learning with Additive Margin for Adaptive Nasopharyngeal Carcinoma Radiotherapy Prediction. Jiabao Sheng, Saikit Lam, Zhe Li, Jiang Zhang, Xinzhi Teng, Yuanpeng Zhang, Jing Cai |
| 2023 | Multimodal Topic Segmentation of Podcast Shows with Pre-trained Neural Encoders. Iacopo Ghinassi, Lin Wang, Chris Newell, Matthew Purver |
| 2023 | MuseHash: Supervised Bayesian Hashing for Multimodal Image Representation. Maria Pegia, Björn Þór Jónsson, Anastasia Moumtzidou, Ilias Gialampoukidis, Stefanos Vrochidis, Ioannis Kompatsiaris |
| 2023 | Not Only Generative Art: Stable Diffusion for Content-Style Disentanglement in Art Analysis. Yankun Wu, Yuta Nakashima, Noa Garcia |
| 2023 | Offensive Tactics Recognition in Broadcast Basketball Videos Based on 2D Camera View Player Heatmaps. subst Nico, Tse-Yu Pan, Herman Prawiro, Jian-Wei Peng, Wen-Cheng Chen, Hung-Kuo Chu, Min-Chun Hu |
| 2023 | Predicting Tweet Engagement with Graph Neural Networks. Marco Arazzi, Marco Cotogni, Antonino Nocera, Luca Virgili |
| 2023 | Proceedings of the 2023 ACM International Conference on Multimedia Retrieval, ICMR 2023, Thessaloniki, Greece, June 12-15, 2023 Ioannis Kompatsiaris, Jiebo Luo, Nicu Sebe, Angela Yao, Vasileios Mazaris, Symeon Papadopoulos, Adrian Popescu, Zi Helen Huang |
| 2023 | RIP-NeRF: Learning Rotation-Invariant Point-based Neural Radiance Field for Fine-grained Editing and Compositing. Yuze Wang, Junyi Wang, Yansong Qu, Yue Qi |
| 2023 | Raising User Awareness about the Consequences of Online Photo Sharing. Hugo Schindler, Adrian Popescu, Van-Khoa Nguyen, Jerome Deshayes-Chossart |
| 2023 | Recognizing Actions in Videos under Domain Shift. Elisa Ricci |
| 2023 | Recommendation of Mix-and-Match Clothing by Modeling Indirect Personal Compatibility. Shuiying Liao, Yujuan Ding, P. Y. Mok |
| 2023 | Reducing Semantic Confusion: Scene-aware Aggregation Network for Remote Sensing Cross-modal Retrieval. Jiancheng Pan, Qing Ma, Cong Bai |
| 2023 | Reference-Limited Compositional Zero-Shot Learning. Siteng Huang, Qiyao Wei, Donglin Wang |
| 2023 | Reproducibility Companion Paper: MeTILDA - Platform for Melodic Transcription in Language Documentation and Application. Mitchell Lee, Chris Lee, Sanjay Penmetsa, Min Chen, Mizuki Miyashita, Naatosi Fish, Bo Wu, Omar Shahbaz Khan |
| 2023 | SIGMA-DF: Single-Side Guided Meta-Learning for Deepfake Detection. Bing Han, Jianshu Li, Wenqi Ren, Man Luo, Jian Liu, Xiaochun Cao |
| 2023 | SOFA: Style-based One-shot 3D Facial Animation Driven by 2D landmarks. Pu Ching, Hung-Kuo Chu, Min-Chun Hu |
| 2023 | SPAE: Spatial Preservation-based Autoencoder for ADHD functional brain networks modelling. Chunhong Cao, Gai Li, Huawei Fu, Xingxing Li, Xieping Gao |
| 2023 | Shot Retrieval and Assembly with Text Script for Video Montage Generation. Guoxing Yang, Haoyu Lu, Zelong Sun, Zhiwu Lu |
| 2023 | Strong-Weak Cross-View Interaction Network for Stereo Image Super-Resolution. Kun He, Changyu Li, Jie Shao |
| 2023 | Symbol Location-Aware Network for Improving Handwritten Mathematical Expression Recognition. Yingnan Fu, Wenyuan Cai, Ming Gao, Aoying Zhou |
| 2023 | TAGM: Task-Aware Graph Model for Few-shot Node Classification. Feng Zhao, Min Zhang, Tiancheng Huang, Donglin Wang |
| 2023 | TDEC: Deep Embedded Image Clustering with Transformer and Distribution Information. Ruilin Zhang, Haiyang Zheng, Hongpeng Wang |
| 2023 | TNOD: Transformer Network with Object Detection for Tag Recommendation. Kai Feng, Tao Liu, Heng Zhang, Zihao Meng, Zemin Miao |
| 2023 | Text-to-Image Fashion Retrieval with Fabric Textures. Daichi Suzuki, Go Irie, Kiyoharu Aizawa |
| 2023 | Towards Practical Consistent Video Depth Estimation. Pengzhi Li, Yikang Ding, Linge Li, Jingwei Guan, Zhiheng Li |
| 2023 | Towards Shape-regularized Learning for Mitigating Texture Bias in CNNs. Harsh Sinha, Adriana Kovashka |
| 2023 | TsP-Tran: Two-Stage Pure Transformer for Multi-Label Image Retrieval. Ying Li, Chunming Guan, Jiaquan Gao |
| 2023 | Tweaking EfficientDet for frugal training. Georgios Orfanidis, Konstantinos Ioannidis, Anastasios Tefas, Stefanos Vrochidis, Ioannis Kompatsiaris |
| 2023 | Unlocking Potential of 3D-aware GAN for More Expressive Face Generation. Juheon Hwang, Jiwoo Kang, Kyoungoh Lee, Sanghoon Lee |
| 2023 | VISIONE: A Large-Scale Video Retrieval System with Advanced Search Functionalities. Giuseppe Amato, Paolo Bolettieri, Fabio Carrara, Fabrizio Falchi, Claudio Gennaro, Nicola Messina, Lucia Vadicamo, Claudio Vairo |
| 2023 | Video Retrieval for Everyday Scenes With Common Objects. Arun Zachariah, Praveen Rao |
| 2023 | We Are Not So Similar: Alleviating User Representation Collapse in Social Recommendation. Bingchao Wu, Yangyuxuan Kang, Bei Guan, Yongji Wang |
| 2023 | Zero-shot Sketch-based Image Retrieval with Adaptive Balanced Discriminability and Generalizability. Jialin Tian, Xing Xu, Zuo Cao, Gong Zhang, Fumin Shen, Yang Yang |
| 2023 | navigu.net: NAvigation in Visual Image Graphs gets User-friendly. Kai Uwe Barthel, Nico Hezel, Konstantin Schall, Klaus Jung |