| 2024 | 3DMSE: An Interactive 3D Media Search Engine. Maria Eirini Pegia, Dimitris Georgalis, Nick Pantelidis, Björn Þór Jónsson, Anastasia Moumtzidou, Sotiris Diplaris, Ilias Gialampoukidis, Stefanos Vrochidis, Ioannis Kompatsiaris |
| 2024 | A Causal View for Multi-Interest User Modeling in News Recommendation. Mei Yu, Xiaoxi Zhou, Mankun Zhao, Tianyi Xu, Yue Zhao, Ruiguo Yu, Xuewei Li |
| 2024 | A GAN based Video Summarization Method with Representation Loss. Zhuo Lei, Qiang Yu, Lidan Shou, Shengquan Li, Yunqing Mao |
| 2024 | A Generative Adaptive Context Learning Framework for Large Language Models in Cheapfake Detection. Long-Khanh Pham, Hoa-Vien Vo-Hoang, Anh-Duy Tran |
| 2024 | A Graph Convolution Network with a POS-aware Filter and Context Enhancement Mechanism for Event Detection. Xintao Jiao, Jiansheng Chen, Jiale Liu |
| 2024 | A Hybrid Few-Shot Image Classification Framework Combining Gaussian Modeling and Label Propagation. Chao Ye, Qian Wang, Lanfang Dong |
| 2024 | A Knowledge-Driven Approach to Enhance Topic Modeling with Multi-Modal Representation Learning. Hongzhang Mu, Shuili Zhang, Hongbo Xu |
| 2024 | A Lightweight Surface Defect Segmentation Network with External Semantics and High-frequency Information. Tianpeng Zhang, Xuesong Jiang |
| 2024 | A Multi-Stage Deep Learning Approach Incorporating Text-Image and Image-Image Comparisons for Cheapfake Detection. Jangwon Seo, Hyo-Seok Hwang, Jiyoung Lee, Minhyeok Lee, Wonsuk Kim, Junhee Seok |
| 2024 | A Novel Auxiliary Task Framework in 3D Human Pose Estimation for Opera Videos. Xingquan Cai, Haoyu Zhang, Shanshan He, Haoyu Song, Haiyan Sun |
| 2024 | A Parallel Transformer Framework for Video Moment Retrieval. Thao-Nhu Nguyen, Zongyao Li, Satoshi Yamazaki, Jianquan Liu, Cathal Gurrin |
| 2024 | A Sentimental Prompt Framework with Visual Text Encoder for Multimodal Sentiment Analysis. Shizhou Huang, Bo Xu, Changqun Li, Jiabo Ye, Xin Lin |
| 2024 | A Unified Network for Detecting Out-Of-Context Information Using Generative Synthetic Data. Van-Loc Nguyen, Bao-Tin Nguyen, Thanh-Son Nguyen, Duc-Tien Dang-Nguyen, Minh-Triet Tran |
| 2024 | A Web Demo Interface for Super-Resolution Reconstruction with Parametric Regularization Loss. Supatta Viriyavisuthisakul, Parinya Sanguansat, Toshihiko Yamasaki |
| 2024 | ACR-Pose: Adversarial Canonical Representation Reconstruction Network for Category Level 6D Object Pose Estimation. Zhaoxin Fan, Zhenbo Song, Zhicheng Wang, Jian Xu, Kejian Wu, Hongyan Liu, Jun He |
| 2024 | AI Batting Buddy: A Computational and Kinematic Approach for Enhancing Batting Performance and Analysis in Baseball. Kuo-Yu Liu, Ting-Yu Guo, Ta-Shan Pan, Ping-Yi Tung, Yi-Rou Lin |
| 2024 | AI-SIPM 2024: International Workshop on Artificial Intelligence for Signal, Image Processing and Multimedia. Mahasak Ketcham, Kanyalag Phodong, Patiyuth Pramkeaw, Worawut Yimyam, Narumol Chumuang, Pokpong Songmuang, Thittaporn Ganokratanaa |
| 2024 | AdOCTeRA: Adaptive Optimization Constraints for improved Text-guided Retrieval of Apartments. Ali Abdari, Alex Falcon, Giuseppe Serra |
| 2024 | An Exploration Graph with Continuous Refinement for Efficient Multimedia Retrieval. Nico Hezel, Kai Uwe Barthel, Konstantin Schall, Klaus Jung |
| 2024 | Anchor-aware Deep Metric Learning for Audio-visual Retrieval. Donghuo Zeng, Yanan Wang, Kazushi Ikeda, Yi Yu |
| 2024 | BFIDet: A YOLOv7-improved Vehicle and Pedestrian Detector via Balancing Feature Integration. Anrui Wang, Libo Weng, Fei Gao |
| 2024 | BeatDance: A Beat-Based Model-Agnostic Contrastive Learning Framework for Music-Dance Retrieval. Kaixing Yang, Xukun Zhou, Xulong Tang, Ran Diao, Hongyan Liu, Jun He, Zhaoxin Fan |
| 2024 | Bringing Video Browsing to Virtual Reality: Empirical Evaluation of a Novel Multimedia Drawer. Florian Spiess, Nicolas Scharowski, Ariane Haller, Zgjim Memeti, Heiko Schuldt, Florian Brühlmann |
| 2024 | CGI-MRE: A Comprehensive Genetic-Inspired Model For Multimodal Relation Extraction. Pengfei Wei, Zhaokang Huang, Hongjun Ouyang, Qintai Hu, Bi Zeng, Guang Feng |
| 2024 | CLCP: Realtime Text-Image Retrieval for Retailing via Pre-trained Clustering and Priority Queue. Shuyang Zhang, Liangwu Wei, Qingyu Wang, Yuntao Wei, Yanzhi Song |
| 2024 | CLIP-ProbCR: CLIP-based Probability embedding Combination Retrieval. Mingyong Li, Zongwei Zhao, Xiaolong Jiang, Zheng Jiang |
| 2024 | CLIPping the Deception: Adapting Vision-Language Models for Universal Deepfake Detection. Sohail Ahmed Khan, Duc-Tien Dang-Nguyen |
| 2024 | CLTalk: Speech-Driven 3D Facial Animation with Contrastive Learning. Xitie Zhang, Suping Wu |
| 2024 | CMFF-Face: Attention-Based Cross-Modal Feature Fusion for High-Quality Audio-Driven Talking Face Generation. Guangzhe Zhao, Yanan Liu, Xueping Wang, Feihu Yan |
| 2024 | Calibration & Reconstruction: Deeply Integrated Language for Referring Image Segmentation. Yichen Yan, Xingjian He, Sihan Chen, Jing Liu |
| 2024 | CarAI: Car Inspection with Artificial Intelligence. Panumate Chetprayoon, Sakol Tasanangam, Gayatri Tirumalasetty, Thanatwit Angsarawanee, Paveen Virameteekul, Wadeepas Lertwatanawanich, Theerat Sakdejayont |
| 2024 | Causal Inference-based Few-Shot Class-Incremental Learning. Weiwei Zhou, Guoqiang Xiao, Michael S. Lew, Song Wu |
| 2024 | CoDancers: Music-Driven Coherent Group Dance Generation with Choreographic Unit. Kaixing Yang, Xulong Tang, Ran Diao, Hongyan Liu, Jun He, Zhaoxin Fan |
| 2024 | CodeDetector: Revealing Forgery Traces with Codebook for Generalized Deepfake Detection. Jiaxin Li, Zhihan Yu, Guibo Luo, Yuesheng Zhu |
| 2024 | Comment-aided Video-Language Alignment via Contrastive Pre-training for Short-form Video Humor Detection. Yang Liu, Tongfei Shen, Dong Zhang, Qingying Sun, Shoushan Li, Guodong Zhou |
| 2024 | Compact Visual Data Representation for Multimedia Search and Analytics. Shiqi Wang, Xinfeng Zhang |
| 2024 | Component-Level Oracle Bone Inscription Retrieval. Zhikai Hu, Yiu-ming Cheung, Yonggang Zhang, Peiying Zhang, Puiling Tang |
| 2024 | Content-Based Exclusion Queries in Keyword-Based Image Retrieval. Eisaku Yoshikawa, Keishi Tajima |
| 2024 | Context or Clutter? Efficiently Matching Objects Across Scenes. Albatool Wazzan, Imtiaz Ahmad, Stephen MacNeil, Richard Souvenir |
| 2024 | Contrastive Pre-training with Multi-level Alignment for Grounded Multimodal Named Entity Recognition. Xigang Bao, Mengyuan Tian, Luyao Wang, Zhiyuan Zha, Biao Qin |
| 2024 | Conversational Image Search: A Sketch-based Approach. Daniel D. Braghis, Haiming Liu |
| 2024 | Creating Sorted Grid Layouts with Gradient-based Optimization. Kai Uwe Barthel, Florian Tim Barthel, Peter Eisert, Nico Hezel, Konstantin Schall |
| 2024 | Deep Image Clustering Based on Curriculum Learning and Density Information. Haiyang Zheng, Ruilin Zhang, Hongpeng Wang |
| 2024 | Deep Scaling Factor Quantization Network for Large-scale Image Retrieval. Ziqing Deng, Zhihui Lai, Yujuan Ding, Heng Kong, Xu Wu |
| 2024 | DeepEnhancer: Temporally Consistent Focal Transformer for Comprehensive Video Enhancement. Qin Jiang, Qinglin Wang, Lihua Chi, Wentao Ma, Feng Li, Jie Liu |
| 2024 | Detecting Misinformation in Photos Utilizing Reverse Image Search. Vinh Dang, Thanh-Son Nguyen, Minh-Triet Tran, Duc-Tien Dang-Nguyen |
| 2024 | Detecting Out-of-Context Media with LLaMa-Adapter V2 and RoBERTa: An Effective Method for Cheapfakes Detection. Hoa-Vien Vo-Hoang, Long-Khanh Pham, Minh-Son Dao |
| 2024 | DiffHarmony: Latent Diffusion Model Meets Image Harmonization. Pengfei Zhou, Fangxiang Feng, Xiaojie Wang |
| 2024 | Directly Locating Actions in Video with Single Frame Annotation. Haoran Tong, Xinyan Liu, Guorong Li, Laiyun Qing |
| 2024 | Discovering Multi-Relational Integration for Knowledge Tracing with Retentive Networks. Linhao Zhou, Sheng-hua Zhong, Zhijiao Xiao |
| 2024 | Diversity in Multimedia. Yi-Ping Phoebe Chen |
| 2024 | DualStyle3D: Real-time Exemplar-based Artistic Portrait View Synthesis Based on Radiance Field. Runlai Hao, Jinlong Li, Qiuju Chen, Huanhuan Chen |
| 2024 | Dynamic Segmentation for Efficient Retrieval of Podcasts: The Repping Algorithm. Stephan Repp, Ernst Georg Haffner |
| 2024 | Dynamic Soft Labeling for Visual Semantic Embedding. Jiaao Yu, Yunlai Ding, Junyu Dong, Yuezun Li |
| 2024 | ELSEIR: A Privacy-Preserving Large-Scale Image Retrieval Framework for Outsourced Data Sharing. Zixin Tang, Haihui Fan, Xiaoyan Gu, Yang Li, Bo Li, Xin Wang |
| 2024 | End-to-End Thai Text-to-Speech with Linguistic Unit. Kontawat Wisetpaitoon, Sattaya Singkul, Theerat Sakdejayont, Tawunrat Chalothorn |
| 2024 | Enhancing Cheapfake Detection: An Approach Using Prompt Engineering and Interleaved Text-Image Model. Dang Vu, Minh-Nhat Nguyen, Quoc-Trung Nguyen |
| 2024 | Enhancing Class-Incremental Learning for Image Classification via Bidirectional Transport and Selective Momentum. Feifei Fu, Yizhao Gao, Zhiwu Lu |
| 2024 | Enhancing Interactive Image Retrieval With Query Rewriting Using Large Language Models and Vision Language Models. Hongyi Zhu, Jia-Hong Huang, Stevan Rudinac, Evangelos Kanoulas |
| 2024 | Enhancing Visible-Infrared Person Re-identification with Modality- and Instance-aware Visual Prompt Learning. Ruiqi Wu, Bingliang Jiao, Wenxuan Wang, Meng Liu, Peng Wang |
| 2024 | Exploiting Degradation Prior for Personalized Federated Learning in Real-World Image Super-Resolution. Yue Yang, Liangjun Ke |
| 2024 | ExpoGenius: Robust Personalized Human Image Generation using Diffusion Model for Exposure Variation and Pose Transfer. Depei Liu, Hongjie Fan, Junfei Liu |
| 2024 | Extending CLIP for Text-to-font Retrieval. Qinghua Sun, Jia Cui, Zhenyu Gu |
| 2024 | FEST: A Multi-way Framework with Enhanced Spatial-Temporal Modeling for Traffic Forecasting. Yilin Li, Tszyin Guo, Ying Qiao, Zitong Bo, Hongan Wang |
| 2024 | FaceX: Understanding Face Attribute Classifiers through Summary Model Explanations. Ioannis Sarridis, Christos Koutlis, Symeon Papadopoulos, Christos Diou |
| 2024 | FedPAM: Federated Personalized Augmentation Model for Text-to-Image Retrieval. Yueying Feng, Fan Ma, Wang Lin, Chang Yao, Jingyuan Chen, Yi Yang |
| 2024 | Federated Multi-Task Learning on Non-IID Data Silos: An Experimental Study. Yuwen Yang, Yuxiang Lu, Suizhi Huang, Shalayiding Sirejiding, Hongtao Lu, Yue Ding |
| 2024 | Fine-Tuning Large Language Models for Private Document Retrieval: A Tutorial. Frank Sommers, Alisa Kongthon, Sarawoot Kongyoung |
| 2024 | Fine-grained Semantics-aware Representation Learning for Text-based Person Retrieval. Di Wang, Feng Yan, Yifeng Wang, Lin Zhao, Xiao Liang, Haodi Zhong, Ronghua Zhang |
| 2024 | Fingerprinting in EEG Model IP Protection Using Diffusion Model. Tianyi Wang, Shenghua Zhong |
| 2024 | G-SAP: Graph-based Structure-Aware Prompt Learning over Heterogeneous Knowledge for Commonsense Reasoning. Ruiting Dai, Yuqiao Tan, Lisi Mo, Shuang Liang, Guohao Huo, Jiayi Luo, Yao Cheng |
| 2024 | GB Zeli Wang, Tuo Zhang, Shuyin Xia, Longlong Lin, Guoyin Wang |
| 2024 | GSD-GNN: Generalizable and Scalable Algorithms for Decoupled Graph Neural Networks. Yunfeng Yu, Longlong Lin, Qiyu Liu, Zeli Wang, Xi Ou, Tao Jia |
| 2024 | Generative Data Augmentation with Liveness Information Preserving for Face Anti-Spoofing. Changgu Chen, Yang Li, Jian Zhang, Jiali Liu, Changbo Wang |
| 2024 | HashNeck is a Boosting Tool for Deep Learning to Hashing. Hua Gao, Chenchen Hu, Guang Han, Jiafa Mao, Wei Huang, Kaiyuan Wan |
| 2024 | HybridHash: Hybrid Convolutional and Self-Attention Deep Hashing for Image Retrieval. Chao He, Hongxi Wei |
| 2024 | ICDAR 24: Intelligent Cross-Data Analysis and Retrieval. Minh-Son Dao, Michael Alexander Riegler, Duc-Tien Dang-Nguyen, Hanh-Nhi Tran, Rage Uday Kiran, Takahiro Komamizu |
| 2024 | Identification of Speaker Roles and Situation Types in News Videos. Gullal S. Cheema, Judi Arafat, Chiao-I Tseng, John A. Bateman, Ralph Ewerth, Eric Müller-Budack |
| 2024 | Image-to-Point Registration via Cross-Modality Correspondence Retrieval. Lin Bie, Siqi Li, Kai Cheng |
| 2024 | Improve Deep Hashing with Language Guidance for Unsupervised Image Retrieval. Chuang Zhao, Hefei Ling, Shijie Lu, Yuxuan Shi, Jiazhong Chen, Ping Li |
| 2024 | Improving Data Augmentation for Robust Visual Question Answering with Effective Curriculum Learning. Yuhang Zheng, Zhen Wang, Long Chen |
| 2024 | Improving Interpretable Embeddings for Ad-hoc Video Search with Generative Captions and Multi-word Concept Bank. Jiaxin Wu, Chong-Wah Ngo, Wing-Kwong Chan |
| 2024 | Improving Video Corpus Moment Retrieval with Partial Relevance Enhancement. Danyang Hou, Liang Pang, Huawei Shen, Xueqi Cheng |
| 2024 | Intra and Inter-modality Incongruity Modeling and Adversarial Contrastive Learning for Multimodal Fake News Detection. Siqi Wei, Bin Wu |
| 2024 | Introduction to the Seventh Annual Lifelog Search Challenge, LSC'24. Cathal Gurrin, Liting Zhou, Graham Healy, Werner Bailer, Duc-Tien Dang-Nguyen, Steve Hodges, Björn Þór Jónsson, Jakub Lokoc, Luca Rossetto, Minh-Triet Tran, Klaus Schöffmann |
| 2024 | Knowledge Distillation for Single Image Super-Resolution via Contrastive Learning. Cencen Liu, Dongyang Zhang, Ke Qin |
| 2024 | Known-Item Search in Video: An Eye Tracking-Based Study. Lucas Joos, Bastian Jäckl, Daniel A. Keim, Maximilian T. Fischer, Ladislav Peska, Jakub Lokoc |
| 2024 | Learning from Reduced Labels for Long-Tailed Data. Meng Wei, Zhongnian Li, Yong Zhou, Xinzheng Xu |
| 2024 | Lifelong Visible-Infrared Person Re-Identification via a Tri-Token Transformer with a Query-Key Mechanism. Yitong Xing, Guoqiang Xiao, Michael S. Lew, Song Wu |
| 2024 | Local Deep Learning Quantization for Approximate Nearest Neighbor Search. Quan Li, Xike Xie, Chao Wang, Jiali Weng |
| 2024 | Low-Light Image Enhancement via Weighted Low-Rank Tensor Regularized Retinex Model. Weipeng Yang, Hongxia Gao, Wenbin Zou, Tongtong Liu, Shasha Huang, Jianliang Ma |
| 2024 | MAD '24 Workshop: Multimedia AI against Disinformation. Cristian Lucian Stanciu, Bogdan Ionescu, Luca Cuccovillo, Symeon Papadopoulos, Giorgos Kordopatis-Zilos, Adrian Popescu, Roberto Caldelli |
| 2024 | MFVG: A Visual Grounding Network with Multi-scale Fusion. Peijia Chen, Ke Qi, Xi Tao, Wenhao Xu, Jingdong Zhang |
| 2024 | ML Ziyu Gong, Chengcheng Mai, Yihua Huang |
| 2024 | MORE'24 Multimedia Object Re-ID: Advancements, Challenges, and Opportunities. Zhedong Zheng, Yaxiong Wang, Xuelin Qian, Zhun Zhong, Zheng Wang, Liang Zheng |
| 2024 | MSI: Multi-modal Recommendation via Superfluous Semantics Discarding and Interaction Preserving. Yi Li, Qingmeng Zhu, Changwen Zheng, Jiangmeng Li |
| 2024 | MUWS 2024: The 3rd International Workshop on Multimodal Human Understanding for the Web and Social Media. Marc A. Kastner, Gullal S. Cheema, Sherzod Hakimov, Noa Garcia |
| 2024 | MVRMLM 2024: Multimodal Video Retrieval and Multimodal Language Modelling. Hui Wang, Josef Kittler, Mark J. F. Gales, Rob Cooper, Maurice D. Mulvenna, Wing W. Y. Ng, Yang Hua, Richard Gault, Abbas Haider, Guanfeng Wu |
| 2024 | Mapping the Audio Landscape for Innovative Music Sample Generation. Christian Limberg, Zhe Zhang |
| 2024 | MarginFinger: Controlling Generated Fingerprint Distance to Classification boundary Using Conditional GANs. Weixing Liu, Shenghua Zhong |
| 2024 | MemoriLens: a Low-cost Lifelog Camera Using Raspberry Pi Zero. Quang-Linh Tran, Binh T. Nguyen, Gareth J. F. Jones, Cathal Gurrin |
| 2024 | Modality-specific and -shared Contrastive Learning for Sentiment Analysis. Dahuang Liu, Jiuxiang You, Guobo Xie, Lap-Kei Lee, Fu Lee Wang, Zhenguo Yang |
| 2024 | Modeling Multi-Task Joint Training of Aggregate Networks for Multi-Modal Sarcasm Detection. Lisong Ou, Zhixin Li |
| 2024 | Monocular Expressive 3D Human Reconstruction of Multiple People. Zhenghao Zhao, Hao Tang, Joy Wan, Yan Yan |
| 2024 | Multi-Source Augmentation and Composite Prompts for Visual Recognition with Missing Modality. Zhirui Kuai, Yulu Zhou, Qi Xie, Li Kuang |
| 2024 | Multi-modal Entity Alignment via Position-enhanced Multi-label Propagation. Wei Tang, Yuanyi Wang |
| 2024 | Multi-modal Video Summarization. Jia-Hong Huang |
| 2024 | Multi-view Counterfactual Contrastive Learning for Fact-checking Fake News Detection. Yongcheng Zhang, Lingou Kong, Sheng Tian, Hao Fei, Changpeng Xiang, Huan Wang, Xiaomei Wei |
| 2024 | Multi-view Subspace Clustering via An Adaptive Consensus Graph Filter. Lai Wei, Shanshan Song |
| 2024 | Multidimensional Semantic Disentanglement Network for Clothes-Changing Person Re-Identification. Yongkang Ding, Anqi Wang, Liyan Zhang |
| 2024 | Multimedia Retrieval in and for XR. Maria Pegia, Sotiris Diplaris, Stefanos Vrochidis, Heiko Schuldt, Florian Spiess, Rahel Arnold, Werner Bailer |
| 2024 | Multimodal Prototype-Enhanced Network for Few-Shot Action Recognition. Xinzhe Ni, Yong Liu, Hao Wen, Yatai Ji, Jing Xiao, Yujiu Yang |
| 2024 | Multimodality in Media Retrieval. Maria Eirini Pegia |
| 2024 | Navigating Style Variations in Scene Text Image Super-Resolution through Multi-Scale Perception. Feifei Xu, Ziheng Yu |
| 2024 | Near-Miss Accident Prediction on the Edge: A Real-Time System for Safer Driving. Minh-Son Dao, Koji Zettsu |
| 2024 | NeurNCD: Novel Class Discovery via Implicit Neural Representation. Junming Wang, Yi Shi |
| 2024 | Neural Parametric Human Hand Modeling with Point Cloud Representation. Jian Yang, Weize Quan, Zhen Shen, Dong-Ming Yan, Huaiyu Wu |
| 2024 | Octree-Retention Fusion: A High-Performance Context Model for Point Cloud Geometry Compression. Zhikang Zhang, Zhongjie Zhu, Yongqiang Bai, Ming Wang, Zhijing Yu |
| 2024 | OpenLifelogCam - A Low-Cost Open-Source Wearable Camera Platform. Luca Rossetto |
| 2024 | Overview of the Grand Challenge on Detecting Cheapfakes at ACM ICMR 2024. Duc-Tien Dang-Nguyen, Sohail Ahmed Khan, Michael Riegler, Pål Halvorsen, Anh-Duy Tran, Minh-Son Dao, Minh-Triet Tran |
| 2024 | PTAN: Principal Token-aware Adjacent Network for Compositional Temporal Grounding. Zhuoyuan Wei, Xun Jiang, Zheng Wang, Fumin Shen, Xing Xu |
| 2024 | Parametric CAD Primitive Retrieval via Multi-Modal Fusion and Deep Hashing. Minyang Xu, Yunzhong Lou, Weijian Ma, Xueyang Li, Xiangdong Zhou |
| 2024 | Pattern4Ego: Learning Egocentric Video Representation Using Cross-video Activity Patterns. Ruihai Wu, Yourong Zhang, Yu Qi, Andy Guanhong Chen, Hao Dong |
| 2024 | PiCoGen: Generate Piano Covers with a Two-stage Approach. Chih-Pin Tan, Shuen-Huei Guan, Yi-Hsuan Yang |
| 2024 | PoseRec: 3D Human Pose Driven Online Advertisement Recommendation for Micro-videos. Zhaoxin Fan, Fengxin Li, Hongyan Liu, Jun He, Xiaoyong Du |
| 2024 | Proactive Privacy and Intellectual Property Protection of Multimedia Retrieval Models in Edge Intelligence. Peihao Li, Jie Huang, Shuaishuai Zhang, Chunyang Qi |
| 2024 | Proceedings of the 2024 International Conference on Multimedia Retrieval, ICMR 2024, Phuket, Thailand, June 10-14, 2024 Cathal Gurrin, Rachada Kongkachandra, Klaus Schoeffmann, Duc-Tien Dang-Nguyen, Luca Rossetto, Shin'ichi Satoh, Liting Zhou |
| 2024 | Progressive Multi-modal Conditional Prompt Tuning. Xiaoyu Qiu, Hao Feng, Yuechen Wang, Wengang Zhou, Houqiang Li |
| 2024 | Prompt Expending for Single Positive Multi-Label Learning with Global Unannotated Categories. Zhongnian Li, Peng Ying, Meng Wei, Tongfeng Sun, Xinzheng Xu |
| 2024 | Pseudo Content Hallucination for Unpaired Image Captioning. Huixia Ben, Shuo Wang, Meng Wang, Richang Hong |
| 2024 | Pyramidal Cross-Modal Transformer with Sustained Visual Guidance for Multi-Label Image Classification. Zhuohua Li, Ruyun Wang, Fuqing Zhu, Jizhong Han, Songlin Hu |
| 2024 | QAVidCap: Enhancing Video Captioning through Question Answering Techniques. Hui Liu, Xiaojun Wan |
| 2024 | RE-IDVIS: Person Re-Identification System based on Interactive Visualization. Wang xia, Guodao Sun, Zihao Zhu, Pan Liang, Sujia Zhu, Yiming Wu, Haoran Liang, Ronghua Liang |
| 2024 | RGB-D Video Object Segmentation via Enhanced Multi-store Feature Memory. Boyue Xu, Ruichao Hou, Tongwei Ren, Gangshan Wu |
| 2024 | Reconciling the Rift Between Recognition and Recall: Insights from a Video Memorability Drawing Experiment. Lorin Sweeney, Graham Healy, Alan F. Smeaton |
| 2024 | Refracting Once is Enough: Neural Radiance Fields for Novel-View Synthesis of Real Refractive Objects. Xiaoqian Liang, Jianji Wang, Yuanliang Lu, Xubin Duan, Xichun Liu, Nanning Zheng |
| 2024 | Reproducibility Companion Paper of "MMSF: A Multimodal Sentiment-Fused Method to Recognize Video Speaking Style". Fan Yu, Beibei Zhang, Yaqun Fang, Jia Bei, Tongwei Ren, Jiyi Li, Luca Rossetto |
| 2024 | Reproducibility Companion Paper: Recommendation of Mix-and-Match Clothing by Modeling Indirect Personal Compatibility. Shuiying Liao, Yujuan Ding, P. Y. Mok, Qiushi Huang, Jialun Cao |
| 2024 | Reproducibility Companion Paper: Stable Diffusion for Content-Style Disentanglement in Art Analysis. Yankun Wu, Yuta Nakashima, Noa Garcia, Sheng Li, Zhaoyang Zeng |
| 2024 | Research on Epilepsy Classification Model Based on Variational Mode Quadratic Decomposition. Chen Huang, Zhijun Fan, Kui Xiao, Yan Zhang, Shihui Wang, Jianhua Song, Wei Wu, Chao Liu |
| 2024 | Retrieval-Augmented Audio Deepfake Detection. Zuheng Kang, Yayun He, Botao Zhao, Xiaoyang Qu, Junqing Peng, Jing Xiao, Jianzong Wang |
| 2024 | RetrievalMMT: Retrieval-Constrained Multi-Modal Prompt Learning for Multi-Modal Machine Translation. Yan Wang, Yawen Zeng, Junjie Liang, Xiaofen Xing, Jin Xu, Xiangmin Xu |
| 2024 | Retrieving Emotional Stimuli in Artworks. Tianwei Chen, Noa Garcia, Liangzhi Li, Yuta Nakashima |
| 2024 | Robust Video Hashing with Non-negative Tensor Factorization for Copy Detection. Mengzhu Yu, Zhenjun Tang, Huijiang Zhuang, Xiaoping Liang, Zhixin Li, Xianquan Zhang |
| 2024 | S2F-Net: Shared-Specific Fusion Network for Infrared and Visible Image Fusion. Yijing Zhao, Yuchao Xia, Yi Ding, Yumeng Liu, Shuai Liu, Hongan Wang |
| 2024 | SBCR: Stochasticity Beats Content Restriction Problem in Training and Tuning Free Image Editing. Jiancheng Huang, Mingfu Yan, Yifan Liu, Shifeng Chen |
| 2024 | SFAM: Lightweight Spectrum Unreferenced Attention Network. Xuanhao Qi, Min Zhi, Yanjun Yin, Ping Ping, Yuening Zhang |
| 2024 | STDG: Semi-Teacher-Student Training Paradigm for Depth-guided One-stage Scene Graph Generation. Xukun Zhou, Zhenbo Song, Jun He, Hongyan Liu, Zhaoxin Fan |
| 2024 | SamCap: Energy-based Controllable Image Captioning by Gradient-Based Sampling. Yuchen Niu, Min Zhu, Zhihua Wei |
| 2024 | Secure Verification Encrypted Image Retrieval Scheme with Addition Homomorphic Bitmap Index. Mingyue Li, Yuting Zhu, Ruizhong Du, Chunfu Jia |
| 2024 | Self-Supervised Multi-Label Classification with Global Context and Local Attention. Chun-Yen Chen, Mei-Chen Yeh |
| 2024 | Semantic-guided RGB-Thermal Crowd Counting with Segment Anything Model. Yaqun Fang, Yi Shi, Jia Bei, Tongwei Ren |
| 2024 | Semi-Parametric Style Transfer with Multi-Perspective Feature Fusion and Information-Guided Alignment. Tianlong Zhang, Jing Lv, Ming Yang |
| 2024 | SkeletonFormer: Point Cloud Completion with Dynamic Selective Skeleton Points. Beiqi Liu, Fuqing Duan, Junli Zhao |
| 2024 | Sketch-aided Interactive Fusion Point Cloud Place Recognition. Ruonan Zhang, Xiaohang Liu, Ge Li, Thomas H. Li, Pengjun Zhao |
| 2024 | Smart Fitting Room: A One-stop Framework for Matching-aware Virtual Try-On. Mingzhe Yu, Yunshan Ma, Lei Wu, Kai Cheng, Xue Li, Lei Meng, Tat-Seng Chua |
| 2024 | Speak From Heart: An Emotion-Guided LLM-Based Multimodal Method for Emotional Dialogue Generation. Chenxiao Liu, Zheyong Xie, Sirui Zhao, Jin Zhou, Tong Xu, Minglei Li, Enhong Chen |
| 2024 | Subspace Clustering with A Hybrid Adaptive Graph Filter. Lai Wei, Mingyuan Xi |
| 2024 | TIM: Temporal Interaction Model in Notification System. Huxiao Ji, Haitao Yang, Linchuan Li, Shunyu Zhang, Cunyi Zhang, Xuanping Li, Wenwu Ou |
| 2024 | TWIST: Text-only Weakly Supervised Scene Text Spotting Using Pseudo Labels. Lilong Wen, Xiu Tang, Dongxiang Zhang |
| 2024 | Targeted Universal Adversarial Attack on Deep Hash Networks. Fanlei Meng, Xiangru Chen, Yuan Cao |
| 2024 | TeGA: A Text-Guided Generative-based Approach in Cheapfake Detection. Anh-Thu Le, Minh-Dat Nguyen, Minh-Son Dao, Anh-Duy Tran, Duc-Tien Dang-Nguyen |
| 2024 | Team HUGE: Image-Text Matching via Hierarchical and Unified Graph Enhancing. Bo Li, You Wu, Zhixin Li |
| 2024 | Text Adversarial Defense via Granular-Ball Sample Enhancement. Zeli Wang, Jian Li, Shuyin Xia, Longlong Lin, Guoyin Wang |
| 2024 | The First ACM Workshop on AI-Powered Question Answering Systems for Multimedia. Tai Tan Mai, Quang-Linh Tran, Ly-Duyen Tran, Van-Tu Ninh, Duc-Tien Dang-Nguyen, Cathal Gurrin |
| 2024 | The LLM Wrecking Ball: Are We About to Lose Decades of Work in Multimedia because of MM-LLMs? Alan F. Smeaton |
| 2024 | TriMPL: Masked Multi-Prompt Learning with Knowledge Mixing for Vision-Language Few-shot Learning. Xiangyu Liu, Yanlei Shang, Yong Chen |
| 2024 | Triadic Elastic Structure Representation for Open-Set Incremental 3D Object Retrieval. Yang Xu, Yifan Feng, Lin Bie |
| 2024 | TrustGo: Trust Mining and Multi-semantic Regularization in Social Recommendation. Shenghao Liu, Yuqin Lan, Xianjun Deng, Lingzhi Yi, Chenlu Zhu, Laurence T. Yang, Jong Hyuk Park |
| 2024 | UBiSS: A Unified Framework for Bimodal Semantic Summarization of Videos. Yuting Mei, Linli Yao, Qin Jin |
| 2024 | Unifying Pictorial and Textual Features for Screen Content Image Quality Evaluation. Yihua Chen, Xiaoping Liang, Mengzhu Yu, Zhenjun Tang |
| 2024 | Unveiling Global Narratives: A Multilingual Twitter Dataset of News Media on the Russo-Ukrainian Conflict. Sherzod Hakimov, Gullal S. Cheema |
| 2024 | VEC-MNER: Hybrid Transformer with Visual-Enhanced Cross-Modal Multi-level Interaction for Multimodal NER. Pengfei Wei, Hongjun Ouyang, Qintai Hu, Bi Zeng, Guang Feng, Qingpeng Wen |
| 2024 | Vector-Aware Anisotropic Gauge Equivariant Mesh Convolution Network for 3D Aneurysm Detection. Xudong Ru, Haichuan Zhao, Xingce Wang, Zhongke Wu, Shaolong Liu, Yi-Cheng Zhu, Alejandro F. Frangi |
| 2024 | Visibility-guided Human Body Reconstruction from Uncalibrated Multi-view Cameras. Zhenyu Xie, Huanyu He, Gui Zou, Jie Wu, Guoliang Liu, Jun Zhao, Yingxue Wang, Hui Lin, Weiyao Lin |
| 2024 | When Handcrafted Filter Meets CNN: A Lightweight Conv-Filter Mixer Network for Efficient Image Super-Resolution. Zhijian Wu, Wenhui Liu, Dingjiang Huang |
| 2024 | Wireless Capsule Endoscope Low-light Image Enhancement with Balanced Brightness and Saturation. Wenzhuo Li, Yinghui Wang, Wei Li, Liangyi Huang, Kamoliddin Shukurov, Mingfeng Wang |
| 2024 | YawnNet: A Visual-Centric Approach for Yawning Detection. Ruoxi Sun, Xinyu Yang, Cong Qian, Chenyu Zhu, Wei Sui, Zeyd Boukhers, Cong Yang |