ICMR B

178 papers

YearTitle / Authors
20243DMSE: An Interactive 3D Media Search Engine.
Maria Eirini Pegia, Dimitris Georgalis, Nick Pantelidis, Björn Þór Jónsson, Anastasia Moumtzidou, Sotiris Diplaris, Ilias Gialampoukidis, Stefanos Vrochidis, Ioannis Kompatsiaris
2024A Causal View for Multi-Interest User Modeling in News Recommendation.
Mei Yu, Xiaoxi Zhou, Mankun Zhao, Tianyi Xu, Yue Zhao, Ruiguo Yu, Xuewei Li
2024A GAN based Video Summarization Method with Representation Loss.
Zhuo Lei, Qiang Yu, Lidan Shou, Shengquan Li, Yunqing Mao
2024A Generative Adaptive Context Learning Framework for Large Language Models in Cheapfake Detection.
Long-Khanh Pham, Hoa-Vien Vo-Hoang, Anh-Duy Tran
2024A Graph Convolution Network with a POS-aware Filter and Context Enhancement Mechanism for Event Detection.
Xintao Jiao, Jiansheng Chen, Jiale Liu
2024A Hybrid Few-Shot Image Classification Framework Combining Gaussian Modeling and Label Propagation.
Chao Ye, Qian Wang, Lanfang Dong
2024A Knowledge-Driven Approach to Enhance Topic Modeling with Multi-Modal Representation Learning.
Hongzhang Mu, Shuili Zhang, Hongbo Xu
2024A Lightweight Surface Defect Segmentation Network with External Semantics and High-frequency Information.
Tianpeng Zhang, Xuesong Jiang
2024A Multi-Stage Deep Learning Approach Incorporating Text-Image and Image-Image Comparisons for Cheapfake Detection.
Jangwon Seo, Hyo-Seok Hwang, Jiyoung Lee, Minhyeok Lee, Wonsuk Kim, Junhee Seok
2024A Novel Auxiliary Task Framework in 3D Human Pose Estimation for Opera Videos.
Xingquan Cai, Haoyu Zhang, Shanshan He, Haoyu Song, Haiyan Sun
2024A Parallel Transformer Framework for Video Moment Retrieval.
Thao-Nhu Nguyen, Zongyao Li, Satoshi Yamazaki, Jianquan Liu, Cathal Gurrin
2024A Sentimental Prompt Framework with Visual Text Encoder for Multimodal Sentiment Analysis.
Shizhou Huang, Bo Xu, Changqun Li, Jiabo Ye, Xin Lin
2024A Unified Network for Detecting Out-Of-Context Information Using Generative Synthetic Data.
Van-Loc Nguyen, Bao-Tin Nguyen, Thanh-Son Nguyen, Duc-Tien Dang-Nguyen, Minh-Triet Tran
2024A Web Demo Interface for Super-Resolution Reconstruction with Parametric Regularization Loss.
Supatta Viriyavisuthisakul, Parinya Sanguansat, Toshihiko Yamasaki
2024ACR-Pose: Adversarial Canonical Representation Reconstruction Network for Category Level 6D Object Pose Estimation.
Zhaoxin Fan, Zhenbo Song, Zhicheng Wang, Jian Xu, Kejian Wu, Hongyan Liu, Jun He
2024AI Batting Buddy: A Computational and Kinematic Approach for Enhancing Batting Performance and Analysis in Baseball.
Kuo-Yu Liu, Ting-Yu Guo, Ta-Shan Pan, Ping-Yi Tung, Yi-Rou Lin
2024AI-SIPM 2024: International Workshop on Artificial Intelligence for Signal, Image Processing and Multimedia.
Mahasak Ketcham, Kanyalag Phodong, Patiyuth Pramkeaw, Worawut Yimyam, Narumol Chumuang, Pokpong Songmuang, Thittaporn Ganokratanaa
2024AdOCTeRA: Adaptive Optimization Constraints for improved Text-guided Retrieval of Apartments.
Ali Abdari, Alex Falcon, Giuseppe Serra
2024An Exploration Graph with Continuous Refinement for Efficient Multimedia Retrieval.
Nico Hezel, Kai Uwe Barthel, Konstantin Schall, Klaus Jung
2024Anchor-aware Deep Metric Learning for Audio-visual Retrieval.
Donghuo Zeng, Yanan Wang, Kazushi Ikeda, Yi Yu
2024BFIDet: A YOLOv7-improved Vehicle and Pedestrian Detector via Balancing Feature Integration.
Anrui Wang, Libo Weng, Fei Gao
2024BeatDance: A Beat-Based Model-Agnostic Contrastive Learning Framework for Music-Dance Retrieval.
Kaixing Yang, Xukun Zhou, Xulong Tang, Ran Diao, Hongyan Liu, Jun He, Zhaoxin Fan
2024Bringing Video Browsing to Virtual Reality: Empirical Evaluation of a Novel Multimedia Drawer.
Florian Spiess, Nicolas Scharowski, Ariane Haller, Zgjim Memeti, Heiko Schuldt, Florian Brühlmann
2024CGI-MRE: A Comprehensive Genetic-Inspired Model For Multimodal Relation Extraction.
Pengfei Wei, Zhaokang Huang, Hongjun Ouyang, Qintai Hu, Bi Zeng, Guang Feng
2024CLCP: Realtime Text-Image Retrieval for Retailing via Pre-trained Clustering and Priority Queue.
Shuyang Zhang, Liangwu Wei, Qingyu Wang, Yuntao Wei, Yanzhi Song
2024CLIP-ProbCR: CLIP-based Probability embedding Combination Retrieval.
Mingyong Li, Zongwei Zhao, Xiaolong Jiang, Zheng Jiang
2024CLIPping the Deception: Adapting Vision-Language Models for Universal Deepfake Detection.
Sohail Ahmed Khan, Duc-Tien Dang-Nguyen
2024CLTalk: Speech-Driven 3D Facial Animation with Contrastive Learning.
Xitie Zhang, Suping Wu
2024CMFF-Face: Attention-Based Cross-Modal Feature Fusion for High-Quality Audio-Driven Talking Face Generation.
Guangzhe Zhao, Yanan Liu, Xueping Wang, Feihu Yan
2024Calibration & Reconstruction: Deeply Integrated Language for Referring Image Segmentation.
Yichen Yan, Xingjian He, Sihan Chen, Jing Liu
2024CarAI: Car Inspection with Artificial Intelligence.
Panumate Chetprayoon, Sakol Tasanangam, Gayatri Tirumalasetty, Thanatwit Angsarawanee, Paveen Virameteekul, Wadeepas Lertwatanawanich, Theerat Sakdejayont
2024Causal Inference-based Few-Shot Class-Incremental Learning.
Weiwei Zhou, Guoqiang Xiao, Michael S. Lew, Song Wu
2024CoDancers: Music-Driven Coherent Group Dance Generation with Choreographic Unit.
Kaixing Yang, Xulong Tang, Ran Diao, Hongyan Liu, Jun He, Zhaoxin Fan
2024CodeDetector: Revealing Forgery Traces with Codebook for Generalized Deepfake Detection.
Jiaxin Li, Zhihan Yu, Guibo Luo, Yuesheng Zhu
2024Comment-aided Video-Language Alignment via Contrastive Pre-training for Short-form Video Humor Detection.
Yang Liu, Tongfei Shen, Dong Zhang, Qingying Sun, Shoushan Li, Guodong Zhou
2024Compact Visual Data Representation for Multimedia Search and Analytics.
Shiqi Wang, Xinfeng Zhang
2024Component-Level Oracle Bone Inscription Retrieval.
Zhikai Hu, Yiu-ming Cheung, Yonggang Zhang, Peiying Zhang, Puiling Tang
2024Content-Based Exclusion Queries in Keyword-Based Image Retrieval.
Eisaku Yoshikawa, Keishi Tajima
2024Context or Clutter? Efficiently Matching Objects Across Scenes.
Albatool Wazzan, Imtiaz Ahmad, Stephen MacNeil, Richard Souvenir
2024Contrastive Pre-training with Multi-level Alignment for Grounded Multimodal Named Entity Recognition.
Xigang Bao, Mengyuan Tian, Luyao Wang, Zhiyuan Zha, Biao Qin
2024Conversational Image Search: A Sketch-based Approach.
Daniel D. Braghis, Haiming Liu
2024Creating Sorted Grid Layouts with Gradient-based Optimization.
Kai Uwe Barthel, Florian Tim Barthel, Peter Eisert, Nico Hezel, Konstantin Schall
2024Deep Image Clustering Based on Curriculum Learning and Density Information.
Haiyang Zheng, Ruilin Zhang, Hongpeng Wang
2024Deep Scaling Factor Quantization Network for Large-scale Image Retrieval.
Ziqing Deng, Zhihui Lai, Yujuan Ding, Heng Kong, Xu Wu
2024DeepEnhancer: Temporally Consistent Focal Transformer for Comprehensive Video Enhancement.
Qin Jiang, Qinglin Wang, Lihua Chi, Wentao Ma, Feng Li, Jie Liu
2024Detecting Misinformation in Photos Utilizing Reverse Image Search.
Vinh Dang, Thanh-Son Nguyen, Minh-Triet Tran, Duc-Tien Dang-Nguyen
2024Detecting Out-of-Context Media with LLaMa-Adapter V2 and RoBERTa: An Effective Method for Cheapfakes Detection.
Hoa-Vien Vo-Hoang, Long-Khanh Pham, Minh-Son Dao
2024DiffHarmony: Latent Diffusion Model Meets Image Harmonization.
Pengfei Zhou, Fangxiang Feng, Xiaojie Wang
2024Directly Locating Actions in Video with Single Frame Annotation.
Haoran Tong, Xinyan Liu, Guorong Li, Laiyun Qing
2024Discovering Multi-Relational Integration for Knowledge Tracing with Retentive Networks.
Linhao Zhou, Sheng-hua Zhong, Zhijiao Xiao
2024Diversity in Multimedia.
Yi-Ping Phoebe Chen
2024DualStyle3D: Real-time Exemplar-based Artistic Portrait View Synthesis Based on Radiance Field.
Runlai Hao, Jinlong Li, Qiuju Chen, Huanhuan Chen
2024Dynamic Segmentation for Efficient Retrieval of Podcasts: The Repping Algorithm.
Stephan Repp, Ernst Georg Haffner
2024Dynamic Soft Labeling for Visual Semantic Embedding.
Jiaao Yu, Yunlai Ding, Junyu Dong, Yuezun Li
2024ELSEIR: A Privacy-Preserving Large-Scale Image Retrieval Framework for Outsourced Data Sharing.
Zixin Tang, Haihui Fan, Xiaoyan Gu, Yang Li, Bo Li, Xin Wang
2024End-to-End Thai Text-to-Speech with Linguistic Unit.
Kontawat Wisetpaitoon, Sattaya Singkul, Theerat Sakdejayont, Tawunrat Chalothorn
2024Enhancing Cheapfake Detection: An Approach Using Prompt Engineering and Interleaved Text-Image Model.
Dang Vu, Minh-Nhat Nguyen, Quoc-Trung Nguyen
2024Enhancing Class-Incremental Learning for Image Classification via Bidirectional Transport and Selective Momentum.
Feifei Fu, Yizhao Gao, Zhiwu Lu
2024Enhancing Interactive Image Retrieval With Query Rewriting Using Large Language Models and Vision Language Models.
Hongyi Zhu, Jia-Hong Huang, Stevan Rudinac, Evangelos Kanoulas
2024Enhancing Visible-Infrared Person Re-identification with Modality- and Instance-aware Visual Prompt Learning.
Ruiqi Wu, Bingliang Jiao, Wenxuan Wang, Meng Liu, Peng Wang
2024Exploiting Degradation Prior for Personalized Federated Learning in Real-World Image Super-Resolution.
Yue Yang, Liangjun Ke
2024ExpoGenius: Robust Personalized Human Image Generation using Diffusion Model for Exposure Variation and Pose Transfer.
Depei Liu, Hongjie Fan, Junfei Liu
2024Extending CLIP for Text-to-font Retrieval.
Qinghua Sun, Jia Cui, Zhenyu Gu
2024FEST: A Multi-way Framework with Enhanced Spatial-Temporal Modeling for Traffic Forecasting.
Yilin Li, Tszyin Guo, Ying Qiao, Zitong Bo, Hongan Wang
2024FaceX: Understanding Face Attribute Classifiers through Summary Model Explanations.
Ioannis Sarridis, Christos Koutlis, Symeon Papadopoulos, Christos Diou
2024FedPAM: Federated Personalized Augmentation Model for Text-to-Image Retrieval.
Yueying Feng, Fan Ma, Wang Lin, Chang Yao, Jingyuan Chen, Yi Yang
2024Federated Multi-Task Learning on Non-IID Data Silos: An Experimental Study.
Yuwen Yang, Yuxiang Lu, Suizhi Huang, Shalayiding Sirejiding, Hongtao Lu, Yue Ding
2024Fine-Tuning Large Language Models for Private Document Retrieval: A Tutorial.
Frank Sommers, Alisa Kongthon, Sarawoot Kongyoung
2024Fine-grained Semantics-aware Representation Learning for Text-based Person Retrieval.
Di Wang, Feng Yan, Yifeng Wang, Lin Zhao, Xiao Liang, Haodi Zhong, Ronghua Zhang
2024Fingerprinting in EEG Model IP Protection Using Diffusion Model.
Tianyi Wang, Shenghua Zhong
2024G-SAP: Graph-based Structure-Aware Prompt Learning over Heterogeneous Knowledge for Commonsense Reasoning.
Ruiting Dai, Yuqiao Tan, Lisi Mo, Shuang Liang, Guohao Huo, Jiayi Luo, Yao Cheng
2024GB
Zeli Wang, Tuo Zhang, Shuyin Xia, Longlong Lin, Guoyin Wang
2024GSD-GNN: Generalizable and Scalable Algorithms for Decoupled Graph Neural Networks.
Yunfeng Yu, Longlong Lin, Qiyu Liu, Zeli Wang, Xi Ou, Tao Jia
2024Generative Data Augmentation with Liveness Information Preserving for Face Anti-Spoofing.
Changgu Chen, Yang Li, Jian Zhang, Jiali Liu, Changbo Wang
2024HashNeck is a Boosting Tool for Deep Learning to Hashing.
Hua Gao, Chenchen Hu, Guang Han, Jiafa Mao, Wei Huang, Kaiyuan Wan
2024HybridHash: Hybrid Convolutional and Self-Attention Deep Hashing for Image Retrieval.
Chao He, Hongxi Wei
2024ICDAR 24: Intelligent Cross-Data Analysis and Retrieval.
Minh-Son Dao, Michael Alexander Riegler, Duc-Tien Dang-Nguyen, Hanh-Nhi Tran, Rage Uday Kiran, Takahiro Komamizu
2024Identification of Speaker Roles and Situation Types in News Videos.
Gullal S. Cheema, Judi Arafat, Chiao-I Tseng, John A. Bateman, Ralph Ewerth, Eric Müller-Budack
2024Image-to-Point Registration via Cross-Modality Correspondence Retrieval.
Lin Bie, Siqi Li, Kai Cheng
2024Improve Deep Hashing with Language Guidance for Unsupervised Image Retrieval.
Chuang Zhao, Hefei Ling, Shijie Lu, Yuxuan Shi, Jiazhong Chen, Ping Li
2024Improving Data Augmentation for Robust Visual Question Answering with Effective Curriculum Learning.
Yuhang Zheng, Zhen Wang, Long Chen
2024Improving Interpretable Embeddings for Ad-hoc Video Search with Generative Captions and Multi-word Concept Bank.
Jiaxin Wu, Chong-Wah Ngo, Wing-Kwong Chan
2024Improving Video Corpus Moment Retrieval with Partial Relevance Enhancement.
Danyang Hou, Liang Pang, Huawei Shen, Xueqi Cheng
2024Intra and Inter-modality Incongruity Modeling and Adversarial Contrastive Learning for Multimodal Fake News Detection.
Siqi Wei, Bin Wu
2024Introduction to the Seventh Annual Lifelog Search Challenge, LSC'24.
Cathal Gurrin, Liting Zhou, Graham Healy, Werner Bailer, Duc-Tien Dang-Nguyen, Steve Hodges, Björn Þór Jónsson, Jakub Lokoc, Luca Rossetto, Minh-Triet Tran, Klaus Schöffmann
2024Knowledge Distillation for Single Image Super-Resolution via Contrastive Learning.
Cencen Liu, Dongyang Zhang, Ke Qin
2024Known-Item Search in Video: An Eye Tracking-Based Study.
Lucas Joos, Bastian Jäckl, Daniel A. Keim, Maximilian T. Fischer, Ladislav Peska, Jakub Lokoc
2024Learning from Reduced Labels for Long-Tailed Data.
Meng Wei, Zhongnian Li, Yong Zhou, Xinzheng Xu
2024Lifelong Visible-Infrared Person Re-Identification via a Tri-Token Transformer with a Query-Key Mechanism.
Yitong Xing, Guoqiang Xiao, Michael S. Lew, Song Wu
2024Local Deep Learning Quantization for Approximate Nearest Neighbor Search.
Quan Li, Xike Xie, Chao Wang, Jiali Weng
2024Low-Light Image Enhancement via Weighted Low-Rank Tensor Regularized Retinex Model.
Weipeng Yang, Hongxia Gao, Wenbin Zou, Tongtong Liu, Shasha Huang, Jianliang Ma
2024MAD '24 Workshop: Multimedia AI against Disinformation.
Cristian Lucian Stanciu, Bogdan Ionescu, Luca Cuccovillo, Symeon Papadopoulos, Giorgos Kordopatis-Zilos, Adrian Popescu, Roberto Caldelli
2024MFVG: A Visual Grounding Network with Multi-scale Fusion.
Peijia Chen, Ke Qi, Xi Tao, Wenhao Xu, Jingdong Zhang
2024ML
Ziyu Gong, Chengcheng Mai, Yihua Huang
2024MORE'24 Multimedia Object Re-ID: Advancements, Challenges, and Opportunities.
Zhedong Zheng, Yaxiong Wang, Xuelin Qian, Zhun Zhong, Zheng Wang, Liang Zheng
2024MSI: Multi-modal Recommendation via Superfluous Semantics Discarding and Interaction Preserving.
Yi Li, Qingmeng Zhu, Changwen Zheng, Jiangmeng Li
2024MUWS 2024: The 3rd International Workshop on Multimodal Human Understanding for the Web and Social Media.
Marc A. Kastner, Gullal S. Cheema, Sherzod Hakimov, Noa Garcia
2024MVRMLM 2024: Multimodal Video Retrieval and Multimodal Language Modelling.
Hui Wang, Josef Kittler, Mark J. F. Gales, Rob Cooper, Maurice D. Mulvenna, Wing W. Y. Ng, Yang Hua, Richard Gault, Abbas Haider, Guanfeng Wu
2024Mapping the Audio Landscape for Innovative Music Sample Generation.
Christian Limberg, Zhe Zhang
2024MarginFinger: Controlling Generated Fingerprint Distance to Classification boundary Using Conditional GANs.
Weixing Liu, Shenghua Zhong
2024MemoriLens: a Low-cost Lifelog Camera Using Raspberry Pi Zero.
Quang-Linh Tran, Binh T. Nguyen, Gareth J. F. Jones, Cathal Gurrin
2024Modality-specific and -shared Contrastive Learning for Sentiment Analysis.
Dahuang Liu, Jiuxiang You, Guobo Xie, Lap-Kei Lee, Fu Lee Wang, Zhenguo Yang
2024Modeling Multi-Task Joint Training of Aggregate Networks for Multi-Modal Sarcasm Detection.
Lisong Ou, Zhixin Li
2024Monocular Expressive 3D Human Reconstruction of Multiple People.
Zhenghao Zhao, Hao Tang, Joy Wan, Yan Yan
2024Multi-Source Augmentation and Composite Prompts for Visual Recognition with Missing Modality.
Zhirui Kuai, Yulu Zhou, Qi Xie, Li Kuang
2024Multi-modal Entity Alignment via Position-enhanced Multi-label Propagation.
Wei Tang, Yuanyi Wang
2024Multi-modal Video Summarization.
Jia-Hong Huang
2024Multi-view Counterfactual Contrastive Learning for Fact-checking Fake News Detection.
Yongcheng Zhang, Lingou Kong, Sheng Tian, Hao Fei, Changpeng Xiang, Huan Wang, Xiaomei Wei
2024Multi-view Subspace Clustering via An Adaptive Consensus Graph Filter.
Lai Wei, Shanshan Song
2024Multidimensional Semantic Disentanglement Network for Clothes-Changing Person Re-Identification.
Yongkang Ding, Anqi Wang, Liyan Zhang
2024Multimedia Retrieval in and for XR.
Maria Pegia, Sotiris Diplaris, Stefanos Vrochidis, Heiko Schuldt, Florian Spiess, Rahel Arnold, Werner Bailer
2024Multimodal Prototype-Enhanced Network for Few-Shot Action Recognition.
Xinzhe Ni, Yong Liu, Hao Wen, Yatai Ji, Jing Xiao, Yujiu Yang
2024Multimodality in Media Retrieval.
Maria Eirini Pegia
2024Navigating Style Variations in Scene Text Image Super-Resolution through Multi-Scale Perception.
Feifei Xu, Ziheng Yu
2024Near-Miss Accident Prediction on the Edge: A Real-Time System for Safer Driving.
Minh-Son Dao, Koji Zettsu
2024NeurNCD: Novel Class Discovery via Implicit Neural Representation.
Junming Wang, Yi Shi
2024Neural Parametric Human Hand Modeling with Point Cloud Representation.
Jian Yang, Weize Quan, Zhen Shen, Dong-Ming Yan, Huaiyu Wu
2024Octree-Retention Fusion: A High-Performance Context Model for Point Cloud Geometry Compression.
Zhikang Zhang, Zhongjie Zhu, Yongqiang Bai, Ming Wang, Zhijing Yu
2024OpenLifelogCam - A Low-Cost Open-Source Wearable Camera Platform.
Luca Rossetto
2024Overview of the Grand Challenge on Detecting Cheapfakes at ACM ICMR 2024.
Duc-Tien Dang-Nguyen, Sohail Ahmed Khan, Michael Riegler, Pål Halvorsen, Anh-Duy Tran, Minh-Son Dao, Minh-Triet Tran
2024PTAN: Principal Token-aware Adjacent Network for Compositional Temporal Grounding.
Zhuoyuan Wei, Xun Jiang, Zheng Wang, Fumin Shen, Xing Xu
2024Parametric CAD Primitive Retrieval via Multi-Modal Fusion and Deep Hashing.
Minyang Xu, Yunzhong Lou, Weijian Ma, Xueyang Li, Xiangdong Zhou
2024Pattern4Ego: Learning Egocentric Video Representation Using Cross-video Activity Patterns.
Ruihai Wu, Yourong Zhang, Yu Qi, Andy Guanhong Chen, Hao Dong
2024PiCoGen: Generate Piano Covers with a Two-stage Approach.
Chih-Pin Tan, Shuen-Huei Guan, Yi-Hsuan Yang
2024PoseRec: 3D Human Pose Driven Online Advertisement Recommendation for Micro-videos.
Zhaoxin Fan, Fengxin Li, Hongyan Liu, Jun He, Xiaoyong Du
2024Proactive Privacy and Intellectual Property Protection of Multimedia Retrieval Models in Edge Intelligence.
Peihao Li, Jie Huang, Shuaishuai Zhang, Chunyang Qi
2024Proceedings of the 2024 International Conference on Multimedia Retrieval, ICMR 2024, Phuket, Thailand, June 10-14, 2024
Cathal Gurrin, Rachada Kongkachandra, Klaus Schoeffmann, Duc-Tien Dang-Nguyen, Luca Rossetto, Shin'ichi Satoh, Liting Zhou
2024Progressive Multi-modal Conditional Prompt Tuning.
Xiaoyu Qiu, Hao Feng, Yuechen Wang, Wengang Zhou, Houqiang Li
2024Prompt Expending for Single Positive Multi-Label Learning with Global Unannotated Categories.
Zhongnian Li, Peng Ying, Meng Wei, Tongfeng Sun, Xinzheng Xu
2024Pseudo Content Hallucination for Unpaired Image Captioning.
Huixia Ben, Shuo Wang, Meng Wang, Richang Hong
2024Pyramidal Cross-Modal Transformer with Sustained Visual Guidance for Multi-Label Image Classification.
Zhuohua Li, Ruyun Wang, Fuqing Zhu, Jizhong Han, Songlin Hu
2024QAVidCap: Enhancing Video Captioning through Question Answering Techniques.
Hui Liu, Xiaojun Wan
2024RE-IDVIS: Person Re-Identification System based on Interactive Visualization.
Wang xia, Guodao Sun, Zihao Zhu, Pan Liang, Sujia Zhu, Yiming Wu, Haoran Liang, Ronghua Liang
2024RGB-D Video Object Segmentation via Enhanced Multi-store Feature Memory.
Boyue Xu, Ruichao Hou, Tongwei Ren, Gangshan Wu
2024Reconciling the Rift Between Recognition and Recall: Insights from a Video Memorability Drawing Experiment.
Lorin Sweeney, Graham Healy, Alan F. Smeaton
2024Refracting Once is Enough: Neural Radiance Fields for Novel-View Synthesis of Real Refractive Objects.
Xiaoqian Liang, Jianji Wang, Yuanliang Lu, Xubin Duan, Xichun Liu, Nanning Zheng
2024Reproducibility Companion Paper of "MMSF: A Multimodal Sentiment-Fused Method to Recognize Video Speaking Style".
Fan Yu, Beibei Zhang, Yaqun Fang, Jia Bei, Tongwei Ren, Jiyi Li, Luca Rossetto
2024Reproducibility Companion Paper: Recommendation of Mix-and-Match Clothing by Modeling Indirect Personal Compatibility.
Shuiying Liao, Yujuan Ding, P. Y. Mok, Qiushi Huang, Jialun Cao
2024Reproducibility Companion Paper: Stable Diffusion for Content-Style Disentanglement in Art Analysis.
Yankun Wu, Yuta Nakashima, Noa Garcia, Sheng Li, Zhaoyang Zeng
2024Research on Epilepsy Classification Model Based on Variational Mode Quadratic Decomposition.
Chen Huang, Zhijun Fan, Kui Xiao, Yan Zhang, Shihui Wang, Jianhua Song, Wei Wu, Chao Liu
2024Retrieval-Augmented Audio Deepfake Detection.
Zuheng Kang, Yayun He, Botao Zhao, Xiaoyang Qu, Junqing Peng, Jing Xiao, Jianzong Wang
2024RetrievalMMT: Retrieval-Constrained Multi-Modal Prompt Learning for Multi-Modal Machine Translation.
Yan Wang, Yawen Zeng, Junjie Liang, Xiaofen Xing, Jin Xu, Xiangmin Xu
2024Retrieving Emotional Stimuli in Artworks.
Tianwei Chen, Noa Garcia, Liangzhi Li, Yuta Nakashima
2024Robust Video Hashing with Non-negative Tensor Factorization for Copy Detection.
Mengzhu Yu, Zhenjun Tang, Huijiang Zhuang, Xiaoping Liang, Zhixin Li, Xianquan Zhang
2024S2F-Net: Shared-Specific Fusion Network for Infrared and Visible Image Fusion.
Yijing Zhao, Yuchao Xia, Yi Ding, Yumeng Liu, Shuai Liu, Hongan Wang
2024SBCR: Stochasticity Beats Content Restriction Problem in Training and Tuning Free Image Editing.
Jiancheng Huang, Mingfu Yan, Yifan Liu, Shifeng Chen
2024SFAM: Lightweight Spectrum Unreferenced Attention Network.
Xuanhao Qi, Min Zhi, Yanjun Yin, Ping Ping, Yuening Zhang
2024STDG: Semi-Teacher-Student Training Paradigm for Depth-guided One-stage Scene Graph Generation.
Xukun Zhou, Zhenbo Song, Jun He, Hongyan Liu, Zhaoxin Fan
2024SamCap: Energy-based Controllable Image Captioning by Gradient-Based Sampling.
Yuchen Niu, Min Zhu, Zhihua Wei
2024Secure Verification Encrypted Image Retrieval Scheme with Addition Homomorphic Bitmap Index.
Mingyue Li, Yuting Zhu, Ruizhong Du, Chunfu Jia
2024Self-Supervised Multi-Label Classification with Global Context and Local Attention.
Chun-Yen Chen, Mei-Chen Yeh
2024Semantic-guided RGB-Thermal Crowd Counting with Segment Anything Model.
Yaqun Fang, Yi Shi, Jia Bei, Tongwei Ren
2024Semi-Parametric Style Transfer with Multi-Perspective Feature Fusion and Information-Guided Alignment.
Tianlong Zhang, Jing Lv, Ming Yang
2024SkeletonFormer: Point Cloud Completion with Dynamic Selective Skeleton Points.
Beiqi Liu, Fuqing Duan, Junli Zhao
2024Sketch-aided Interactive Fusion Point Cloud Place Recognition.
Ruonan Zhang, Xiaohang Liu, Ge Li, Thomas H. Li, Pengjun Zhao
2024Smart Fitting Room: A One-stop Framework for Matching-aware Virtual Try-On.
Mingzhe Yu, Yunshan Ma, Lei Wu, Kai Cheng, Xue Li, Lei Meng, Tat-Seng Chua
2024Speak From Heart: An Emotion-Guided LLM-Based Multimodal Method for Emotional Dialogue Generation.
Chenxiao Liu, Zheyong Xie, Sirui Zhao, Jin Zhou, Tong Xu, Minglei Li, Enhong Chen
2024Subspace Clustering with A Hybrid Adaptive Graph Filter.
Lai Wei, Mingyuan Xi
2024TIM: Temporal Interaction Model in Notification System.
Huxiao Ji, Haitao Yang, Linchuan Li, Shunyu Zhang, Cunyi Zhang, Xuanping Li, Wenwu Ou
2024TWIST: Text-only Weakly Supervised Scene Text Spotting Using Pseudo Labels.
Lilong Wen, Xiu Tang, Dongxiang Zhang
2024Targeted Universal Adversarial Attack on Deep Hash Networks.
Fanlei Meng, Xiangru Chen, Yuan Cao
2024TeGA: A Text-Guided Generative-based Approach in Cheapfake Detection.
Anh-Thu Le, Minh-Dat Nguyen, Minh-Son Dao, Anh-Duy Tran, Duc-Tien Dang-Nguyen
2024Team HUGE: Image-Text Matching via Hierarchical and Unified Graph Enhancing.
Bo Li, You Wu, Zhixin Li
2024Text Adversarial Defense via Granular-Ball Sample Enhancement.
Zeli Wang, Jian Li, Shuyin Xia, Longlong Lin, Guoyin Wang
2024The First ACM Workshop on AI-Powered Question Answering Systems for Multimedia.
Tai Tan Mai, Quang-Linh Tran, Ly-Duyen Tran, Van-Tu Ninh, Duc-Tien Dang-Nguyen, Cathal Gurrin
2024The LLM Wrecking Ball: Are We About to Lose Decades of Work in Multimedia because of MM-LLMs?
Alan F. Smeaton
2024TriMPL: Masked Multi-Prompt Learning with Knowledge Mixing for Vision-Language Few-shot Learning.
Xiangyu Liu, Yanlei Shang, Yong Chen
2024Triadic Elastic Structure Representation for Open-Set Incremental 3D Object Retrieval.
Yang Xu, Yifan Feng, Lin Bie
2024TrustGo: Trust Mining and Multi-semantic Regularization in Social Recommendation.
Shenghao Liu, Yuqin Lan, Xianjun Deng, Lingzhi Yi, Chenlu Zhu, Laurence T. Yang, Jong Hyuk Park
2024UBiSS: A Unified Framework for Bimodal Semantic Summarization of Videos.
Yuting Mei, Linli Yao, Qin Jin
2024Unifying Pictorial and Textual Features for Screen Content Image Quality Evaluation.
Yihua Chen, Xiaoping Liang, Mengzhu Yu, Zhenjun Tang
2024Unveiling Global Narratives: A Multilingual Twitter Dataset of News Media on the Russo-Ukrainian Conflict.
Sherzod Hakimov, Gullal S. Cheema
2024VEC-MNER: Hybrid Transformer with Visual-Enhanced Cross-Modal Multi-level Interaction for Multimodal NER.
Pengfei Wei, Hongjun Ouyang, Qintai Hu, Bi Zeng, Guang Feng, Qingpeng Wen
2024Vector-Aware Anisotropic Gauge Equivariant Mesh Convolution Network for 3D Aneurysm Detection.
Xudong Ru, Haichuan Zhao, Xingce Wang, Zhongke Wu, Shaolong Liu, Yi-Cheng Zhu, Alejandro F. Frangi
2024Visibility-guided Human Body Reconstruction from Uncalibrated Multi-view Cameras.
Zhenyu Xie, Huanyu He, Gui Zou, Jie Wu, Guoliang Liu, Jun Zhao, Yingxue Wang, Hui Lin, Weiyao Lin
2024When Handcrafted Filter Meets CNN: A Lightweight Conv-Filter Mixer Network for Efficient Image Super-Resolution.
Zhijian Wu, Wenhui Liu, Dingjiang Huang
2024Wireless Capsule Endoscope Low-light Image Enhancement with Balanced Brightness and Saturation.
Wenzhuo Li, Yinghui Wang, Wei Li, Liangyi Huang, Kamoliddin Shukurov, Mingfeng Wang
2024YawnNet: A Visual-Centric Approach for Yawning Detection.
Ruoxi Sun, Xinyu Yang, Cong Qian, Chenyu Zhu, Wei Sui, Zeyd Boukhers, Cong Yang