ACM Multimedia A*

1233 papers

YearTitle / Authors
2024"Special Relativity" of Image Aesthetics Assessment: a Preliminary Empirical Perspective.
Rui Xie, Anlong Ming, Shuai He, Yi Xiao, Huadong Ma
20241M-Deepfakes Detection Challenge.
Zhixi Cai, Abhinav Dhall, Shreya Ghosh, Munawar Hayat, Dimitrios Kollias, Kalin Stefanov, Usman Tariq
2024256 Metaverse Records Dataset.
Patrick Steinert, Stefan Wagenpfeil, Ingo Frommholz, Matthias L. Hemmje
20242M-AF: A Strong Multi-Modality Framework For Human Action Quality Assessment with Self-supervised Representation Learning.
Yuning Ding, Sifan Zhang, Shenglan Liu, Jinrong Zhang, Wenyue Chen, Haifei Duan, Bingcheng Dong, Tao Sun
20243D Gaussian Editing with A Single Image.
Guan Luo, Tian-Xing Xu, Ying-Tian Liu, Xiaoxiong Fan, Fang-Lue Zhang, Song-Hai Zhang
20243D Human Pose Estimation from Multiple Dynamic Views via Single-view Pretraining with Procrustes Alignment.
Renshu Gu, Jiajun Zhu, Yixuan Si, Fei Gao, Jiamin Xu, Gang Xu
20243D Priors-Guided Diffusion for Blind Face Restoration.
Xiaobin Lu, Xiaobin Hu, Jun Luo, Ben Zhu, Yaping Ruan, Wenqi Ren
20243D Question Answering for City Scene Understanding.
Penglei Sun, Yaoxian Song, Xiang Liu, Xiaofei Yang, Qiang Wang, Tiefeng Li, Yang Yang, Xiaowen Chu
20243D Question Answering with Scene Graph Reasoning.
Zizhao Wu, Haohan Li, Gongyi Chen, Zhou Yu, Xiaoling Gu, Yigang Wang
20243D Reconstruction and Novel View Synthesis of Indoor Environments Based on a Dual Neural Radiance Field.
Zhenyu Bao, Guibiao Liao, Zhongyuan Zhao, Kanglin Liu, Qing Li, Guoping Qiu
20243D Scene De-occlusion in Neural Radiance Fields: A Framework for Obstacle Removal and Realistic Inpainting.
Yi Liu, Xinyi Li, Wenjing Shuai
20243D-GRES: Generalized 3D Referring Expression Segmentation.
Changli Wu, Yihang Liu, Jiayi Ji, Yiwei Ma, Haowei Wang, Gen Luo, Henghui Ding, Xiaoshuai Sun, Rongrong Ji
20243DPCP-Net: A Lightweight Progressive 3D Correspondence Pruning Network for Accurate and Efficient Point Cloud Registration.
Jingtao Wang, Zechao Li
20244D Gaussian Splatting with Scale-aware Residual Field and Adaptive Optimization for Real-time Rendering of Temporally Complex Dynamic Scenes.
Jinbo Yan, Rui Peng, Luyang Tang, Ronggang Wang
2024A Chinese Multimodal Social Video Dataset for Controversy Detection.
Tianjiao Xu, Aoxuan Chen, Yuxi Zhao, Jinfei Gao, Tian Gan
2024A Coarse to Fine Detection Method for Prohibited Object in X-ray Images Based on Progressive Transformer Decoder.
Chunjie Ma, Lina Du, Zan Gao, Li Zhuo, Meng Wang
2024A Descriptive Basketball Highlight Dataset for Automatic Commentary Generation.
Benhui Zhang, Junyu Gao, Yuan Yuan
2024A General Framework to Boost 3D GS Initialization for Text-to-3D Generation by Lexical Richness.
Lutao Jiang, Hangyu Li, Lin Wang
2024A Lightweight Anchor-Based Incremental Framework for Multi-view Clustering.
Qian Qu, Xinhang Wan, Weixuan Liang, Jiyuan Liu, Yu Feng, Huiying Xu, Xinwang Liu, En Zhu
2024A Lightweight Multi-domain Multi-attention Progressive Network for Single Image Deraining.
Junliu Zhong, Zhiyi Li, Dan Xiang, Maotang Han, Changsheng Li, Yanfen Gan
2024A Medical Data-Effective Learning Benchmark for Highly Efficient Pre-training of Foundation Models.
Wenxuan Yang, Weimin Tan, Yuqi Sun, Bo Yan
2024A Method for Visual Spatial Description Based on Large Language Model Fine-tuning.
Jiabao Wang, Fang Gao, Jingfeng Tang, Shaodong Li, Hanbo Zheng, Shengheng Ma, Feng Shuang, Jun Yu
2024A Multi-scale Feature Learning Network with Optical Flow Correction for Micro- and Macro-expression Spotting.
Zhengye Zhang, Sirui Zhao, Xinglong Mao, Shifeng Liu, Hao Wang, Tong Xu, Enhong Chen
2024A Multilevel Guidance-Exploration Network and Behavior-Scene Matching Method for Human Behavior Anomaly Detection.
Guoqing Yang, Zhiming Luo, Jianzhe Gao, Yingxin Lai, Kun Yang, Yifan He, Shaozi Li
2024A Novel Confidence Guided Training Method for Conditional GANs with Auxiliary Classifier.
Qi Chen, Wenjie Liu, Hu Ding
2024A Novel State Space Model with Local Enhancement and State Sharing for Image Fusion.
Zihan Cao, Xiao Wu, Liang-Jian Deng, Yu Zhong
2024A Picture Is Worth a Graph: A Blueprint Debate Paradigm for Multimodal Reasoning.
Changmeng Zheng, Dayong Liang, Wengyu Zhang, Xiaoyong Wei, Tat-Seng Chua, Qing Li
2024A Plug-and-Play Method for Rare Human-Object Interactions Detection by Bridging Domain Gap.
Lijun Zhang, Wei Suo, Peng Wang, Yanning Zhang
2024A Principled Approach to Natural Language Watermarking.
Zhe Ji, Qiansiqi Hu, Yicheng Zheng, Liyao Xiang, Xinbing Wang
2024A Progressive Skip Reasoning Fusion Method for Multi-Modal Classification.
Qian Guo, Xinyan Liang, Yuhua Qian, Zhihua Cui, Jie Wen
2024A Sample-driven Selection Framework: Towards Graph Contrastive Networks with Reinforcement Learning.
Xiangping Zheng, Xiuxin Hao, Bo Wu, Xigang Bao, Xuan Zhang, Wei Li, Xun Liang
2024A Simple and Provable Approach for Learning on Noisy Labeled Medical Images.
Nan Wang, Zonglin Di, Houlin He, Qingchao Jiang, Xiaoxiao Li
2024A Solution to ACMMM 2024 on Artificial Intelligence Generated Image Detection.
Shihang Li, Haishan Wu, Biao Wang
2024A Synopsis of FAME 2024 Challenge: Associating Faces with Voices in Multilingual Environments.
Muhammad Saad Saeed, Shah Nawaz, Marta Moscati, Rohan Kumar Das, Muhammad Salman Tahir, Muhammad Zaigham Zaheer, Muhammad Irzam Liaqat, Muhammad Haris Khan, Karthik Nandakumar, Muhammad Haroon Yousaf, Markus Schedl
2024A Unified Understanding of Adversarial Vulnerability Regarding Unimodal Models and Vision-Language Pre-training Models.
Haonan Zheng, Xinyang Deng, Wen Jiang, Wenrui Li
2024A Unimodal Valence-Arousal Driven Contrastive Learning Framework for Multimodal Multi-Label Emotion Recognition.
Wenjie Zheng, Jianfei Yu, Rui Xia
2024ACM Multimedia 2024 Grand Challenge Report for Artificial Intelligence Generated Image Detection.
Shien Song, Jie Yang, Jin Chen, Han Qi, Yifei Xue, Yizhen Lao, Yi Yu
2024ADDG: An Adaptive Domain Generalization Framework for Cross-Plane MRI Segmentation.
Zibo Ma, Bo Zhang, Zheng Zhang, Wu Liu, Wufan Wang, Hui Gao, Wendong Wang
2024AIGCs Confuse AI Too: Investigating and Explaining Synthetic Image-induced Hallucinations in Large Vision-Language Models.
Yifei Gao, Jiaqi Wang, Zhiyu Lin, Jitao Sang
2024AL-GTD: Deep Active Learning for Gaze Target Detection.
Francesco Tonini, Nicola Dall'Asen, Lorenzo Vaquero, Cigdem Beyan, Elisa Ricci
2024AMG-Embedding: A Self-Supervised Embedding Approach for Audio Identification.
Yuhang Su, Wei Hu, Fan Zhang, Qiming Xu
2024ANFluid: Animate Natural Fluid Photos base on Physics-Aware Simulation and Dual-Flow Texture Learning.
Xiangcheng Zhai, Yingqi Jie, Xueguang Xie, Aimin Hao, Na Jiang, Yang Gao
2024APP: Adaptive Pose Pooling for 3D Human Pose Estimation from Videos.
Jinyan Zhang, Mengyuan Liu, Hong Liu, Guoquan Wang, Wenhao Li
2024ARTS: Semi-Analytical Regressor using Disentangled Skeletal Representations for Human Mesh Recovery from Videos.
Tao Tang, Hong Liu, Yingxuan You, Ti Wang, Wenhao Li
2024AV-Deepfake1M: A Large-Scale LLM-Driven Audio-Visual Deepfake Dataset.
Zhixi Cai, Shreya Ghosh, Aman Pankaj Adatia, Munawar Hayat, Abhinav Dhall, Tom Gedeon, Kalin Stefanov
2024AVHash: Joint Audio-Visual Hashing for Video Retrieval.
Yuxiang Zhou, Zhe Sun, Rui Liu, Yong Chen, Dell Zhang
2024AbsGS: Recovering Fine Details in 3D Gaussian Splatting.
Zongxin Ye, Wenyu Li, Sidun Liu, Peng Qiao, Yong Dou
2024Accurate and Lightweight Learning for Specific Domain Image-Text Retrieval.
Rui Yang, Shuang Wang, Jianwei Tao, Yingping Han, Qiaoling Lin, Yanhe Guo, Biao Hou, Licheng Jiao
2024Achieving Resolution-Agnostic DNN-based Image Watermarking: A Novel Perspective of Implicit Neural Representation.
Yuchen Wang, Xingyu Zhu, Guanhui Ye, Shiyao Zhang, Xuetao Wei
2024Ada-iD: Active Domain Adaptation for Intrusion Detection.
Fujun Han, Peng Ye, Shukai Duan, Lidan Wang
2024Ada2I: Enhancing Modality Balance for Multimodal Conversational Emotion Recognition.
Cam-Van Thi Nguyen, The-Son Le, Anh-Tuan Mai, Duc-Trong Le
2024AdaCoder: Adaptive Prompt Compression for Programmatic Visual Question Answering.
Mahiro Ukai, Shuhei Kurita, Atsushi Hashimoto, Yoshitaka Ushiku, Nakamasa Inoue
2024AdaFPP: Adapt-Focused Bi-Propagating Prototype Learning for Panoramic Activity Recognition.
Meiqi Cao, Rui Yan, Xiangbo Shu, Guangzhao Dai, Yazhou Yao, Guo-Sen Xie
2024AdapMTL: Adaptive Pruning Framework for Multitask Learning Model.
Mingcan Xiang, Jiaxun Tang, Qizheng Yang, Hui Guan, Tongping Liu
2024Adaptive Hierarchical Aggregation for Federated Object Detection.
Ruofan Jia, Weiying Xie, Jie Lei, Yunsong Li
2024Adaptive Instance-wise Multi-view Clustering.
Shudong Huang, Hecheng Cai, Hao Dai, Wentao Feng, Jiancheng Lv
2024Adaptive Multi-Modality Prompt Learning.
Zongqian Wu, Yujing Liu, Mengmeng Zhan, Ping Hu, Xiaofeng Zhu
2024Adaptive Pruning of Channel Spatial Dependability in Convolutional Neural Networks.
Weiying Xie, Mei Yuan, Jitao Ma, Yunsong Li
2024Adaptive Query Selection for Camouflaged Instance Segmentation.
Bo Dong, Pichao Wang, Hao Luo, Fan Wang
2024Adaptive Selection based Referring Image Segmentation.
Pengfei Yue, Jianghang Lin, Shengchuan Zhang, Jie Hu, Yilin Lu, Hongwei Niu, Haixin Ding, Yan Zhang, Guannan Jiang, Liujuan Cao, Rongrong Ji
2024Adaptive Vision Transformer for Event-Based Human Pose Estimation.
Nannan Yu, Tao Ma, Jiqing Zhang, Yuji Zhang, Qirui Bao, Xiaopeng Wei, Xin Yang
2024Adaptively Building a Video-language Model for Video Captioning and Retrieval without Massive Video Pretraining.
Zihao Liu, Xiaoyu Wu, Shengjin Wang, Jiayao Qian
2024Addressing Imbalance for Class Incremental Learning in Medical Image Classification.
Xuze Hao, Wenqian Ni, Xuhao Jiang, Weimin Tan, Bo Yan
2024AdvQDet: Detecting Query-Based Adversarial Attacks with Adversarial Contrastive Prompt Tuning.
Xin Wang, Kai Chen, Xingjun Ma, Zhineng Chen, Jingjing Chen, Yu-Gang Jiang
2024Advancing 3D Object Grounding Beyond a Single 3D Scene.
Wencan Huang, Daizong Liu, Wei Hu
2024Advancing Generalized Deepfake Detector with Forgery Perception Guidance.
Ruiyang Xia, Dawei Zhou, Decheng Liu, Lin Yuan, Shuodi Wang, Jie Li, Nannan Wang, Xinbo Gao
2024Advancing Micro-Action Recognition with Multi-Auxiliary Heads and Hybrid Loss Optimization.
Qiankun Li, Xiaolong Huang, Huabao Chen, Feng He, Qiupu Chen, Zengfu Wang
2024Advancing Multi-grained Alignment for Contrastive Language-Audio Pre-training.
Yiming Li, Zhifang Guo, Xiangdong Wang, Hong Liu
2024Advancing Multimodal Large Language Models with Quantization-Aware Scale Learning for Efficient Adaptation.
Jingjing Xie, Yuxin Zhang, Mingbao Lin, Liujuan Cao, Rongrong Ji
2024Advancing Prompt Learning through an External Layer.
Fangming Cui, Xun Yang, Chao Wu, Liang Xiao, Xinmei Tian
2024Advancing Quantization Steps Estimation: A Two-Stream Network Approach for Enhancing Robustness.
Xin Cheng, Hao Wang, Jinwei Wang, Xiangyang Luo, Bin Ma
2024Advancing Semantic Edge Detection through Cross-Modal Knowledge Learning.
Ruoxi Deng, Bin Yu, Jinxuan Lu, Caixia Zhou, Zhao-Min Chen, Jie Hu
2024Adversarial Example Quality Assessment: A Large-scale Dataset and Strong Baseline.
Jia-Li Yin, Menghao Chen, Jin Han, Bo-Hao Chen, Ximeng Liu
2024Adversarial Experts Model for Black-box Domain Adaptation.
Siying Xiao, Mao Ye, Qichen He, Shuaifeng Li, Song Tang, Xiatian Zhu
2024AerialGait: Bridging Aerial and Ground Views for Gait Recognition.
Aoqi Li, Saihui Hou, Chenye Wang, Qingyuan Cai, Yongzhen Huang
2024AesExpert: Towards Multi-modality Foundation Model for Image Aesthetics Perception.
Yipo Huang, Xiangfei Sheng, Zhichao Yang, Quan Yuan, Zhichao Duan, Pengfei Chen, Leida Li, Weisi Lin, Guangming Shi
2024AesMamba: Universal Image Aesthetic Assessment with State Space Models.
Fei Gao, Yuhao Lin, Jiaqi Shi, Maoying Qiao, Nannan Wang
2024AesStyler: Aesthetic Guided Universal Style Transfer.
Ran Yi, Haokun Zhu, Teng Hu, Yu-Kun Lai, Paul L. Rosin
2024Affinity3D: Propagating Instance-Level Semantic Affinity for Zero-Shot Point Cloud Semantic Segmentation.
Haizhuang Liu, Junbao Zhuo, Chen Liang, Jiansheng Chen, Huimin Ma
2024Agent Aggregator with Mask Denoise Mechanism for Histopathology Whole Slide Image Analysis.
Xitong Ling, Minxi Ouyang, Yizhi Wang, Xinrui Chen, Renao Yan, Hongbo Chu, Junru Cheng, Tian Guan, Sufang Tian, Xiaoping Liu, Yonghong He
2024Align-IQA: Aligning Image Quality Assessment Models with Diverse Human Preferences via Customizable Guidance.
Junfeng Yang, Jing Fu, Zhen Zhang, Limei Liu, Qin Li, Wei Zhang, Wenzhi Cao
2024Align2Concept: Language Guided Interpretable Image Recognition by Visual Prototype and Textual Concept Alignment.
Jiaqi Wang, Pichao Wang, Yi Feng, Huafeng Liu, Chang Gao, Liping Jing
2024AlignCLIP: Align Multi Domains of Texts Input for CLIP models with Object-IoU Loss.
Lu Zhang, Ke Yan, Shouhong Ding
2024All rivers run into the sea: Unified Modality Brain-Inspired Emotional Central Mechanism.
Xinji Mai, Junxiong Lin, Haoran Wang, Zeng Tao, Yan Wang, Shaoqi Yan, Xuan Tong, Jiawen Yu, Boyang Wang, Ziheng Zhou, Qing Zhao, Shuyong Gao, Wenqiang Zhang
2024Alleviating the Equilibrium Challenge with Sample Virtual Labeling for Adversarial Domain Adaptation.
Wenxu Shi, Bochuan Zheng
2024An Active Masked Attention Framework for Many-to-Many Cross-Domain Recommendations.
Feng Zhu, Xinxing Yang, Longfei Li, Jun Zhou
2024An End-to-End Real-World Camera Imaging Pipeline.
Kepeng Xu, Zijia Ma, Li Xu, Gang He, Yunsong Li, Wenxin Yu, Taichu Han, Cheng Yang
2024An Entailment Tree Generation Approach for Multimodal Multi-Hop Question Answering with Mixture-of-Experts and Iterative Feedback Mechanism.
Qing Zhang, Haocheng Lv, Jie Liu, Zhiyun Chen, Jianyong Duan, Hao Wang, Li He, Mingying Xu
2024An In-depth Study of Bandwidth Allocation across Media Sources in Video Conferencing.
Zejun Zhang, Xiao Zhu, Anlan Zhang, Feng Qian
2024An Innovative Industry Program in A New Era of Multimedia with Generative AI.
Jianquan Liu, Balu Adsumilli, Yukiko Yanagawa, Haiwei Dong
2024An Inverse Partial Optimal Transport Framework for Music-guided Trailer Generation.
Yutong Wang, Sidan Zhu, Hongteng Xu, Dixin Luo
2024Anatomical Prior Guided Spatial Contrastive Learning for Few-Shot Medical Image Segmentation.
Wendong Huang, Jinwu Hu, Xiuli Bi, Bin Xiao
2024AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion Encoding.
Tao Liu, Feilong Chen, Shuai Fan, Chenpeng Du, Qi Chen, Xie Chen, Kai Yu
2024Animatable 3D Gaussian: Fast and High-Quality Reconstruction of Multiple Human Avatars.
Yang Liu, Xiang Huang, Minghan Qin, Qinwei Lin, Haoqian Wang
2024AraLive: Automatic Reward Adaption for Learning-based Live Video Streaming.
Huanhuan Zhang, Liu Zhuo, Haotian Li, Anfu Zhou, Chuanming Wang, Huadong Ma
2024Are handcrafted filters helpful for attributing AI-generated images?
Jialiang Li, Haoyue Wang, Sheng Li, Zhenxing Qian, Xinpeng Zhang, Athanasios V. Vasilakos
2024Arondight: Red Teaming Large Vision Language Models with Auto-generated Multi-modal Jailbreak Prompts.
Yi Liu, Chengjun Cai, Xiaoli Zhang, Xingliang Yuan, Cong Wang
2024ArtSpeech: Adaptive Text-to-Speech Synthesis with Articulatory Representations.
Zhongxu Wang, Yujia Wang, Mingzhu Li, Hua Huang
2024Aspect-Based Multimodal Mining: Unveiling Sentiments, Complaints, and Beyond in User-Generated Content.
Mamta, Gopendra Vikram Singh, Deepak Raju Kori, Asif Ekbal
2024AssistEditor: Multi-Agent Collaboration for GUI Workflow Automation in Video Creation.
Difei Gao, Siyuan Hu, Zechen Bai, Qinghong Lin, Mike Zheng Shou
2024Asymmetric Event-Guided Video Super-Resolution.
Zeyu Xiao, Dachun Kai, Yueyi Zhang, Xiaoyan Sun, Zhiwei Xiong
2024Attentive Linguistic Tracking in Diffusion Models for Training-free Text-guided Image Editing.
Bingyan Liu, Chengyu Wang, Jun Huang, Kui Jia
2024Attribute-Driven Multimodal Hierarchical Prompts for Image Aesthetic Quality Assessment.
Hancheng Zhu, Ju Shi, Zhiwen Shao, Rui Yao, Yong Zhou, Jiaqi Zhao, Leida Li
2024Attribute-driven Disentangled Representation Learning for Multimodal Recommendation.
Zhenyang Li, Fan Liu, Yinwei Wei, Zhiyong Cheng, Liqiang Nie, Mohan S. Kankanhalli
2024Audio Deepfake Detection with Self-Supervised XLS-R and SLS Classifier.
Qishan Zhang, Shuangbing Wen, Tao Hu
2024Audio-Driven Identity Manipulation for Face Inpainting.
Yuqi Sun, Qing Lin, Weimin Tan, Bo Yan
2024AudioLCM: Efficient and High-Quality Text-to-Audio Generation with Minimal Inference Steps.
Huadai Liu, Rongjie Huang, Yang Liu, Hengyuan Cao, Jialei Wang, Xize Cheng, Siqi Zheng, Zhou Zhao
2024Auto DragGAN: Editing the Generative Image Manifold in an Autoregressive Manner.
Pengxiang Cai, Zhiwei Liu, Guibo Zhu, Yunfang Niu, Jinqiao Wang
2024Auto-ACD: A Large-scale Dataset for Audio-Language Representation Learning.
Luoyi Sun, Xuenan Xu, Mengyue Wu, Weidi Xie
2024AutoGraph: Enabling Visual Context via Graph Alignment in Open Domain Multi-Modal Dialogue Generation.
Deji Zhao, Donghong Han, Ye Yuan, Bo Ning, Mengxiang Li, Zhongjiang He, Shuangyong Song
2024AutoM
Daqin Luo, Chengjian Feng, Yuxuan Nong, Yiqing Shen
2024AutoSFX: Automatic Sound Effect Generation for Videos.
Yujia Wang, Zhongxu Wang, Hua Huang
2024Autogenic Language Embedding for Coherent Point Tracking.
Zikai Song, Ying Tang, Run Luo, Lintao Ma, Junqing Yu, Yi-Ping Phoebe Chen, Wei Yang
2024Automatic and Aligned Anchor Learning Strategy for Multi-View Clustering.
Huimin Ma, Siwei Wang, Shengju Yu, Suyuan Liu, Junjie Huang, Huijun Wu, Xinwang Liu, En Zhu
2024AxiomVision: Accuracy-Guaranteed Adaptive Visual Model Selection for Perspective-Aware Video Analytics.
Xiangxiang Dai, Zeyu Zhang, Peng Yang, Yuedong Xu, Xutong Liu, John C. S. Lui
2024BCSCN: Reducing Domain Gap through Bézier Curve basis-based Sparse Coding Network for Single-Image Super-Resolution.
Wenhao Guo, Peng Lu, Xujun Peng, Zhaoran Zhao, Ji Qiu, Xiangtao Dong
2024BSBP-RWKV: Background Suppression with Boundary Preservation for Efficient Medical Image Segmentation.
Xudong Zhou, Tianxiang Chen
2024Backdoor Attacks on Bimodal Salient Object Detection with RGB-Thermal Data.
Wen Yin, Bin Benjamin Zhu, Yulai Xie, Pan Zhou, Dan Feng
2024Balanced Multi-Relational Graph Clustering.
Zhixiang Shen, Haolan He, Zhao Kang
2024Balancing Generalization and Robustness in Adversarial Training via Steering through Clean and Adversarial Gradient Directions.
Haoyu Tong, Xiaoyu Zhang, Yulin Jin, Jian Lou, Kai Wu, Xiaofeng Chen
2024Benchmarking In-the-Wild Multimodal Disease Recognition and A Versatile Baseline.
Tianqi Wei, Zhi Chen, Zi Huang, Xin Yu
2024Beyond Direct Relationships: Exploring Multi-Order Label Pair Dependencies for Knowledge Distillation.
Jingchao Wang, Zhengnan Deng, Tongxu Lin, Wenyuan Li, Shaobin Ling, Junyu Lin
2024Beyond the Known: Ambiguity-Aware Multi-view Learning.
Zihan Fang, Shide Du, Yuhong Chen, Shiping Wang
2024Bi-directional Task-Guided Network for Few-Shot Fine-Grained Image Classification.
Zhen-Xiang Ma, Zhen-Duo Chen, Li-Jun Zhao, Zi-Chao Zhang, Tai Zheng, Xin Luo, Xin-Shun Xu
2024Bilateral Adaptive Cross-Modal Fusion Prompt Learning for CLIP.
Qiang Wang, Ke Yan, Shouhong Ding
2024Blind Face Video Restoration with Temporal Consistent Generative Prior and Degradation-Aware Prompt.
Jingfan Tan, Hyunhee Park, Ying Zhang, Tao Wang, Kaihao Zhang, Xiangyu Kong, Pengwen Dai, Zikun Liu, Wenhan Luo
2024Blind Video Bit-Depth Expansion.
Panjun Duan, Yang Zhao, Yuan Chen, Wei Jia, Zhao Zhang, Ronggang Wang
2024Boosting Audio Visual Question Answering via Key Semantic-Aware Cues.
Guangyao Li, Henghui Du, Di Hu
2024Boosting Non-causal Semantic Elimination: An Unconventional Harnessing of LVM for Open-World Deepfake Interpretation.
Zhaoyang Li, Zhu Teng, Baopeng Zhang, Jianping Fan
2024Boosting Semi-supervised Crowd Counting with Scale-based Active Learning.
Shiwei Zhang, Wei Ke, Shuai Liu, Xiaopeng Hong, Tong Zhang
2024Boosting Speech Recognition Robustness to Modality-Distortion with Contrast-Augmented Prompts.
Dongjie Fu, Xize Cheng, Xiaoda Yang, Hanting Wang, Zhou Zhao, Tao Jin
2024Boundary-Aware Periodicity-based Sparsification Strategy for Ultra-Long Time Series Forecasting.
Yiying Bao, Hao Zhou, Chao Peng, Chenyang Xu, Shuo Shi, Kecheng Cai
2024BrainRAM: Cross-Modality Retrieval-Augmented Image Reconstruction from Human Brain Activity.
Dian Xie, Peiang Zhao, Jiarui Zhang, Kangqi Wei, Xiaobao Ni, Jiong Xia
2024Break the Visual Perception: Adversarial Attacks Targeting Encoded Visual Tokens of Large Vision-Language Models.
Yubo Wang, Chaohu Liu, Yanqiu Qu, Haoyu Cao, Deqiang Jiang, Linli Xu
2024Breaking Modality Gap in RGBT Tracking: Coupled Knowledge Distillation.
Andong Lu, Jiacong Zhao, Chenglong Li, Yun Xiao, Bin Luo
2024Bridging Fourier and Spatial-Spectral Domains for Hyperspectral Image Denoising.
Jiahua Xiao, Yang Liu, Shizhou Zhang, Xing Wei
2024Bridging Gaps in Content and Knowledge for Multimodal Entity Linking.
Pengfei Luo, Tong Xu, Che Liu, Suojuan Zhang, Linli Xu, Minglei Li, Enhong Chen
2024Bridging Visual Affective Gap: Borrowing Textual Knowledge by Learning from Noisy Image-Text Pairs.
Daiqing Wu, Dongbao Yang, Yu Zhou, Can Ma
2024Bridging the Gap: Sketch-Aware Interpolation Network for High-Quality Animation Sketch Inbetweening.
Jiaming Shen, Kun Hu, Wei Bao, Chang Wen Chen, Zhiyong Wang
2024Bridging the Modality Gap: Dimension Information Alignment and Sparse Spatial Constraint for Image-Text Matching.
Xiang Ma, Xuemei Li, Lexin Fang, Caiming Zhang
2024Building Robust Video-Level Deepfake Detection via Audio-Visual Local-Global Interactions.
Yifan Wang, Xuecheng Wu, Jia Zhang, Mohan Jing, Keda Lu, Jun Yu, Wen Su, Fang Gao, Qingsong Liu, Jianqing Sun, Jiaen Liang
2024Building Trust in Decision with Conformalized Multi-view Deep Classification.
Wei Liu, Yufei Chen, Xiaodong Yue
2024CACE-Net: Co-guidance Attention and Contrastive Enhancement for Effective Audio-Visual Event Localization.
Xiang He, Xiangxi Liu, Yang Li, Dongcheng Zhao, Guobin Shen, Qingqun Kong, Xin Yang, Yi Zeng
2024CAD Translator: An Effective Drive for Text to 3D Parametric Computer-Aided Design Generative Modeling.
Xueyang Li, Yu Song, Yunzhong Lou, Xiangdong Zhou
2024CAPNet: Cartoon Animal Parsing with Spatial Learning and Structural Modeling.
Jian-Jun Qiao, Meng-Yu Duan, Xiao Wu, Wei Li
2024CBNet: Cooperation-Based Weakly Supervised Polyp Detection.
Xiuquan Du, Jiajia Chen, Xuejun Zhang
2024CDEA: Context- and Detail-Enhanced Unsupervised Learning for Domain Adaptive Semantic Segmentation.
Shuyuan Wen, Bingrui Hu, Wenchao Li
2024CFDiffusion: Controllable Foreground Relighting in Image Compositing via Diffusion Model.
Ziqi Yu, Jing Zhou, Zhongyun Bao, Gang Fu, Weilei He, Chao Liang, Chunxia Xiao
2024CIEASR: Contextual Image-Enhanced Automatic Speech Recognition for Improved Homophone Discrimination.
Ziyi Wang, Yiming Rong, Deyang Jiang, Haoran Wu, Shiyu Zhou, Bo Xu
2024CIRP: Cross-Item Relational Pre-training for Multimodal Product Bundling.
Yunshan Ma, Yingzhi He, Wenjun Zhong, Xiang Wang, Roger Zimmermann, Tat-Seng Chua
2024CLIP2UDA: Making Frozen CLIP Reward Unsupervised Domain Adaptation in 3D Semantic Segmentation.
Yao Wu, Mingwei Xing, Yachao Zhang, Yuan Xie, Yanyun Qu
2024CLIPCleaner: Cleaning Noisy Labels with CLIP.
Chen Feng, Georgios Tzimiropoulos, Ioannis Patras
2024CLaM: An Open-Source Library for Performance Evaluation of Text-driven Human Motion Generation.
Xiaodong Chen, Kunlang He, Wu Liu, Xinchen Liu, Zheng-Jun Zha, Tao Mei
2024CLiF-VQA: Enhancing Video Quality Assessment by Incorporating High-Level Semantic Information related to Human Feelings.
Yachun Mi, Yan Shu, Yu Li, Chen Hui, Puchao Zhou, Shaohui Liu
2024CMT: Co-training Mean-Teacher for Unsupervised Domain Adaptation on 3D Object Detection.
Shijie Chen, Junbao Zhuo, Xin Li, Haizhuang Liu, Rongquan Wang, Jiansheng Chen, Huimin Ma
2024COCO-LC: Colorfulness Controllable Language-based Colorization.
Yifan Li, Yuhang Bai, Shuai Yang, Jiaying Liu
2024COMD: Training-free Video Motion Transfer With Camera-Object Motion Disentanglement.
Teng Hu, Jiangning Zhang, Ran Yi, Yating Wang, Jieyu Weng, Hongrui Huang, Yabiao Wang, Lizhuang Ma
2024CP-Prompt: Composition-Based Cross-modal Prompting for Domain-Incremental Continual Learning.
Yu Feng, Zhen Tian, Yifan Zhu, Zongfu Han, Haoran Luo, Guangwei Zhang, Meina Song
2024CRASH: Crash Recognition and Anticipation System Harnessing with Context-Aware and Temporal Focus Attentions.
Haicheng Liao, Haoyu Sun, Huanming Shen, Chengyue Wang, Chunlin Tian, Kahou Tam, Li Li, Chengzhong Xu, Zhenning Li
2024CREAM: Coarse-to-Fine Retrieval and Multi-modal Efficient Tuning for Document VQA.
Jinxu Zhang, Yongqi Yu, Yu Zhang
2024CREST: Cross-modal Resonance through Evidential Deep Learning for Enhanced Zero-Shot Learning.
Haojian Huang, Xiaozhen Qiao, Zhuo Chen, Haodong Chen, Bingyu Li, Zhe Sun, Mulin Chen, Xuelong Li
2024CSO: Constraint-Guided Space Optimization for Active Scene Mapping.
Xuefeng Yin, Chenyang Zhu, Shanglai Qu, Yuqi Li, Kai Xu, Baocai Yin, Xin Yang
2024CT
Bowen Zhao, Tianhao Cheng, Yuejie Zhang, Ying Cheng, Rui Feng, Xiaobo Zhang
2024CalibRBEV: Multi-Camera Calibration via Reversed Bird's-eye-view Representations for Autonomous Driving.
Wenlong Liao, Sunyuan Qiang, Xianfei Li, Xiaolei Chen, Haoyu Wang, Yanyan Liang, Junchi Yan, Tao He, Pai Peng
2024Calibrating Prompt from History for Continual Vision-Language Retrieval and Grounding.
Tao Jin, Weicai Yan, Ye Wang, Sihang Cai, Qifan Shuai, Zhou Zhao
2024Calibration for Long-tailed Scene Graph Generation.
Xuhan Zhu, Yifei Xing, Ruiping Wang, Yaowei Wang, Xiangyuan Lan
2024Can We Debias Multimodal Large Language Models via Model Editing?
Zecheng Wang, Xinye Li, Zhanyue Qin, Chunshan Li, Zhiying Tu, Dianhui Chu, Dianbo Sui
2024Cantor: Inspiring Multimodal Chain-of-Thought of MLLM.
Timin Gao, Peixian Chen, Mengdan Zhang, Chaoyou Fu, Yunhang Shen, Yan Zhang, Shengchuan Zhang, Xiawu Zheng, Xing Sun, Liujuan Cao, Rongrong Ji
2024CapS-Adapter: Caption-based MultiModal Adapter in Zero-Shot Classification.
Qijie Wang, Guandu Liu, Bin Wang
2024Caption-Aware Multimodal Relation Extraction with Mutual Information Maximization.
Zefan Zhang, Weiqi Zhang, Yanhui Li, Tian Bai
2024CartoonNet: Cartoon Parsing with Semantic Consistency and Structure Correlation.
Jian-Jun Qiao, Meng-Yu Duan, Xiao Wu, Yu-Pei Song
2024Cascaded Adversarial Attack: Simultaneously Fooling Rain Removal and Semantic Segmentation Networks.
Zhiwen Wang, Yuhui Wu, Zheng Wang, Jiwei Wei, Tianyu Li, Guoqing Wang, Yang Yang, Hengtao Shen
2024Category-Prompt Refined Feature Learning for Long-Tailed Multi-Label Image Classification.
Jiexuan Yan, Sheng Huang, Nankun Mu, Luwen Huangfu, Bo Liu
2024Caterpillar: A Pure-MLP Architecture with Shifted-Pillars-Concatenation.
Jin Sun, Xiaoshuang Shi, Zhiyuan Wang, Kaidi Xu, Heng Tao Shen, Xiaofeng Zhu
2024Causal Visual-semantic Correlation for Zero-shot Learning.
Shuhuang Chen, Dingjie Fu, Shiming Chen, Shuo Ye, Wenjin Hou, Xinge You
2024Causal-driven Large Language Models with Faithful Reasoning for Knowledge Question Answering.
Jiawei Wang, Da Cao, Shaofei Lu, Zhanchang Ma, Junbin Xiao, Tat-Seng Chua
2024Cefdet: Cognitive Effectiveness Network Based on Fuzzy Inference for Action Detection.
Zhe Luo, Weina Fu, Shuai Liu, Saeed Anwar, Muhammad Saqib, Sambit Bakshi, Khan Muhammad
2024Chain of Visual Perception: Harnessing Multimodal Large Language Models for Zero-shot Camouflaged Object Detection.
Lv Tang, Peng-Tao Jiang, Zhihao Shen, Hao Zhang, Jinwei Chen, Bo Li
2024Channel-Spatial Support-Query Cross-Attention for Fine-Grained Few-Shot Image Classification.
Shicheng Yang, Xiaoxu Li, Dongliang Chang, Zhanyu Ma, Jing-Hao Xue
2024Class Balance Matters to Active Class-Incremental Learning.
Zitong Huang, Ze Chen, Yuanze Li, Bowen Dong, Erjin Zhou, Yong Liu, Rick Siow Mong Goh, Chun-Mei Feng, Wangmeng Zuo
2024ClickDiff: Click to Induce Semantic Contact Map for Controllable Grasp Generation with Diffusion Models.
Peiming Li, Ziyi Wang, Mengyuan Liu, Hong Liu, Chen Chen
2024Cloth-aware Augmentation for Cloth-generalized Person Re-identification.
Fangyi Liu, Mang Ye, Bo Du
2024Cluster-Phys: Facial Clues Clustering Towards Efficient Remote Physiological Measurement.
Wei Qian, Kun Li, Dan Guo, Bin Hu, Meng Wang
2024Cluster-driven Personalized Federated Recommendation with Interest-aware Graph Convolution Network for Multimedia.
Xingyuan Mao, Yuwen Liu, Lianyong Qi, Li Duan, Xiaolong Xu, Xuyun Zhang, Wanchun Dou, Amin Beheshti, Xiaokang Zhou
2024CoAst: Validation-Free Contribution Assessment for Federated Learning based on Cross-Round Valuation.
Hao Wu, Likun Zhang, Shucheng Li, Fengyuan Xu, Sheng Zhong
2024CoIn: A Lightweight and Effective Framework for Story Visualization and Continuation.
Ming Tao, Bing-Kun Bao, Hao Tang, Yaowei Wang, Changsheng Xu
2024CoMO-NAS: Core-Structures-Guided Multi-Objective Neural Architecture Search for Multi-Modal Classification.
Pinhan Fu, Xinyan Liang, Yuhua Qian, Qian Guo, Zhifang Wei, Wen Li
2024CoPL: Parameter-Efficient Collaborative Prompt Learning for Audio-Visual Tasks.
Yihan Zhao, Wei Xi, Yuhang Cui, Gairui Bai, Xinhui Liu, Jizhong Zhao
2024CoTuning: A Large-Small Model Collaborating Distillation Framework for Better Model Generalization.
Zimo Liu, Kangjun Liu, Mingyue Guo, Shiliang Zhang, Yaowei Wang
2024Coarse-to-Fine Proposal Refinement Framework for Audio Temporal Forgery Detection and Localization.
Junyan Wu, Wei Lu, Xiangyang Luo, Rui Yang, Qian Wang, Xiaochun Cao
2024CodeSwap: Symmetrically Face Swapping Based on Prior Codebook.
Xiangyang Luo, Xin Zhang, Yifan Xie, Xinyi Tong, Weijiang Yu, Heng Chang, Fei Ma, Fei Richard Yu
2024Cognition-Supervised Saliency Detection: Contrasting EEG Signals and Visual Stimuli.
Jun Ma, Tuukka Ruotsalo
2024ColVO: Colonoscopic Visual Odometry Considering Geometric and Photometric Consistency.
Ruyu Liu, Zhengzhe Liu, Haoyu Zhang, Guodao Zhang, Jianhua Zhang, Bo Sun, Weiguo Sheng, Xiufeng Liu, Yaochu Jin
2024Collaborative Training of Tiny-Large Vision Language Models.
Shichen Lu, Longteng Guo, Wenxuan Wang, Zijia Zhao, Tongtian Yue, Jing Liu, Si Liu
2024Color4E: Event Demosaicing for Full-color Event Guided Image Deblurring.
Yi Ma, Peiqi Duan, Yuchen Hong, Chu Zhou, Yu Zhang, Jimmy S. J. Ren, Boxin Shi
2024Combating Visual Question Answering Hallucinations via Robust Multi-Space Co-Debias Learning.
Jiawei Zhu, Yishu Liu, Huanjia Zhu, Hui Lin, Yuncheng Jiang, Zheng Zhang, Bingzhi Chen
2024CompGS: Efficient 3D Scene Representation via Compressed Gaussian Splatting.
Xiangrui Liu, Xinju Wu, Pingping Zhang, Shiqi Wang, Zhu Li, Sam Kwong
2024Compacter: A Lightweight Transformer for Image Restoration.
Zhijian Wu, Jun Li, Yang Hu, Dingjiang Huang
2024Conditional Diffusion Model for Open-ended Video Question Answering.
Xinyue Liu, Jiahui Wan, Linlin Zong, Bo Xu
2024Connectivity-based Cerebrovascular Segmentation in Time-of-Flight Magnetic Resonance Angiography.
Zan Chen, Xiao Yu, Yuanjing Feng
2024Cons2Plan: Vector Floorplan Generation from Various Conditions via a Learning Framework based on Conditional Diffusion Models.
Shibo Hong, Xuhong Zhang, Tianyu Du, Sheng Cheng, Xun Wang, Jianwei Yin
2024Consistencies are All You Need for Semi-supervised Vision-Language Tracking.
Jiawei Ge, Jiuxin Cao, Xuelin Zhu, Xinyu Zhang, Chang Liu, Kun Wang, Bo Liu
2024Consistency Guided Diffusion Model with Neural Syntax for Perceptual Image Compression.
Haowei Kuang, Yiyang Ma, Wenhan Yang, Zongming Guo, Jiaying Liu
2024Consistent123: One Image to Highly Consistent 3D Asset Using Case-Aware Diffusion Priors.
Yukang Lin, Haonan Han, Chaoqun Gong, Zunnan Xu, Yachao Zhang, Xiu Li
2024ConsistentAvatar: Learning to Diffuse Fully Consistent Talking Head Avatar with Temporal Guidance.
Haijie Yang, Zhenyu Zhang, Hao Tang, Jianjun Qian, Jian Yang
2024Context-Aware Indoor Point Cloud Object Generation through User Instructions.
Yiyang Luo, Ke Lin, Chao Gu
2024Continual Panoptic Perception: Towards Multi-modal Incremental Interpretation of Remote Sensing Images.
Bo Yuan, Danpei Zhao, Zhuoran Liu, Wentao Li, Tian Li
2024Contrastive Context-Speech Pretraining for Expressive Text-to-Speech Synthesis.
Yujia Xiao, Xi Wang, Xu Tan, Lei He, Xinfa Zhu, Sheng Zhao, Tan Lee
2024Contrastive Graph Distribution Alignment for Partially View-Aligned Clustering.
Xibiao Wang, Hang Gao, Xindian Wei, Liang Peng, Rui Li, Cheng Liu, Si Wu, Hau-San Wong
2024Contrastive Learning-based Chaining-Cluster for Multilingual Voice-Face Association.
Wuyang Chen, Yanjie Sun, Kele Xu, Yong Dou
2024Control-Talker: A Rapid-Customization Talking Head Generation Method for Multi-Condition Control and High-Texture Enhancement.
Yiding Li, Lingyun Yu, Li Wang, Hongtao Xie
2024Controllable Music Loops Generation with MIDI and Text via Multi-Stage Cross Attention and Instrument-Aware Reinforcement Learning.
Guan-Yuan Chen, Von-Wun Soo
2024Controllable Procedural Generation of Landscapes.
Jia-Hong Liu, Shao-Kui Zhang, Chuyue Zhang, Song-Hai Zhang
2024Convert and Speak: Zero-shot Accent Conversion with Minimum Supervision.
Zhijun Jia, Huaying Xue, Xiulian Peng, Yan Lu
2024Correlation-Driven Multi-Modality Graph Decomposition for Cross-Subject Emotion Recognition.
Wuliang Huang, Yiqiang Chen, Xinlong Jiang, Chenlong Gao, Qian Chen, Teng Zhang, Bingjie Yan, Yifan Wang, Jianrong Yang
2024Counterfactually Augmented Event Matching for De-biased Temporal Sentence Grounding.
Xun Jiang, Zhuoyuan Wei, Shenshen Li, Xing Xu, Jingkuan Song, Heng Tao Shen
2024Cover-separable Fixed Neural Network Steganography via Deep Generative Models.
Guobiao Li, Sheng Li, Zhenxing Qian, Xinpeng Zhang
2024Cross-Class Domain Adaptive Semantic Segmentation with Visual Language Models.
Wenqi Ren, Ruihao Xia, Meng Zheng, Ziyan Wu, Yang Tang, Nicu Sebe
2024Cross-Modal Coherence-Enhanced Feedback Prompting for News Captioning.
Ning Xu, Yifei Gao, Ting-Ting Zhang, Hongshuo Tian, An-An Liu
2024Cross-Modal Meta Consensus for Heterogeneous Federated Learning.
Shuai Li, Fan Qi, Zixin Zhang, Changsheng Xu
2024Cross-Task Knowledge Transfer for Semi-supervised Joint 3D Grounding and Captioning.
Yang Liu, Daizong Liu, Zongming Guo, Wei Hu
2024Cross-View Consistency Regularisation for Knowledge Distillation.
Weijia Zhang, Dongnan Liu, Weidong Cai, Chao Ma
2024Cross-View Mutual Learning for Semi-Supervised Medical Image Segmentation.
Song Wu, Xiaoyu Wei, Xinyue Chen, Yazhou Ren, Jing He, Xiaorong Pu
2024Cross-modal Observation Hypothesis Inference.
Mengze Li, Kairong Han, Jiahe Xu, Yueying Li, Tao Wu, Zhou Zhao, Jiaxu Miao, Shengyu Zhang, Jingyuan Chen
2024Cross-view Contrastive Unification Guides Generative Pretraining for Molecular Property Prediction.
Junyu Lin, Yan Zheng, Xinyue Chen, Yazhou Ren, Xiaorong Pu, Jing He
2024Crossmodal Few-shot 3D Point Cloud Semantic Segmentation via View Synthesis.
Ziyu Zhao, Pingping Cai, Canyu Zhang, Xiaoguang Li, Song Wang
2024Curriculum Learning for Multimedia in the Era of Large Language Models.
Xin Wang, Yuwei Zhou, Hong Chen, Wenwu Zhu
2024CustomNet: Object Customization with Variable-Viewpoints in Text-to-Image Diffusion Models.
Ziyang Yuan, Mingdeng Cao, Xintao Wang, Zhongang Qi, Chun Yuan, Ying Shan
2024Customizing Text-to-Image Generation with Inverted Interaction.
Mengmeng Ge, Xu Jia, Takashi Isobe, Xiaomin Li, Qinghe Wang, Jing Mu, Dong Zhou, Li Wang, Huchuan Lu, Lu Tian, Ashish Sirasao, Emad Barsoum
2024D
Kai Han, Jin Wang, Yunhui Shi, Nam Ling, Baocai Yin
2024DAC: 2D-3D Retrieval with Noisy Labels via Divide-and-Conquer Alignment and Correction.
Chaofan Gan, Yuanpeng Tu, Yuxi Li, Weiyao Lin
2024DAFT-GAN: Dual Affine Transformation Generative Adversarial Network for Text-Guided Image Inpainting.
Jihoon Lee, Yunhong Min, Hwidong Kim, Sangtae Ahn
2024DAT: Dialogue-Aware Transformer with Modality-Group Fusion for Human Engagement Estimation.
Jia Li, Yangchen Yu, Yin Chen, Yu Zhang, Peng Jia, Yunbo Xu, Ziqiang Li, Meng Wang, Richang Hong
2024DCAFuse: Dual-Branch Diffusion-CNN Complementary Feature Aggregation Network for Multi-Modality Image Fusion.
Xudong Lu, Yuqi Jiang, Haiwen Hong, Qi Sun, Cheng Zhuo
2024DEITalk: Speech-Driven 3D Facial Animation with Dynamic Emotional Intensity Modeling.
Kang Shen, Haifeng Xia, Guangxing Geng, Guangyue Geng, Siyu Xia, Zhengming Ding
2024DEMON24: ACM MM24 Demonstrative Instruction Following Challenge.
Zhiqi Ge, Juncheng Li, Qifan Yu, Wei Zhou, Siliang Tang, Yueting Zhuang
2024DERD: Data-free Adversarial Robustness Distillation through Self-adversarial Teacher Group.
Yuhang Zhou, Yushu Zhang, Leo Yu Zhang, Zhongyun Hua
2024DERO: Diffusion-Model-Erasure Robust Watermarking.
Han Fang, Kejiang Chen, Yupeng Qiu, Zehua Ma, Weiming Zhang, Ee-Chien Chang
2024DFMVC: Deep Fair Multi-view Clustering.
Bowen Zhao, Qianqian Wang, Zhiqiang Tao, Wei Feng, Quanxue Gao
2024DGMamba: Domain Generalization via Generalized State Space Model.
Shaocong Long, Qianyu Zhou, Xiangtai Li, Xuequan Lu, Chenhao Ying, Yuan Luo, Lizhuang Ma, Shuicheng Yan
2024DIG: Complex Layout Document Image Generation with Authentic-looking Text for Enhancing Layout Analysis.
Dehao Ying, Fengchang Yu, Haihua Chen, Wei Lu
2024DINO is Also a Semantic Guider: Exploiting Class-aware Affinity for Weakly Supervised Semantic Segmentation.
Yuanchen Wu, Xiaoqiang Li, Jide Li, Kequan Yang, Pinpin Zhu, Shaohua Zhang
2024DMFourLLIE: Dual-Stage and Multi-Branch Fourier Network for Low-Light Image Enhancement.
Tongshun Zhang, Pingping Liu, Ming Zhao, Haotian Lv
2024DNTextSpotter: Arbitrary-Shaped Scene Text Spotting via Improved Denoising Training.
Qian Qiao, Yu Xie, Jun Gao, Tianxiang Wu, Shaoyao Huang, Jiaqing Fan, Ziqiang Cao, Zili Wang, Yue Zhang
2024DOPRA: Decoding Over-accumulation Penalization and Re-allocation in Specific Weighting Layer.
Jinfeng Wei, Xiaofeng Zhang
2024DP-RAE: A Dual-Phase Merging Reversible Adversarial Example for Image Privacy Protection.
Jiajie Zhu, Xia Du, Jizhe Zhou, Chi-Man Pun, Qizhen Xu, Xiaoyuan Liu
2024DPO: Dual-Perturbation Optimization for Test-time Adaptation in 3D Object Detection.
Zhuoxiao Chen, Zixin Wang, Yadan Luo, Sen Wang, Zi Huang
2024DQ-Former: Querying Transformer with Dynamic Modality Priority for Cognitive-aligned Multimodal Emotion Recognition in Conversation.
Ye Jing, Xinpei Zhao
2024DQG: Database Question Generation for Exact Text-based Image Retrieval.
Rintaro Yanagi, Ren Togo, Takahiro Ogawa, Miki Haseyama
2024DRMF: Degradation-Robust Multi-Modal Image Fusion via Composable Diffusion Prior.
Linfeng Tang, Yuxin Deng, Xunpeng Yi, Qinglong Yan, Yixuan Yuan, Jiayi Ma
2024DVF: Advancing Robust and Accurate Fine-Grained Image Retrieval with Retrieval Guidelines.
Xin Jiang, Hao Tang, Rui Yan, Jinhui Tang, Zechao Li
2024DanceCamAnimator: Keyframe-Based Controllable 3D Dance Camera Synthesis.
Zixuan Wang, Jiayi Li, Xiaoyu Qin, Shikun Sun, Songtao Zhou, Jia Jia, Jiebo Luo
2024DanceMimic: Awaken Your Dancing Instinct through a Real-time Dance Imitation Capture System.
Seongjean Kim, Jungwoo Huh, Yeseung Park, Jungsu Kim, Sanghoon Lee
2024Data Generation Scheme for Thermal Modality with Edge-Guided Adversarial Conditional Diffusion Model.
Guoqing Zhu, Honghu Pan, Qiang Wang, Chao Tian, Chao Yang, Zhenyu He
2024Dataset, Challenge, and Evaluation for Tumor Segmentation Variability.
Yicheng Wu, Yutong Xie, Xiangde Luo, Qi Wu, Jianfei Cai
2024De-fine: Decomposing and Refining Visual Programs with Auto-Feedback.
Minghe Gao, Juncheng Li, Hao Fei, Liang Pang, Wei Ji, Guoming Wang, Zheqi Lv, Wenqiao Zhang, Siliang Tang, Yueting Zhuang
2024Deblurring Neural Radiance Fields with Event-driven Bundle Adjustment.
Yunshan Qi, Lin Zhu, Yifan Zhao, Nan Bao, Jia Li
2024Deciphering Perceptual Quality in Colored Point Cloud: Prioritizing Geometry or Texture Distortion?
Xuemei Zhou, Irene Viola, Yunlu Chen, Jiahuan Pei, Pablo César
2024Decoder Pre-Training with only Text for Scene Text Recognition.
Shuai Zhao, Yongkun Du, Zhineng Chen, Yu-Gang Jiang
2024Decoder-Only LLMs are Better Controllers for Diffusion Models.
Ziyi Dong, Yao Xiao, Pengxu Wei, Liang Lin
2024Decoding Urban Industrial Complexity: Enhancing Knowledge-Driven Insights via IndustryScopeGPT.
Siqi Wang, Chao Liang, Yunfan Gao, Yang Liu, Jing Li, Haofen Wang
2024Deconfounded Emotion Guidance Sticker Selection with Causal Inference.
Jiali Chen, Yi Cai, Ruohang Xu, Jiexin Wang, Jiayuan Xie, Qing Li
2024Decoupling General and Personalized Knowledge in Federated Learning via Additive and Low-rank Decomposition.
Xinghao Wu, Xuefeng Liu, Jianwei Niu, Haolin Wang, Shaojie Tang, Guogang Zhu, Hao Su
2024Decoupling Heterogeneous Features for Robust 3D Interacting Hand Poses Estimation.
Huan Yao, Changxing Ding, Xuanda Xu, Zhifeng Lin
2024Deep Incomplete Multi-View Network Semi-Supervised Multi-Label Learning with Unbiased Loss.
Quanjiang Li, Tingjin Luo, Mingdie Jiang, Jiahui Liao, Zhangqi Jiang
2024Deep Instruction Tuning for Segment Anything Model.
Xiaorui Huang, Gen Luo, Chaoyang Zhu, Bo Tong, Yiyi Zhou, Xiaoshuai Sun, Rongrong Ji
2024Deep Video Compression with Scaled Hierarchical Bi-directional Motion Model.
Feng Ye, Li Zhang, Chuanmin Jia
2024DeepPointMap2: Accurate and Robust LiDAR-Visual SLAM with Neural Descriptors.
Xiaze Zhang, Ziheng Ding, Qi Jing, Ying Cheng, Wenchao Ding, Rui Feng
2024Deeply Fusing Semantics and Interactions for Item Representation Learning via Topology-driven Pre-training.
Shiqin Liu, Chaozhuo Li, Xi Zhang, Minjun Zhao, Yuanbo Xu, Jiajun Bu
2024Deformable NeRF using Recursively Subdivided Tetrahedra.
Zherui Qiu, Chenqu Ren, Kaiwen Song, Xiaoyi Zeng, Leyuan Yang, Juyong Zhang
2024Demonstrative Instruction Following in Multimodal LLMs via Integrating Low-Rank Adaptation with Ensemble Learning.
Jingyu Wei, Yi Su, Kele Xu, Lingbin Zeng, Bo Liu, Huaimin Wang
2024DenseTrack: Drone-Based Crowd Tracking via Density-Aware Motion-Appearance Synergy.
Yi Lei, Huilin Zhu, Jingling Yuan, Guangli Xiang, Xian Zhong, Shengfeng He
2024DepthCloak: Projecting Optical Camouflage Patches for Erroneous Monocular Depth Estimation of Vehicles.
Huixiang Wen, Shizong Yan, Shan Chang, Jie Xu, Hongzi Zhu, Yanting Zhang, Bo Li
2024Designing Spatial Visualization and Interactions of Immersive Sankey Diagram in Virtual Reality.
Yang Lu, Junxian Li, Zhitong Cui, Jiapeng Hu, Yanna Lin, Shijian Luo
2024Detached and Interactive Multimodal Learning.
Yunfeng Fan, Wenchao Xu, Haozhao Wang, Junhong Liu, Song Guo
2024Detecting Multimodal Situations with Insufficient Context and Abstaining from Baseless Predictions.
Junzhang Liu, Zhecan Wang, Hammad A. Ayyubi, Haoxuan You, Chris Thomas, Rui Sun, Shih-Fu Chang, Kai-Wei Chang
2024Devil is in Details: Locality-Aware 3D Abdominal CT Volume Generation for Self-Supervised Organ Segmentation.
Yuran Wang, Zhijing Wan, Yansheng Qiu, Zheng Wang
2024DiffGlue: Diffusion-Aided Image Feature Matching.
Shihua Zhang, Jiayi Ma
2024DiffHarmony++: Enhancing Image Harmonization with Harmony-VAE and Inverse Harmonization Model.
Pengfei Zhou, Fangxiang Feng, Guang Liu, Ruifan Li, Xiaojie Wang
2024DiffMM: Multi-Modal Diffusion Model for Recommendation.
Yangqin Jiang, Lianghao Xia, Wei Wei, Da Luo, Kangyi Lin, Chao Huang
2024DiffTV: Identity-Preserved Thermal-to-Visible Face Translation via Feature Alignment and Dual-Stage Conditions.
Jingyu Lin, Guiqin Zhao, Jing Xu, Guoli Wang, Zejin Wang, Antitza Dantcheva, Lan Du, Cunjian Chen
2024Differential-Perceptive and Retrieval-Augmented MLLM for Change Captioning.
Xian Zhang, Haokun Wen, Jianlong Wu, Pengda Qin, Hui Xue', Liqiang Nie
2024Diffusion Domain Teacher: Diffusion Guided Domain Adaptive Object Detector.
Boyong He, Yuxiang Ji, Zhuoyue Tan, Liaoni Wu
2024Diffusion Facial Forgery Detection.
Harry Cheng, Yangyang Guo, Tianyi Wang, Liqiang Nie, Mohan S. Kankanhalli
2024Diffusion Networks with Task-Specific Noise Control for Radiology Report Generation.
Yuanhe Tian, Fei Xia, Yan Song
2024Diffusion Posterior Proximal Sampling for Image Restoration.
Hongjie Wu, Linchao He, Mingqin Zhang, Dongdong Chen, Kunming Luo, Mengting Luo, Ji-Zhe Zhou, Hu Chen, Jiancheng Lv
2024Dig a Hole and Fill in Sand: Adversary and Hiding Decoupled Steganography.
Weixuan Tang, Haoyu Yang, Yuan Rao, Zhili Zhou, Fei Peng
2024Dig into Detailed Structures: Key Context Encoding and Semantic-based Decoding for Point Cloud Completion.
Hongye Hou, Xuehao Gao, Zhan Liu, Yang Yang
2024Digging into Contrastive Learning for Robust Depth Estimation with Diffusion Models.
Jiyuan Wang, Chunyu Lin, Lang Nie, Kang Liao, Shuwei Shao, Yao Zhao
2024DisControlFace: Adding Disentangled Control to Diffusion Autoencoder for One-shot Explicit Facial Image Editing.
Haozhe Jia, Yan Li, Hengfei Cui, Di Xu, Yuwang Wang, Tao Yu
2024DisenStudio: Customized Multi-Subject Text-to-Video Generation with Disentangled Spatial Control.
Hong Chen, Xin Wang, Yipeng Zhang, Yuwei Zhou, Zeyang Zhang, Siao Tang, Wenwu Zhu
2024Disentangled-Multimodal Privileged Knowledge Distillation for Depression Recognition with Incomplete Multimodal Data.
Yuchen Pan, Junjun Jiang, Kui Jiang, Xianming Liu
2024Disentangling Identity Features from Interference Factors for Cloth-Changing Person Re-identification.
Yubo Li, De Cheng, Chaowei Fang, Changzhe Jiao, Nannan Wang, Xinbo Gao
2024Disrupting Diffusion: Token-Level Attention Erasure Attack against Diffusion-based Customization.
Yisu Liu, Jinyang An, Wanqian Zhang, Dayan Wu, Jingzi Gu, Zheng Lin, Weiping Wang
2024Dissecting Temporal Understanding in Text-to-Audio Retrieval.
Andreea-Maria Oncescu, João F. Henriques, A. Sophia Koepke
2024Distilled Cross-Combination Transformer for Image Captioning with Dual Refined Visual Features.
Junbo Hu, Zhixin Li
2024Distribution Consistency Guided Hashing for Cross-Modal Retrieval.
Yuan Sun, Kaiming Liu, Yongxiang Li, Zhenwen Ren, Jian Dai, Dezhong Peng
2024Diverse Consensuses Paired with Motion Estimation-Based Multi-Model Fitting.
Wenyu Yin, Shuyuan Lin, Yang Lu, Hanzi Wang
2024Diversified Semantic Distribution Matching for Dataset Distillation.
Hongcheng Li, Yucan Zhou, Xiaoyan Gu, Bo Li, Weiping Wang
2024Diversity Matters: User-Centric Multi-Interest Learning for Conversational Movie Recommendation.
Yongsen Zheng, Guohua Wang, Yang Liu, Liang Lin
2024Divide and Conquer: Isolating Normal-Abnormal Attributes in Knowledge Graph-Enhanced Radiology Report Generation.
Xiao Liang, Yanlei Zhang, Di Wang, Haodi Zhong, Ronghan Li, Quan Wang
2024Do LLMs Understand Visual Anomalies? Uncovering LLM's Capabilities in Zero-shot Anomaly Detection.
Jiaqi Zhu, Shaofeng Cai, Fang Deng, Beng Chin Ooi, Junran Wu
2024Document Registration: Towards Automated Labeling of Pixel-Level Alignment Between Warped-Flat Documents.
Weiguang Zhang, Qiufeng Wang, Kaizhu Huang, Xiaowei Huang, Fengjun Guo, Xiaomeng Gu
2024Domain Generalization-Aware Uncertainty Introspective Learning for 3D Point Clouds Segmentation.
Pei He, Licheng Jiao, Lingling Li, Xu Liu, Fang Liu, Wenping Ma, Shuyuan Yang, Ronghua Shang
2024Domain Knowledge Enhanced Vision-Language Pretrained Model for Dynamic Facial Expression Recognition.
Liupeng Li, Yuhua Zheng, Shupeng Liu, Xiaoyin Xu, Taihao Li
2024Domain Shared and Specific Prompt Learning for Incremental Monocular Depth Estimation.
Zhiwen Yang, Liang Li, Jiehua Zhang, Tingyu Wang, Yaoqi Sun, Chenggang Yan
2024Domain-Agnostic Crowd Counting via Uncertainty-Guided Style Diversity Augmentation.
Guanchen Ding, Lingbo Liu, Zhenzhong Chen, Changwen Chen
2024Domain-Conditioned Transformer for Fully Test-time Adaptation.
Yushun Tang, Shuoshuo Chen, Jiyuan Jia, Yi Zhang, Zhihai He
2024Dr. CLIP: CLIP-Driven Universal Framework for Zero-Shot Sketch Image Retrieval.
Xue Li, Jiong Yu, Ziyang Li, Hongchun Lu, Ruifeng Yuan
2024DragEntity: Trajectory Guided Video Generation using Entity and Positional Relationships.
Zhang Wan, Sheng Tang, Jiawei Wei, Ruize Zhang, Juan Cao
2024DreamBooth++: Boosting Subject-Driven Generation via Region-Level References Packing.
Zhongyi Fan, Zixin Yin, Gang Li, Yibing Zhan, Heliang Zheng
2024DreamLCM: Towards High Quality Text-to-3D Generation via Latent Consistency Model.
Yiming Zhong, Xiaolin Zhang, Yao Zhao, Yunchao Wei
2024DreamVTON: Customizing 3D Virtual Try-on with Personalized Diffusion Models.
Zhenyu Xie, Haoye Dong, Yufei Gao, Zehua Ma, Xiaodan Liang
2024Driving Scene Understanding with Traffic Scene-Assisted Topology Graph Transformer.
Fu Rong, Wenjin Peng, Meng Lan, Qian Zhang, Lefei Zhang
2024Dual Advancement of Representation Learning and Clustering for Sparse and Noisy Images.
Wenlin Li, Yucheng Xu, Xiaoqing Zheng, Suoya Han, Jun Wang, Xiaobo Sun
2024Dual-Branch Fusion with Style Modulation for Cross-Domain Few-Shot Semantic Segmentation.
Qiuyu Kong, Jiangming Chen, Jie Jiang, Zanxi Ruan, Lai Kang
2024Dual-Criterion Quality Loss for Blind Image Quality Assessment.
Desen Yuan, Lei Wang
2024Dual-Hybrid Attention Network for Specular Highlight Removal.
Xiaojiao Guo, Xuhang Chen, Shenghong Luo, Shuqiang Wang, Chi-Man Pun
2024Dual-Modeling Decouple Distillation for Unsupervised Anomaly Detection.
Xinyue Liu, Jianyuan Wang, Biao Leng, Shuo Zhang
2024Dual-Optimized Adaptive Graph Reconstruction for Multi-View Graph Clustering.
Zichen Wen, Tianyi Wu, Yazhou Ren, Yawen Ling, Chenhang Cui, Xiaorong Pu, Lifang He
2024Dual-Resolution Fusion Modeling for Unsupervised Cross-Resolution Person Re-Identification.
Zhiqi Pang, Lingling Zhao, Chunyu Wang
2024Dual-Stream Pre-Training Transformer to Enhance Multimodal Learning for Social Media Prediction.
Wenhao Hu, Weilong Chen, Weimin Yuan, Yan Wang, Shimin Cai, Yanru Zhang
2024Dual-head Genre-instance Transformer Network for Arbitrary Style Transfer.
Meichen Liu, Shuting He, Songnan Lin, Bihan Wen
2024Dual-path Collaborative Generation Network for Emotional Video Captioning.
Cheng Ye, Weidong Chen, Jingyu Li, Lei Zhang, Zhendong Mao
2024Dual-stream Feature Augmentation for Domain Generalization.
Shanshan Wang, ALuSi, Xun Yang, Ke Xu, Huibin Tan, Xingyi Zhang
2024Dual-stream Perception-driven Blind Quality Assessment for Stereoscopic Omnidirectional Images.
Zhaolin Wan, Qiushuang Yang, Zhiyang Li, Xiaopeng Fan, Wangmeng Zuo, Debin Zhao
2024Dual-view Pyramid Network for Video Frame Interpolation.
Yao Luo, Ming Yang, Jinhui Tang
2024DualFed: Enjoying both Generalization and Personalization in Federated Learning via Hierachical Representations.
Guogang Zhu, Xuefeng Liu, Jianwei Niu, Shaojie Tang, Xinghao Wu, Jiayuan Zhang
2024DySarl: Dynamic Structure-Aware Representation Learning for Multimodal Knowledge Graph Reasoning.
Kangzheng Liu, Feng Zhao, Yu Yang, Guandong Xu
2024Dynamic Evidence Decoupling for Trusted Multi-view Learning.
Ying Liu, Lihong Liu, Cai Xu, Xiangyu Song, Ziyu Guan, Wei Zhao
2024Dynamic Mixed-Prototype Model for Incremental Deepfake Detection.
Jiahe Tian, Cai Yu, Xi Wang, Peng Chen, Zihao Xiao, Jizhong Han, Yesheng Chai
2024Dynamic Prompting of Frozen Text-to-Image Diffusion Models for Panoptic Narrative Grounding.
Hongyu Li, Tianrui Hui, Zihan Ding, Jing Zhang, Bin Ma, Xiaoming Wei, Jizhong Han, Si Liu
2024EAGLE: Egocentric AGgregated Language-video Engine.
Jing Bi, Yunlong Tang, Luchuan Song, Ali Vosoughi, Nguyen Nguyen, Chenliang Xu
2024ECAvatar: 3D Avatar Facial Animation with Controllable Identity and Emotion.
Minjing Yu, Delong Pang, Ziwen Kang, Zhiyao Sun, Tian Lv, Jenny Sheng, Ran Yi, Yu-Hui Wen, Yong-Jin Liu
2024ECFCON: Emotion Consequence Forecasting in Conversations.
Xincheng Ju, Dong Zhang, Suyang Zhu, Junhui Li, Shoushan Li, Guodong Zhou
2024EEG-MACS: Manifold Attention and Confidence Stratification for EEG-based Cross-Center Brain Disease Diagnosis under Unreliable Annotations.
Zhenxi Song, Ruihan Qin, Huixia Ren, Zhen Liang, Yi Guo, Min Zhang, Zhiguo Zhang
2024EGGen: Image Generation with Multi-entity Prior Learning through Entity Guidance.
Zhenhong Sun, Junyan Wang, Zhiyu Tan, Daoyi Dong, Hailan Ma, Hao Li, Dong Gong
2024EGGesture: Entropy-Guided Vector Quantized Variational AutoEncoder for Co-Speech Gesture Generation.
Yiyong Xiao, Kai Shu, Haoyi Zhang, Baohua Yin, Wai Seng Cheang, Haoyang Wang, Jiechao Gao
2024EMVCC: Enhanced Multi-View Contrastive Clustering for Hyperspectral Images.
Fulin Luo, Yi Liu, Xiuwen Gong, Zhixiong Nan, Tan Guo
2024EPL-UFLSID: Efficient Pseudo Labels-Driven Underwater Forward-Looking Sonar Images Object Detection.
Cheng Shen, Liquan Shen, Mengyao Li, Meng Yu
2024ERL-MR: Harnessing the Power of Euler Feature Representations for Balanced Multi-modal Learning.
Weixiang Han, Chengjun Cai, Yu Guo, Jialiang Peng
2024Edge-assisted Real-time Dynamic 3D Point Cloud Rendering for Multi-party Mobile Virtual Reality.
Ximing Wu, Kongyange Zhao, Xu Chen, Teng Liang
2024Edit As You Wish: Video Caption Editing with Multi-grained User Control.
Linli Yao, Yuanmeng Zhang, Ziheng Wang, Xinglin Hou, Tiezheng Ge, Yuning Jiang, Xu Sun, Qin Jin
2024Edit3D: Elevating 3D Scene Editing with Attention-Driven Multi-Turn Interactivity.
Peng Zhou, Dunbo Cai, Yujian Du, Runqing Zhang, Bingbing Ni, Jie Qin, Ling Qian
2024Effective Optimization of Root Selection Towards Improved Explanation of Deep Classifiers.
Xin Zhang, Shenghua Zhong, Jianmin Jiang
2024Efficiency in Focus: LayerNorm as a Catalyst for Fine-tuning Medical Visual Language Models.
Jiawei Chen, Dingkang Yang, Yue Jiang, Mingcheng Li, Jinjie Wei, Xiaolu Hou, Lihua Zhang
2024Efficient Dual-Confounding Eliminating for Weakly-supervised Temporal Action Localization.
Ao Li, Huijun Liu, Jinrong Sheng, Zhongming Chen, Yongxin Ge
2024Efficient Face Super-Resolution via Wavelet-based Feature Enhancement Network.
Wenjie Li, Heng Guo, Xuannan Liu, Kongming Liang, Jiani Hu, Zhanyu Ma, Jun Guo
2024Efficient Perceiving Local Details via Adaptive Spatial-Frequency Information Integration for Multi-focus Image Fusion.
Jingjia Huang, Jingyan Tu, Ge Meng, Yingying Wang, Yuhang Dong, Xiaotong Tu, Xinghao Ding, Yue Huang
2024Efficient Single Image Super-Resolution with Entropy Attention and Receptive Field Augmentation.
Xiaole Zhao, Linze Li, Chengxing Xie, Xiaoming Zhang, Ting Jiang, Wenjie Lin, Shuaicheng Liu, Tianrui Li
2024Efficient Training for Multilingual Visual Speech Recognition: Pre-training with Discretized Visual Speech Representation.
Minsu Kim, Jeong Hun Yeo, Se Jin Park, Hyeongseop Rha, Yong Man Ro
2024Eglcr: Edge Structure Guidance and Scale Adaptive Attention for Iterative Stereo Matching.
Zhien Dai, Zhaohui Tang, Hu Zhang, Can Tian, Mingjun Pan, Yongfang Xie
2024Ego3DT: Tracking Every 3D Object in Ego-centric Videos.
Shengyu Hao, Wenhao Chai, Zhonghan Zhao, Meiqi Sun, Wendi Hu, Jieyang Zhou, Yixian Zhao, Qi Li, Yizhou Wang, Xi Li, Gaoang Wang
2024Egocentric Vehicle Dense Video Captioning.
Feiyu Chen, Cong Xu, Qi Jia, Yihua Wang, Yuhan Liu, Haotian Zhang, Endong Wang
2024Eliminate Before Align: A Remote Sensing Image-Text Retrieval Framework with Keyword Explicit Reasoning.
Zhong Ji, Changxu Meng, Yan Zhang, Haoran Wang, Yanwei Pang, Jungong Han
2024Embedding an Ethical Mind: Aligning Text-to-Image Synthesis via Lightweight Value Optimization.
Xingqi Wang, Xiaoyuan Yi, Xing Xie, Jia Jia
2024Embodied Contrastive Learning with Geometric Consistency and Behavioral Awareness for Object Navigation.
Bolei Chen, Jiaxu Kang, Ping Zhong, Yixiong Liang, Yu Sheng, Jianxin Wang
2024Embodied Laser Attack: Leveraging Scene Priors to Achieve Agent-based Robust Non-contact Attacks.
Yitong Sun, Yao Huang, Xingxing Wei
2024Embracing Adaptation: An Effective Dynamic Defense Strategy Against Adversarial Examples.
Shenglin Yin, Kelu Yao, Zhen Xiao, Jieyi Long
2024Embracing Domain Gradient Conflicts: Domain Generalization Using Domain Gradient Equilibrium.
Zuyu Zhang, Yan Li, Byung-Seok Shin
2024Emotion Recognition in HMDs: A Multi-task Approach Using Physiological Signals and Occluded Faces.
Yunqiang Pei, Jialei Tang, Qihang Tang, Mingfeng Zha, Dongyu Xie, Guoqing Wang, Zhitao Liu, Ning Xie, Peng Wang, Yang Yang, Hengtao Shen
2024Emphasizing Semantic Consistency of Salient Posture for Speech-Driven Gesture Generation.
Fengqi Liu, Hexiang Wang, Jingyu Gong, Ran Yi, Qianyu Zhou, Xuequan Lu, Jiangbo Lu, Lizhuang Ma
2024Empowering People to Harness and Control their Multimodal Data in Scrutable User models.
Judy Kay
2024Enabling Synergistic Full-Body Control in Prompt-Based Co-Speech Motion Generation.
Bohong Chen, Yumeng Li, Yao-Xiang Ding, Tianjia Shao, Kun Zhou
2024End-to-end Spatio-Temporal Information Aggregation For Micro-Action Detection.
Jun Yu, Mohan Jing, Guopeng Zhao, Keda Lu, Yifan Wang, Feng Zhao, Jiaqing Sun, Qingsong Liu, Jiaen Liang
2024Engaging Live Video Comments Generation.
Ge Luo, Yuchen Ma, Manman Zhang, Junqiang Huang, Sheng Li, Zhenxing Qian, Xinpeng Zhang
2024Enhanced Experts with Uncertainty-Aware Routing for Multimodal Sentiment Analysis.
Zixian Gao, Disen Hu, Xun Jiang, Huimin Lu, Heng Tao Shen, Xing Xu
2024Enhanced Screen Content Image Compression: A Synergistic Approach for Structural Fidelity and Text Integrity Preservation.
Fangtao Zhou, Xiaofeng Huang, Peng Zhang, Meng Wang, Zhao Wang, Yang Zhou, Haibing Yin
2024Enhanced Tensorial Self-representation Subspace Learning for Incomplete Multi-view Clustering.
Hangjun Che, Xinyu Pu, Deqiang Ouyang, Beibei Li
2024Enhancing Adaptive Deep Networks for Image Classification via Uncertainty-aware Decision Fusion.
Xu Zhang, Zhipeng Xie, Haiyang Yu, Qitong Wang, Peng Wang, Wei Wang
2024Enhancing Images with Coupled Low-Resolution and Ultra-Dark Degradations: A Tri-level Learning Framework.
Jiaxin Gao, Yaohua Liu
2024Enhancing Micro-Expression Analysis Performance by Effectively Addressing Data Imbalance.
Yuhong He, Wenchao Liu, Guangyu Wang, Lin Ma, Haifeng Li
2024Enhancing Model Interpretability with Local Attribution over Global Exploration.
Zhiyu Zhu, Zhibo Jin, Jiayu Zhang, Huaming Chen
2024Enhancing Multi-view Graph Neural Network with Cross-view Confluent Message Passing.
Shuman Zhuang, Sujia Huang, Wei Huang, Yuhong Chen, Zhihao Wu, Ximeng Liu
2024Enhancing Multimodal Large Language Models on Demonstrative Multi-Image Instructions.
Xian Fu
2024Enhancing Pre-trained ViTs for Downstream Task Adaptation: A Locality-Aware Prompt Learning Method.
Shaokun Wang, Yifan Yu, Yuhang He, Yihong Gong
2024Enhancing Robustness in Learning with Noisy Labels: An Asymmetric Co-Training Approach.
Mengmeng Sheng, Zeren Sun, Gensheng Pei, Tao Chen, Haonan Luo, Yazhou Yao
2024Enhancing Speaking and Slide Design Skills with Deep Learning: An Online Presentation Assessment System.
Shengzhou Yi, Junichiro Matsugami, Takuya Yamamoto, Toshihiko Yamasaki
2024Enhancing Transformer-based Semantic Matching for Few-shot Learning through Weakly Contrastive Pre-training.
Wei Yang, Tengfei Huo, Zhiqiang Liu
2024Enhancing Underwater Images via Asymmetric Multi-Scale Invertible Networks.
Yuhui Quan, Xiaoheng Tan, Yan Huang, Yong Xu, Hui Ji
2024Enhancing Unsupervised Visible-Infrared Person Re-Identification with Bidirectional-Consistency Gradual Matching.
Xiao Teng, Xingyu Shen, Kele Xu, Long Lan
2024Equilibrated Diffusion: Frequency-aware Textual Embedding for Equilibrated Image Customization.
Liyuan Ma, Xueji Fang, Guo-Jun Qi
2024Estimating the Semantic Density of Visual Media.
Luca Rossetto, Cristina Sarasua, Abraham Bernstein
2024Event Traffic Forecasting with Sparse Multimodal Data.
Xiao Han, Zhenduo Zhang, Yiling Wu, Xinfeng Zhang, Zhe Wu
2024Event-Guided Rolling Shutter Correction with Time-Aware Cross-Attentions.
Hefei Huang, Xu Jia, Xinyu Zhang, Shengming Li, Huchuan Lu
2024Event-ID: Intrinsic Decomposition Using an Event Camera.
Zehao Chen, Zhan Lu, De Ma, Huajin Tang, Xudong Jiang, Qian Zheng, Gang Pan
2024EvilEdit: Backdooring Text-to-Image Diffusion Models in One Second.
Hao Wang, Shangwei Guo, Jialing He, Kangjie Chen, Shudong Zhang, Tianwei Zhang, Tao Xiang
2024Evolution-aware VAriance (EVA) Coreset Selection for Medical Image Classification.
Yuxin Hong, Xiao Zhang, Xin Zhang, Joey Tianyi Zhou
2024Evolving Storytelling: Benchmarks and Methods for New Character Customization with Diffusion Models.
Xiyu Wang, Yufei Wang, Satoshi Tsutsui, Weisi Lin, Bihan Wen, Alex C. Kot
2024Expanded Convolutional Neural Network Based Look-Up Tables for High Efficient Single-Image Super-Resolution.
Kai Yin, Jie Shen
2024Explicit Granularity and Implicit Scale Correspondence Learning for Point-Supervised Video Moment Localization.
Kun Wang, Hao Liu, Lirong Jie, Zixu Li, Yupeng Hu, Liqiang Nie
2024Explore Hybrid Modeling for Moving Infrared Small Target Detection.
Mingjin Zhang, Shilong Liu, Yuanjun Ouyang, Jie Guo, Zhihong Tang, Yunsong Li
2024Exploring Data Efficiency in Image Restoration: A Gaussian Denoising Case Study.
Zhengwei Yin, Mingze Ma, Guixu Lin, Yinqiang Zheng
2024Exploring Deeper! Segment Anything Model with Depth Perception for Camouflaged Object Detection.
Zhenni Yu, Xiaoqin Zhang, Li Zhao, Yi Bin, Guobao Xiao
2024Exploring Matching Rates: From Keypoint Selection to Camera Relocalization.
Hu Lin, Chengjiang Long, Yifeng Fei, Qianchen Xia, Erwei Yin, Baocai Yin, Xin Yang
2024Exploring Robust Face-Voice Matching in Multilingual Environments.
Jiehui Tang, Xiaofei Wang, Zhen Xiao, Jiayi Liu, Xueliang Liu, Richang Hong
2024Exploring Stable Meta-Optimization Patterns via Differentiable Reinforcement Learning for Few-Shot Classification.
Zheng Han, Xiaobin Zhu, Chun Yang, Hongyang Zhou, Jingyan Qin, Xu-Cheng Yin
2024Exploring in Extremely Dark: Low-Light Video Enhancement with Real Events.
Xicong Wang, Huiyuan Fu, Jiaxuan Wang, Xin Wang, Heng Zhang, Huadong Ma
2024Exploring the Robustness of Decision-Level Through Adversarial Attacks on LLM-Based Embodied Models.
Shuyuan Liu, Jiawei Chen, Shouwei Ruan, Hang Su, Zhaoxia Yin
2024Exploring the Use of Abusive Generative AI Models on Civitai.
Yiluo Wei, Yiming Zhu, Pan Hui, Gareth Tyson
2024Exposure Completing for Temporally Consistent Neural High Dynamic Range Video Rendering.
Jiahao Cui, Wei Jiang, Zhan Peng, Zhiyu Pan, Zhiguo Cao
2024ExpressiveSinger: Multilingual and Multi-Style Score-based Singing Voice Synthesis with Expressive Performance Control.
Shuqi Dai, Ming-Yu Liu, Rafael Valle, Siddharth Gururani
2024F-3DGS: Factorized Coordinates and Representations for 3D Gaussian Splatting.
Xiangyu Sun, Joo Chan Lee, Daniel Rho, Jong Hwan Ko, Usman Ali, Eunbyung Park
2024FARFusion V2: A Geometry-based Radar-Camera Fusion Method on the Ground for Roadside Far-Range 3D Object Detection.
Yao Li, Jiajun Deng, Yuxuan Xiao, Yingjie Wang, Xiaomeng Chu, Jianmin Ji, Yanyong Zhang
2024FBSDiff: Plug-and-Play Frequency Band Substitution of Diffusion Features for Highly Controllable Text-Driven Image Translation.
Xiang Gao, Jiaying Liu
2024FC-4DFS: Frequency-controlled Flexible 4D Facial Expression Synthesizing.
Xin Lu, Chuanqing Zhuang, Zhengda Lu, Yiqun Wang, Jun Xiao
2024FD2Talk: Towards Generalized Talking Head Generation with Facial Decoupled Diffusion Model.
Ziyu Yao, Xuxin Cheng, Zhiqi Huang
2024FIND: Fine-tuning Initial Noise Distribution with Policy Optimization for Diffusion Models.
Changgu Chen, Libing Yang, Xiaoyan Yang, Lianggangxu Chen, Gaoqi He, Changbo Wang, Yang Li
2024FKA-Owl: Advancing Multimodal Fake News Detection through Knowledge-Augmented LVLMs.
Xuannan Liu, Peipei Li, Huaibo Huang, Zekun Li, Xing Cui, Jiahao Liang, Lixiong Qin, Weihong Deng, Zhaofeng He
2024FLIP-80M: 80 Million Visual-Linguistic Pairs for Facial Language-Image Pre-Training.
Yudong Li, Xianxu Hou, Dezhi Zheng, Linlin Shen, Zhe Zhao
2024FM-CLIP: Flexible Modal CLIP for Face Anti-Spoofing.
Ajian Liu, Hui Ma, Junze Zheng, Haocheng Yuan, Xiaoyuan Yu, Yanyan Liang, Sergio Escalera, Jun Wan, Zhen Lei
2024FOCT: Few-shot Industrial Anomaly Detection with Foreground-aware Online Conditional Transport.
Long Tian, Hongyi Zhao, Ruiying Lu, Rongrong Wang, Yujie Wu, Liming Wang, Xiongpeng He, Xiyang Liu
2024FRADE: Forgery-aware Audio-distilled Multimodal Learning for Deepfake Detection.
Fan Nie, Jiangqun Ni, Jian Zhang, Bin Zhang, Weizhe Zhang
2024FSL-QuickBoost: Minimal-Cost Ensemble for Few-Shot Learning.
Yunwei Bai, Bill Yang Cai, Ying Kiat Tan, Zangwei Zheng, Shiming Chen, Tsuhan Chen
2024FSVFG: Towards Immersive Full-Scene Volumetric Video Streaming with Adaptive Feature Grid.
Daheng Yin, Jianxin Shi, Miao Zhang, Zhaowu Huang, Jiangchuan Liu, Fang Dong
2024FTF-ER: Feature-Topology Fusion-Based Experience Replay Method for Continual Graph Learning.
Jinhui Pang, Changqing Lin, Xiaoshuai Hao, Rong Yin, Zixuan Wang, Zhihui Zhang, Jinglin He, Huang Tai Sheng
2024FacialFlowNet: Advancing Facial Optical Flow Estimation with a Diverse Dataset and a Decomposed Model.
Jianzhi Lu, Ruian He, Shili Zhou, Weimin Tan, Bo Yan
2024FacialPulse: An Efficient RNN-based Depression Detection via Temporal Facial Landmarks.
Ruiqi Wang, Jinyang Huang, Jie Zhang, Xin Liu, Xiang Zhang, Zhi Liu, Peng Zhao, Sigui Chen, Xiao Sun
2024Fact : Teaching MLLMs with Faithful, Concise and Transferable Rationales.
Minghe Gao, Shuang Chen, Liang Pang, Yuan Yao, Jisheng Dang, Wenqiao Zhang, Juncheng Li, Siliang Tang, Yueting Zhuang, Tat-Seng Chua
2024FakingRecipe: Detecting Fake News on Short Video Platforms from the Perspective of Creative Process.
Yuyan Bu, Qiang Sheng, Juan Cao, Peng Qi, Danding Wang, Jintao Li
2024Fast Elastic-Net Multi-view Clustering: A Geometric Interpretation Perspective.
Yalan Qin, Li Qian
2024Fast and Scalable Incomplete Multi-View Clustering with Duality Optimal Graph Filtering.
Liang Du, Yukai Shi, Yan Chen, Peng Zhou, Yuhua Qian
2024FedBCGD: Communication-Efficient Accelerated Block Coordinate Gradient Descent for Federated Learning.
Junkang Liu, Fanhua Shang, Yuanyuan Liu, Hongying Liu, Yuangang Li, YunXiang Gong
2024FedCAFE: Federated Cross-Modal Hashing with Adaptive Feature Enhancement.
Ting Fu, Yu-Wei Zhan, Chong-Yu Zhang, Xin Luo, Zhen-Duo Chen, Yongxin Wang, Xun Yang, Xin-Shun Xu
2024FedDEO: Description-Enhanced One-Shot Federated Learning with Diffusion Models.
Mingzhao Yang, Shangchao Su, Bin Li, Xiangyang Xue
2024FedEvalFair: A Privacy-Preserving and Statistically Grounded Federated Fairness Evaluation Framework.
Zhongchi Wang, Hailong Sun, Zhengyang Zhao
2024FedSLS: Exploring Federated Aggregation in Saliency Latent Space.
Hengyi Wang, Weiying Xie, Jitao Ma, Daixun Li, Yunsong Li
2024Federated Fuzzy C-means with Schatten-
Wei Feng, Zhenwei Wu, Qianqian Wang, Bo Dong, Quanxue Gao
2024Federated Morozov Regularization for Shortcut Learning in Privacy Preserving Learning with Watermarked Image Data.
Tao Ling, Siping Shi, Hao Wang, Chuang Hu, Dan Wang
2024Few-Shot Joint Multimodal Entity-Relation Extraction via Knowledge-Enhanced Cross-modal Prompt Model.
Li Yuan, Yi Cai, Junsheng Huang
2024Few-Shot Multimodal Explanation for Visual Question Answering.
Dizhan Xue, Shengsheng Qian, Changsheng Xu
2024Few-shot Semantic Segmentation via Perceptual Attention and Spatial Control.
Guangchen Shi, Wei Zhu, Yirui Wu, Danhuai Zhao, Kang Zheng, Tong Lu
2024FewVS: A Vision-Semantics Integration Framework for Few-Shot Image Classification.
Zhuoling Li, Yong Wang, Kaitong Li
2024FiLo: Zero-Shot Anomaly Detection by Fine-Grained Description and High-Quality Localization.
Zhaopeng Gu, Bingke Zhu, Guibo Zhu, Yingying Chen, Hao Li, Ming Tang, Jinqiao Wang
2024Finding Input Data Domains of Image Classification Models with Hard-Label Black-Box Access.
Jiyi Zhang, Han Fang, Ee-Chien Chang
2024Fine-Grained Prompt Learning for Face Anti-Spoofing.
Xueli Hu, Huan Liu, Haocheng Yuan, Zhiyang Fu, Yizhi Luo, Ning Zhang, Hang Zou, Jianwen Gan, Yuan Zhang
2024Fine-Grained Side Information Guided Dual-Prompts for Zero-Shot Skeleton Action Recognition.
Yang Chen, Jingcai Guo, Tian He, Xiaocheng Lu, Ling Wang
2024Fine-grained Semantic Alignment with Transferred Person-SAM for Text-based Person Retrieval.
Yihao Wang, Meng Yang, Rui Cao
2024FineCLIPER: Multi-modal Fine-grained CLIP for Dynamic Facial Expression Recognition with AdaptERs.
Haodong Chen, Haojian Huang, Junhao Dong, Mingzhe Zheng, Dian Shao
2024FlashSpeech: Efficient Zero-Shot Speech Synthesis.
Zhen Ye, Zeqian Ju, Haohe Liu, Xu Tan, Jianyi Chen, Yiwen Lu, Peiwen Sun, Jiahao Pan, Weizhen Bian, Shulin He, Wei Xue, Qifeng Liu, Yike Guo
2024FlexIR: Towards Flexible and Manipulable Image Restoration.
Zhengwei Yin, Guixu Lin, Mengshun Hu, Hao Zhang, Yinqiang Zheng
2024Focus & Gating: A Multimodal Approach for Unveiling Relations in Noisy Social Media.
Liang He, Hongke Wang, Zhen Wu, Jianbing Zhang, Xinyu Dai, Jiajun Chen
2024Focus, Distinguish, and Prompt: Unleashing CLIP for Efficient and Flexible Scene Text Retrieval.
Gangyan Zeng, Yuan Zhang, Jin Wei, Dongbao Yang, Peng Zhang, Yiwen Gao, Xugong Qin, Yu Zhou
2024FodFoM: Fake Outlier Data by Foundation Models Creates Stronger Visual Out-of-Distribution Detector.
Jiankang Chen, Ling Deng, Zhiyong Gan, Wei-Shi Zheng, Ruixuan Wang
2024Fooling 3D Face Recognition with One Single 2D Image.
Shizong Yan, Huixiang Wen, Shan Chang, Hongzi Zhu, Luo Zhou
2024Foreground Harmonization and Shadow Generation for Composite Image.
Jing Zhou, Ziqi Yu, Zhongyun Bao, Gang Fu, Weilei He, Chao Liang, Chunxia Xiao
2024Fractional Correspondence Framework in Detection Transformer.
Masoumeh Zareapoor, Pourya Shamsolmoali, Huiyu Zhou, Yue Lu, Salvador García
2024Frame Interpolation with Consecutive Brownian Bridge Diffusion.
Zonglin Lyu, Ming Li, Jianbo Jiao, Chen Chen
2024Free Lunch: Frame-level Contrastive Learning with Text Perceiver for Robust Scene Text Recognition in Lightweight Models.
Hongjian Zhan, Yangfu Li, Yu-Jie Xiong, Umapada Pal, Yue Lu
2024FreeEnhance: Tuning-Free Image Enhancement via Content-Consistent Noising-and-Denoising Process.
Yang Luo, Yiheng Zhang, Zhaofan Qiu, Ting Yao, Zhineng Chen, Yu-Gang Jiang, Tao Mei
2024FreePIH: Training-Free Painterly Image Harmonization with Diffusion Model.
Ruibin Li, Jingcai Guo, Qihua Zhou, Song Guo
2024Freehand Sketch Generation from Mechanical Components.
Zhichao Liao, Fengyuan Piao, Di Huang, Xinghui Li, Yue Ma, Pingfa Feng, Heming Fang, Long Zeng
2024FreqMamba: Viewing Mamba from a Frequency Perspective for Image Deraining.
Zhen Zou, Hu Yu, Jie Huang, Feng Zhao
2024Frequency Guidance Matters: Skeletal Action Recognition by Frequency-Aware Mixed Transformer.
Wenhan Wu, Ce Zheng, Zihao Yang, Chen Chen, Srijan Das, Aidong Lu
2024Frequency-Aware GAN for Imperceptible Transfer Attack on 3D Point Clouds.
Xiaowen Cai, Yunbo Tao, Daizong Liu, Pan Zhou, Xiaoye Qu, Jianfeng Dong, Keke Tang, Lichao Sun
2024From Assistants to Agents in the LLM Era.
Pascale Fung
2024From Covert Hiding To Visual Editing: Robust Generative Video Steganography.
Xueying Mao, Xiaoxiao Hu, Wanli Peng, Zhenliang Gan, Zhenxing Qian, Xinpeng Zhang, Sheng Li
2024From Multimodal LLM to Human-level AI: Modality, Instruction, Reasoning and Beyond.
Hao Fei, Xiangtai Li, Haotian Liu, Fuxiao Liu, Zhuosheng Zhang, Hanwang Zhang, Shuicheng Yan
2024From Question to Exploration: Can Classic Test-Time Adaptation Strategies Be Effectively Applied in Semantic Segmentation?
Chang'an Yi, Haotian Chen, Yifan Zhang, Yonghui Xu, Yan Zhou, Lizhen Cui
2024From Speaker to Dubber: Movie Dubbing with Prosody and Duration Consistency Learning.
Zhedong Zhang, Liang Li, Gaoxiang Cong, Haibing Yin, Yuhan Gao, Chenggang Yan, Anton van den Hengel, Yuankai Qi
2024Fuse Your Latents: Video Editing with Multi-source Latent Diffusion Models.
Tianyi Lu, Xing Zhang, Jiaxi Gu, Renjing Pei, Songcen Xu, Xingjun Ma, Hang Xu, Zuxuan Wu
2024FusionOcc: Multi-Modal Fusion for 3D Occupancy Prediction.
Shuo Zhang, Yupeng Zhai, Jilin Mei, Yu Hu
2024Future Motion Dynamic Modeling via Hybrid Supervision for Multi-Person Motion Prediction Uncertainty Reduction.
Yan Zhuang, Yanlu Cai, Weizhong Zhang, Cheng Jin
2024G-Refine: A General Quality Refiner for Text-to-Image Generation.
Chunyi Li, Haoning Wu, Hongkun Hao, Zicheng Zhang, Tengchuan Kou, Chaofeng Chen, Lei Bai, Xiaohong Liu, Weisi Lin, Guangtao Zhai
2024GAN-based Symmetric Embedding Costs Adjustment for Enhancing Image Steganographic Security.
Miaoxin Ye, Saixing Zhou, Weiqi Luo, Shunquan Tan, Jiwu Huang
2024GDR-GMA: Machine Unlearning via Direction-Rectified and Magnitude-Adjusted Gradients.
Shen Lin, Xiaoyu Zhang, Willy Susilo, Xiaofeng Chen, Jun Liu
2024GG-Editor: Locally Editing 3D Avatars with Multimodal Large Language Model Guidance.
Yunqiu Xu, Linchao Zhu, Yi Yang
2024GIST: Improving Parameter Efficient Fine-Tuning via Knowledge Interaction.
Jiacheng Ruan, Jingsheng Gao, Mingye Xie, Suncheng Xiang, Zefang Yu, Ting Liu, Yuzhuo Fu, Xiaoye Qu
2024GLATrack: Global and Local Awareness for Open-Vocabulary Multiple Object Tracking.
Guangyao Li, Yajun Jian, Yan Yan, Hanzi Wang
2024GLGait: A Global-Local Temporal Receptive Field Network for Gait Recognition in the Wild.
Guozhen Peng, Yunhong Wang, Yuwei Zhao, Shaoxiong Zhang, Annan Li
2024GLoMo: Global-Local Modal Fusion for Multimodal Sentiment Analysis.
Yan Zhuang, Yanru Zhang, Zheng Hu, Xiaoyue Zhang, Jiawen Deng, Fuji Ren
2024GOAL: Grounded text-to-image Synthesis with Joint Layout Alignment Tuning.
Yaqi Li, Han Fang, Zerun Feng, Kaijing Ma, Chao Ban, Xianghao Zang, Lanxiang Zhou, Zhongjiang He, Jingyan Chen, Jiani Hu, Hao Sun, Huayu Zhang
2024GOI: Find 3D Gaussians of Interest with an Optimizable Open-vocabulary Semantic-space Hyperplane.
Yansong Qu, Shaohui Dai, Xinyang Li, Jianghang Lin, Liujuan Cao, Shengchuan Zhang, Rongrong Ji
2024GPD-VVTO: Preserving Garment Details in Video Virtual Try-On.
Yuanbin Wang, Weilun Dai, Long Chan, Huanyu Zhou, Aixi Zhang, Si Liu
2024GPT4Video: A Unified Multimodal Large Language Model for lnstruction-Followed Understanding and Safety-Aware Generation.
Zhanyu Wang, Longyue Wang, Zhen Zhao, Minghao Wu, Chenyang Lyu, Huayang Li, Deng Cai, Luping Zhou, Shuming Shi, Zhaopeng Tu
2024GRACE: GRadient-based Active Learning with Curriculum Enhancement for Multimodal Sentiment Analysis.
Xinyu Li, Wenqing Ye, Yueyi Zhang, Xiaoyan Sun
2024GRFormer: Grouped Residual Self-Attention for Lightweight Single Image Super-Resolution.
Yuzhen Li, Zehang Deng, Yuxin Cao, Lihua Liu
2024GROOT: Generating Robust Watermark for Diffusion-Model-Based Audio Synthesis.
WeiZhi Liu, Yue Li, Dongdong Lin, Hui Tian, Haizhou Li
2024GS
Linfei Li, Lin Zhang, Zhong Wang, Ying Shen
2024GS
Chengshun Wang, Na Zhao
2024GSLAMOT: A Tracklet and Query Graph-based Simultaneous Locating, Mapping, and Multiple Object Tracking System.
Shuo Wang, Yongcai Wang, Zhimin Xu, Yongyu Guo, Wanting Li, Zhe Huang, Xuewei Bai, Deying Li
2024Gait Recognition in Large-scale Free Environment via Single LiDAR.
Xiao Han, Yiming Ren, Peishan Cong, Yujing Sun, Jingya Wang, Lan Xu, Yuexin Ma
2024GalleryGPT: Analyzing Paintings with Large Multimodal Models.
Yi Bin, Wenhao Shi, Yujuan Ding, Zhiqiang Hu, Zheng Wang, Yang Yang, See-Kiong Ng, Heng Tao Shen
2024Gaussian Mutual Information Maximization for Efficient Graph Self-Supervised Learning: Bridging Contrastive-based to Decorrelation-based.
Jinyong Wen
2024Gaussian Splatting with Neural Basis Extension.
Zhi Zhou, Junke Zhu, Zhangjin Huang
2024GaussianTalker: Real-Time Talking Head Synthesis with 3D Gaussian Splatting.
Kyusun Cho, Joungbin Lee, Heeji Yoon, Yeobin Hong, Jaehoon Ko, Sangjun Ahn, Seungryong Kim
2024GaussianTalker: Speaker-specific Talking Head Synthesis via 3D Gaussian Splatting.
Hongyun Yu, Zhan Qu, Qihang Yu, Jianchuan Chen, Zhonghua Jiang, Zhiwen Chen, Shengyu Zhang, Jimin Xu, Fei Wu, Chengfei Lv, Gang Yu
2024GeNSeg-Net: A General Segmentation Framework for Any Nucleus in Immunohistochemistry Images.
Siyuan Xu, Guannan Li, Haofei Song, Jiansheng Wang, Yan Wang, Qingli Li
2024GenUDC: High Quality 3D Mesh Generation With Unsigned Dual Contouring Representation.
Ruowei Wang, Jiaqi Li, Dan Zeng, Xueqi Ma, Zixiang Xu, Jianwei Zhang, Qijun Zhao
2024Generalize to Fully Unseen Graphs: Learn Transferable Hyper-Relation Structures for Inductive Link Prediction.
Jing Yang, Xiaowen Jiang, Yuan Gao, Laurence T. Yang, Jieming Yang
2024Generalized News Event Discovery via Dynamic Augmentation and Entropy Optimization.
Zehang Lin, Jiayuan Xie, Zhenguo Yang, Yi Yu, Qing Li
2024Generalized Sampling of Non-Local Textural Clues Multi-View Stereo Framework.
Jingyuan Tang, Yangang Cai, Xuesong Gao, Songlin Sun
2024Generalized Source-Free Domain-adaptive Segmentation via Reliable Knowledge Propagation.
Qi Zang, Shuang Wang, Dong Zhao, Yang Hu, Dou Quan, Jinlong Li, Nicu Sebe, Zhun Zhong
2024Generalizing ISP Model by Unsupervised Raw-to-raw Mapping.
Dongyu Xie, Chaofan Qiao, Lanyue Liang, Zhiwen Wang, Tianyu Li, Qiao Liu, Chongyi Li, Guoqing Wang, Yang Yang
2024Generating Action-conditioned Prompts for Open-vocabulary Video Action Recognition.
Chengyou Jia, Minnan Luo, Xiaojun Chang, Zhuohang Dang, Mingfei Han, Mengmeng Wang, Guang Dai, Sizhe Dang, Jingdong Wang
2024Generating Multimodal Metaphorical Features for Meme Understanding.
Bo Xu, Junzhe Zheng, Jiayuan He, Yuxuan Sun, Hongfei Lin, Liang Zhao, Feng Xia
2024Generating Prompts in Latent Space for Rehearsal-free Continual Learning.
Chengyi Yang, Wentao Liu, Shisong Chen, Jiayin Qi, Aimin Zhou
2024Generative AI in Multimedia: Challenges and Opportunities for Academic and Industrial Impact.
Zi Helen Huang, Phoebe Chen, Shuicheng Yan
2024Generative Active Learning for Image Synthesis Personalization.
Xulu Zhang, Wengyu Zhang, Xiaoyong Wei, Jinlin Wu, Zhaoxiang Zhang, Zhen Lei, Qing Li
2024Generative Expressive Conversational Speech Synthesis.
Rui Liu, Yifan Hu, Yi Ren, Xiang Yin, Haizhou Li
2024Generative Motion Stylization of Cross-structure Characters within Canonical Motion Space.
Jiaxu Zhang, Xin Chen, Gang Yu, Zhigang Tu
2024Generative Multimodal Data Augmentation for Low-Resource Multimodal Named Entity Recognition.
Ziyan Li, Jianfei Yu, Jia Yang, Wenya Wang, Li Yang, Rui Xia
2024Generative Text Steganography with Large Language Model.
Jiaxuan Wu, Zhengxian Wu, Yiming Xue, Juan Wen, Wanli Peng
2024GeoFormer: Learning Point Cloud Completion with Tri-Plane Integrated Transformer.
Jinpeng Yu, Binbin Huang, Yuxuan Zhang, Huaxia Li, Xu Tang, Shenghua Gao
2024Geometry-Guided Diffusion Model with Masked Transformer for Robust Multi-View 3D Human Pose Estimation.
Xinyi Zhang, Qinpeng Cui, Qiqi Bao, Wenming Yang, Qingmin Liao
2024Global Patch-wise Attention is Masterful Facilitator for Masked Image Modeling.
Gongli Xi, Ye Tian, Mengyu Yang, Lanshan Zhang, Xirong Que, Wendong Wang
2024Graph Convolutional Semi-Supervised Cross-Modal Hashing.
Xiaobo Shen, Gaoyao Yu, Yinfan Chen, Xichen Yang, Yuhui Zheng
2024Graph based Consistency Learning for Contrastive Multi-View Clustering.
Binbin Xu, Jun Yin, Nan Zhang
2024GraphLearner: Graph Node Clustering with Fully Learnable Augmentation.
Xihong Yang, Erxue Min, Ke Liang, Yue Liu, Siwei Wang, Sihang Zhou, Huijun Wu, Xinwang Liu, En Zhu
2024Group Vision Transformer.
Yaopeng Peng, Milan Sonka, Danny Z. Chen
2024Group-aware Parameter-efficient Updating for Content-Adaptive Neural Video Compression.
Zhenghao Chen, Luping Zhou, Zhihao Hu, Dong Xu
2024GuidedNet: Semi-Supervised Multi-Organ Segmentation via Labeled Data Guide Unlabeled Data.
Haochen Zhao, Hui Meng, Deqian Yang, Xiaozheng Xie, Xiaoze Wu, Qingfeng Li, Jianwei Niu
2024HGOE: Hybrid External and Internal Graph Outlier Exposure for Graph Out-of-Distribution Detection.
Junwei He, Qianqian Xu, Yangbangyan Jiang, Zitai Wang, Yuchen Sun, Qingming Huang
2024HICEScore: A Hierarchical Metric for Image Captioning Evaluation.
Zequn Zeng, Jianqiao Sun, Hao Zhang, Tiansheng Wen, Yudi Su, Yan Xie, Zhengjue Wang, Bo Chen
2024HINER: Neural Representation for Hyperspectral Image.
Junqi Shi, Mingyi Jiang, Ming Lu, Tong Chen, Xun Cao, Zhan Ma
2024HKDSME: Heterogeneous Knowledge Distillation for Semi-supervised Singing Melody Extraction Using Harmonic Supervision.
Shuai Yu, Xiaoliang He, Ke Chen, Yi Yu
2024HMR-Adapter: A Lightweight Adapter with Dual-Path Cross Augmentation for Expressive Human Mesh Recovery.
Wenhao Shen, Wanqi Yin, Hao Wang, Chen Wei, Zhongang Cai, Lei Yang, Guosheng Lin
2024HOGDA: Boosting Semi-supervised Graph Domain Adaptation via High-Order Structure-Guided Adaptive Feature Alignment.
Jun Dan, Weiming Liu, Mushui Liu, Chunfeng Xie, Shunjie Dong, Guofang Ma, Yanchao Tan, Jiazheng Xing
2024HPC: Hierarchical Progressive Coding Framework for Volumetric Video.
Zihan Zheng, Houqiang Zhong, Qiang Hu, Xiaoyun Zhang, Li Song, Ya Zhang, Yanfeng Wang
2024HS-Surf: A Novel High-Frequency Surface Shell Radiance Field to Improve Large-Scale Scene Rendering.
Jiongming Qin, Fei Luo, Tuo Cao, Wenju Xu, Chunxia Xiao
2024Hal-Eval: A Universal and Fine-grained Hallucination Evaluation Framework for Large Vision Language Models.
Chaoya Jiang, Hongrui Jia, Mengfan Dong, Wei Ye, Haiyang Xu, Ming Yan, Ji Zhang, Shikun Zhang
2024Hallu-PI: Evaluating Hallucination in Multi-modal Large Language Models within Perturbed Inputs.
Peng Ding, Jingyu Wu, Jun Kuang, Dan Ma, Xuezhi Cao, Xunliang Cai, Shi Chen, Jiajun Chen, Shujian Huang
2024HandRefiner: Refining Malformed Hands in Generated Images by Diffusion-based Conditional Inpainting.
Wenquan Lu, Yufei Xu, Jing Zhang, Chaoyue Wang, Dacheng Tao
2024Harmfully Manipulated Images Matter in Multimodal Misinformation Detection.
Bing Wang, Shengsheng Wang, Changchun Li, Renchu Guan, Ximing Li
2024HarmonicNeRF: Geometry-Informed Synthetic View Augmentation for 3D Scene Reconstruction in Driving Scenarios.
Xiaochao Pan, Jiawei Yao, Hongrui Kou, Tong Wu, Canran Xiao
2024Harmony Everything! Masked Autoencoders for Video Harmonization.
Yuhang Li, Jincen Jiang, Xiaosong Yang, Youdong Ding, Jian Jun Zhang
2024Harmony in Diversity: Improving All-in-One Image Restoration via Multi-Task Collaboration.
Gang Wu, Junjun Jiang, Kui Jiang, Xianming Liu
2024Hawkeye: Discovering and Grounding Implicit Anomalous Sentiment in Recon-videos via Scene-enhanced Video Large Language Model.
Jianing Zhao, Jingjing Wang, Yujie Jin, Jiamin Luo, Guodong Zhou
2024HazeSpace2M: A Dataset for Haze Aware Single Image Dehazing.
Md Tanvir Islam, Nasir Rahim, Saeed Anwar, Muhammad Saqib, Sambit Bakshi, Khan Muhammad
2024HcaNet: Haze-concentration-aware Network for Real-scene Dehazing with Codebook Priors.
Yi Liu, Jiachen Li, Yanchun Ma, Qing Xie, Yongjian Liu
2024HeadsetOff: Enabling Photorealistic Video Conferencing on Economical VR Headsets.
Yili Jin, Xize Duan, Fangxin Wang, Xue Liu
2024Hearing the Moment with MetaEcho! From Physical to Virtual in Synchronized Sound Recording.
Zheng Wei, Yuzheng Chen, Wai Tong, Xuan Zong, Huamin Qu, Xian Xu, Lik-Hang Lee
2024HeroMaker: Human-centric Video Editing with Motion Priors.
Shiyu Liu, Zibo Zhao, Yihao Zhi, Yiqun Zhao, Binbin Huang, Shuo Wang, Ruoyu Wang, Michael Xuan, Zhengxin Li, Shenghua Gao
2024Heterogeneity-Aware Federated Deep Multi-View Clustering towards Diverse Feature Representations.
Xiaorui Jiang, Zhongyi Ma, Yulin Fu, Yong Liao, Pengyuan Zhou
2024Heterogeneous Graph Guided Contrastive Learning for Spatially Resolved Transcriptomics Data.
Xiao He, Chang Tang, Xinwang Liu, Chuankun Li, Shan An, Zhenglai Li
2024Heterophilic Graph Invariant Learning for Out-of-Distribution of Fraud Detection.
Lingfei Ren, Ruimin Hu, Zheng Wang, Yilin Xiao, Dengshi Li, Junhang Wu, Yilong Zang, Jinzhang Hu, Zijun Huang
2024Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video Diffusion Models.
Haibo Yang, Yang Chen, Yingwei Pan, Ting Yao, Zhineng Chen, Chong-Wah Ngo, Tao Mei
2024HiVG: Hierarchical Multimodal Fine-grained Modulation for Visual Grounding.
Linhui Xiao, Xiaoshan Yang, Fang Peng, Yaowei Wang, Changsheng Xu
2024HideMIA: Hidden Wavelet Mining for Privacy-Enhancing Medical Image Analysis.
Xun Lin, Yi Yu, Zitong Yu, Ruohan Meng, Jiale Zhou, Ajian Liu, Yizhong Liu, Shuai Wang, Wenzhong Tang, Zhen Lei, Alex C. Kot
2024Hierarchical Debiasing and Noisy Correction for Cross-domain Video Tube Retrieval.
Jingqiao Xiu, Mengze Li, Wei Ji, Jingyuan Chen, Hanbin Zhao, Shin'ichi Satoh, Roger Zimmermann
2024Hierarchical Multi-label Learning for Incremental Multilingual Text Recognition.
Xiao-Qian Liu, Minghui Liu, Zhen-Duo Chen, Xin Luo, Xin-Shun Xu
2024Hierarchical Perceptual and Predictive Analogy-Inference Network for Abstract Visual Reasoning.
Wentao He, Jianfeng Ren, Ruibin Bai, Xudong Jiang
2024High Fidelity Aggregated Planar Prior Assisted PatchMatch Multi-View Stereo.
Jie Liang, Rongjie Wang, Rui Peng, Zhe Zhang, Kaiqiang Xiong, Ronggang Wang
2024Higher-Order Vision-Language Alignment for Social Media Prediction.
Mingsheng Tu, Tianjiao Wan, Qisheng Xu, Xinhao Jiang, Kele Xu, Cheng Yang
2024HighlightRemover: Spatially Valid Pixel Learning for Image Specular Highlight Removal.
Ling Zhang, Yidong Ma, Zhi Jiang, Weilei He, Zhongyun Bao, Gang Fu, Wenju Xu, Chunxia Xiao
2024Highly Efficient No-reference 4K Video Quality Assessment with Full-Pixel Covering Sampling and Training Strategy.
Xiaoheng Tan, Jiabin Zhang, Yuhui Quan, Jing Li, Yajing Wu, Zilin Bian
2024Highly Transferable Diffusion-based Unrestricted Adversarial Attack on Pre-trained Vision-Language Models.
Wenzhuo Xu, Kai Chen, Ziyi Gao, Zhipeng Wei, Jingjing Chen, Yu-Gang Jiang
2024HmPEAR: A Dataset for Human Pose Estimation and Action Recognition.
Yitai Lin, Zhijie Wei, Wanfa Zhang, Xiping Lin, Yudi Dai, Chenglu Wen, Siqi Shen, Lan Xu, Cheng Wang
2024Holistic-CAM: Ultra-lucid and Sanity Preserving Visual Interpretation in Holistic Stage of CNNs.
Pengxu Chen, Huazhong Liu, Jihong Ding, Jiawen Luo, Peng Tan, Laurence T. Yang
2024Hunting Blemishes: Language-guided High-fidelity Face Retouching Transformer with Limited Paired Data.
Le Jiang, Yan Huang, Lianxin Xie, Wen Xue, Cheng Liu, Si Wu, Hau-San Wong
2024Hybrid Cost Volume for Memory-Efficient Optical Flow.
Yang Zhao, Gangwei Xu, Gang Wu
2024HybridFlow: Infusing Continuity into Masked Codebook for Extreme Low-Bitrate Image Compression.
Lei Lu, Yanyue Xie, Wei Jiang, Wei Wang, Xue Lin, Yanzhi Wang
2024Hydrodynamics-Informed Neural Network for Simulating Dense Crowd Motion Patterns.
Yanshan Zhou, Pingrui Lai, Jiaqi Yu, Yingjie Xiong, Hua Yang
2024HyperTime: Hyperparameter Optimization for Combating Temporal Distribution Shifts.
Shaokun Zhang, Yiran Wu, Zhonghua Zheng, Qingyun Wu, Chi Wang
2024Hypergraph Multi-modal Large Language Model: Exploiting EEG and Eye-tracking Modalities to Evaluate Heterogeneous Responses for Video Understanding.
Minghui Wu, Chenxu Zhao, Anyang Su, Donglin Di, Tianyu Fu, Da An, Min He, Ya Gao, Meng Ma, Kun Yan, Ping Wang
2024Hypergraph-guided Intra- and Inter-category Relation Modeling for Fine-grained Visual Recognition.
Lu Chen, Qiangchang Wang, Zhaohui Li, Yilong Yin
2024IBMEA: Exploring Variational Information Bottleneck for Multi-modal Entity Alignment.
Taoyu Su, Jiawei Sheng, Shicheng Wang, Xinghua Zhang, Hongbo Xu, Tingwen Liu
2024IC-Mapper: Instance-Centric Spatio-Temporal Modeling for Online Vectorized Map Construction.
Jiangtong Zhu, Zhao Yang, Yinan Shi, Jianwu Fang, Jianru Xue
2024IF-Garments: Reconstructing Your Intersection-Free Multi-Layered Garments from Monocular Videos.
Mingyang Sun, Qipeng Yan, Zhuoer Liang, Dongliang Kou, Dingkang Yang, Ruisheng Yuan, Xiao Zhao, Mingcheng Li, Lihua Zhang
2024IGSPAD: Inverting 3D Gaussian Splatting for Pose-agnostic Anomaly Detection.
Bolin Jiang, Yuqiu Xie, Jiawei Li, Naiqi Li, Bin Chen, Shu-Tao Xia
2024IconDM: Text-Guided Icon Set Expansion Using Diffusion Models.
Jiawei Lin, Zhaoyun Jiang, Jiaqi Guo, Shizhao Sun, Ting Liu, Zijiang Yang, Jian-Guang Lou, Dongmei Zhang
2024Identity-Driven Multimedia Forgery Detection via Reference Assistance.
Junhao Xu, Jingjing Chen, Xue Song, Feng Han, Haijun Shan, Yu-Gang Jiang
2024Illumination Distribution Prior for Low-light Image Enhancement.
Chao Wang, Yang Zhou, Liangtian He, Fenglai Lin, Hongming Chen, Liang-Jian Deng
2024Image-free Pre-training for Low-Level Vision.
Siyang Wang, Jinghao Zhang, Jie Huang, Feng Zhao
2024ImageBind3D: Image as Binding Step for Controllable 3D Generation.
Zhenqiang Li, Jie Li, Yangjie Cao, Jiayi Wang, Runfeng Lv
2024Imbalanced Multi-instance Multi-label Learning via Coding Ensemble and Adaptive Thresholds.
Xinyue Zhang, Tingjin Luo, Yueying Liu, Chenping Hou
2024Importance-aware Shared Parameter Subspace Learning for Domain Incremental Learning.
Shiye Wang, Changsheng Li, Jialin Tang, Xing Gong, Ye Yuan, Guoren Wang
2024Improved Weighted Tensor Schatten
Yinghui Sun, Xingfeng Li, Quansen Sun, Min-Ling Zhang, Zhenwen Ren
2024Improving Composed Image Retrieval via Contrastive Learning with Scaling Positives and Negatives.
Zhangchi Feng, Richong Zhang, Zhijie Nie
2024Improving Interaction Comfort in Authoring Task in AR-HRI through Dynamic Dual-Layer Interaction Adjustment.
Yunqiang Pei, Kaiyue Zhang, Hongrong Yang, Yong Tao, Qihang Tang, Jialei Tang, Guoqing Wang, Zhitao Liu, Ning Xie, Peng Wang, Yang Yang, Hengtao Shen
2024Improving Out-of-Distribution Detection with Disentangled Foreground and Background Features.
Choubo Ding, Guansong Pang
2024Improving the Training of the GANs with Limited Data via Dual Adaptive Noise Injection.
Zhaoyu Zhang, Yang Hua, Guanxiong Sun, Hui Wang, Seán F. McLoone
2024In Situ 3D Scene Synthesis for Ubiquitous Embodied Interfaces.
Haiyan Jiang, Leiyu Song, Dongdong Weng, Zhe Sun, Huiying Li, Xiaonuo Dongye, Zhenliang Zhang
2024In-Context Learning for Zero-shot Medical Report Generation.
Rui Liu, Mingjie Li, Shen Zhao, Ling Chen, Xiaojun Chang, Lina Yao
2024InMu-Net: Advancing Multi-modal Intent Detection via Information Bottleneck and Multi-sensory Processing.
Zhihong Zhu, Xuxin Cheng, Zhaorun Chen, Yuyan Chen, Yunyan Zhang, Xian Wu, Yefeng Zheng, Bowen Xing
2024InNeRF: Learning Interpretable Radiance Fields for Generalizable 3D Scene Representation and Rendering.
Dan Wang, Xinrui Cui
2024Incremental Learning via Robust Parameter Posterior Fusion.
Wenju Sun, Qingyong Li, Siyu Zhang, Wen Wang, Yangli-ao Geng
2024Inferring 3D Occupancy Fields through Implicit Reasoning on Silhouette Images.
Baorui Ma, Yu-Shen Liu, Matthias Zwicker, Zhizhong Han
2024Information Diffusion Prediction with Graph Neural Ordinary Differential Equation Network.
Ding Wang, Wei Zhou, Songlin Hu
2024Information Fusion with Knowledge Distillation for Fine-grained Remote Sensing Object Detection.
Sheng Zhang, Xi Yang
2024Informative Point cloud Dataset Extraction for Classification via Gradient-based Points Moving.
Wenxiao Zhang, Ziqi Wang, Li Xu, Xun Yang, Jun Liu
2024Infusion: Preventing Customized Text-to-Image Diffusion from Overfitting.
Weili Zeng, Yichao Yan, Qi Zhu, Zhuo Chen, Pengzhi Chu, Weiming Zhao, Xiaokang Yang
2024InsVP: Efficient Instance Visual Prompting from Image Itself.
Zichen Liu, Yuxin Peng, Jiahuan Zhou
2024Instance-Level Panoramic Audio-Visual Saliency Detection and Ranking.
Ruohao Guo, Dantong Niu, Liao Qu, Yanyu Qi, Ji Shi, Wenzhen Yue, Bowei Xing, Taiyan Chen, Xianghua Ying
2024Instance-aware Fine-grained Micro-action Recognition.
Chen Wang, Xun Mei, Feng Zhang
2024InstantAS: Minimum Coverage Sampling for Arbitrary-Size Image Generation.
Changshuo Wang, Mingzhe Yu, Lei Wu, Lei Meng, Xiang Li, Xiangxu Meng
2024Integrating Content-Semantics-World Knowledge to Detect Stress from Videos.
Yang Ding, Yi Dai, Xin Wang, Ling Feng, Lei Cao, Huijun Zhang
2024Integrating Stickers into Multimodal Dialogue Summarization: A Novel Dataset and Approach for Enhancing Social Media Interaction.
Yuanchen Shi, Fang Kong
2024Interactive Segmentation by Considering First-Click Intentional Ambiguity.
Kangpeng Hu, Quansen Sun, Yinghui Sun, Tao Wang
2024Interpretable Matching of Optical-SAR Image via Dynamically Conditioned Diffusion Models.
Shuiping Gou, Xin Wang, Xinlin Wang, Yunzhi Chen
2024Introducing Common Null Space of Gradients for Gradient Projection Methods in Continual Learning.
Chengyi Yang, Mingda Dong, Xiaoyue Zhang, Jiayin Qi, Aimin Zhou
2024Investigating Conceptual Blending of a Diffusion Model for Improving Nonword-to-Image Generation.
Chihaya Matsuhira, Marc A. Kastner, Takahiro Komamizu, Takatsugu Hirayama, Ichiro Ide
2024It Takes Two: Accurate Gait Recognition in the Wild via Cross-granularity Alignment.
Jinkai Zheng, Xinchen Liu, Boyue Zhang, Chenggang Yan, Jiyong Zhang, Wu Liu, Yongdong Zhang
2024JoReS-Diff: Joint Retinex and Semantic Priors in Diffusion Model for Low-light Image Enhancement.
Yuhui Wu, Guoqing Wang, Zhiwen Wang, Yang Yang, Tianyu Li, Malu Zhang, Chongyi Li, Heng Tao Shen
2024Joint Homophily and Heterophily Relational Knowledge Distillation for Efficient and Compact 3D Object Detection.
Shidi Chen, Lili Wei, Liqian Liang, Congyan Lang
2024Joint-Motion Mutual Learning for Pose Estimation in Video.
Sifan Wu, Haipeng Chen, Yifang Yin, Sihao Hu, Runyang Feng, Yingying Jiao, Ziqi Yang, Zhenguang Liu
2024KEBR: Knowledge Enhanced Self-Supervised Balanced Representation for Multimodal Sentiment Analysis.
Aoqiang Zhu, Min Hu, Xiaohua Wang, Jiaoyun Yang, Yiming Tang, Fuji Ren
2024KNN Transformer with Pyramid Prompts for Few-Shot Learning.
Wenhao Li, Qiangchang Wang, Peng Zhao, Yilong Yin
2024Knowledge-Aware Artifact Image Synthesis with LLM-Enhanced Prompting and Multi-Source Supervision.
Shengguang Wu, Zhenglun Chen, Qi Su
2024LD-BFR: Vector-Quantization-Based Face Restoration Model with Latent Diffusion Enhancement.
Yuzhen Du, Teng Hu, Ran Yi, Lizhuang Ma
2024LDA-AQU: Adaptive Query-guided Upsampling via Local Deformable Attention.
Zewen Du, Zhenjiang Hu, Guiyu Zhao, Ying Jin, Hongbin Ma
2024LDCNet: Long-Distance Context Modeling for Large-Scale 3D Point Cloud Scene Semantic Segmentation.
Shoutong Luo, Zhengxing Sun, Yi Wang, Yunhan Sun, Chendi Zhu
2024LDStega: Practical and Robust Generative Image Steganography based on Latent Diffusion Models.
Yinyin Peng, Yaofei Wang, Donghui Hu, Kejiang Chen, Xianjin Rong, Weiming Zhang
2024LLaVA-Ultra: Large Chinese Language and Vision Assistant for Ultrasound.
Xuechen Guo, Wenhao Chai, Shiyan Li, Gaoang Wang
2024LLaVA-VSD: Large Language-and-Vision Assistant for Visual Spatial Description.
Yizhang Jin, Jian Li, Jiangning Zhang, Jianlong Hu, Zhenye Gan, Xin Tan, Yong Liu, Yabiao Wang, Chengjie Wang, Lizhuang Ma
2024LMM-PCQA: Assisting Point Cloud Quality Assessment with LMM.
Zicheng Zhang, Haoning Wu, Yingjie Zhou, Chunyi Li, Wei Sun, Chaofeng Chen, Xiongkuo Min, Xiaohong Liu, Weisi Lin, Guangtao Zhai
2024LOVD: Large-and-Open Vocabulary Object Detection.
Shiyu Tang, Zhaofan Luo, Yifan Wang, Lijun Wang, Huchuan Lu, Weibo Su, Libo Liu
2024Label Decoupling and Reconstruction: A Two-Stage Training Framework for Long-tailed Multi-label Medical Image Recognition.
Jie Huang, Zhao-Min Chen, Xiaoqin Zhang, Yisu Ge, Lusi Ye, Guodao Zhang, Huiling Chen
2024Label Text-aided Hierarchical Semantics Mining for Panoramic Activity Recognition.
Tianshan Liu, Kin-Man Lam, Bing-Kun Bao
2024Label-Efficient Emotion and Sentiment Analysis.
Sicheng Zhao, Guoli Jia, Xiaopeng Hong, Yanyan Zhao, Jianhua Tao
2024LampMark: Proactive Deepfake Detection via Training-Free Landmark Perceptual Watermarks.
Tianyi Wang, Mengxiao Huang, Harry Cheng, Xiao Zhang, Zhiqi Shen
2024LaneCMKT: Boosting Monocular 3D Lane Detection with Cross-Modal Knowledge Transfer.
Runkai Zhao, Heng Wang, Weidong Cai
2024Language-Driven Interactive Shadow Detection.
Hongqiu Wang, Wei Wang, Haipeng Zhou, Huihui Xu, Shaozhi Wu, Lei Zhu
2024Language-Guided Visual Prompt Compensation for Multi-Modal Remote Sensing Image Classification with Modality Absence.
Ling Huang, Wenqian Dong, Song Xiao, Jiahui Qu, Yuanbo Yang, Yunsong Li
2024Laplacian Matrix Learning for Point Cloud Attribute Compression with Ternary Search-Based Adaptive Block Partition.
Changhao Peng, Wei Gao
2024Large Multi-modality Model Assisted AI-Generated Image Quality Assessment.
Puyi Wang, Wei Sun, Zicheng Zhang, Jun Jia, Yanwei Jiang, Zhichao Zhang, Xiongkuo Min, Guangtao Zhai
2024Large Multimodal Models as Social Multimedia Analysis Engines.
Jiebo Luo
2024Large Point-to-Gaussian Model for Image-to-3D Generation.
Longfei Lu, Huachen Gao, Tao Dai, Yaohua Zha, Zhi Hou, Junta Wu, Shu-Tao Xia
2024Latent Representation Reorganization for Face Privacy Protection.
Zhengzhong Kuang, Jianan Lu, Chenhui Hong, Haobin Huang, Suguo Zhu, Xiaowei Zhao, Jun Yu, Jianping Fan
2024Learnable Negative Proposals Using Dual-Signed Cross-Entropy Loss for Weakly Supervised Video Moment Localization.
Sunoh Kim, Daeho Um, Hyunjun Choi, Jin Young Choi
2024Learning A Low-Level Vision Generalist via Visual Task Prompt.
Xiangyu Chen, Yihao Liu, Yuandong Pu, Wenlong Zhang, Jiantao Zhou, Yu Qiao, Chao Dong
2024Learning Backward Compatible Representations.
Niccolò Biondi, Simone Ricci, Federico Pernici, Alberto Del Bimbo
2024Learning Context with Priors for 3D Interacting Hand-Object Pose Estimation.
Zengsheng Kuang, Changxing Ding, Huan Yao
2024Learning Cross-Spectral Prior for Image Super-Resolution.
Chenxi Ma, Weimin Tan, Shili Zhou, Bo Yan
2024Learning Dual Enhanced Representation for Contrastive Multi-view Clustering.
Guoliang Zou, Yangdong Ye, Tongji Chen, Shizhe Hu
2024Learning Enriched Features via Selective State Spaces Model for Efficient Image Deblurring.
Hu Gao, Bowen Ma, Ying Zhang, Jingfan Yang, Jing Yang, Depeng Dang
2024Learning Exposure Correction in Dynamic Scenes.
Jin Liu, Bo Wang, Chuanming Wang, Huiyuan Fu, Huadong Ma
2024Learning Geometry Consistent Neural Radiance Fields from Sparse and Unposed Views.
Qi Zhang, Chi Huang, Qian Zhang, Nan Li, Wei Feng
2024Learning Optimal Combination Patterns for Lightweight Stereo Image Super-Resolution.
Hu Gao, Jing Yang, Ying Zhang, Jingfan Yang, Bowen Ma, Depeng Dang
2024Learning Realistic Sketching: A Dual-agent Reinforcement Learning Approach.
Ji Qiu, Peng Lu, Xujun Peng, Wenhao Guo, Zhaoran Zhao, Xiangtao Dong
2024Learning Spectral-Decomposited Tokens for Domain Generalized Semantic Segmentation.
Jingjun Yi, Qi Bi, Hao Zheng, Haolan Zhan, Wei Ji, Yawen Huang, Yuexiang Li, Yefeng Zheng
2024Learning Unknowns from Unknowns: Diversified Negative Prototypes Generator for Few-shot Open-Set Recognition.
Zhenyu Zhang, Guangyao Chen, Yixiong Zou, Yuhua Li, Ruixuan Li
2024Learning from Concealed Labels.
Zhongnian Li, Meng Wei, Peng Ying, Tongfeng Sun, Xinzheng Xu
2024Learning from Distinction: Mitigating Backdoors Using a Low-Capacity Model.
Haosen Sun, Yiming Li, Xixiang Lyu, Jing Ma
2024Learning in Order! A Sequential Strategy to Learn Invariant Features for Multimodal Sentiment Analysis.
Xianbing Zhao, Lizhen Qu, Tao Feng, Jianfei Cai, Buzhou Tang
2024Learning to Correction: Explainable Feedback Generation for Visual Commonsense Reasoning Distractor.
Jiali Chen, Xusen Hei, Yuqi Xue, Yuancheng Wei, Jiayuan Xie, Yi Cai, Qing Li
2024Learning to Handle Large Obstructions in Video Frame Interpolation.
Libo Long, Xiao Hu, Jochen Lang
2024Learning to Transfer Heterogeneous Translucent Materials from a 2D Image to 3D Models.
Xiaogang Wang, Yuhang Cheng, Ziyang Fan, Kai Xu
2024Learning with Alignments: Tackling the Inter- and Intra-domain Shifts for Cross-multidomain Facial Expression Recognition.
Yuxiang Yang, Lu Wen, Xinyi Zeng, Yuanyuan Xu, Xi Wu, Jiliu Zhou, Yan Wang
2024LearningPCC: A PyTorch Library for Learning-Based Point Cloud Compression.
Liang Xie, Wei Gao
2024Less is More: Adaptive Feature Selection and Fusion for Eye Contact Detection.
Fuyan Ma, Yiran He, Bin Sun, Shutao Li
2024Let Me Finish My Sentence: Video Temporal Grounding with Holistic Text Understanding.
Jongbhin Woo, Hyeonggon Ryu, Youngjoon Jang, Jae-Won Cho, Joon Son Chung
2024Leveraging Knowledge of Modality Experts for Incomplete Multimodal Learning.
Wenxin Xu, Hexin Jiang, Xuefeng Liang
2024Leveraging RGB-Pressure for Whole-body Human-to-Humanoid Motion Imitation.
Yi Lu, Shenghao Ren, Qiu Shen, Xun Cao
2024Leveraging Weak Cross-Modal Guidance for Coherence Modelling via Iterative Learning.
Yi Bin, Junrong Liao, Yujuan Ding, Haoxuan Li, Yang Yang, See-Kiong Ng, Heng Tao Shen
2024LiDAR-NeRF: Novel LiDAR View Synthesis via Neural Radiance Fields.
Tang Tao, Longfei Gao, Guangrun Wang, Yixing Lao, Peng Chen, Hengshuang Zhao, Dayang Hao, Xiaodan Liang, Mathieu Salzmann, Kaicheng Yu
2024Linearly-evolved Transformer for Pan-sharpening.
Junming Hou, Zihan Cao, Naishan Zheng, Xuan Li, Xiaoyu Chen, Xinyang Liu, Xiaofeng Cong, Danfeng Hong, Man Zhou
2024LinkThief: Combining Generalized Structure Knowledge with Node Similarity for Link Stealing Attack against GNN.
Yuxing Zhang, Siyuan Meng, Chunchun Chen, Mengyao Peng, Hongyan Gu, Xinli Huang
2024ListenFormer: Responsive Listening Head Generation with Non-autoregressive Transformers.
Miao Liu, Jing Wang, Xinyuan Qian, Haizhou Li
2024Lite-Mind: Towards Efficient and Robust Brain Representation Learning.
Zixuan Gong, Qi Zhang, Guangyin Bao, Lei Zhu, Yu Zhang, Ke Liu, Liang Hu, Duoqian Miao
2024LiteGfm: A Lightweight Self-supervised Monocular Depth Estimation Framework for Artifacts Reduction via Guided Image Filtering.
Zhilin He, Yawei Zhang, Jingchang Mu, Xiaoyue Gu, Tianhao Gu
2024LiteQUIC: Improving QoE of Video Streams by Reducing CPU Overhead of QUIC.
Pengqiang Bi, Yifei Zou, Mengbai Xiao, Dongxiao Yu, Yijun Li, Zhixiong Liu, Qun Xie
2024Live on the Hump: Self Knowledge Distillation via Virtual Teacher-Students Mutual Learning.
Shuang Wang, Pengyi Hao, Fuli Wu, Cong Bai
2024LoFormer: Local Frequency Transformer for Image Deblurring.
Xintian Mao, Jiansheng Wang, Xingran Xie, Qingli Li, Yan Wang
2024LoMOE: Localized Multi-Object Editing via Multi-Diffusion.
Goirik Chakrabarty, Aditya Chandrasekar, Ramya Hebbalaguppe, Prathosh AP
2024Loc4Plan: Locating Before Planning for Outdoor Vision and Language Navigation.
Huilin Tian, Jingke Meng, Wei-Shi Zheng, Yuan-Ming Li, Junkai Yan, Yunong Zhang
2024LoopGaussian: Creating 3D Cinemagraph with Multi-view Images via Eulerian Motion Field.
Jiyang Li, Lechao Cheng, Zhangye Wang, Tingting Mu, Jingxuan He
2024Low-rank Prompt Interaction for Continual Vision-Language Retrieval.
Weicai Yan, Ye Wang, Wang Lin, Zirun Guo, Zhou Zhao, Tao Jin
2024Lumos: Optimizing Live 360-degree Video Upstreaming via Spatial-Temporal Integrated Neural Enhancement.
Beizhang Guo, Juntao Bao, Baili Chai, Di Wu, Miao Hu
2024MAC 2024: Micro-Action Analysis Grand Challenge.
Dan Guo, Xiaobai Li, Kun Li, Haoyu Chen, Jingjing Hu, Guoying Zhao, Yi Yang, Meng Wang
2024MAF-ID: Multi-Agent Framework for Interactive Dubbing through Deep Video Understanding.
Zhanbin Hu, Xiaodong He, Renzhou Pan, Xianzhou Zeng, Chenming Fan, Qiang Zhu
2024MAG-Edit: Localized Image Editing in Complex Scenarios via Mask-Based Attention-Adjusted Guidance.
Qi Mao, Lan Chen, Yuchao Gu, Zhen Fang, Mike Zheng Shou
2024MAGIC: Rethinking Dynamic Convolution Design for Medical Image Segmentation.
Shijie Li, Yunbin Tu, Qingyuan Xiang, Zheng Li
2024MAJL: A Model-Agnostic Joint Learning Framework for Music Source Separation and Pitch Estimation.
Haojie Wei, Jun Yuan, Rui Zhang, Quanyu Dai, Yueguo Chen
2024MB2C: Multimodal Bidirectional Cycle Consistency for Learning Robust Visual Neural Representations.
Yayun Wei, Lei Cao, Hao Li, Yilin Dong
2024MDDR: Multi-modal Dual-Attention aggregation for Depression Recognition.
Wei Zhang, En Zhu, Juan Chen, Yunpeng Li
2024MDR: Multi-stage Decoupled Relational Knowledge Distillation with Adaptive Stage Selection.
Jiaqi Wang, Lu Lu, Mingmin Chi, Jian Chen
2024MDT-A2G: Exploring Masked Diffusion Transformers for Co-Speech Gesture Generation.
Xiaofeng Mao, Zhengkai Jiang, Qilin Wang, Chencan Fu, Jiangning Zhang, Jiafu Wu, Yabiao Wang, Chengjie Wang, Wei Li, Mingmin Chi
2024MEGC2024: ACM Multimedia 2024 Facial Micro-Expression Grand Challenge.
John See, Jingting Li, Adrian K. Davison, Gen-Bing Liong, Moi Hoon Yap, Wen-Huang Cheng, Xiaobai Li, Xiaopeng Hong, Su-Jing Wang
2024MFMS: Learning Modality-Fused and Modality-Specific Features for Deepfake Detection and Localization Tasks.
Yi Zhang, Changtao Miao, Man Luo, Jianshu Li, Wenzhong Deng, Weibin Yao, Zhe Li, Bingyu Hu, Weiwei Feng, Tao Gong, Qi Chu
2024MFRGN: Multi-scale Feature Representation Generalization Network for Ground-to-Aerial Geo-localization.
Yuntao Wang, Jinpu Zhang, Ruonan Wei, Wenbo Gao, Yuehuan Wang
2024MGR-Dark: A Large Multimodal Video Dataset and RGB-IR Benchmark for Gesture Recognition in Darkness.
Yuanyuan Shi, Yunan Li, Siyu Liang, Huizhou Chen, Qiguang Miao
2024MICM: Rethinking Unsupervised Pretraining for Enhanced Few-shot Learning.
Zhenyu Zhang, Guangyao Chen, Yixiong Zou, Zhimeng Huang, Yuhua Li, Ruixuan Li
2024MIRACLE: An Online, Explainable Multimodal Interactive Concept Learning System.
Ansel Blume, Khanh Duy Nguyen, Zhenhailong Wang, Yangyi Chen, Michal Shlapentokh-Rothman, Xiaomeng Jin, Jeonghwan Kim, Zhen Zhu, Jiateng Liu, Kuan-Hao Huang, Mankeerat Sidhu, Xuanming Zhang, Vivian Liu, Raunak Sinha, Te-Lin Wu, Abhay Zala, Elias Stengel-Eskin, Da Yin, Yao Xiao, Utkarsh Mall, Zhou Yu, Kai-Wei Chang, Camille Cobb, Karrie Karahalios, Lydia B. Chilton, Mohit Bansal, Nanyun Peng, Carl Vondrick, Derek Hoiem, Heng Ji
2024MLP Embedded Inverse Tone Mapping.
Panjun Liu, Jiacheng Li, Lizhi Wang, Zheng-Jun Zha, Zhiwei Xiong
2024MM-Forecast: A Multimodal Approach to Temporal Event Forecasting with Large Language Models.
Haoxuan Li, Zhengmao Yang, Yunshan Ma, Yi Bin, Yang Yang, Tat-Seng Chua
2024MM-LDM: Multi-Modal Latent Diffusion Model for Sounding Video Generation.
Mingzhen Sun, Weining Wang, Yanyuan Qiao, Jiahui Sun, Zihan Qin, Longteng Guo, Xinxin Zhu, Jing Liu
2024MMAL: Multi-Modal Analytic Learning for Exemplar-Free Audio-Visual Class Incremental Tasks.
Xianghu Yue, Xueyi Zhang, Yiming Chen, Chengwei Zhang, Mingrui Lao, Huiping Zhuang, Xinyuan Qian, Haizhou Li
2024MMDFND: Multi-modal Multi-Domain Fake News Detection.
Yu Tong, Weihai Lu, Zhe Zhao, Song Lai, Tong Shi
2024MMDRFuse: Distilled Mini-Model with Dynamic Refresh for Multi-Modality Image Fusion.
Yanglin Deng, Tianyang Xu, Chunyang Cheng, Xiao-Jun Wu, Josef Kittler
2024MMF: Winning Solution to Social Media Popularity Prediction Challenge 2024.
Yu-Shi Lin, Anthony J. T. Lee
2024MMHead: Towards Fine-grained Multi-modal 3D Facial Animation.
Sijing Wu, Yunhao Li, Yichao Yan, Huiyu Duan, Ziwei Liu, Guangtao Zhai
2024MPT: Multi-grained Prompt Tuning for Text-Video Retrieval.
Haonan Zhang, Pengpeng Zeng, Lianli Gao, Jingkuan Song, Heng Tao Shen
2024MSFNet: Multi-Scale Fusion Network for Brain-Controlled Speaker Extraction.
Cunhang Fan, Jingjing Zhang, Hongyu Zhang, Wang Xiang, Jianhua Tao, Xinhui Li, Jiangyan Yi, Dianbo Sui, Zhao Lv
2024MSTA3D: Multi-scale Twin-attention for 3D Instance Segmentation.
Duc Dang Trung Tran, Byeongkeun Kang, Yeejin Lee
2024MTSNet: Joint Feature Adaptation and Enhancement for Text-Guided Multi-view Martian Terrain Segmentation.
Yang Fang, Xuefeng Rao, Xinbo Gao, Weisheng Li, Zijian Min
2024MUSCAT: A Multimodal mUSic Collection for Automatic Transcription of Real Recordings and Image Scores.
Alejandro Galán-Cuenca, Jose J. Valero-Mas, Juan C. Martinez-Sevilla, Antonio Hidalgo-Centeno, Antonio Pertusa, Jorge Calvo-Zaragoza
2024MVP-Net: Multi-View Depth Image Guided Cross-Modal Distillation Network for Point Cloud Upsampling.
Jiade Chen, Jin Wang, Yunhui Shi, Nam Ling, Baocai Yin
2024MVPbev: Multi-view Perspective Image Generation from BEV with Test-time Controllability and Generalizability.
Buyu Liu, Kai Wang, Yansong Liu, Jun Bao, Tingting Han, Jun Yu
2024Magic Clothing: Controllable Garment-Driven Image Synthesis.
Weifeng Chen, Tao Gu, Yuhao Xu, Arlene Chen
2024MagicCartoon: 3D Pose and Shape Estimation for Bipedal Cartoon Characters.
Yu-Pei Song, Yuantong Liu, Xiao Wu, Qi He, Zhaoquan Yuan, Ao Luo
2024MagicFight: Personalized Martial Arts Combat Video Generation.
Jiancheng Huang, Mingfu Yan, Songyan Chen, Yi Huang, Shifeng Chen
2024MagicVFX: Visual Effects Synthesis in Just Minutes.
Jiaqi Guo, Lianli Gao, Junchen Zhu, Jiaxin Zhang, Siyang Li, Jingkuan Song
2024Make Privacy Renewable! Generating Privacy-Preserving Faces Supporting Cancelable Biometric Recognition.
Tao Wang, Yushu Zhang, Xiangli Xiao, Lin Yuan, Zhihua Xia, Jian Weng
2024Making Large Language Models Perform Better in Knowledge Graph Completion.
Yichi Zhang, Zhuo Chen, Lingbing Guo, Yajing Xu, Wen Zhang, Huajun Chen
2024Mamba3D: Enhancing Local Features for 3D Point Cloud Analysis via State Space Model.
Xu Han, Yuan Tang, Zhaoxuan Wang, Xianzhi Li
2024MambaGesture: Enhancing Co-Speech Gesture Generation with Mamba and Disentangled Multi-Modality Fusion.
Chencan Fu, Yabiao Wang, Jiangning Zhang, Zhengkai Jiang, Xiaofeng Mao, Jiafu Wu, Weijian Cao, Chengjie Wang, Yanhao Ge, Yong Liu
2024MambaMOS: LiDAR-based 3D Moving Object Segmentation with Motion-aware State Space Model.
Kang Zeng, Hao Shi, Jiacheng Lin, Siyu Li, Jintao Cheng, Kaiwei Wang, Zhiyong Li, Kailun Yang
2024MambaTrack: A Simple Baseline for Multiple Object Tracking with State Space Model.
Changcheng Xiao, Qiong Cao, Zhigang Luo, Long Lan
2024MappingFormer: Learning Cross-modal Feature Mapping for Visible-to-infrared Image Translation.
Haining Wang, Na Li, Huijie Zhao, Yan Wen, Yi Su, Yuqiang Fang
2024MaskBEV: Towards A Unified Framework for BEV Detection and Map Segmentation.
Xiao Zhao, Xukun Zhang, Dingkang Yang, Mingyang Sun, Mingcheng Li, Shunli Wang, Lihua Zhang
2024MaskMentor: Unlocking the Potential of Masked Self-Teaching for Missing Modality RGB-D Semantic Segmentation.
Zhida Zhao, Jia Li, Lijun Wang, Yifan Wang, Huchuan Lu
2024Maskable Retentive Network for Video Moment Retrieval.
Jingjing Hu, Dan Guo, Kun Li, Zhan Si, Xun Yang, Meng Wang
2024Masked Random Noise for Communication-Efficient Federated Learning.
Shiwei Li, Yingyi Cheng, Haozhao Wang, Xing Tang, Shijie Xu, Weihong Luo, Yuhua Li, Dugang Liu, Xiuqiang He, Ruixuan Li
2024Masked Snake Attention for Fundus Image Restoration with Vessel Preservation.
Xiaohuan Ding, Yangrui Gong, Tianyi Shi, Zihang Huang, Gangwei Xu, Xin Yang
2024MaterialSeg3D: Segmenting Dense Materials from 2D Priors for 3D Assets.
Zeyu Li, Ruitong Gan, Chuanchen Luo, Yuxi Wang, Jiaheng Liu, Ziwei Zhu, Qing Li, Xucheng Yin, Man Zhang, Zhaoxiang Zhang, Junran Peng
2024Maximizing Feature Distribution Variance for Robust Neural Networks.
Hao Yang, Min Wang, Zhengfei Yu, Zhi Zeng, Mingrui Lao, Yun Zhou
2024Measure and Improve Your Food: Ingredient Estimation Based Nutrition Calculator.
Liangyu Wang, Yoko Yamakata, Ryoma Maeda, Kiyoharu Aizawa
2024Medical Report Generation via Multimodal Spatio-Temporal Fusion.
Xin Mei, Rui Mao, Xiaoyan Cai, Libin Yang, Erik Cambria
2024MegaSurf: Scalable Large Scene Neural Surface Reconstruction.
Yusen Wang, Kaixuan Zhou, Wenxiao Zhang, Chunxia Xiao
2024Mesh Denoising Using Filtering Coefficients Jointly Aware of Noise and Geometry.
Xingtao Wang, Xianqi Zhang, Wenxue Cui, Ruiqin Xiong, Xiaopeng Fan, Debin Zhao
2024Mesh-Centric Gaussian Splatting for Human Avatar Modelling with Real-time Dynamic Mesh Reconstruction.
Ruiqi Zhang, Jie Chen
2024MetaDragonBoat: Exploring Paddling Techniques of Virtual Dragon Boating in a Metaverse Campus.
Wei He, Xiang Li, Shengtian Xu, Yuzheng Chen, Chan-In Sio, Ge Lin Kan, Lik-Hang Lee
2024MetaEnzyme: Meta Pan-Enzyme Learning for Task-Adaptive Redesign.
Jiangbin Zheng, Han Zhang, Qianqing Xu, An-Ping Zeng, Stan Z. Li
2024MetaRepair: Learning to Repair Deep Neural Networks from Repairing Experiences.
Yun Xing, Qing Guo, Xiaofeng Cao, Ivor W. Tsang, Lei Ma
2024MiNet: Weakly-Supervised Camouflaged Object Detection through Mutual Interaction between Region and Edge Cues.
Yuzhen Niu, Lifen Yang, Rui Xu, Yuezhou Li, Yuzhong Chen
2024Micro-Action Recognition via Hierarchical Fusion and Inference.
Fan Gong, Jialiang Chen, Jiajun Zhu, Qijian Bao, Fei Gao, Renshu Gu, Gang Xu
2024Micro-Expression Spotting Based on Optical Flow Feature with Boundary Calibration.
Jun Yu, Yaohui Zhang, Gongpeng Zhao, Peng He, Zerui Zhang, Zhongpeng Cai, Qingsong Liu, Jianqing Sun, Jiaen Liang
2024Miko: Multimodal Intention Knowledge Distillation from Large Language Models for Social-Media Commonsense Discovery.
Feihong Lu, Weiqi Wang, Yangyifei Luo, Ziqin Zhu, Qingyun Sun, Baixuan Xu, Haochen Shi, Shiqi Gao, Qian Li, Yangqiu Song, Jianxin Li
2024Minerva: Enhancing Quantum Network Performance for High-Fidelity Multimedia Transmission.
Tingting Li, Ziming Zhao, Jianwei Yin
2024MiniGPT-3D: Efficiently Aligning 3D Point Clouds with Large Language Models using 2D Priors.
Yuan Tang, Xu Han, Xianzhi Li, Qiao Yu, Yixue Hao, Long Hu, Min Chen
2024Mitigate Catastrophic Remembering via Continual Knowledge Purification for Noisy Lifelong Person Re-Identification.
Kunlun Xu, Haozhuo Zhang, Yu Li, Yuxin Peng, Jiahuan Zhou
2024Mitigating Sample Selection Bias with Robust Domain Adaption in Multimedia Recommendation.
Jiaye Lin, Qing Li, Guorui Xie, Zhongxu Guan, Yong Jiang, Ting Xu, Zhong Zhang, Peilin Zhao
2024Mitigating Social Biases in Text-to-Image Diffusion Models via Linguistic-Aligned Attention Guidance.
Yue Jiang, Yueming Lyu, Ziwen He, Bo Peng, Jing Dong
2024Mitigating Social Hazards: Early Detection of Fake News via Diffusion-Guided Propagation Path Generation.
Litian Zhang, Xiaoming Zhang, Chaozhuo Li, Ziyi Zhou, Jiacheng Liu, Feiran Huang, Xi Zhang
2024Mitigating World Biases: A Multimodal Multi-View Debiasing Framework for Fake News Video Detection.
Zhi Zeng, Minnan Luo, Xiangzheng Kong, Huan Liu, Hao Guo, Hao Yang, Zihan Ma, Xiang Zhao
2024Mixed Prototype Correction for Causal Inference in Medical Image Classification.
Yajie Zhang, Zhi-An Huang, Zhiliang Hong, Songsong Wu, Jibin Wu, Kay Chen Tan
2024MoBA: Mixture of Bi-directional Adapter for Multi-modal Sarcasm Detection.
Yifeng Xie, Zhihong Zhu, Xin Chen, Zhanpeng Chen, Zhiqi Huang
2024MoS
Heng Jia, Yunqiu Xu, Linchao Zhu, Guang Chen, Yufei Wang, Yi Yang
2024MoTrans: Customized Motion Transfer with Text-driven Video Diffusion Models.
Xiaomin Li, Xu Jia, Qinghe Wang, Haiwen Diao, Mengmeng Ge, Pengxiang Li, You He, Huchuan Lu
2024Modal-Enhanced Semantic Modeling for Fine-Grained 3D Human Motion Retrieval.
Haoyu Shi, Huaiwen Zhang
2024Modality-Balanced Learning for Multimedia Recommendation.
Jinghao Zhang, Guofan Liu, Qiang Liu, Shu Wu, Liang Wang
2024Model-Based Non-Independent Distortion Cost Design for Effective JPEG Steganography.
Yuanfeng Pan, Wenkang Su, Jiangqun Ni, Qingliang Liu, Yulin Zhang, Donghua Jiang
2024ModelLock: Locking Your Model With a Spell.
Yifeng Gao, Yuhua Sun, Xingjun Ma, Zuxuan Wu, Yu-Gang Jiang
2024Modeling Event-level Causal Representation for Video Classification.
Yuqing Wang, Lei Meng, Haokai Ma, Haibei Huang, Xiangxu Meng
2024Monocular Human-Object Reconstruction in the Wild.
Chaofan Huo, Ye Shi, Jingya Wang
2024Motion-aware Latent Diffusion Models for Video Frame Interpolation.
Zhilin Huang, Yijie Yu, Ling Yang, Chujun Qin, Bing Zheng, Xiawu Zheng, Zikun Zhou, Yaowei Wang, Wenming Yang
2024MovingColor: Seamless Fusion of Fine-grained Video Color Enhancement.
Yi Dong, Yuxi Wang, Zheng Fang, Wenqi Ouyang, Xianhui Lin, Zhiqi Shen, Peiran Ren, Xuansong Xie, Qingming Huang
2024Multi-Granularity Hand Action Detection.
Ting Zhe, Jing Zhang, Yongqian Li, Yong Luo, Han Hu, Dacheng Tao
2024Multi-Instance Multi-Label Learning for Text-motion Retrieval.
Yang Yang, Liyuan Cao, Haoyu Shi, Huaiwen Zhang
2024Multi-Label Learning with Block Diagonal Labels.
Leqi Shen, Sicheng Zhao, Yifeng Zhang, Hui Chen, Jundong Zhou, Pengzhang Liu, Yongjun Bao, Guiguang Ding
2024Multi-Modal Inductive Framework for Text-Video Retrieval.
Qian Li, Yucheng Zhou, Cheng Ji, Feihong Lu, Jianian Gong, Shangguang Wang, Jianxin Li
2024Multi-Modality Co-Learning for Efficient Skeleton-based Action Recognition.
Jinfu Liu, Chen Chen, Mengyuan Liu
2024Multi-Scale and Detail-Enhanced Segment Anything Model for Salient Object Detection.
Shixuan Gao, Pingping Zhang, Tianyu Yan, Huchuan Lu
2024Multi-Stage Face-Voice Association Learning with Keynote Speaker Diarization.
Ruijie Tao, Zhan Shi, Yidi Jiang, Duc-Tuan Truong, Eng Siong Chng, Massimo Alioto, Haizhou Li
2024Multi-View Clustering Based on Deep Non-negative Tensor Factorization.
Wei Feng, Dongyuan Wei, Qianqian Wang, Bo Dong, Quanxue Gao
2024Multi-fineness Boundaries and the Shifted Ensemble-aware Encoding for Point Cloud Semantic Segmentation.
Ziming Wang, Boxiang Zhang, Ming Ma, Yue Wang, Taoli Du, Wenhui Li
2024Multi-grained Correspondence Learning of Audio-language Models for Few-shot Audio Recognition.
Shengwei Zhao, Linhai Xu, Yuying Liu, Shaoyi Du
2024Multi-modal Auto-regressive Modeling via Visual Tokens.
Tianshuo Peng, Zuchao Li, Lefei Zhang, Hai Zhao, Ping Wang, Bo Du
2024Multi-modal Denoising Diffusion Pre-training for Whole-Slide Image Classification.
Wei Lou, Guanbin Li, Xiang Wan, Haofeng Li
2024Multi-scale Change-Aware Transformer for Remote Sensing Image Change Detection.
Huan Chen, Tingfa Xu, Zhenxiang Chen, Peifu Liu, Huiyan Bai, Jianan Li
2024Multi-view Feature Extraction via Tunable Prompts is Enough for Image Manipulation Localization.
Xuntao Liu, Yuzhou Yang, Haoyue Wang, Qichao Ying, Zhenxing Qian, Xinpeng Zhang, Sheng Li
2024Multi-view Self-Supervised Contrastive Learning for Multivariate Time Series.
Yuhan Wu, Xiyu Meng, Yang He, Junru Zhang, Haowen Zhang, Yabo Dong, Dongming Lu
2024Multi-view X-ray Image Synthesis with Multiple Domain Disentanglement from CT Scans.
Lixing Tan, Shuang Song, Kangneng Zhou, Chengbo Duan, Lanying Wang, Huayang Ren, Linlin Liu, Wei Zhang, Ruoxiu Xiao
2024MultiColor: Image Colorization by Learning from Multiple Color Spaces.
Xiangcheng Du, Zhao Zhou, Xingjiao Wu, Yanlong Wang, Zhuoyao Wang, Yingbin Zheng, Cheng Jin
2024MultiDAN: Unsupervised, Multistage, Multisource and Multitarget Domain Adaptation for Semantic Segmentation of Remote Sensing Images.
Yuxiang Cai, Yongheng Shang, Jianwei Yin
2024MultiHateClip: A Multilingual Benchmark Dataset for Hateful Video Detection on YouTube and Bilibili.
Han Wang, Tan Rui Yang, Usman Naseem, Roy Ka-Wei Lee
2024MultiMediate'24: Multi-Domain Engagement Estimation.
Philipp Müller, Michal Balazia, Tobias Baur, Michael Dietz, Alexander Heimerl, Anna Penzkofer, Dominik Schiller, François Brémond, Jan Alexandersson, Elisabeth André, Andreas Bulling
2024Multimedia Information Retrieval in XR.
Rahel Arnold, Werner Bailer, Ralph Gasser, Björn Þór Jónsson, Omar Shahbaz Khan, Heiko Schuldt, Florian Spiess, Lucia Vadicamo
2024Multimodal Contextual Interactions of Entities: A Modality Circular Fusion Approach for Link Prediction.
Jing Yang, Shundong Yang, Yuan Gao, Jieming Yang, Laurence T. Yang
2024Multimodal Emotion Recognition Calibration in Conversations.
Geng Tu, Feng Xiong, Bin Liang, Hui Wang, Xi Zeng, Ruifeng Xu
2024Multimodal Fusion via Hypergraph Autoencoder and Contrastive Learning for Emotion Recognition in Conversation.
Zijian Yi, Ziming Zhao, Zhishu Shen, Tiehua Zhang
2024Multimodal Inplace Prompt Tuning for Open-set Object Detection.
Guilin Li, Mengdan Zhang, Xiawu Zheng, Peixian Chen, Zihan Wang, Yunhang Shen, Mingchen Zhuge, Chenglin Wu, Fei Chao, Ke Li, Xing Sun, Rongrong Ji
2024Multimodal LLM Enhanced Cross-lingual Cross-modal Retrieval.
Yabing Wang, Le Wang, Qiang Zhou, Zhibin Wang, Hao Li, Gang Hua, Wei Tang
2024Multimodal Large Language Models and Tunings: Vision, Language, Sensors, Audio, and Beyond.
Soyeon Caren Han, Feiqi Cao, Josiah Poon, Roberto Navigli
2024Multimodal Low-light Image Enhancement with Depth Information.
Zhen Wang, Dongyuan Li, Guang Li, Ziqing Zhang, Renhe Jiang
2024Multimodal Multi-turn Conversation Stance Detection: A Challenge Dataset and Effective Model.
Fuqiang Niu, Zebang Cheng, Xianghua Fu, Xiaojiang Peng, Genan Dai, Yin Chen, Hu Huang, Bowen Zhang
2024Multimodal Physiological Signals Representation Learning via Multiscale Contrasting for Depression Recognition.
Kai Shao, Rui Wang, Yixue Hao, Long Hu, Min Chen, Hans-Arno Jacobsen
2024Multimodal Unlearnable Examples: Protecting Data against Multimodal Contrastive Learning.
Xinwei Liu, Xiaojun Jia, Yuan Xun, Siyuan Liang, Xiaochun Cao
2024Multimodal-aware Multi-intention Learning for Recommendation.
Wei Yang, Qingchen Yang
2024Multiple Kernel Clustering with Shifted Laplacian on Grassmann Manifold.
Xi Wu, Chuang Huang, Xinliu Liu, Fei Zhou, Zhenwen Ren
2024Muskits-ESPnet: A Comprehensive Toolkit for Singing Voice Synthesis in New Paradigm.
Yuning Wu, Jiatong Shi, Yifeng Yu, Yuxun Tang, Tao Qian, Yueqian Lin, Jionghao Han, Xinyi Bai, Shinji Watanabe, Qin Jin
2024NFT1000: A Cross-Modal Dataset For Non-Fungible Token Retrieval.
Shuxun Wang, Yunfei Lei, Ziqi Zhang, Wei Liu, Haowei Liu, Li Yang, Bing Li, Wenjuan Li, Jin Gao, Weiming Hu
2024NNVISR: Bring Neural Network Video Interpolation and Super Resolution into Video Processing Framework.
Yuan Tong, Mengshun Hu, Zheng Wang
2024Narrowing the Gap between Vision and Action in Navigation.
Yue Zhang, Parisa Kordjamshidi
2024Natural Language Induced Adversarial Images.
Xiaopei Zhu, Peiyang Xu, Guanning Zeng, Yinpeng Dong, Xiaolin Hu
2024Navigating Beyond Instructions: Vision-and-Language Navigation in Obstructed Environments.
Haodong Hong, Sen Wang, Zi Huang, Qi Wu, Jiajun Liu
2024Navigating Weight Prediction with Diet Diary.
Yinxuan Gui, Bin Zhu, Jingjing Chen, Chong Wah Ngo, Yu-Gang Jiang
2024Neighbor Does Matter: Curriculum Global Positive-Negative Sampling for Vision-Language Pre-training.
Bin Huang, Feng He, Qi Wang, Hong Chen, Guohao Li, Zhifan Feng, Xin Wang, Wenwu Zhu
2024Neural Boneprint: Person Identification from Bones Using Generative Contrastive Deep Learning.
Chaoqun Niu, Dongdong Chen, Jizhe Zhou, Jian Wang, Xiang Luo, Quan-Hui Liu, Yuan Li, Jiancheng Lv
2024Neural Interaction Energy for Multi-Agent Trajectory Prediction.
Kaixin Shen, Ruijie Quan, Linchao Zhu, Jun Xiao, Yi Yang
2024New Job, New Gender? Measuring the Social Bias in Image Generation Models.
Wenxuan Wang, Haonan Bai, Jen-tse Huang, Yuxuan Wan, Youliang Yuan, Haoyi Qiu, Nanyun Peng, Michael R. Lyu
2024Non-Overlapped Multi-View Weak-Label Learning Guided by Multiple Correlations.
Kaixiang Wang, Xiaojian Ding, Fan Yang
2024Non-uniform Timestep Sampling: Towards Faster Diffusion Model Training.
Tianyi Zheng, Cong Geng, Peng-Tao Jiang, Ben Wan, Hao Zhang, Jinwei Chen, Jia Wang, Bo Li
2024Not All Frequencies Are Created Equal: Towards a Dynamic Fusion of Frequencies in Time-Series Forecasting.
Xingyu Zhang, Siyu Zhao, Zeen Song, Huijie Guo, Jianqi Zhang, Changwen Zheng, Wenwen Qiang
2024Not All Inputs Are Valid: Towards Open-Set Video Moment Retrieval using Language.
Xiang Fang, Wanlong Fang, Daizong Liu, Xiaoye Qu, Jianfeng Dong, Pan Zhou, Renfu Li, Zichuan Xu, Lixing Chen, Panpan Zheng, Yu Cheng
2024Not All Pairs are Equal: Hierarchical Learning for Average-Precision-Oriented Video Retrieval.
Yang Liu, Qianqian Xu, Peisong Wen, Siran Dai, Qingming Huang
2024NovaChart: A Large-scale Dataset towards Chart Understanding and Generation of Multimodal Large Language Models.
Linmei Hu, Duokang Wang, Yiming Pan, Jifan Yu, Yingxia Shao, Chong Feng, Liqiang Nie
2024OSNeRF: On-demand Semantic Neural Radiance Fields for Fast and Robust 3D Object Reconstruction.
Rui Xu, Gaolei Li, Changze Li, Zhaohui Yang, Yuchen Liu, Mingzhe Chen
2024ObjBlur: A Curriculum Learning Approach With Progressive Object-Level Blurring for Improved Layout-to-Image Generation.
Stanislav Frolov, Brian B. Moser, Sebastian Palacio, Andreas Dengel
2024Object-Level Pseudo-3D Lifting for Distance-Aware Tracking.
Haoyuan Jin, Xuesong Nie, Yunfeng Yan, Xi Chen, Zhihang Zhu, Donglian Qi
2024Observe before Generate: Emotion-Cause aware Video Caption for Multimodal Emotion Cause Generation in Conversations.
Fanfan Wang, Heqing Ma, Xiangqing Shen, Jianfei Yu, Rui Xia
2024OmniStitch: Depth-Aware Stitching Framework for Omnidirectional Vision with Multiple Cameras.
Sooho Kim, Soyeon Hong, KyungSoo Park, Hyunsouk Cho, Kyung-Ah Sohn
2024On-the-fly Point Feature Representation for Point Clouds Analysis.
Jiangyi Wang, Zhongyao Cheng, Na Zhao, Jun Cheng, XuLei Yang
2024Once-for-all: Efficient Visual Face Privacy Protection via Person-specific Veils.
Zixuan Yang, Yushu Zhang, Tao Wang, Zhongyun Hua, Zhihua Xia, Jian Weng
2024One-Shot Sequential Federated Learning for Non-IID Data by Enhancing Local Model Diversity.
Naibo Wang, Yuchen Deng, Wenjie Feng, Shichen Fan, Jianwei Yin, See-Kiong Ng
2024One-Stage Fair Multi-View Spectral Clustering.
Rongwen Li, Haiyang Hu, Liang Du, Jiarong Chen, Bingbing Jiang, Peng Zhou
2024One-bit Deep Hashing: Towards Resource-Efficient Hashing Model with Binary Neural Network.
Liyang He, Zhenya Huang, Chenglong Liu, Rui Li, Runze Wu, Qi Liu, Enhong Chen
2024One-shot In-context Part Segmentation.
Zhenqi Dai, Ting Liu, Xingxing Zhang, Yunchao Wei, Yanning Zhang
2024One-shot-but-not-degraded Federated Learning.
Hui Zeng, Minrui Xu, Tongqing Zhou, Xinyi Wu, Jiawen Kang, Zhiping Cai, Dusit Niyato
2024OneChart: Purify the Chart Structural Extraction via One Auxiliary Token.
Jinyue Chen, Lingyu Kong, Haoran Wei, Chenglong Liu, Zheng Ge, Liang Zhao, Jianjian Sun, Chunrui Han, Xiangyu Zhang
2024Open-Set Video-based Facial Expression Recognition with Human Expression-sensitive Prompting.
Yuanyuan Liu, Yuxuan Huang, Shuyang Liu, Yibing Zhan, Zijing Chen, Zhe Chen
2024Open-Sourcing VR2Gather: A Collaborative Social VR System for Adaptive Multi-Party Real Time Communication.
Jack Jansen, Thomas Röggla, Silvia Rossi, Irene Viola, Pablo César
2024Open-Vocabulary Audio-Visual Semantic Segmentation.
Ruohao Guo, Liao Qu, Dantong Niu, Yanyu Qi, Wenzhen Yue, Ji Shi, Bowei Xing, Xianghua Ying
2024Open-Vocabulary Video Scene Graph Generation via Union-aware Semantic Alignment.
Ziyue Wu, Junyu Gao, Changsheng Xu
2024OpenAVE: Moving towards Open Set Audio-Visual Event Localization.
Jiale Yu, Baopeng Zhang, Zhu Teng, Jianping Fan
2024OpenDIC: An Open-Source Library and Performance Evaluation for Deep-learning-based Image Compression.
Wei Gao, Huiming Zheng, Chenhao Zhang, Kaiyu Zheng, Zhuozhen Yu, Yuan Li, Hua Ye, Yongchi Zhang
2024OpenLEAF: A Novel Benchmark for Open-Domain Interleaved Image-Text Generation.
Jie An, Zhengyuan Yang, Linjie Li, Jianfeng Wang, Kevin Lin, Zicheng Liu, Lijuan Wang, Jiebo Luo
2024OpenSEP: An Open Source Subjective Experiment Platform.
Hang Yuan, Wei Gao, Wenxu Gao
2024Optical Flow-Guided 6DoF Object Pose Tracking with an Event Camera.
Zibin Liu, Banglei Guan, Yang Shang, Shunkun Liang, Zhenbao Yu, Qifeng Yu
2024Optimizing AIGC Image Detection: Strategies in Data Augmentation and Model Architecture.
Huihui Fu
2024Optimizing the Baseline Approach for the 2024 ACM Multimedia Grand Challenge in Artificial Intelligence Generated Image Detection.
Jin Chen
2024Overcoming Spatial-Temporal Catastrophic Forgetting for Federated Class-Incremental Learning.
Hao Yu, Xin Yang, Xin Gao, Yihui Feng, Hao Wang, Yan Kang, Tianrui Li
2024Overcoming the Pitfalls of Vision-Language Model for Image-Text Retrieval.
Feifei Zhang, Sijia Qu, Fan Shi, Changsheng Xu
2024P-BiC: Ultra-High-Definition Image Moiré Patterns Removal via Patch Bilateral Compensation.
Zeyu Xiao, Zhihe Lu, Xinchao Wang
2024P-RAG: Progressive Retrieval Augmented Generation For Planning on Embodied Everyday Task.
Weiye Xu, Min Wang, Wengang Zhou, Houqiang Li
2024PAIR: Pre-denosing Augmented Image Retrieval Model for Defending Adversarial Patches.
Ziyang Zhou, Pinghui Wang, Zi Liang, Ruofei Zhang, Haitao Bai
2024PASSION: Towards Effective Incomplete Multi-Modal Medical Image Segmentation with Imbalanced Missing Rates.
Junjie Shi, Caozhi Shang, Zhaobin Sun, Li Yu, Xin Yang, Zengqiang Yan
2024PC
Yue Duan, Zhangxuan Gu, Zhenzhe Ying, Lei Qi, Changhua Meng, Yinghuan Shi
2024PCHMVision: An Open-Source Library of Point Cloud Compression for Human and Machine Vision.
Liang Xie, Wei Gao
2024PD-Refiner: An Underlying Surface Inheritance Refiner with Adaptive Edge-Aware Supervision for Point Cloud Denoising.
Chengwei Zhang, Xueyi Zhang, Xianghu Yue, Mingrui Lao, Tao Jiang, Jiawei Wang, Fubo Zhang, Longyong Chen
2024PEAN: A Diffusion-Based Prior-Enhanced Attention Network for Scene Text Image Super-Resolution.
Zuoyan Zhao, Hui Xue, Pengfei Fang, Shipeng Zhu
2024PEneo: Unifying Line Extraction, Line Grouping, and Entity Linking for End-to-end Document Pair Extraction.
Zening Lin, Jiapeng Wang, Teng Li, Wenhui Liao, Dayi Huang, Longfei Xiong, Lianwen Jin
2024PFFAA: Prototype-based Feature and Frequency Alteration Attack for Semantic Segmentation.
Zhidong Yu, Zhenbo Shi, Xiaoman Liu, Wei Yang
2024PIMT: Physics-Based Interactive Motion Transition for Hybrid Character Animation.
Yanbin Deng, Zheng Li, Ning Xie, Wei Zhang
2024PIP: Detecting Adversarial Examples in Large Vision-Language Models via Attention Patterns of Irrelevant Probe Questions.
Yudong Zhang, Ruobing Xie, Jiansheng Chen, Xingwu Sun, Yu Wang
2024PRISM: PRogressive dependency maxImization for Scale-invariant image Matching.
Xudong Cai, Yongcai Wang, Lun Luo, Minhang Wang, Deying Li, Jintao Xu, Weihao Gu, Rui Ai
2024PROMOTE: Prior-Guided Diffusion Model with Global-Local Contrastive Learning for Exemplar-Based Image Translation.
Guojin Zhong, Yihu Guo, Jin Yuan, Qianjun Zhang, Weili Guan, Long Chen
2024PRTGS: Precomputed Radiance Transfer of Gaussian Splats for Real-Time High-Quality Relighting.
Yijia Guo, Yuanxi Bai, Liwen Hu, Ziyi Guo, Mianzhi Liu, Yu Cai, Tiejun Huang, Lei Ma
2024PS-TTL: Prototype-based Soft-labels and Test-Time Learning for Few-shot Object Detection.
Yingjie Gao, Yanan Zhang, Ziyue Huang, Nanqing Liu, Di Huang
2024PSM: Learning Probabilistic Embeddings for Multi-scale Zero-Shot Soundscape Mapping.
Subash Khanal, Eric Xing, Srikumar Sastry, Aayush Dhakal, Zhexiao Xiong, Adeel Ahmad, Nathan Jacobs
2024PSSD-Transformer: Powerful Sparse Spike-Driven Transformer for Image Semantic Segmentation.
Hongzhi Wang, Xiubo Liang, Tao Zhang, Yue Gu, Weidong Geng
2024PTSBench: A Comprehensive Post-Training Sparsity Benchmark Towards Algorithms and Models.
Zining Wang, Jinyang Guo, Ruihao Gong, Yang Yong, Aishan Liu, Yushi Huang, Jiaheng Liu, Xianglong Liu
2024PanoSent: A Panoptic Sextuple Extraction Benchmark for Multimodal Conversational Aspect-based Sentiment Analysis.
Meng Luo, Hao Fei, Bobo Li, Shengqiong Wu, Qian Liu, Soujanya Poria, Erik Cambria, Mong-Li Lee, Wynne Hsu
2024Parameter-Efficient Complementary Expert Learning for Long-Tailed Visual Recognition.
Lixiang Ru, Xin Guo, Lei Yu, Yingying Zhang, Jiangwei Lao, Jian Wang, Jingdong Chen, Yansheng Li, Ming Yang
2024Parameter-efficient is not Sufficient: Exploring Parameter, Memory, and Time Efficient Adapter Tuning for Dense Predictions.
Dongshuo Yin, Xueting Han, Bin Li, Hao Feng, Jing Bai
2024Part-level Reconstruction for Self-Supervised Category-level 6D Object Pose Estimation with Coarse-to-Fine Correspondence Optimization.
Zerui Zhang, Jun Yu, Liangxian Cui, Qiang Ling, Tianyu Liu
2024Partial Multi-label Learning Based On Near-Far Neighborhood Label Enhancement And Nonlinear Guidance.
Yu Chen, Yanan Wu, Na Han, Xiaozhao Fang, Bingzhi Chen, Jie Wen
2024Partially Aligned Cross-modal Retrieval via Optimal Transport-based Prototype Alignment Learning.
Junsheng Wang, Tiantian Gong, Yan Yan
2024PastNet: Introducing Physical Inductive Biases for Spatio-temporal Video Prediction.
Hao Wu, Fan Xu, Chong Chen, Xian-Sheng Hua, Xiao Luo, Haixin Wang
2024PathUp: Patch-wise Timestep Tracking for Multi-class Large Pathology Image Synthesising Diffusion Model.
Jingxiong Li, Sunyi Zheng, Chenglu Zhu, Yuxuan Sun, Pingyi Chen, Zhongyi Shui, Yunlong Zhang, Honglin Li, Lin Yang
2024Peeling Back the Layers: Interpreting the Storytelling of ViT.
Jingjie Zeng, Zhihao Yang, Qi Yang, Liang Yang, Hongfei Lin
2024PerFRDiff: Personalised Weight Editing for Multiple Appropriate Facial Reaction Generation.
Hengde Zhu, Xiangyu Kong, Weicheng Xie, Xin Huang, Linlin Shen, Lu Liu, Hatice Gunes, Siyang Song
2024Perceive before Respond: Improving Sticker Response Selection by Emotion Distillation and Hard Mining.
Wuyou Xia, Shengzhe Liu, Rong Qin, Guoli Jia, Eunil Park, Jufeng Yang
2024PercepLIE: A New Path to Perceptual Low-Light Image Enhancement.
Cong Wang, Chengjin Yu, Jie Mu, Wei Wang
2024Perceptual Visual Similarity from EEG: Prediction and Image Generation.
Carlos de la Torre-Ortiz, Tuukka Ruotsalo
2024Perceptual-Distortion Balanced Image Super-Resolution is a Multi-Objective Optimization Problem.
Qiwen Zhu, Yanjie Wang, Shilv Cai, Liqun Chen, Jiahuan Zhou, Luxin Yan, Sheng Zhong, Xu Zou
2024PhysReaction: Physically Plausible Real-Time Humanoid Reaction Synthesis via Forward Dynamics Guided 4D Imitation.
Yunze Liu, Changxi Chen, Chenjing Ding, Li Yi
2024Pick-and-Draw: Training-free Semantic Guidance for Text-to-Image Personalization.
Henglei Lv, Jiayu Xiao, Liang Li
2024PixelFade: Privacy-preserving Person Re-identification with Noise-guided Progressive Replacement.
Delong Zhang, Yi-Xing Peng, Xiao-Ming Wu, Ancong Wu, Weishi Zheng
2024PlacidDreamer: Advancing Harmony in Text-to-3D Generation.
Shuo Huang, Shikun Sun, Zixuan Wang, Xiaoyu Qin, Yanmin Xiong, Yuan Zhang, Pengfei Wan, Di Zhang, Jia Jia
2024Point Cloud Compression, Enhancement and Applications: From 3D Perception to Large Models.
Wei Gao, Ge Li
2024Point Cloud Densification for 3D Gaussian Splatting from Sparse Input Views.
Kin-Chung Chan, Jun Xiao, Hana Lebeta Goshu, Kin-Man Lam
2024Point Cloud Reconstruction Is Insufficient to Learn 3D Representations.
Weichen Xu, Jian Cao, Tianhao Fu, Ruilong Ren, Zicong Hu, Xixin Cao, Xing Zhang
2024Point Cloud Upsampling with Geometric Algebra Driven Inverse Heat Dissipation.
Wenqiang Xu, Wenrui Dai, Ziyang Zheng, Chenglin Li, Junni Zou, Hongkai Xiong
2024Point-GCC: Universal Self-supervised 3D Scene Pre-training via Geometry-Color Contrast.
Guofan Fan, Zekun Qi, Wenkai Shi, Kaisheng Ma
2024Poisoning for Debiasing: Fair Recognition via Eliminating Bias Uncovered in Data Poisoning.
Yi Zhang, Zhefeng Wang, Rui Hu, Xinyu Duan, Yi Zheng, Baoxing Huai, Jiarun Han, Jitao Sang
2024Portrait Shadow Removal via Self-Exemplar Illumination Equalization.
Qian Huang, Cheng Xu, Guiqing Li, Ziheng Wu, Shengxin Liu, Shengfeng He
2024Practical Deep Learning Models for QIM-based VoIP Steganalysis.
Cheng Zhang
2024Predicting the Unseen: A Novel Dataset for Hidden Intention Localization in Pre-abnormal Analysis.
Zehao Qi, Ruixu Zhang, Xinyi Hu, Wenxuan Liu, Zheng Wang
2024PriFU: Capturing Task-Relevant Information Without Adversarial Learning.
Xiuli Bi, Yang Hu, Bo Liu, Weisheng Li, Pamela C. Cosman, Bin Xiao
2024PrimKD: Primary Modality Guided Multimodal Fusion for RGB-D Semantic Segmentation.
Zhiwei Hao, Zhongyu Xiao, Yong Luo, Jianyuan Guo, Jing Wang, Li Shen, Han Hu
2024PrimeComposer: Faster Progressively Combined Diffusion for Image Composition with Attention Steering.
Yibin Wang, Weizhong Zhang, Jianwei Zheng, Cheng Jin
2024Prior Knowledge Integration via LLM Encoding and Pseudo Event Regulation for Video Moment Retrieval.
Yiyang Jiang, Wengyu Zhang, Xulu Zhang, Xiaoyong Wei, Chang Wen Chen, Qing Li
2024Prior Metadata-Driven RAW Reconstruction: Eliminating the Need for Per-Image Metadata.
Wencheng Han, Chen Zhang, Yang Zhou, Wentao Liu, Chen Qian, Chengzhong Xu, Jianbing Shen
2024Prior-free Balanced Replay: Uncertainty-guided Reservoir Sampling for Long-Tailed Continual Learning.
Lei Liu, Li Liu, Yawen Cui
2024Private Gradient Estimation is Useful for Generative Modeling.
Bochao Liu, Pengju Wang, Weijia Guo, Yong Li, Liansheng Zhuang, Weiping Wang, Shiming Ge
2024ProFD: Prompt-Guided Feature Disentangling for Occluded Person Re-Identification.
Can Cui, Siteng Huang, Wenxuan Song, Pengxiang Ding, Min Zhang, Donglin Wang
2024Probabilistic Distillation Transformer: Modelling Uncertainties for Visual Abductive Reasoning.
Wanru Xu, Zhenjiang Miao, Yi Tian, Yigang Cen, Lili Wan, Xiaole Ma
2024Probabilistic Vision-Language Representation for Weakly Supervised Temporal Action Localization.
Geuntaek Lim, Hyunwoo Kim, Joonsoo Kim, Yukyung Choi
2024Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024 - 1 November 2024
Jianfei Cai, Mohan S. Kankanhalli, Balakrishnan Prabhakaran, Susanne Boll, Ramanathan Subramanian, Liang Zheng, Vivek K. Singh, Pablo César, Lexing Xie, Dong Xu
2024Product2IMG: Prompt-Free E-commerce Product Background Generation with Diffusion Model and Self-Improved LMM.
Tingfeng Cao, Junsheng Kong, Xue Zhao, Wenqing Yao, Junwei Ding, Jinhui Zhu, Jiandong Zhang
2024Progressive Local and Non-Local Interactive Networks with Deeply Discriminative Training for Image Deraining.
Cong Wang, Liyan Wang, Jie Mu, Chengjin Yu, Wei Wang
2024Progressive Point Cloud Denoising with Cross-Stage Cross-Coder Adaptive Edge Graph Convolution Network.
Wu Chen, Hehe Fan, Qiuping Jiang, Chao Huang, Yi Yang
2024Progressive Prototype Evolving for Dual-Forgetting Mitigation in Non-Exemplar Online Continual Learning.
Qiwei Li, Yuxin Peng, Jiahuan Zhou
2024Prompt-Guided Image-Adaptive Neural Implicit Lookup Tables for Interpretable Image Enhancement.
Satoshi Kosugi
2024Prompt2Poster: Automatically Artistic Chinese Poster Creation from Prompt Only.
Shaodong Wang, Yunyang Ge, Liuhan Chen, Haiyang Zhou, Qian Wang, Xinhua Cheng, Li Yuan
2024Prompting Continual Person Search.
Pengcheng Zhang, Xiaohan Yu, Xiao Bai, Jin Zheng, Xin Ning
2024Prompting to Adapt Foundational Segmentation Models.
Jie Hu, Jie Li, Yue Ma, Liujuan Cao, Songan Zhang, Wei Zhang, Guannan Jiang, Rongrong Ji
2024Prototype-Guided Dual-Transformer Reasoning for Video Individual Counting.
Rui Li, Yishu Liu, Huafeng Li, Jinxing Li, Guangming Lu
2024Prototypical Prompting for Text-to-image Person Re-identification.
Shuanglin Yan, Jun Liu, Neng Dong, Liyan Zhang, Jinhui Tang
2024Purified Distillation: Bridging Domain Shift and Category Gap in Incremental Object Detection.
Shilong Jia, Tingting Wu, Yingying Fang, Tieyong Zeng, Guixu Zhang, Zhi Li
2024Q-Ground: Image Quality Grounding with Large Multi-modality Models.
Chaofeng Chen, Sensen Yang, Haoning Wu, Liang Liao, Zicheng Zhang, Annan Wang, Wenxiu Sun, Qiong Yan, Weisi Lin
2024Q-MoE: Connector for MLLMs with Text-Driven Routing.
Hanzi Wang, Jiamin Ren, Yifeng Ding, Lei Ren, Huixing Jiang, Wei Chen, Fangxiang Feng, Xiaojie Wang
2024Q-SNNs: Quantized Spiking Neural Networks.
Wenjie Wei, Yu Liang, Ammar Belatreche, Yichen Xiao, Honglin Cao, Zhenbang Ren, Guoqing Wang, Malu Zhang, Yang Yang
2024QE-BEV: Query Evolution for Bird's Eye View Object Detection in Varied Contexts.
Jiawei Yao, Yingxin Lai, Hongrui Kou, Tong Wu, Ruixi Liu
2024QNCD: Quantization Noise Correction for Diffusion Models.
Huanpeng Chu, Wei Wu, Chengjie Zang, Kun Yuan
2024QPT-V2: Masked Image Modeling Advances Visual Scoring.
Qizhi Xie, Kun Yuan, Yunpeng Qu, Mingda Wu, Ming Sun, Chao Zhou, Jihong Zhu
2024QS-NeRV: Real-Time Quality-Scalable Decoding with Neural Representation for Videos.
Chang Wu, Guancheng Quan, Gang He, Xin-Quan Lai, Yunsong Li, Wenxin Yu, Xianmeng Lin, Cheng Yang
2024QVD: Post-training Quantization for Video Diffusion Models.
Shilong Tian, Hong Chen, Chengtao Lv, Yu Liu, Jinyang Guo, Xianglong Liu, Shengxi Li, Hao Yang, Tao Xie
2024Query Augmentation with Brain Signals.
Ziyi Ye, Jingtao Zhan, Qingyao Ai, Yiqun Liu, Maarten de Rijke, Christina Lioma, Tuukka Ruotsalo
2024QueryMatch: A Query-based Contrastive Learning Framework for Weakly Supervised Visual Grounding.
Shengxin Chen, Gen Luo, Yiyi Zhou, Xiaoshuai Sun, Guannan Jiang, Rongrong Ji
2024R
Green Rosh K. S, B. H. Pawan Prasad, Lokesh R. Boregowda, Kaushik Mitra
2024R4D-planes: Remapping Planes For Novel View Synthesis and Self-Supervised Decoupling of Monocular Videos.
Junyuan Guo, Hao Tang, Teng Wang, Chao Wang
2024RAG-Guided Large Language Models for Visual Spatial Description with Adaptive Hallucination Corrector.
Jun Yu, Yunxiang Zhang, Zerui Zhang, Zhao Yang, Gongpeng Zhao, Fengzhao Sun, Fanrui Zhang, Qingsong Liu, Jianqing Sun, Jiaen Liang, Yaohui Zhang
2024RAVSS: Robust Audio-Visual Speech Separation in Multi-Speaker Scenarios with Missing Visual Cues.
Tianrui Pan, Jie Liu, Bohan Wang, Jie Tang, Gangshan Wu
2024RCA: Region Conditioned Adaptation for Visual Abductive Reasoning.
Hao Zhang, Ee Yeo Keat, Basura Fernando
2024RDLNet: A Novel and Accurate Real-world Document Localization Method.
Yaqiang Wu, Zhen Xu, Yong Duan, Yanlai Wu, Qinghua Zheng, Hui Li, Xiaochen Hu, Lianwen Jin
2024REmoNet: Reducing Emotional Label Noise via Multi-regularized Self-supervision.
Wei-Bang Jiang, Yu-Ting Lan, Bao-Liang Lu
2024RFFNet: Towards Robust and Flexible Fusion for Low-Light Image Denoising.
Qiang Wang, Yuning Cui, Yawen Li, Yaping Ruan, Ben Zhu, Wenqi Ren
2024RHKH: Relational Hypergraph Neural Network for Link Prediction on N-ary Knowledge Hypergraph.
Yuzhuo Wang, Junwei He, Hongzhi Wang
2024ROI-Guided Point Cloud Geometry Compression Towards Human and Machine Vision.
Liang Xie, Wei Gao, Huiming Zheng, Ge Li
2024RSC-SNN: Exploring the Trade-off Between Adversarial Robustness and Accuracy in Spiking Neural Networks via Randomized Smoothing Coding.
Keming Wu, Man Yao, Yuhong Chou, Xuerui Qiu, Rui Yang, Bo Xu, Guoqi Li
2024RSNN: Recurrent Spiking Neural Networks for Dynamic Spatial-Temporal Information Processing.
Qi Xu, Xuanye Fang, Yaxin Li, Jiangrong Shen, De Ma, Yi Xu, Gang Pan
2024RainMamba: Enhanced Locality Learning with State Space Models for Video Deraining.
Hongtao Wu, Yijun Yang, Huihui Xu, Weiming Wang, Jinni Zhou, Lei Zhu
2024Rainmer: Learning Multi-view Representations for Comprehensive Image Deraining and Beyond.
Wu Ran, Peirong Ma, Zhiquan He, Hong Lu
2024RainyScape: Unsupervised Rainy Scene Reconstruction using Decoupled Neural Rendering.
Xianqiang Lyu, Hui Liu, Junhui Hou
2024Rate-aware Compression for NeRF-based Volumetric Video.
Zhiyu Zhang, Guo Lu, Huanxiong Liang, Zhengxue Cheng, Anni Tang, Li Song
2024RayFormer: Improving Query-Based Multi-Camera 3D Object Detection via Ray-Centric Strategies.
Xiaomeng Chu, Jiajun Deng, Guoliang You, Yifan Duan, Yao Li, Yanyong Zhang
2024ReCoS: A Novel Benchmark for Cross-Modal Image-Text Retrieval in Complex Real-Life Scenarios.
Xiaojun Chen, Jimeng Lou, Wenxi Huang, Ting Wan, Qin Zhang, Min Yang
2024ReCorD: Reasoning and Correcting Diffusion for HOI Generation.
Jian-Yu Jiang-Lin, Kang-Yang Huang, Ling Lo, Yi-Ning Huang, Terence Lin, Jhih-Ciang Wu, Hong-Han Shuai, Wen-Huang Cheng
2024ReForm-Eval: Evaluating Large Vision Language Models via Unified Re-Formulation of Task-Oriented Benchmarks.
Zejun Li, Ye Wang, Mengfei Du, Qingwen Liu, Binhao Wu, Jiwen Zhang, Chengxing Zhou, Zhihao Fan, Jie Fu, Jingjing Chen, Zhongyu Wei, Xuanjing Huang
2024ReToMe-VA: Recursive Token Merging for Video Diffusion-based Unrestricted Adversarial Attack.
Ziyi Gao, Kai Chen, Zhipeng Wei, Tingshu Mou, Jingjing Chen, Zhiyu Tan, Hao Li, Yu-Gang Jiang
2024ReWiTe: Realistic Wide-angle and Telephoto Dual Camera Fusion Dataset via Beam Splitter Camera Rig.
Chunli Peng, Xuan Dong, Tiantian Cao, Zhengqing Li, Kun Dong, Weixin Li
2024Real-time Parameter Evaluation of High-speed Microfluidic Droplets using Continuous Spike Streams.
Bo Xiong, Changqing Su, Zihan Lin, Yanqin Chen, You Zhou, Zhen Cheng, Zhaofei Yu, Tiejun Huang
2024Realistic Full-Body Motion Generation from Sparse Tracking with State Space Model.
Kun Dong, Jian Xue, Zehai Niu, Xing Lan, Ke Lu, Qingyuan Liu, Xiaoyu Qin
2024Reason-and-Execute Prompting: Enhancing Multi-Modal Large Language Models for Solving Geometry Questions.
Xiuliang Duan, Dating Tan, Liangda Fang, Yuyu Zhou, Chaobo He, Ziliang Chen, Lusheng Wu, Guanliang Chen, Zhiguo Gong, Weiqi Luo, Quanlong Guan
2024Reconstructing, Understanding, and Analyzing Relief Type Cultural Heritage from a Single Old Photo.
Jiao Pan, Liang Li, Hiroshi Yamaguchi, Kyoko Hasegawa, Fadjar Ibnu Thufail, Brahmantara, Xiaojuan Ban, Satoshi Tanaka
2024RefMask3D: Language-Guided Transformer for 3D Referring Segmentation.
Shuting He, Henghui Ding
2024RefScale: Multi-temporal Assisted Image Rescaling in Repetitive Observation Scenarios.
Zhen Zhang, Jing Xiao, Liang Liao, Mi Wang
2024Reference-based Burst Super-resolution.
Seonggwan Ko, Yeong Jun Koh, Donghyeon Cho
2024Regional Attention For Shadow Removal.
Hengxing Liu, Mingjia Li, Xiaojie Guo
2024Regularized Contrastive Partial Multi-view Outlier Detection.
Yijia Wang, Qianqian Xu, Yangbangyan Jiang, Siran Dai, Qingming Huang
2024RelScene: A Benchmark and baseline for Spatial Relations in text-driven 3D Scene Generation.
Zhaoda Ye, Xinhan Zheng, Yang Liu, Yuxin Peng
2024Relational Diffusion Distillation for Efficient Image Generation.
Weilun Feng, Chuanguang Yang, Zhulin An, Libo Huang, Boyu Diao, Fei Wang, Yongjun Xu
2024Reliable Attribute-missing Multi-view Clustering with Instance-level and feature-level Cooperative Imputation.
Dayu Hu, Suyuan Liu, Jun Wang, Junpu Zhang, Siwei Wang, Xingchen Hu, Xinzhong Zhu, Chang Tang, Xinwang Liu
2024Reliable Model Watermarking: Defending against Theft without Compromising on Evasion.
Hongyu Zhu, Sichu Liang, Wentao Hu, Fangqi Li, Ju Jia, Shi-Lin Wang
2024Remembering is Not Applying: Interpretable Knowledge Tracing for Problem-solving Processes.
Tao Huang, Xinjia Ou, Huali Yang, Shengze Hu, Jing Geng, Junjie Hu, Zhuoran Xu
2024Report-Concept Textual-Prompt Learning for Enhancing X-ray Diagnosis.
Xiongjun Zhao, Zhengyu Liu, Fen Liu, Guanting Li, Yutao Dou, Shaoliang Peng
2024Reproducibility Companion Paper: Aesthetics-Driven Virtual Time-Lapse Photography Generation.
Xin Jin, Longteng Jiang, Yihao Zhang, Lihua Lu, Xiaobo Gao, Boyan Dong
2024Reproducing the Past: A Dataset for Benchmarking Inscription Restoration.
Shipeng Zhu, Hui Xue, Na Nie, Chenjie Zhu, Haiyue Liu, Pengfei Fang
2024ResVG: Enhancing Relation and Semantic Understanding in Multiple Instances for Visual Grounding.
Minghang Zheng, Jiahua Zhang, Qingchao Chen, Yuxin Peng, Yang Liu
2024ResVR: Joint Rescaling and Viewport Rendering of Omnidirectional Images.
Weiqi Li, Shijie Zhao, Bin Chen, Xinhua Cheng, Junlin Li, Li Zhang, Jian Zhang
2024Resisting Over-Smoothing in Graph Neural Networks via Dual-Dimensional Decoupling.
Wei Shen, Mang Ye, Wenke Huang
2024Restoring Real-World Degraded Events Improves Deblurring Quality.
Yeqing Shen, Shang Li, Kun Song
2024Rethinking Image Editing Detection in the Era of Generative AI Revolution.
Zhihao Sun, Haipeng Fang, Juan Cao, Xinying Zhao, Danding Wang
2024Rethinking Impersonation and Dodging Attacks on Face Recognition Systems.
Fengfan Zhou, Qianyu Zhou, Bangjie Yin, Hui Zheng, Xuequan Lu, Lizhuang Ma, Hefei Ling
2024Rethinking the Architecture Design for Efficient Generic Event Boundary Detection.
Ziwei Zheng, Zechuan Zhang, Yulin Wang, Shiji Song, Gao Huang, Le Yang
2024Rethinking the Effect of Uninformative Class Name in Prompt Learning.
Fengmao Lv, Changru Nie, Jianyang Zhang, Guowu Yang, Guosheng Lin, Xiao Wu, Tianrui Li
2024Rethinking the Implicit Optimization Paradigm with Dual Alignments for Referring Remote Sensing Image Segmentation.
Yuwen Pan, Rui Sun, Yuan Wang, Tianzhu Zhang, Yongdong Zhang
2024Rethinking the One-shot Object Detection: Cross-Domain Object Search.
Yupeng Zhang, Shuqi Zheng, Ruize Han, Yuzhong Feng, Junhui Hou, Linqi Song, Wei Feng, Liang Wan
2024Reverse2Complete: Unpaired Multimodal Point Cloud Completion via Guided Diffusion.
Wenxiao Zhang, Hossein Rahmani, Xun Yang, Jun Liu
2024Reversed in Time: A Novel Temporal-Emphasized Benchmark for Cross-Modal Video-Text Retrieval.
Yang Du, Yuqi Liu, Qin Jin
2024Reversing Structural Pattern Learning with Biologically Inspired Knowledge Distillation for Spiking Neural Networks.
Qi Xu, Yaxin Li, Xuanye Fang, Jiangrong Shen, Qiang Zhang, Gang Pan
2024Revisiting Knowledge Tracing: A Simple and Powerful Model.
Xiaoxuan Shen, Fenghua Yu, Yaqi Liu, Ruxia Liang, Qian Wan, Kai Yang, Jianwen Sun
2024Revisiting Unsupervised Temporal Action Localization: The Primacy of High-Quality Actionness and Pseudolabels.
Han Jiang, Haoyu Tang, Ming Yan, Ji Zhang, Mingzhu Xu, Yupeng Hu, Jihua Zhu, Liqiang Nie
2024Revisiting Vision-Language Features Adaptation and Inconsistency for Social Media Popularity Prediction.
Chih-Chung Hsu, Chia-Ming Lee, Yu-Fan Lin, Yi-Shiuan Chou, Chih-Yu Jian, Chi-Han Tsai
2024Revolutionizing Lung Cancer Diagnostics with eyonis
Benoit Huet
2024RoCo: Robust Cooperative Perception By Iterative Object Matching and Pose Adjustment.
Zhe Huang, Shuo Wang, Yongcai Wang, Wanting Li, Deying Li, Lei Wang
2024RoSe: Rotation-Invariant Sequence-Aware Consensus for Robust Correspondence Pruning.
Yizhang Liu, Weiwei Zhou, Yanping Li, Shengjie Zhao
2024Robust Contrastive Cross-modal Hashing with Noisy Labels.
Longan Wang, Yang Qin, Yuan Sun, Dezhong Peng, Xi Peng, Peng Hu
2024Robust Live Streaming over LEO Satellite Constellations: Measurement, Analysis, and Handover-Aware Adaptation.
Hao Fang, Haoyuan Zhao, Jianxin Shi, Miao Zhang, Guanzhen Wu, Yi Ching Chou, Feng Wang, Jiangchuan Liu
2024Robust Multimodal Sentiment Analysis of Image-Text Pairs by Distribution-Based Feature Recovery and Fusion.
Daiqing Wu, Dongbao Yang, Yu Zhou, Can Ma
2024Robust Prototype Completion for Incomplete Multi-view Clustering.
Honglin Yuan, Shiyun Lai, Xingfeng Li, Jian Dai, Yuan Sun, Zhenwen Ren
2024Robust Pseudo-label Learning with Neighbor Relation for Unsupervised Visible-Infrared Person Re-Identification.
Xiangbo Yin, Jiangming Shi, Yachao Zhang, Yang Lu, Zhizhong Zhang, Yuan Xie, Yanyun Qu
2024Robust Variational Contrastive Learning for Partially View-unaligned Clustering.
Changhao He, Hongyuan Zhu, Peng Hu, Xi Peng
2024RobustFace: Adaptive Mining of Noise and Hard Samples for Robust Face Recognitions.
Yang Xin, Yu Zhou, Jianmin Jiang
2024Room2XR: Virtual Interactive Collaboration in Real-world Scenes.
Hung-Jui Guo, Hiranya Garbha Kumar, Minhas Kamal, Balakrishnan Prabhakaran
2024S
Chen Hui, Haiqi Zhu, Shuya Yan, Shaohui Liu, Feng Jiang, Debin Zhao
2024S2TD-Face: Reconstruct a Detailed 3D Face with Controllable Texture from a Single Sketch.
Zidu Wang, Xiangyu Zhu, Jiang Yu, Tianshuo Zhang, Zhen Lei
2024SAM-MIL: A Spatial Contextual Aware Multiple Instance Learning Approach for Whole Slide Image Classification.
Heng Fang, Sheng Huang, Wenhao Tang, Luwen Huangfu, Bo Liu
2024SAR-SLAM: Self-Attentive Rendering-based SLAM with Neural Point Cloud Encoding.
Xudong Lv, Zhiwei He, Yuxiang Yang, Jiahao Nie, Jing Zhang
2024SAT3D: Image-driven Semantic Attribute Transfer in 3D.
Zhijun Zhai, Zengmao Wang, Xiaoxiao Long, Kaixuan Zhou, Bo Du
2024SATO: Stable Text-to-Motion Framework.
Wenshuo Chen, Hongru Xiao, Erhang Zhang, Lijie Hu, Lei Wang, Mengyuan Liu, Chen Chen
2024SATPose: Improving Monocular 3D Pose Estimation with Spatial-aware Ground Tactility.
Lishuang Zhan, Enting Ying, Jiabao Gan, Shihui Guo, Boyu Gao, Yipeng Qin
2024SCPSN: Spectral Clustering-based Pyramid Super-resolution Network for Hyperspectral Images.
Yong Yang, Aoqi Zhao, Shuying Huang, Xiaozheng Wang, Yajing Fan
2024SCREEN: A Benchmark for Situated Conversational Recommendation.
Dongding Lin, Jian Wang, Chak Tou Leong, Wenjie Li
2024SDePR: Fine-Grained Leaf Image Retrieval with Structural Deep Patch Representation.
Xin Chen, Bin Wang, Jinzheng Jiang, Kunkun Zhang, Yongsheng Gao
2024SEDS: Semantically Enhanced Dual-Stream Encoder for Sign Language Retrieval.
Longtao Jiang, Min Wang, Zecheng Li, Yao Fang, Wengang Zhou, Houqiang Li
2024SFP: Spurious Feature-Targeted Pruning for Out-of-Distribution Generalization.
Yingchun Wang, Jingcai Guo, Song Guo, Yi Liu, Jie Zhang, Weizhan Zhang
2024SI-BiViT: Binarizing Vision Transformers with Spatial Interaction.
Peng Yin, Xiaosu Zhu, Jingkuan Song, Lianli Gao, Heng Tao Shen
2024SIA-OVD: Shape-Invariant Adapter for Bridging the Image-Region Gap in Open-Vocabulary Detection.
Zishuo Wang, Wenhao Zhou, Jinglin Xu, Yuxin Peng
2024SIRLUT: Simulated Infrared Fusion Guided Image-adaptive 3D Lookup Tables for Lightweight Image Enhancement.
Kaijiang Li, Hao Li, Haining Li, Peisen Wang, Chunyi Guo, Wenfeng Jiang
2024SM
Yihao Liu, Feng Xue, Anlong Ming, Mingshuai Zhao, Huadong Ma, Nicu Sebe
2024SMART: Self-Weighted Multimodal Fusion for Diagnostics of Neurodegenerative Disorders.
Qiuhui Chen, Yi Hong
2024SMP Challenge Summary: Social Media Prediction Challenge.
Bo Wu, Peiye Liu, Qiushi Huang, Zhaoyang Zeng, Jia Wang, Bei Liu, Jiebo Luo, Wen-Huang Cheng
2024SOAP: Enhancing Spatio-Temporal Relation and Motion Information Capturing for Few-Shot Action Recognition.
Wenbo Huang, Jinghui Zhang, Xuwei Qian, Zhen Wu, Meng Wang, Lei Zhang
2024SOIL: Contrastive Second-Order Interest Learning for Multimodal Recommendation.
Hongzu Su, Jingjing Li, Fengling Li, Ke Lu, Lei Zhu
2024SSAT-Adapter: Enhancing Vision-Language Model Few-shot Learning with Auxiliary Tasks.
Bowen Chen, Yun Sing Koh, Gillian Dobbie
2024SSL: A Self-similarity Loss for Improving Generative Image Super-resolution.
Du Chen, Zhengqiang Zhang, Jie Liang, Lei Zhang
2024STAR-VP: Improving Long-term Viewport Prediction in 360° Videos via Space-aligned and Time-varying Fusion.
Baoqi Gao, Daoxu Sheng, Lei Zhang, Qi Qi, Bo He, Zirui Zhuang, Jingyu Wang
2024Safe-SD: Safe and Traceable Stable Diffusion with Text Prompt Trigger for Invisible Generative Watermarking.
Zhiyuan Ma, Guoli Jia, Biqing Qi, Bowen Zhou
2024SafePaint: Anti-forensic Image Inpainting with Domain Adaptation.
Dunyun Chen, Xin Liao, Xiaoshuai Wu, Shiwei Chen
2024Saliency-Guided Fine-Grained Temporal Mask Learning for Few-Shot Action Recognition.
Shuo Zheng, Yuanjie Dang, Peng Chen, Ruohong Huan, Dongdong Zhao, Ronghua Liang
2024Sample Efficiency Matters: Training Multimodal Conversational Recommendation Systems in a Small Data Setting.
Haoyang Su, Wenzhe Du, Xiaoliang Wang, Cam-Tu Nguyen
2024Sample-agnostic Adversarial Perturbation for Vision-Language Pre-training Models.
Haonan Zheng, Wen Jiang, Xinyang Deng, Wenrui Li
2024Sampling to Distill: Knowledge Transfer from Open-World Data.
Yuzheng Wang, Zhaoyu Chen, Jie Zhang, Dingkang Yang, Zuhao Ge, Yang Liu, Siao Liu, Yunquan Sun, Wenqiang Zhang, Lizhe Qi
2024Scalable Multi-Source Pre-training for Graph Neural Networks.
Mingkai Lin, Wenzhong Li, Xiaobin Hong, Sanglu Lu
2024Scalable Multi-view Unsupervised Feature Selection with Structure Learning and Fusion.
Chenglong Zhang, Xinyan Liang, Peng Zhou, Zhaolong Ling, Yingwei Zhang, Xingyu Wu, Weiguo Sheng, Bingbing Jiang
2024Scalable Super-Resolution Neural Operator.
Lei Han, Xuesong Zhang
2024ScaleTraversal: Creating Multi-Scale Biomedical Animation with Limited Hardware Resources.
Richen Liu, Hansheng Wang, Hailong Wang, Siru Chen, Chufan Lai, Ayush Kumar, Siming Chen
2024ScanTD: 360° Scanpath Prediction based on Time-Series Diffusion.
Yujia Wang, Fang-Lue Zhang, Neil A. Dodgson
2024Scene Diffusion: Text-driven Scene Image Synthesis Conditioning on a Single 3D Model.
Xuan Han, Yihao Zhao, Mingyu You
2024Scene Graph Driven Hybrid Interactive VR Teleconferencing.
Mingyuan Wu, Ruifan Ji, Haozhen Zheng, Jiaxi Li, Beitong Tian, Bo Chen, Ruixiao Zhang, Jacob Chakareski, Michael Zink, Ramesh K. Sitaraman, Klara Nahrstedt
2024SceneExpander: Real-Time Scene Synthesis for Interactive Floor Plan Editing.
Shao-Kui Zhang, Junkai Huang, Liang Yue, Jia-Tong Zhang, Jia-Hong Liu, Yu-Kun Lai, Song-Hai Zhang
2024ScenePhotographer: Object-Oriented Photography for Residential Scenes.
Shao-Kui Zhang, Hanxi Zhu, Xuebin Chen, Jinghuan Chen, Zhike Peng, Ziyang Chen, Yong-Liang Yang, Song-Hai Zhang
2024Score-Based Image-to-Image Brownian Bridge.
Peiyong Wang, Bohan Xiao, Qisheng He, Carri Glide-Hurst, Ming Dong
2024See or Guess: Counterfactually Regularized Image Captioning.
Qian Cao, Xu Chen, Ruihua Song, Xiting Wang, Xinting Huang, Yuchen Ren
2024Seeing Beyond Classes: Zero-Shot Grounded Situation Recognition via Language Explainer.
Jiaming Lei, Lin Li, Chunping Wang, Jun Xiao, Long Chen
2024Seeing Beyond Words: Multimodal Aspect-Level Complaint Detection in Ecommerce Videos.
Rishikesh Devanathan, Apoorva Singh, A. S. Poornash, Sriparna Saha
2024Seeing Text in the Dark: Algorithm and Benchmark.
Chengpei Xu, Hao Fu, Long Ma, Wenjing Jia, Chengqi Zhang, Feng Xia, Xiaoyu Ai, Binghao Li, Wenjie Zhang
2024SegTalker: Segmentation-based Talking Face Generation with Mask-guided Local Editing.
Lingyu Xiong, Xize Cheng, Jintao Tan, Xianjia Wu, Xiandong Li, Lei Zhu, Fei Ma, Minglei Li, Huang Xu, Zhihui Hu
2024Segment Anything with Precise Interaction.
Mengzhen Liu, Mengyu Wang, Henghui Ding, Yilong Xu, Yao Zhao, Yunchao Wei
2024SelM: Selective Mechanism based Audio-Visual Segmentation.
Jiaxu Li, Songsong Yu, Yifan Wang, Lijun Wang, Huchuan Lu
2024Selection and Reconstruction of Key Locals: A Novel Specific Domain Image-Text Retrieval Method.
Yu Liao, Xinfeng Zhang, Rui Yang, Jianwei Tao, Bai Liu, Zhipeng Hu, Shuang Wang, Zeng Zhao
2024Selective Vision-Language Subspace Projection for Few-shot CLIP.
Xingyu Zhu, Beier Zhu, Yi Tan, Shuo Wang, Yanbin Hao, Hanwang Zhang
2024Self-Adaptive Fine-grained Multi-modal Data Augmentation for Semi-supervised Muti-modal Coreference Resolution.
Li Zheng, Boyu Chen, Hao Fei, Fei Li, Shengqiong Wu, Lizi Liao, Donghong Ji, Chong Teng
2024Self-Supervised Emotion Representation Disentanglement for Speech-Preserving Facial Expression Manipulation.
Zhihua Xu, Tianshui Chen, Zhijing Yang, Chunmei Qing, Yukai Shi, Liang Lin
2024Self-Supervised Visual Preference Alignment.
Ke Zhu, Liang Zhao, Zheng Ge, Xiangyu Zhang
2024Self-derived Knowledge Graph Contrastive Learning for Recommendation.
Lei Shi, Jiapeng Yang, Pengtao Lv, Lu Yuan, Feifei Kou, Jia Luo, Mingying Xu
2024SemGIR: Semantic-Guided Image Regeneration Based Method for AI-generated Image Detection and Attribution.
Xiao Yu, Kejiang Chen, Kai Zeng, Han Fang, Zijin Yang, Xiuwei Shang, Yuang Qi, Weiming Zhang, Nenghai Yu
2024SemNFT: A Semantically Enhanced Decentralized Middleware for Digital Asset Immortality.
Lehao Lin, Hong Kang, Xinyao Sun, Wei Cai
2024Semantic Alignment for Multimodal Large Language Models.
Tao Wu, Mengze Li, Jingyuan Chen, Wei Ji, Wang Lin, Jinyang Gao, Kun Kuang, Zhou Zhao, Fei Wu
2024Semantic Aware Just Noticeable Differences for VVC Compressed Text Screen Content Images.
Kaifang Yang, Xinrong Zhao, Yanchao Gong
2024Semantic Codebook Learning for Dynamic Recommendation Models.
Zheqi Lv, Shaoxuan He, Tianyu Zhan, Shengyu Zhang, Wenqiao Zhang, Jingyuan Chen, Zhou Zhao, Fei Wu
2024Semantic Distillation from Neighborhood for Composed Image Retrieval.
Yifan Wang, Wuliang Huang, Lei Li, Chun Yuan
2024Semantic Editing Increment Benefits Zero-Shot Composed Image Retrieval.
Zhenyu Yang, Shengsheng Qian, Dizhan Xue, Jiahong Wu, Fan Yang, Weiming Dong, Changsheng Xu
2024Semantic-Aware and Quality-Aware Interaction Network for Blind Video Quality Assessment.
Jianjun Xiang, Yuanjie Dang, Peng Chen, Ronghua Liang, Ruohong Huan, Nan Gao
2024Semantic-aware Next-Best-View for Multi-DoFs Mobile System in Search-and-Acquisition based Visual Perception.
Xiaotong Yu, Changwen Chen
2024Semantic-aware Representation Learning for Homography Estimation.
Yuhan Liu, Qianxin Huang, Siqi Hui, Jingwen Fu, Sanping Zhou, Kangyi Wu, Pengna Li, Jinjun Wang
2024Semantics-Aware Image Aesthetics Assessment using Tag Matching and Contrastive Ranking.
Zhichao Yang, Leida Li, Pengfei Chen, Jinjian Wu, Weisheng Dong
2024Semi-supervised Camouflaged Object Detection from Noisy Data.
Yuanbin Fu, Jie Ying, Houlei Lv, Xiaojie Guo
2024Semi-supervised Visible-Infrared Person Re-identification via Modality Unification and Confidence Guidance.
Xiying Zheng, Yukang Zhang, Yang Lu, Hanzi Wang
2024Sentiment-oriented Sarcasm Integration for Video Sentiment Analysis Enhancement with Sarcasm Assistance.
Junlin Fang, Wenya Wang, Guosheng Lin, Fengmao Lv
2024Serial Section Microscopy Image Inpainting Guided by Axial Optical Flow.
Yiran Cheng, Bintao He, Fa Zhang, Renmin Han
2024Shape-Guided Clothing Warping for Virtual Try-On.
Xiaoyu Han, Shunyuan Zheng, Zonglin Li, Chenyang Wang, Xin Sun, Quanling Meng
2024Shapley Value-based Contrastive Alignment for Multimodal Information Extraction.
Wen Luo, Yu Xia, Tianshu Shen, Sujian Li
2024ShiftMorph: A Fast and Robust Convolutional Neural Network for 3D Deformable Medical Image Registration.
Lijian Yang, Weisheng Li, Yucheng Shu, Jian-Xun Mi, Yuping Huang, Bin Xiao
2024Siformer: Feature-isolated Transformer for Efficient Skeleton-based Sign Language Recognition.
Muxin Pu, Mei Kuan Lim, Chun Yong Chong
2024SimCEN: Simple Contrast-enhanced Network for CTR Prediction.
Honghao Li, Lei Sang, Yi Zhang, Yiwen Zhang
2024SimCLIP: Refining Image-Text Alignment with Simple Prompts for Zero-/Few-shot Anomaly Detection.
Chenghao Deng, Haote Xu, Xiaolu Chen, Haodi Xu, Xiaotong Tu, Xinghao Ding, Yue Huang
2024Similarity Preserving Transformer Cross-Modal Hashing for Video-Text Retrieval.
Qianxin Huang, Siyao Peng, Xiaobo Shen, Yunhao Yuan, Shirui Pan
2024Simple Yet Effective: Structure Guided Pre-trained Transformer for Multi-modal Knowledge Graph Reasoning.
Ke Liang, Lingyuan Meng, Yue Liu, Meng Liu, Wei Wei, Suyuan Liu, Wenxuan Tu, Siwei Wang, Sihang Zhou, Xinwang Liu
2024SimpliGuard: Robust Mesh Simplification In the Wild.
Peibin Chen, Xijin Zhang, Daniel Kang Du
2024Simplifying Cross-modal Interaction via Modality-Shared Features for RGBT Tracking.
Liqiu Chen, Yuqing Huang, Hengyu Li, Zikun Zhou, Zhenyu He
2024Sketch3D: Style-Consistent Guidance for Sketch-to-3D Generation.
Wangguandong Zheng, Haifeng Xia, Rui Chen, Libo Sun, Ming Shao, Siyu Xia, Zhengming Ding
2024SkipVSR: Adaptive Patch Routing for Video Super-Resolution with Inter-Frame Mask.
Zekun Ai, Xiaotong Luo, Yanyun Qu, Yuan Xie
2024SleepMG: Multimodal Generalizable Sleep Staging with Inter-modal Balance of Classification and Domain Discrimination.
Shuo Ma, Yingwei Zhang, Qiqi Zhang, Yiqiang Chen, Haoran Wang, Ziyu Jia
2024Sniffing Threatening Open-World Objects in Autonomous Driving by Open-Vocabulary Models.
Yulin He, Siqi Wang, Wei Chen, Tianci Xun, Yusong Tan
2024Sophia-in-Audition: Virtual Production with a Robot Performer.
Taotao Zhou, Teng Xu, Dong Zhang, Yuyang Jiao, Peijun Xu, Yaoyu He, Lan Xu, Jingyi Yu
2024Sparse Query Dense: Enhancing 3D Object Detection with Pseudo Points.
Yujian Mo, Yan Wu, Junqiao Zhao, Zhenjie Hou, Weiquan Huang, Yinghao Hu, Jijun Wang, Jun Yan
2024SparseFormer: Detecting Objects in HRW Shots via Sparse Vision Transformer.
Wenxi Li, Yuchen Guo, Jilai Zheng, Haozhe Lin, Chao Ma, Lu Fang, Xiaokang Yang
2024SparseInteraction: Sparse Semantic Guidance for Radar and Camera 3D Object Detection.
Shaoqing Xu, Shengyin Jiang, Fang Li, Li Liu, Ziying Song, Bo Yang, Zhixin Yang
2024Spatial-Temporal Context Model for Remote Sensing Imagery Compression.
Jinxiao Zhang, Runmin Dong, Juepeng Zheng, Mengxuan Chen, Lixian Zhang, Yi Zhao, Haohuan Fu
2024Spatio-temporal Heterogeneous Federated Learning for Time Series Classification with Multi-view Orthogonal Training.
Chenrui Wu, Haishuai Wang, Xiang Zhang, Zhen Fang, Jiajun Bu
2024Spatiotemporal Fine-grained Video Description for Short Videos.
Te Yang, Jian Jia, Bo Wang, Yanhua Cheng, Yan Li, Dongze Hao, Xipeng Cao, Quan Chen, Han Li, Peng Jiang, Xiangyu Zhu, Zhen Lei
2024Spatiotemporal Graph Guided Multi-modal Network for Livestreaming Product Retrieval.
Xiaowan Hu, Yiyi Chen, Yan Li, Minquan Wang, Haoqian Wang, Quan Chen, Han Li, Peng Jiang
2024SpecGaussian with Latent Features: A High-quality Modeling of the View-dependent Appearance for 3D Gaussian Splatting.
Zhiru Wang, Shiyun Xie, Chengwei Pan, Guoping Wang
2024Speech Reconstruction from Silent Lip and Tongue Articulation by Diffusion Models and Text-Guided Pseudo Target Generation.
Rui-Chen Zheng, Yang Ai, Zhen-Hua Ling
2024SpeechCraft: A Fine-Grained Expressive Speech Dataset with Natural Language Description.
Zeyu Jin, Jia Jia, Qixin Wang, Kehan Li, Shuoyi Zhou, Songtao Zhou, Xiaoyu Qin, Zhiyong Wu
2024SpeechEE: A Novel Benchmark for Speech Event Extraction.
Bin Wang, Meishan Zhang, Hao Fei, Yu Zhao, Bobo Li, Shengqiong Wu, Wei Ji, Min Zhang
2024SpikeGS: 3D Gaussian Splatting from Spike Streams with High-Speed Camera Motion.
Jiyuan Zhang, Kang Chen, Shiyan Chen, Yajing Zheng, Tiejun Huang, Zhaofei Yu
2024StableMoFusion: Towards Robust and Efficient Diffusion-based Motion Generation Framework.
Yiheng Huang, Hui Yang, Chuanchen Luo, Yuxi Wang, Shibiao Xu, Zhaoxiang Zhang, Man Zhang, Junran Peng
2024StarStream: Live Video Analytics over Space Networking.
Miao Zhang, Jiaxing Li, Haoyuan Zhao, Linfeng Shen, Jiangchuan Liu
2024Stay Focused is All You Need for Adversarial Robustness.
Bingzhi Chen, Ruihan Liu, Yishu Liu, Xiaozhao Fang, Jiahui Pan, Guangming Lu, Zheng Zhang
2024StealthDiffusion: Towards Evading Diffusion Forensic Detection through Diffusion Model.
Ziyin Zhou, Ke Sun, Zhongxi Chen, Huafeng Kuang, Xiaoshuai Sun, Rongrong Ji
2024Stochastic Context Consistency Reasoning for Domain Adaptive Object Detection.
Yiming Cui, Liang Li, Jiehua Zhang, Chenggang Yan, Hongkui Wang, Shuai Wang, Heng jin, Li Wu
2024Streamable Portrait Video Editing with Probabilistic Pixel Correspondence.
Xiaodi Li
2024Student-Oriented Teacher Knowledge Refinement for Knowledge Distillation.
Chaomin Shen, Yaomin Huang, Haokun Zhu, Jinsong Fan, Guixu Zhang
2024Style-conditional Prompt Token Learning for Generalizable Face Anti-spoofing.
Jiabao Guo, Huan Liu, Yizhi Luo, Xueli Hu, Hang Zou, Yuan Zhang, Hui Liu, Bo Zhao
2024StylizedFacePoint: Facial Landmark Detection for Stylized Characters.
Shengran Cheng, Chuhang Ma, Ye Pan
2024Subjective and Objective Quality-of-Experience Assessment for 3D Talking Heads.
Yingjie Zhou, Zicheng Zhang, Wei Sun, Xiaohong Liu, Xiongkuo Min, Guangtao Zhai
2024Subjective-Aligned Dataset and Metric for Text-to-Video Quality Assessment.
Tengchuan Kou, Xiaohong Liu, Zicheng Zhang, Chunyi Li, Haoning Wu, Xiongkuo Min, Guangtao Zhai, Ning Liu
2024Superpixel-based Efficient Sampling for Learning Neural Fields from Large Input.
Zhongwei Xuan, Zunjie Zhu, Shuai Wang, Haibing Yin, Hongkui Wang, Ming Lu
2024Suppressing Uncertainties in Degradation Estimation for Blind Super-Resolution.
Junxiong Lin, Zen Tao, Xuan Tong, Xinji Mai, Haoran Wang, Boyang Wang, Yan Wang, Qing Zhao, Jiawen Yu, Yuxuan Lin, Shaoqi Yan, Shuyong Gao, Wenqiang Zhang
2024Sustainable Self-evolution Adversarial Training.
Wenxuan Wang, Chenglei Wang, Huihui Qi, Menghao Ye, Xuelin Qian, Peng Wang, Yanning Zhang
2024Swarical: An Integrated Hierarchical Approach to Localizing Flying Light Specks.
Hamed Alimohammadzadeh, Shahram Ghandeharizadeh
2024SymAttack: Symmetry-aware Imperceptible Adversarial Attacks on 3D Point Clouds.
Keke Tang, Zhensu Wang, Weilong Peng, Lujie Huang, Le Wang, Peican Zhu, Wenping Wang, Zhihong Tian
2024SyncTalklip: Highly Synchronized Lip-Readable Speaker Generation with Multi-Task Learning.
Xiaoda Yang, Xize Cheng, Dongjie Fu, Minghui Fang, Jialong Zuo, Shengpeng Ji, Zhou Zhao, Tao Jin
2024Synergetic Prototype Learning Network for Unbiased Scene Graph Generation.
Ruonan Zhang, Ziwei Shang, Fengjuan Wang, Zhaoqilin Yang, Shan Cao, Yigang Cen, Gaoyun An
2024SynopGround: A Large-Scale Dataset for Multi-Paragraph Video Grounding from TV Dramas and Synopses.
Chaolei Tan, Zihang Lin, Junfu Pu, Zhongang Qi, Wei-Yi Pei, Zhi Qu, Yexin Wang, Ying Shan, Wei-Shi Zheng, Jian-Fang Hu
2024T2I-Scorer: Quantitative Evaluation on Text-to-Image Generation via Fine-Tuned Large Multi-Modal Models.
Haoning Wu, Xiele Wu, Chunyi Li, Zicheng Zhang, Chaofeng Chen, Xiaohong Liu, Guangtao Zhai, Weisi Lin
2024T2VIndexer: A Generative Video Indexer for Efficient Text-Video Retrieval.
Yili Li, Jing Yu, Keke Gai, Bang Liu, Gang Xiong, Qi Wu
2024TALE: Training-free Cross-domain Image Composition via Adaptive Latent Manipulation and Energy-guided Optimization.
Kien T. Pham, Jingye Chen, Qifeng Chen
2024TAS: Personalized Text-guided Audio Spatialization.
Zhaojian Li, Bin Zhao, Yuan Yuan
2024TAVGBench: Benchmarking Text to Audible-Video Generation.
Yuxin Mao, Xuyang Shen, Jing Zhang, Zhen Qin, Jinxing Zhou, Mochu Xiang, Yiran Zhong, Yuchao Dai
2024TDSD: Text-Driven Scene-Decoupled Weakly Supervised Video Anomaly Detection.
Shengyang Sun, Jiashen Hua, Junyi Feng, Dongxu Wei, Baisheng Lai, Xiaojin Gong
2024TGCA-PVT: Topic-Guided Context-Aware Pyramid Vision Transformer for Sticker Emotion Recognition.
Jian Chen, Wei Wang, Yuzhu Hu, Junxin Chen, Han Liu, Xiping Hu
2024TS-ILM: Class Incremental Learning for Online Action Detection.
Xiaochen Li, Jian Cheng, Ziying Xia, Zichong Chen, Junhao Shi, Zhicheng Dong, Nyima Tashi
2024TUT4CRS: Time-aware User-preference Tracking for Conversational Recommendation System.
Dongxiao He, Jinghan Zhang, Xiaobao Wang, Meng Ge, Zhiyong Feng, Longbiao Wang, Xiaoke Ma
2024TVPR: Text-to-Video Person Retrieval and a New Benchmark.
Xu Zhang, Fan Ni, Guannan Dong, Aichun Zhu, Jianhui Wu, Mingcheng Ni, Hui Liu
2024Tag Tree-Guided Multi-grained Alignment for Multi-Domain Short Video Recommendation.
Yuting Zhang, Zhao Zhang, Yiqing Wu, Ying Sun, Fuzhen Zhuang, Wenhui Yu, Lantao Hu, Han Li, Kun Gai, Zhulin An, Yongjun Xu
2024TagOOD: A Novel Approach to Out-of-Distribution Detection via Vision-Language Representations and Class Center Learning.
Jinglun Li, Xinyu Zhou, Kaixun Jiang, Lingyi Hong, Pinxue Guo, Zhaoyu Chen, Weifeng Ge, Wenqiang Zhang
2024Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization.
Navonil Majumder, Chia-Yu Hung, Deepanway Ghosal, Wei-Ning Hsu, Rada Mihalcea, Soujanya Poria
2024Tangram-Splatting: Optimizing 3D Gaussian Splatting Through Tangram-inspired Shape Priors.
Yi Wang, Ningze Zhong, Minglin Chen, Longguang Wang, Yulan Guo
2024Task-Adapter: Task-specific Adaptation of Image Models for Few-shot Action Recognition.
Congqi Cao, Yueran Zhang, Yating Yu, Qinyi Lv, Lingtong Min, Yanning Zhang
2024Task-Conditional Adapter for Multi-Task Dense Prediction.
Fengze Jiang, Shuling Wang, Xiaojin Gong
2024Task-Interaction-Free Multi-Task Learning with Efficient Hierarchical Feature Representation.
Shalayiding Sirejiding, Bayram Bayramli, Yuxiang Lu, Yuwen Yang, Tamam Alsarhan, Hongtao Lu, Yue Ding
2024Task-Oriented Multi-Bitstream Optimization for Image Compression and Transmission via Optimal Transport.
Sa Yan, Nuowen Kan, Chenglin Li, Wenrui Dai, Junni Zou, Hongkai Xiong
2024TeRF: Text-driven and Region-aware Flexible Visible and Infrared Image Fusion.
Hebaixu Wang, Hao Zhang, Xunpeng Yi, Xinyu Xiang, Leyuan Fang, Jiayi Ma
2024Temporal Enhancement for Video Affective Content Analysis.
Xin Li, Shangfei Wang, Xuandong Huang
2024Temporal-Informative Adapters in VideoMAE V2 and Multi-Scale Feature Fusion for Micro-Expression Spotting-then-Recognize.
Jun Yu, Gongpeng Zhao, Yaohui Zhang, Peng He, Zerui Zhang, Zhao Yang, Qingsong Liu, Jianqing Sun, Jiaen Liang
2024Test-Time Training on Graphs with Large Language Models (LLMs).
Jiaxin Zhang, Yiqi Wang, Xihong Yang, Siwei Wang, Yu Feng, Yu Shi, Ruichao Ren, En Zhu, Xinwang Liu
2024Text-Region Matching for Multi-Label Image Recognition with Missing Labels.
Leilei Ma, Hongxing Xie, Lei Wang, Yanping Fu, Dengdi Sun, Haifeng Zhao
2024Text-prompt Camouflaged Instance Segmentation with Graduated Camouflage Learning.
Zhentao He, Changqun Xia, Shengye Qiao, Jia Li
2024TextGaze: Gaze-Controllable Face Generation with Natural Language.
Hengfei Wang, Zhongqun Zhang, Yihua Cheng, Hyung Jin Chang
2024The ACM Multimedia 2024 Viual Spatial Description Grand Challenge.
Yu Zhao, Hao Fei, Bobo Li, Meishan Zhang, Min Zhang
2024The Room: Design and Embodiment of Spaces as Social Beings.
Federico Espositi, Andrea Bonarini
2024Thinking Temporal Automatic White Balance: Datasets, Models and Benchmarks.
Chunxiao Li, Shuyang Wang, Xuejing Kang, Anlong Ming
2024TiVA: Time-Aligned Video-to-Audio Generation.
Xihua Wang, Yuyue Wang, Yihan Wu, Ruihua Song, Xu Tan, Zehua Chen, Hongteng Xu, Guodong Sui
2024Time-Frequency Domain Fusion Enhancement for Audio Super-Resolution.
Ye Tian, Zhe Wang, Jianguo Sun, Liguo Zhang
2024TimeNeRF: Building Generalizable Neural Radiance Fields across Time from Few-Shot Input Views.
Hsiang-Hui Hung, Huu-Phu Do, Yung-Hui Li, Ching-Chun Huang
2024Timeline and Boundary Guided Diffusion Network for Video Shadow Detection.
Haipeng Zhou, Hongqiu Wang, Tian Ye, Zhaohu Xing, Jun Ma, Ping Li, Qiong Wang, Lei Zhu
2024Toward Explainable Physical Audiovisual Commonsense Reasoning.
Daoming Zong, Chaoyue Ding, Kaitao Chen
2024Toward Timeliness-Enhanced Loss Recovery for Large-Scale Live Streaming.
Bo Wu, Tong Li, Cheng Luo, Xu Yan, Fuyu Wang, Xinle Du, Ke Xu
2024Towards Artist-Like Painting Agents with Multi-Granularity Semantic Alignment.
Zhangli Hu, Ye Chen, Zhongyin Zhao, Jinfan Liu, Bilian Ke, Bingbing Ni
2024Towards Distortion-Debiased Blind Image Quality Assessment.
Lize Zhou, Xiaoqi Wang, Jian Xiong, Xianzhong Long, Hao Gao
2024Towards Effective Data-Free Knowledge Distillation via Diverse Diffusion Augmentation.
Muquan Li, Dongyang Zhang, Tao He, Xiurui Xie, Yuan-Fang Li, Ke Qin
2024Towards Effective Federated Graph Anomaly Detection via Self-boosted Knowledge Distillation.
Jinyu Cai, Yunhe Zhang, Zhoumin Lu, Wenzhong Guo, See-Kiong Ng
2024Towards Efficient and Diverse Generative Model for Unconditional Human Motion Synthesis.
Hua Yu, Weiming Liu, Jiapeng Bai, Xu Gui, Yaqing Hou, Yew-Soon Ong, Qiang Zhang
2024Towards Emotion-enriched Text-to-Motion Generation via LLM-guided Limb-level Emotion Manipulating.
Tan Yu, Jingjing Wang, Jiawen Wang, Jiamin Luo, Guodong Zhou
2024Towards End-to-End Explainable Facial Action Unit Recognition via Vision-Language Joint Learning.
Xuri Ge, Junchen Fu, Fuhai Chen, Shan An, Nicu Sebe, Joemon M. Jose
2024Towards Engagement Prediction: A Cross-Modality Dual-Pipeline Approach using Visual and Audio Features.
Deepak Kumar, Surbhi Madan, Pradeep Singh, Abhinav Dhall, Balasubramanian Raman
2024Towards Flexible Evaluation for Generative Visual Question Answering.
Huishan Ji, Qingyi Si, Zheng Lin, Weiping Wang
2024Towards High-performance Spiking Transformers from ANN to SNN Conversion.
Zihan Huang, Xinyu Shi, Zecheng Hao, Tong Bu, Jianhao Ding, Zhaofei Yu, Tiejun Huang
2024Towards High-resolution 3D Anomaly Detection via Group-Level Feature Contrastive Learning.
Hongze Zhu, Guoyang Xie, Chengbin Hou, Tao Dai, Can Gao, Jinbao Wang, Linlin Shen
2024Towards Labeling-free Fine-grained Animal Pose Estimation.
Dan Zeng, Yu Zhu, Shuiwang Li, Qijun Zhao, Qiaomu Shen, Bo Tang
2024Towards Low-latency Event-based Visual Recognition with Hybrid Step-wise Distillation Spiking Neural Networks.
Xian Zhong, Shengwang Hu, Wenxuan Liu, Wenxin Huang, Jianhao Ding, Zhaofei Yu, Tiejun Huang
2024Towards Medical Vision-Language Contrastive Pre-training via Study-Oriented Semantic Exploration.
Bo Liu, Zexin Lu, Yan Wang
2024Towards Multi-view Consistent Graph Diffusion.
Jielong Lu, Zhihao Wu, Zhaoliang Chen, Zhiling Cai, Shiping Wang
2024Towards Multimodal-augmented Pre-trained Language Models via Self-balanced Expectation-Maximization Iteration.
Xianwei Zhuang, Xuxin Cheng, Zhihong Zhu, Zhanpeng Chen, Hongxiang Li, Yuexian Zou
2024Towards Open-vocabulary HOI Detection with Calibrated Vision-language Models and Locality-aware Queries.
Zhenhao Yang, Xin Liu, Deqiang Ouyang, Guiduo Duan, Dongyang Zhang, Tao He, Yuan-Fang Li
2024Towards Photorealistic Video Colorization via Gated Color-Guided Image Diffusion Models.
Jiaxing Li, Hongbo Zhao, Yijun Wang, Jianxin Lin
2024Towards Practical Human Motion Prediction with LiDAR Point Clouds.
Xiao Han, Yiming Ren, Yichen Yao, Yujing Sun, Yuexin Ma
2024Towards Real-time Video Compressive Sensing on Mobile Devices.
Miao Cao, Lishun Wang, Huan Wang, Guoqing Wang, Xin Yuan
2024Towards Robust Physical-world Backdoor Attacks on Lane Detection.
Xinwei Zhang, Aishan Liu, Tianyuan Zhang, Siyuan Liang, Xianglong Liu
2024Towards Robustness Prompt Tuning with Fully Test-Time Adaptation for CLIP's Zero-Shot Generalization.
Ran Wang, Hua Zuo, Zhen Fang, Jie Lu
2024Towards Small Object Editing: A Benchmark Dataset and A Training-Free Approach.
Qihe Pan, Zhen Zhao, Zicheng Wang, Sifan Long, Yiming Wu, Wei Ji, Haoran Liang, Ronghua Liang
2024Towards Stricter Black-box Integrity Verification of Deep Neural Network Models.
Chaoxiang He, Xiaofan Bai, Xiaojing Ma, Bin B. Zhu, Pingyi Hu, Jiayun Fu, Hai Jin, Dongmei Zhang
2024Towards Trustworthy MetaShopping: Studying Manipulative Audiovisual Designs in Virtual-Physical Commercial Platforms.
Esmée Henrieke Anne de Haas, Lik-Hang Lee, Yiming Huang, Carlos Bermejo, Pan Hui, Zijun Lin
2024Towards Video-based Activated Muscle Group Estimation in the Wild.
Kunyu Peng, David Schneider, Alina Roitberg, Kailun Yang, Jiaming Zhang, Chen Deng, Kaiyu Zhang, M. Saquib Sarfraz, Rainer Stiefelhagen
2024TrGa: Reconsidering the Application of Graph Neural Networks in Two-View Correspondence Pruning.
Luanyuan Dai, Xiaoyu Du, Jinhui Tang
2024Tracing Training Progress: Dynamic Influence Based Selection for Active Learning.
Tianjiao Wan, Kele Xu, Long Lan, Zijian Gao, Dawei Feng, Bo Ding, Huaimin Wang
2024Tracking-forced Referring Video Object Segmentation.
Ruxue Yan, Wenya Guo, Xubo Liu, Xumeng Liu, Ying Zhang, Xiaojie Yuan
2024TrafficMOT: A Challenging Dataset for Multi-Object Tracking in Complex Traffic Scenarios.
Lihao Liu, Yanqi Cheng, Zhongying Deng, Shujun Wang, Dongdong Chen, Xiaowei Hu, Pietro Liò, Carola-Bibiane Schönlieb, Angelica I. Avilés-Rivero
2024Training Pansharpening Networks at Full Resolution Using Degenerate Invariance.
Yichang Qu, Bing Li, Jie Huang, Feng Zhao
2024Training Spatial-Frequency Visual Prompts and Probabilistic Clusters for Accurate Black-Box Transfer Learning.
Wonwoo Cho, Kangyeol Kim, Saemee Choi, Jaegul Choo
2024Training-Free Feature Reconstruction with Sparse Optimization for Vision-Language Models.
Yi Zhang, Ke Yu, Angelica I. Avilés-Rivero, Jiyuan Jia, Yushun Tang, Zhihai He
2024Traj2Former: A Local Context-aware Snapshot and Sequential Dual Fusion Transformer for Trajectory Classification.
Yuan Xie, Yichen Zhang, Yifang Yin, Sheng Zhang, Ying Zhang, Rajiv Ratn Shah, Roger Zimmermann, Guoqing Xiao
2024TransLinkGuard: Safeguarding Transformer Models Against Model Stealing in Edge Deployment.
Qinfeng Li, Zhiqiang Shen, ZhengHan Qin, Yangfan Xie, Xuhong Zhang, Tianyu Du, Sheng Cheng, Xun Wang, Jianwei Yin
2024TransNet V2: An Effective Deep Network Architecture for Fast Shot Transition Detection.
Tomás Soucek, Jakub Lokoc
2024Transferable Adversarial Facial Images for Privacy Protection.
Minghui Li, Jiangxiong Wang, Hao Zhang, Ziqi Zhou, Shengshan Hu, Xiaobing Pei
2024Transferring to Real-World Layouts: A Depth-aware Framework for Scene Adaptation.
Mu Chen, Zhedong Zheng, Yi Yang
2024Translating Motion to Notation: Hand Labanotation for Intuitive and Comprehensive Hand Movement Documentation.
Ling Li, Wenrui Yang, Xinchun Yu, Junliang Xing, Xiao-Ping Zhang
2024TreeReward: Improve Diffusion Model via Tree-Structured Feedback Learning.
Jiacheng Zhang, Jie Wu, Huafeng Kuang, Haiming Zhang, Yuxi Ren, Weifeng Chen, Manlin Zhang, Xuefeng Xiao, Guanbin Li
2024Triple Alignment Strategies for Zero-shot Phrase Grounding under Weak Supervision.
Pengyue Lin, Ruifan Li, Yuzhe Ji, Zhihan Yu, Fangxiang Feng, Zhanyu Ma, Xiaojie Wang
2024Trust Prophet or Not? Taking a Further Verification Step toward Accurate Scene Text Recognition.
Anna Zhu, Ke Xiao, Bo Zhou, Runmin Wang
2024Tunnel Try-on: Excavating Spatial-temporal Tunnels for High-quality Virtual Try-on in Videos.
Zhengze Xu, Mengting Chen, Zhao Wang, Linyu Xing, Zhonghua Zhai, Nong Sang, Jinsong Lan, Shuai Xiao, Changxin Gao
2024Tutorial: Large Language-Vision Model in Society.
Kaicheng Yu, Zhuang Shao, Siyuan Qi, Dongfang Liu
2024Two Teachers Are Better Than One: Semi-supervised Elliptical Object Detection by Dual-Teacher Collaborative Guidance.
Yu Liu, Longhan Feng, Qi Jia, Zezheng Liu, Zi-Huang Cao
2024Two in One Go: Single-stage Emotion Recognition with Decoupled Subject-context Transformer.
Xinpeng Li, Teng Wang, Jian Zhao, Shuyi Mao, Jinbao Wang, Feng Zheng, Xiaojiang Peng, Xuelong Li
2024U2UData: A Large-scale Cooperative Perception Dataset for Swarm UAVs Autonomous Flight.
Tongtong Feng, Xin Wang, Feilin Han, Leping Zhang, Wenwu Zhu
2024U2USim - A UAV Telepresence Simulation Platform with Multi-agent Sensing and Dynamic Environment.
Feilin Han, Leping Zhang, Xin Wang, Ke-Ao Zhao, Ying Zhong, Ziyi Su, Tongtong Feng, Wenwu Zhu
2024UNER: A Unified Prediction Head for Named Entity Recognition in Visually-rich Documents.
Yi Tu, Chong Zhang, Ya Guo, Huan Chen, Jinyang Tang, Huijia Zhu, Qi Zhang
2024UVMap-ID: A Controllable and Personalized UV Map Generative Model.
Weijie Wang, Jichao Zhang, Chang Liu, Xia Li, Xingqian Xu, Humphrey Shi, Nicu Sebe, Bruno Lepri
2024Uncertainty-Aware Pseudo-Labeling and Dual Graph Driven Network for Incomplete Multi-View Multi-Label Classification.
Wulin Xie, Xiaohuan Lu, Yadong Liu, Jiang Long, Bob Zhang, Shuping Zhao, Jie Wen
2024Uncovering Capabilities of Model Pruning in Graph Contrastive Learning.
Junran Wu, Xueyuan Chen, Shangzhe Li
2024Understanding and Tackling Scattering and Reflective Flare for Mobile Camera Systems.
Fengbo Lan, Chang Wen Chen
2024Understanding the Impact of AI-Generated Content on Social Media: The Pixiv Case.
Yiluo Wei, Gareth Tyson
2024Uni-DlLoRA: Style Fine-Tuning for Fashion Image Translation.
Fangjian Liao, Xingxing Zou, Waikeung Wong
2024Uni-YOLO: Vision-Language Model-Guided YOLO for Robust and Fast Universal Detection in the Open World.
Xudong Wang, Weihong Ren, Xi'ai Chen, Huijie Fan, Yandong Tang, Zhi Han
2024UniDense: Unleashing Diffusion Models with Meta-Routers for Universal Few-Shot Dense Prediction.
Lintao Dong, Wei Zhai, Zheng-Jun Zha
2024UniGM: Unifying Multiple Pre-trained Graph Models via Adaptive Knowledge Aggregation.
Jintao Chen, Fan Wang, Shengye Pang, Siwei Tan, Mingshuai Chen, Tiancheng Zhao, Meng Xi, Jianwei Yin
2024UniL: Point Cloud Novelty Detection through Multimodal Pre-training.
Yuhan Wang, Mofei Song
2024UniQ: Unified Decoder with Task-specific Queries for Efficient Scene Graph Generation.
Xinyao Liao, Wei Wei, Dangyang Chen, Yuanyuan Fu
2024UniStyle: Unified Style Modeling for Speaking Style Captioning and Stylistic Speech Synthesis.
Xinfa Zhu, Wenjie Tian, Xinsheng Wang, Lei He, Yujia Xiao, Xi Wang, Xu Tan, Sheng Zhao, Lei Xie
2024Unifying Spike Perception and Prediction: A Compact Spike Representation Model Using Multi-scale Correlation.
Kexiang Feng, Chuanmin Jia, Siwei Ma, Wen Gao
2024Universal Frequency Domain Perturbation for Single-Source Domain Generalization.
Chuang Liu, Yichao Cao, Xiu Su, Haogang Zhu
2024Unleashing the Power of Generic Segmentation Model: A Simple Baseline for Infrared Small Target Detection.
Mingjin Zhang, Chi Zhang, Qiming Zhang, Yunsong Li, Xinbo Gao, Jing Zhang
2024Unlimited Vision: Professional Composition by Yourself.
Xin Jin, Liaoruxing Zhang, Longteng Jiang, Dandan Li
2024Unpaired Photo-realistic Image Deraining with Energy-informed Diffusion Model.
Yuanbo Wen, Tao Gao, Ting Chen
2024Unraveling Motion Uncertainty for Local Motion Deblurring.
Zeyu Xiao, Zhihe Lu, Michael Bi Mi, Zhiwei Xiong, Xinchao Wang
2024Unseen No More: Unlocking the Potential of CLIP for Generative Zero-shot HOI Detection.
Yixin Guo, Yu Liu, Jianghao Li, Weimin Wang, Qi Jia
2024Unsupervised Image-to-Video Adaptation via Category-aware Flow Memory Bank and Realistic Video Generation.
Kenan Huang, Junbao Zhuo, Shuhui Wang, Chi Su, Qingming Huang, Huimin Ma
2024Unsupervised Multi-view Pedestrian Detection.
Mengyin Liu, Chao Zhu, Shiqi Ren, Xu-Cheng Yin
2024Unveiling Structural Memorization: Structural Membership Inference Attack for Text-to-Image Diffusion Models.
Qiao Li, Xiaomeng Fu, Xi Wang, Jin Liu, Xingyu Gao, Jiao Dai, Jizhong Han
2024Unveiling and Mitigating Bias in Audio Visual Segmentation.
Peiwen Sun, Honggang Zhang, Di Hu
2024UrbanCross: Enhancing Satellite Image-Text Retrieval with Cross-Domain Adaptation.
Siru Zhong, Xixuan Hao, Yibo Yan, Ying Zhang, Yangqiu Song, Yuxuan Liang
2024Utilizing Speaker Profiles for Impersonation Audio Detection.
Hao Gu, Jiangyan Yi, Chenglong Wang, Yong Ren, Jianhua Tao, Xinrui Yan, Yujie Chen, Xiaohui Zhang
2024Utilizing Very High-resolution Optical RGB Satellite Imagery in Geo-information Extraction for Fine-scale Map-making.
Wenmiao Hu
2024V
Xuanyu Zhang, Youmin Xu, Runyi Li, Jiwen Yu, Weiqi Li, Zhipei Xu, Jian Zhang
2024VL-Reader: Vision and Language Reconstructor is an Effective Scene Text Recognizer.
Humen Zhong, Zhibo Yang, Zhaohai Li, Peng Wang, Jun Tang, Wenqing Cheng, Cong Yao
2024VLMEvalKit: An Open-Source ToolKit for Evaluating Large Multi-Modality Models.
Haodong Duan, Junming Yang, Yuxuan Qiao, Xinyu Fang, Lin Chen, Yuan Liu, Xiaoyi Dong, Yuhang Zang, Pan Zhang, Jiaqi Wang, Dahua Lin, Kai Chen
2024VR-DiagNet: Medical Volumetric and Radiomic Diagnosis Networks with Interpretable Clinician-like Optimizing Visual Inspection.
Shouyu Chen, Liang Hu, Tangwei Ye, Zhongyuan Lai, Qi Zhang, Ke Liu, Usman Naseem, Ke Sun, Nengjun Zhu
2024VR-Mediated Cognitive Defusion: A Comparative Study for Managing Negative Thoughts.
Kento Shigyo, Yifan Cao, Kentaro Takahira, Mingming Fan, Huamin Qu
2024VRDistill: Vote Refinement Distillation for Efficient Indoor 3D Object Detection.
Ze Yuan, Jinyang Guo, Dakai An, Junran Wu, He Zhu, Jianhao Li, Xueyuan Chen, Ke Xu, Jiaheng Liu
2024Vaccine Misinformation Detection in X using Cooperative Multimodal Framework.
Usman Naseem, Adam G. Dunn, Matloob Khushi, Jinman Kim
2024VeCAF: Vision-language Collaborative Active Finetuning with Training Objective Awareness.
Rongyu Zhang, Zefan Cai, Huanrui Yang, Zidong Liu, Denis A. Gudovskiy, Tomoyuki Okuno, Yohei Nakata, Kurt Keutzer, Baobao Chang, Yuan Du, Li Du, Shanghang Zhang
2024Vi2ACT: Video-enhanced Cross-modal Co-learning with Representation Conditional Discriminator for Few-shot Human Activity Recognition.
Kang Xia, Wenzhong Li, Yimiao Shao, Sanglu Lu
2024Video Anomaly Detection via Progressive Learning of Multiple Proxy Tasks.
Menghao Zhang, Jingyu Wang, Qi Qi, Pengfei Ren, Haifeng Sun, Zirui Zhuang, Huazheng Wang, Lei Zhang, Jianxin Liao
2024Video Bokeh Rendering: Make Casual Videography Cinematic.
Yawen Luo, Min Shi, Liao Shen, Yachuan Huang, Zixuan Ye, Juewen Peng, Zhiguo Cao
2024Video Editing Chatbot: Language-Driven Video Compositing System.
Ying Ma, Xinyan Yang, Aiqi Wang, Jianglin Zeng, Shaofei Liu
2024View Gap Matters: Cross-view Topology and Information Decoupling for Multi-view Clustering.
Fangdi Wang, Jiaqi Jin, Zhibin Dong, Xihong Yang, Yu Feng, Xinwang Liu, Xinzhong Zhu, Siwei Wang, Tianrui Liu, En Zhu
2024View-consistent Object Removal in Radiance Fields.
Yiren Lu, Jing Ma, Yu Yin
2024ViewPCGC: View-Guided Learned Point Cloud Geometry Compression.
Huiming Zheng, Wei Gao, Zhuozhen Yu, Tiesong Zhao, Ge Li
2024Vigo: Audiovisual Fake Detection and Segment Localization.
Diego Pérez-Vieites, Juan José Moreira-Pérez, Ángel Aragón-Kifute, Raquel Román-Sarmiento, Rubén Castro-González
2024Virtual Agent Positioning Driven by Personal Characteristics.
Jingjing Liu, Youyi Zheng, Kun Zhou
2024Virtual Visual-Guided Domain-Shadow Fusion via Modal Exchanging for Domain-Specific Multi-Modal Neural Machine Translation.
Zhenyu Hou, Junjun Guo
2024VisHanfu: An Interactive System for the Promotion of Hanfu Knowledge via Cross-Shaped Flat Structure.
Minjing Yu, Lingzhi Zeng, Xinxin Du, Jenny Sheng, Qiantian Liao, Yong-Jin Liu
2024Visual Grounding with Multi-modal Conditional Adaptation.
Ruilin Yao, Shengwu Xiong, Yichen Zhao, Yi Rong
2024Visual Question Answering Driven Eye Tracking Paradigm for Identifying Children with Autism Spectrum Disorder.
Jiansong Qi, Yaping Huang, Ying Zhang, Sihui Zhang, Mei Tian, Yi Tian, Fanchao Meng, Lin Guan, Tianyi Chang
2024Visual-Language Collaborative Representation Network for Broad-Domain Few-Shot Image Classification.
Qianyu Guo, Jieji Ren, Haofen Wang, Tianxing Wu, Weifeng Ge, Wenqiang Zhang
2024Visual-Semantic Decomposition and Partial Alignment for Document-based Zero-Shot Learning.
Xiangyan Qu, Jing Yu, Keke Gai, Jiamin Zhuang, Yuanmin Tang, Gang Xiong, Gaopeng Gou, Qi Wu
2024Visual-linguistic Cross-domain Feature Learning with Group Attention and Gamma-correct Gated Fusion for Extracting Commonsense Knowledge.
Jialu Zhang, Xinyi Wang, Chenglin Yao, Jianfeng Ren, Xudong Jiang
2024VmambaSCI: Dynamic Deep Unfolding Network with Mamba for Compressive Spectral Imaging.
Mingjin Zhang, Longyi Li, Wenxuan Shi, Jie Guo, Yunsong Li, Xinbo Gao
2024VoCAPTER: Voting-based Pose Tracking for Category-level Articulated Object via Inter-frame Priors.
Li Zhang, Zean Han, Yan Zhong, Qiaojun Yu, Xingyu Wu, Xue Wang, Rujing Wang
2024VoiceTuner: Self-Supervised Pre-training and Efficient Fine-tuning For Voice Generation.
Rongjie Huang, Yongqi Wang, Ruofan Hu, Xiaoshan Xu, Zhiqing Hong, Dongchao Yang, Xize Cheng, Zehan Wang, Ziyue Jiang, Zhenhui Ye, Luping Liu, Siqi Zheng, Zhou Zhao
2024VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling.
Yixuan Zhou, Xiaoyu Qin, Zeyu Jin, Shuoyi Zhou, Shun Lei, Songtao Zhou, Zhiyong Wu, Jia Jia
2024VoxelTrack: Exploring Multi-level Voxel Representation for 3D Point Cloud Object Tracking.
Yuxuan Lu, Jiahao Nie, Zhiwei He, Hongjie Gu, Xudong Lv
2024VrdONE: One-stage Video Visual Relation Detection.
Xinjie Jiang, Chenxi Zheng, Xuemiao Xu, Bangzhen Liu, Weiying Zheng, Huaidong Zhang, Shengfeng He
2024WSEL: EEG Feature Selection with Weighted Self-expression Learning for Incomplete Multi-dimensional Emotion Recognition.
Xueyuan Xu, Li Zhuo, Jinxin Lu, Xia Wu
2024Wave-Mamba: Wavelet State Space Model for Ultra-High-Definition Low-Light Image Enhancement.
Wenbin Zou, Hongxia Gao, Weipeng Yang, Tongtong Liu
2024WaveDN: A Wavelet-based Training-free Zero-shot Enhancement for Vision-Language Models.
Jiulin Li, Mengyu Yang, Ye Tian, Lanshan Zhang, Yongchun Lu, Jice Liu, Wendong Wang
2024WeakSAM: Segment Anything Meets Weakly-supervised Instance-level Recognition.
Lianghui Zhu, Junwei Zhou, Yan Liu, Xin Hao, Wenyu Liu, Xinggang Wang
2024Weakly Supervised Gaussian Contrastive Grounding with Large Multimodal Models for Video Question Answering.
Haibo Wang, Chenghang Lai, Yixuan Sun, Weifeng Ge
2024Weakly Supervised Video Anomaly Detection and Localization with Spatio-Temporal Prompts.
Peng Wu, Xuerong Zhou, Guansong Pang, Zhiwei Yang, Qingsen Yan, Peng Wang, Yanning Zhang
2024What's the Real: A Novel Design Philosophy for Robust AI-Synthesized Voice Detection.
Xuan Hai, Xin Liu, Yuan Tan, Gang Liu, Song Li, Weina Niu, Rui Zhou, Xiaokang Zhou
2024When ControlNet Meets Inexplicit Masks: A Case Study of ControlNet on its Contour-following Ability.
Wenjie Xuan, Yufei Xu, Shanshan Zhao, Chaoyue Wang, Juhua Liu, Bo Du, Dacheng Tao
2024When, Where, and What? A Benchmark for Accident Anticipation and Localization with Large Language Models.
Haicheng Liao, Yongkang Li, Chengyue Wang, Yanchen Guan, Kahou Tam, Chunlin Tian, Li Li, Chengzhong Xu, Zhenning Li
2024White-box Multimodal Jailbreaks Against Large Vision-Language Models.
Ruofan Wang, Xingjun Ma, Hanxu Zhou, Chuanjun Ji, Guangnan Ye, Yu-Gang Jiang
2024WisdoM: Improving Multimodal Sentiment Analysis by Fusing Contextual World Knowledge.
Wenbin Wang, Liang Ding, Li Shen, Yong Luo, Han Hu, Dacheng Tao
2024WorldGPT: Empowering LLM as Multimodal World Model.
Zhiqi Ge, Hongzhe Huang, Mingze Zhou, Juncheng Li, Guoming Wang, Siliang Tang, Yueting Zhuang
2024X-Prompt: Multi-modal Visual Prompt for Video Object Segmentation.
Pinxue Guo, Wanyun Li, Hao Huang, Lingyi Hong, Xinyu Zhou, Zhaoyu Chen, Jinglun Li, Kaixun Jiang, Wei Zhang, Wenqiang Zhang
2024XMeCap: Meme Caption Generation with Sub-Image Adaptability.
Yuyan Chen, Songzhou Yan, Zhihong Zhu, Zhixu Li, Yanghua Xiao
2024ZePo: Zero-Shot Portrait Stylization with Faster Sampling.
Jin Liu, Huaibo Huang, Jie Cao, Ran He
2024Zenith: Real-time Identification of DASH Encrypted Video Traffic with Distortion.
Weitao Tang, Jianqiang Li, Meijie Du, Die Hu, Qingyun Liu
2024Zero-Shot Character Identification and Speaker Prediction in Comics via Iterative Multimodal Fusion.
Yingxuan Li, Ryota Hinami, Kiyoharu Aizawa, Yusuke Matsui
2024Zero-Shot Controllable Image-to-Video Animation via Motion Decomposition.
Shoubin Yu, Jacob Zhiyuan Fang, Jian Zheng, Gunnar A. Sigurdsson, Vicente Ordonez, Robinson Piramuthu, Mohit Bansal
2024iControl3D: An Interactive System for Controllable 3D Scene Generation.
Xingyi Li, Yizheng Wu, Jun Cen, Juewen Peng, Kewei Wang, Ke Xian, Zhe Wang, Zhiguo Cao, Guosheng Lin
2024mPLUG-PaperOwl: Scientific Diagram Analysis with the Multimodal Large Language Model.
Anwen Hu, Yaya Shi, Haiyang Xu, Jiabo Ye, Qinghao Ye, Ming Yan, Chenliang Li, Qi Qian, Ji Zhang, Fei Huang
2024rPPG-HiBa: Hierarchical Balanced Framework for Remote Physiological Measurement.
Yin Wang, Hao Lu, Ying-Cong Chen, Li Kuang, MengChu Zhou, Shuiguang Deng
2024uvg266: Open-Source VVC Intra Encoder.
Marko Viitanen, Joose Sainio, Kari Siivonen, Alexandre Mercat, Jarno Vanne
2024uvgComm: Open Software for Low-Latency Multi-party Video Communication.
Joni Räsänen, Heikki Tampio, Alexandre Mercat, Jarno Vanne