| 2025 | Adapting Image-to-Video Diffusion Models for Large-Motion Frame Interpolation. Luoxu Jin, Hiroshi Watanabe |
| 2025 | An Exploration of User Biometric Identification In XR Applications Based On User Head Movement. Owen Dossett, Ke Lyu, Maohong Liao, Han Li, Xianglong Feng |
| 2025 | Anderson Accelerated Residual Solver for Total Variation Models in Image Processing. Yuanhao Gong, Yongfei Guo |
| 2025 | Blind Image Super-Resolution with Local and Global Dual-Guidance. Yajun Qiu, Shuyuan Zhu, Lantao Yu, Bing Zeng |
| 2025 | CG-SMFNet: Consensus-Guided Selective Multimodal Fusion for Weakly Supervised Temporal Action Localization. Peng Liu, Zitai Jiang |
| 2025 | Carbon-Efficient Internet Video Streaming. Zichen Zhu, Tian Guo, Sheng Wei |
| 2025 | CompBench: Benchmarking and Comparing Image Generation with Large Multimodal Models. Jiarui Wang, Huiyu Duan, Yuke Xing, Yiling Xu, Guangtao Zhai, Xiongkuo Min |
| 2025 | Cross-Modal Thermal Image Compression via RGB Side Information. Sayush Maharjan, Raghunath Sai Puttagunta, Zach Button, Zhu Li |
| 2025 | D3Net: Dual-Path Decoupling-Distillation for Adaptive Fusion in Continual Egocentric Learning. Chenghao Qi, Heqian Qiu, Zhaofeng Shi, Lanxiao Wang, Hanwen Zhang, Xinyu Chen, Hongliang Li |
| 2025 | DBAB: A Dual-Branch Adaptive Balance Framework with Optimized Plasticity Branch for Class-Incremental Learning. Xinyu Chen, Heqian Qiu, Chenghao Qi, Ruisong Dai, Hongliang Li |
| 2025 | DFR: A Decompose-Fuse-Reconstruct Framework for Multi-Modal Few-Shot Segmentation. Shuai Chen, Fanman Meng, Xiwei Zhang, Haoran Wei, Chenhao Wu, Qingbo Wu, Hongliang Li |
| 2025 | Data-independent Beamforming for End-to-end Multichannel Multi-speaker ASR. Can Cui, Paul Magron, Mostafa Sadeghi, Emmanuel Vincent |
| 2025 | Dynamic Gaussian Streams for Volumetric Video via Codebook-Based Quantization. Zhehao Shen, Yiwen Cai, Yuanji Lu, Yu Hong, Yize Wu, Meihan Zheng, Yingliang Zhang, Lan Xu |
| 2025 | EPINET-Lite: Rethinking Mixed Convolutions for Efficient Light Field Disparity Estimation Network. Ali Hassan, Tingting Zhang, Karen Egiazarian, Mårten Sjöström |
| 2025 | Efficient Generative Defect Synthesis for Industrial Anomaly Detection on MVTec AD. Avinash Kumar Sharma, Tushar Shinde |
| 2025 | Efficient Polyp Detection via Wavelet-Driven Boundary Enhancement and Temporal Consistency. Hanwen Zhang, Heqian Qiu, Lanxiao Wang, Chenghao Qi, Ruisong Dai, Hongliang Li |
| 2025 | Explicit Residual-Based Scalable Image Coding for Humans and Machines. Yui Tatsumi, Ziyue Zeng, Hiroshi Watanabe |
| 2025 | Exploring Cross-Stage Adversarial Transferability in Class-Incremental Continual Learning. Jungwoo Kim, Jong-Seok Lee |
| 2025 | FPG-NAS: FLOPs-Aware Gated Differentiable Neural Architecture Search for Efficient 6DoF Pose Estimation. Nassim Ali Ousalah, Peyman Rostami, Anis Kacem, Enjie Ghorbel, Emmanuel Koumandakis, Djamila Aouada |
| 2025 | FPGA Accelerated One-Sided Box Filter for Edge-Preserving Image Processing. Yongfei Guo, Xudong Niu, Chizhi Zhang, Yuanhao Gong |
| 2025 | Flexibly Constrained Tucker Decomposition for High-Order Spectral Analysis. Fei He, Houji Du, Nipon Theera-Umpon, Yipeng Liu, Ce Zhu |
| 2025 | Frequency-Weighted Training Losses for Phoneme-Level DNN-based Speech Enhancement. Nasser-Eddine Monir, Paul Magron, Romain Serizel |
| 2025 | Guided Diffusion for the Extension of Machine Vision to Human Visual Perception. Takahiro Shindo, Yui Tatsumi, Taiju Watanabe, Hiroshi Watanabe |
| 2025 | HGS_OFAT: High-fidelity Gaussian SLAM based on Optical Flow Assisted Tracking. Zhenyong Li, Shanxin Zhang, Yuxiang Liu, Chuanfen Feng, Hui Ji, Jiande Sun |
| 2025 | HyperDiff: Hypergraph Guided Diffusion Model for 3D Human Pose Estimation. Bing Han, Yuhua Huang, Pan Gao |
| 2025 | IEEE International Workshop on Multimedia Signal Processing, MMSP 2025, Beijing, China, September 21-23, 2025 |
| 2025 | IdCo: Joint Identification and Contrastive Learning for Masked Face Recognition. Qingtong Xu, Chao Zhang, Ao Li, Xiaoning Liu, Ce Zhu |
| 2025 | Infant Cry Detection In Noisy Environment Using Blueprint Separable Convolutions and Time-Frequency Recurrent Neural Network. Haolin Yu, Yanxiong Li |
| 2025 | LSS3D: Learnable Spatial Shifting for Consistent and High-Quality 3D Generation from Single-Image. Zhuojiang Cai, Yiheng Zhang, Meitong Guo, Mingdao Wang, Yuwang Wang |
| 2025 | Latent Space Stability vs. Perceptual Sensitivity: A Study of Visual Encoders under Distortion. Abderrezzaq Sendjasni, Mohamed-Chaker Larabi |
| 2025 | Learned Image Codec with Progressive Multi-Scale Probability Model for Streaming in Unreliable Communication Channels. Honglei Zhang, A. Burakhan Koyuncu, Jukka I. Ahonen, Nannan Zou, Francesco Cricri |
| 2025 | Learning 3D mesh saliency from spiral patch features. Olivier Lézoray, Anass Nouri |
| 2025 | Lightweight DNN for Full-Band Speech Denoising on Mobile Devices: Exploiting Long and Short Temporal Patterns. Konstantinos Drossos, Mikko Heikkinen, Paschalis Tsiaflakis |
| 2025 | Lightweight Steel Surface Defect Detection via Knowledge Distillation. Tao Lu, Gaochang Wu |
| 2025 | Low Latency Immersive Visual Communication with Scalable Gaussian Splatting Coding. Lingyu Shi, Jiaqi Zou, Songlin Sun, Geert Van der Auwera, Zhu Li |
| 2025 | MGFT: Multi-Geometric Fusion Transformer for Robust Point Cloud Registration. Yuxiang Liu, Shanxin Zhang, Zhenyong Li, Chuanfen Feng, Hui Ji, Jiande Sun |
| 2025 | Meta Learning for Adaptive Disentangled User Preference Integration Toward Multimodal Recommendation. Zhenchao Wu, Hongteng Xu, Xu Chen |
| 2025 | Meta Learning-based Multimodal Recommendation with Adaptive User Modality-Aware Preference Integration. Zhenchao Wu, Hongteng Xu, Xu Chen |
| 2025 | Multimodal Federated Learning for Personalized Clothing Recommendation. Xinhui Yu, Sophie Liu, Chunhua Wu |
| 2025 | Music Source Restoration. Yongyi Zang, Zheqi Dai, Mark D. Plumbley, Qiuqiang Kong |
| 2025 | NeRFCompressor: Enhancing Dynamic Scene Representation for Efficient 6-DoF Object Transportation. Jin Zhou, Mufeng Zhu, Yao Liu, Songqing Chen |
| 2025 | OrthCal: Synergizing Orthogonal Contrastive Learning and Prototype Calibration for Few-Shot Class-Incremental Learning. Ruisong Dai, Hanwen Zhang, Xinyu Chen, Chenghao Qi, Heqian Qiu, Hongliang Li |
| 2025 | PromptGS: Visual Prompting for Tiny Object Reconstruction in 3DGS Optimization. Xun Wang, Xutao Xue, Xubing Kang, Siyuan Li, Shayer Shabab Utsho, Kun Li, Mengqi Ji |
| 2025 | Prototype Embedding Optimization for Human-Object Interaction Detection in Livestreaming : PeO-HOI. Menghui Zhang, Jing Zhang, Lin Chen, Li Zhuo |
| 2025 | Real-Time Distortion Detection for PTZ Camera Systems. Zhuobin Yuan, Rui Dai, Rayan Alghamdi |
| 2025 | Real-Time View Synthesis with Multiplane Image Network using Multimodal Supervision. Manu Gond, Mohammadreza Shamshirgarha, Emin Zerman, Sebastian Knorr, Mårten Sjöström |
| 2025 | Reinforcement Learning-Based Dynamic Resource Allocation for Aerial 360° Video VR Streaming. Jacob Chakareski, Lingdong Wang, Nicholas Mastronarde |
| 2025 | Restore Anything Anywhere: Targeted Image Restoration with Object Segmentation and Text Guidance. Yen-Ku Yeh, Chun-Hao Yang, Kun-tai Wu, Yan-Tsung Peng, Chun-Rong Huang, Jun-Cheng Chen |
| 2025 | Rethinking Document Layout Analysis through Text Clustering via Multi-Modal Graph Convolution Networks. Wenxi Li, Chenyang Lyu, Wei Ji, Liting Zhou, Cathal Gurrin, Yuchen Guo |
| 2025 | S-LAM3D: Segmentation-Guided Monocular 3D Object Detection via Feature Space Fusion. Diana-Alexandra Sas, Florin Oniga |
| 2025 | Secure INN-based Steganography via Model Smoothing and Adversarial Attacks. Weixiang Zhao, Fei Shang, Jin Li, Jingyang Wen, Xiangui Kang, Z. Jane Wang |
| 2025 | Secure protection of 3D content through reversible geometric deformation. Khélian Larvet, Jean-Pierre Pedeboy, William Puech |
| 2025 | SpatialGeo: Boosting Spatial Reasoning in Multimodal LLMs via Geometry-Semantics Fusion. Jiajie Guo, Qingpeng Zhu, Jin Zeng, Xiaolong Wu, Changyong He, Weida Wang |
| 2025 | Sphere-GAN: a GAN-based approach for saliency estimation in 360° videos. Mahmoud Z. A. Wahba, Sara Baldoni, Federica Battisti |
| 2025 | Structure-Preserving Patch Decoding for Efficient Neural Video Representation. Taiga Hayami, Kakeru Koizumi, Hiroshi Watanabe |
| 2025 | Subjective Visual Quality Assessment of Compressed Light Field Images: Learning-based vs. Conventional Methods. Emin Zerman, Soheib Takhtardeshir, Anthony Trioux, Jianlong Qin, Wenjie Wu, Roger Olsson, Mårten Sjöström |
| 2025 | Tackling Re-buffering in Adaptive Video Streaming over Dynamic Networks: A Generative AI Approach. Duc V. Nguyen |
| 2025 | Task-Aware Optimized Color Image Demosaicing. Lei Xiong, Zihao Wang, Boyuan Zhang, Feiyu Chen, Shuyuan Zhu, Bing Zeng |
| 2025 | Touch-Augmented Gaussian Splatting for Enhanced 3D Scene Reconstruction. Yue Gao, Xiao Xu, Eckehard G. Steinbach, Daniel E. Lucani, Qi Zhang |
| 2025 | Towards Low-Latency Tracking of Multiple Speakers With Short-Context Speaker Embeddings. Taous Iatariene, Alexandre Guérin, Romain Serizel |
| 2025 | Towards Volumetric Video: a Technical Overview of Immersive Media. Shi Pan, Hongshuai Li, Zhengxian Yang, Le Wang, Cheng Su, Liqian Ma, Hua Du, Borong Lin, Tao Yu |
| 2025 | White-box Differentiable Model of Perceived Localisation. Antoine R. Souchaud, Pedro Lladó, Annika Neidhardt, Zoran Cvetkovic, Enzo De Sena |