为了提供给您更好的使用体验,请启用此功能。';window.onload=function(){1==navigator.cookieEnabled||(document.body.insertAdjacentHTML("beforeend",ck_html),window.onscroll=function(i){console.log(i),document.body.setAttribute("style","position:fixed")})}
1. Diffusion-based diverse audio captioning with retrieval-guided Langevin dynamics NSTL国家科技图书文献中心
Zhu, Yonggang | Men, Aidong... - 《Information Fusion》 - 2025,114 - 共13页
2. InVideo Search: Scene Description Clustering and Integrating Image and Audio Captioning for Enhanced Video Search NSTL国家科技图书文献中心
Almira Asif Khan | Muhammed... - 《Distributed Computing and Internet Technology》 - International Conference on Distributed Computing and Internet Technology - 2025, - 195~208 - 共14页
Luntian Mou | Peize Li... - 《Social Robotics》 - International Conference on Social Robotics - 2025, - 282~292 - 共11页
4. MCANet: Multimodal Caption Aware Training-Free Video Anomaly Detection via Large Language Model NSTL国家科技图书文献中心
Prabhu Prasad Dev | Raju Hazari... - 《Pattern Recognition,Part XXXII》 - International Conference on Pattern Recognition - 2025, - 362~379 - 共18页
5. VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset NSTL国家科技图书文献中心
Jing Liu | Sihan Chen... - 《IEEE Transactions on Pattern Analysis and Machine Intelligence》 - 2025,47(2) - 708~724 - 共17页
6. Audio-Guided Visual Knowledge Representation NSTL国家科技图书文献中心
Fei Yu | Zhiguo Wan... - 《Database Systems for Advanced Applications》 - International Conference on Database Systems for Advanced Applications | International Workshop on Big Data Management and Service | International Workshop on Graph Data Management and Analysis | International Workshop on Big Data Quality Management | Workshop on Emerging Results inData Science and Engineering - 2025, - 129~146 - 共18页
7. Towards a Multimodal Framework for Remote Sensing Image Change Retrieval and Captioning NSTL国家科技图书文献中心
Roger Ferrod | Luigi Di Caro... - 《Discovery Science,Part II》 - International Conference on Discovery Science - 2025, - 231~245 - 共15页
8. Diffusion-Based Multimodal Video Captioning NSTL国家科技图书文献中心
Jaakko Kainulainen | Zixin Guo... - 《Computer Vision - ACCV 2024,Part III》 - Asian Conference on Computer Vision - 2025, - 148~165 - 共18页
9. An efficient deep learning-based video captioning framework using multi-modal features NSTL国家科技图书文献中心
Soumya Varma | Dinesh Peter James - 《Expert systems》 - 2025,42(2) - e12920.1~e12920.16 - 共16页 - 被引量:2
10. AD2AT: Audio Description to Alternative Text, a Dataset of Alternative Text from Movies NSTL国家科技图书文献中心
Elise Lincker | Camille Guinaudeau... - 《MultiMedia Modeling,Part I》 - International Conference on MultiMedia Modeling - 2025, - 58~71 - 共14页
服务站
成员单位
友情链接