Tao Jin  (金涛)


Research Interests: Multimedia Analysis, Computer Vision, Natural Language Learning, Transfer Learning,
Address: Hangzhou/Ningbo, Zhejiang Province
Email: jint_zju@zju.edu.cn

Education

`

Work Experiences

  • Research Intern at Taobao Research
    Taobao Lab
    June 2020 - Sep 2020       Hangzhou, China
  • Research Intern at Kuake Research
    Kuake Lab
    Nov 2019 - Feb 2020       Hangzhou, China

Supervised and Co-supervised Students

  • School of Software, Zhejiang University
    Wang Lin (2021, linwanglw@zju.edu.cn, National Scholarship, PHD of ZJU),
    Linjun Li (2021, lilinjun21@zju.edu.cn, National Scholarship, Beidou Plan of Meituan),
    Xize Cheng (2021, chengxize@zju.edu.cn, National Scholarship, PHD of ZJU),
    Ye Wang (2021, yew@zju.edu.cn, National Scholarship, Daka of Tecent&2-1 of Bytedance),
    Zirun Guo (2024),
    Weicai Yan (2024),
    Dongjie Fu (2024),
    Xiaoda Yang (2024),

Publications(* denotes equal contributions, & denotes corresponding author)

  1. Exploring Embodied Emotion Through A Large-Scale Egocentric Video Dataset
    Wang Lin, Tao Jin, Zhou Zhao, Chang Yao, Jingyuan Chen,
    NeurIPS, 2024

  2. Action Imitation in Common Action Space for Customized Action Image Synthesis
    Wang Lin, Jingyuan Chen, Zirun Guo, Tao Jin, Zhou Zhao,
    NeurIPS, 2024

  3. Balancing Multimodal Learning with Classifier-guided Gradient Modulation
    Zirun Guo, Tao Jin&,
    NeurIPS, 2024

  4. AudioVSR: Enhancing Video Speech Recognition with Audio Data
    Xiaoda Yang, Xize Cheng, Tao Jin&,
    EMNLP, 2024

  5. Calibrating Prompt from History for Continual Vision-Language Retrieval and Grounding
    Tao Jin, Weicai Yan, Ye Wang, Zhou Zhao,
    ACM MM, 2024

  6. Boosting Speech Recognition Robustness to Modality-Distortion with Contrast-Augmented Prompts
    Dongjie Fu, Xize Cheng, Xiaoda Yang, Tao Jin&, Zhou Zhao,
    ACM MM, 2024

  7. SyncTalklip: Highly Synchronized Lip-Readable Speaker Generation with Multi-Task Learning
    Xiaoda Yang, Xize Cheng, Dongjie Fu, Tao Jin&, Zhou Zhao,
    ACM MM, 2024

  8. Low-rank Prompt Interaction for Continual Vision-Language Retrieval
    Weicai Yan, Ye Wang, Tao Jin&, Zhou Zhao,
    ACM MM, 2024

  9. TransFace: Unit-Based Audio-Visual Speech Synthesizer for Talking Head Translation
    Xize Cheng, Tao Jin, Zhou Zhao,
    ACL, 2024

  10. Uni-Dubbing: Zero-Shot Speech Synthesis from Visual Articulation
    Songju Lei, Xize Cheng&, Tao Jin, Zhou Zhao,
    ACL, 2024

  11. Rethinking the Multimodal Correlation of Multimodal Sequential Learning
    Tao Jin, Zhou Zhao,
    ACL, 2024

  12. Multimodal Prompt Learning with Missing Modalities for Sentiment Analysis and Emotion Recognition
    Zirun Guo, Tao Jin&, Zhou Zhao,
    ACL, 2024

  13. Two-Stream Generative Recommender with Behavior-Semantic Collaboration
    Ye Wang, Jiahao Xun, Minjie Hong, Jieming Zhu, Tao Jin, Zhou Zhao,
    KDD, 2024

  14. Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language Prompt
    Yongqi Wang, Ruofan Hu, Rongjie Huang, Zhiqing Hong, Ruiqi Li, Wenrui Liu, Fuming You, Tao Jin, Zhou Zhao,
    NAACL, 2024

  15. Non-confusing Generation of Customized Concepts in Diffusion Models
    Wang Lin, Jingyuan Chen, Tao Jin, Zhou Zhao,
    ICML, 2024

  16. Molecule-Space: Free Lunch in Unified Multimodal Space via Knowledge Fusion
    Zehan Wang, Ziang Zhang, Xize Cheng, Rongjie Huang, Luping Liu, Tao Jin, Zhou Zhao,
    ICML, 2024

  17. MPOD123: One Image to 3D Content Generation Using Mask-enhanced Progressive Optimization
    Jimin Xu*, Tianbao Wang*, Tao Jin&, Zhou Zhao,
    CVPR, 2024

  18. Rethinking Missing Modality Learning from a Decoding Perspective
    Tao Jin, Zhou Zhao,
    ACM MM, 2023

  19. Exploring Group-Based Video Captioning with Efficient Relational Approximation
    Wang Lin*, Tao Jin*, Ye Wang, Zhou Zhao,
    ICCV, 2023

  20. Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and Recognition
    Xize Cheng*, Tao Jin*, Linjun Li, Zhou Zhao,
    ICCV, 2023

  21. Multi-Granularity Relational Attention Network for Audio-Visual QA
    Linjun Li*, Tao Jin*, Wang Lin, Hao Jiang, Zhou Zhao,
    TCSVT, 2023

  22. OpenSR: Open-Modality Speech Recognition via Maintaining Multi-Modality Alignment
    Xize Cheng*, Tao Jin*, Linjun Li, Wang Lin, Xinyu Duan,
    ACL, 2023

  23. TAVT: Towards Transferable Audio-Visual Text Generation
    Wang Lin*, Tao Jin*, Ye Wang, Wenwen Pan, Xize Cheng, Linjun Li, Zhou Zhao
    ACL, 2023

  24. Semantic-Conditioned Dual Adaptation for Query-based Visual Segmentation
    Ye Wang*, Tao Jin*, Wang Lin, Xize Cheng, Linjun Li, Zhou Zhao
    ACL, 2023

  25. Contrastive Token-Wise Meta-Learning for Unseen Temporal-Aligned Translation
    Linjun Li*, Tao Jin*, Xize Cheng, Ye Wang, Wang Lin, Rongjie Huang, Zhou Zhao,
    ACL, 2023

  26. DATE: Domain Adaptive Product Seeker for E-commerce
    Haoyuan Li, Hao Jiang, Tao Jin, Mengyan Li, Yan Chen, Zhijie Lin, Yang Zhao, Zhou Zhao,
    CVPR, 2023

  27. Gloss Attention for Gloss-free Sign Language Translation
    Aoxiong Yin, Tianyun Zhong, Li Tang, Weike Jin, Tao Jin, Zhou Zhao,
    CVPR, 2023

  28. Interaction Augmented Transformer with Decoupled Decoding for Video Captioning
    Tao Jin, Zhou Zhao, Peng Wang, Jun Yu, Fei Wu,
    Neurocomputing, 2022

  29. MC-SLT: Towards Low-Resource Signer-Adaptive Sign Language Translation
    Tao Jin, Zhou Zhao, Meng Zhang, Xingshan Zeng,
    ACM MM, 2022

  30. Prior Knowledge and Memory Enriched Transformer for Sign Language Translation
    Tao Jin, Zhou Zhao, Meng Zhang, Xingshan Zeng,
    ACL, 2022

  31. Generalizable Multi-Linear Attention Network
    Tao Jin, Zhou Zhao,
    NeurIPS, 2021

  32. Contrastive Disentangled Meta-Learning for Signer-Independent Sign Language Translation
    Tao Jin, Zhou Zhao,
    ACM MM, 2021

  33. Dual Low-Rank Multimodal Fusion
    Tao Jin*, Siyu Huang*, Yingming Li, Zhongfei Zhang
    EMNLP, 2020

  34. SBAT: Video Captioning with Sparse Boundary-Aware Transformer
    Tao Jin, Siyu Huang, Ming Chen, Yingming Li, Zhongfei Zhang
    IJCAI, 2020

  35. Low-Rank HOCA: Efficient High-Order Cross-Modal Attention for Video Captioning
    Tao Jin, Siyu Huang, Yingming Li, Zhongfei Zhang,
    EMNLP, 2019

  36. Recurrent Convolutional Video Captioning with Global and Local Attention
    Tao Jin, Yingming Li, Zhongfei Zhang,
    Neurocomputing, 2019


Contest


Web Site Hit Counter Since Nov, 2018

Proudly powered by Bootstrap