Yapeng Tian

Assistant Professor, The University of Texas at Dallas

yapeng.tian@utdallas.edu

ECSS 4.211

Bio

I am an assistant professor in the Computer Science Department of UT Dallas and lead the Computer Vision and Multimodal Computing (CVMC) Lab. I am interested in solving core computer vision, computer audition, and machine learning problems and applying the developed learning approaches to broad AI applications.

Before coming to UTD, I finished my PhD at University of Rochester, advised by Chenliang Xu, my master degree at Tsinghua University working with Wenming Yang, and B.E degree at Xidian University. I was a visiting student at SIAT advised by Yu Qiao. I did internships at Adobe Research with Dingzeyu Li and Meta with Alexander Richard.

Call for papers:
We are organizing an ELVM: Efficient Large Vision Models workshop at CVPR 2024. We are looking for both short and long paper submissions. All accepted long papers will be published as part of the CVPR'24 workshop proceedings.

News

  • 03/2024: One paper accepted at NAACL 2024.
  • 03/2034: One journal article accepted at Medical Image Analysis.
  • 03/2024: One journal article accepted at IEEE TMM.
  • 03/2024: Received UTD Undergraduate Research Apprenticeship Program (URAP) award.
  • 02/2024: We are organizing an ELVM: Efficient Large Vision Models workshop at CVPR 2024.
  • 02/2024: One paper accepted at CVPR 2024.
  • 02/2024: One paper accepted at CHI 2024.
  • 10/2023: Invited talk at UTD-DFWCSTA Battle of the Brains - Conference & Contest for K12 students.
  • 10/2023: Invited lightning talk at Workshop on Imaging and Data Science.
  • 10/2023: One paper accepted at WACV 2024.
  • 10/2023: Listed in 2022 World's Top 2% Scientists by Stanford University.
  • 10/2023: Invited talk at Do Good with Data Webinar for K12 students.
  • 09/2023: Two papers accepted at NeurIPS 2023.
  • 09/2023: One paper accepted at UIST 2023.
  • 09/2023: Five papers accepted at ICCV AV4D workshop.
  • 07/2023: Three papers accepted at ICCV 2023.
  • 06/2023: One paper accepted at MICCAI 2023. We are organizing a Cardiac MRI Reconstruction Challenge in conjunction with MICCAI 2023.
  • 06/2023: Invited talk at Sight and Sound Workshop @ CVPR 2023.
  • 06/2023: I will serve as a SPC for AAAI 2024.
  • 06/2023: Received an Adobe Research Gift.
  • 06/2023: One journal paper accepted at IEEE Transactions on Image Processing.
  • 05/2023: Three papers accepted at CVPR Sight and Sound Workshop.
  • 03/2023: I will serve as an Execution Area Chair for VALSE.
  • 03/2023: Received an inaugural Undergraduate Research Apprenticeship Program (URAP) award.
  • 03/2023: Received a Cisco Faculty Research Award.
  • 03/2023: I will be co-organizing a Cardiac MRI Reconstruction Challenge in conjunction with MICCAI 2023.
  • 02/2023: Three papers accepted at CVPR 2023.
  • 02/2023: Please check out our new AV-NeRF paper. In this work, we pose and tackle a Real-World Audio-Visual Scene Synthesis problem.
  • 02/2023: One journal paper accepted at IEEE Signal Processing Letters.
  • 02/2023: One journal paper accepted at IEEE Transactions on Neural Networks and Learning Systems.
  • 01/2023: Two papers accepted at ICLR 2023.
  • 11/2022: Selected for the 2023 AAAI New Faculty Highlights Program.
  • 10/2022: Invited talk at AV4D Workshop @ ECCV 2022.
  • 09/2022: One paper accepted at NeurIPS 2022. Congratulations to Shentong!
  • 09/2022: Two papers accepted at ECCV@AV4D 2022 .
  • 08/2022: Please check out our new article "Learning in Audio-visual Context: A Review, Analysis, and New Perspective."
  • 08/2022: I start as an assistant professor in CS at UTD.
  • 07/2022: I will serve as a Senior Program Committee (SPC) Member for AAAI 2023.
  • 07/2022: One paper accepted at ECCV 2022.
  • 06/2022: One paper accepted at MICCAI 2022.
  • 06/2022: Successfully defended my dissertation! Thanks to everyone who supported me and helped me along the way.
  • 04/2022: I will attend CVPR'22 Doctoral Consortium.
  • 03/2022: Two works: audio-visual question answering and MRI SR are accepted by CVPR 2022.
  • 12/2021: Two papers are accepted by AAAI 2022.
  • 10/2021: One paper on sounding object localization is accepted by BMVC 2021!
  • 07/2021: One paper on video matting is accepted by ICCV 2021!
  • 03/2021: Our two works: co-learn sounding object visual grounding and sound separation and audio-visual robustness are accepted by CVPR 2021!
  • 02/2021: We will co-organize a CVPR 2021 Tutorial on Audio-visual Scene Understanding!
  • 01/2021: Co-organized the WACV 2021 Tutorial on Audio-visual Scene Understanding. More details can be found in our website.
  • 10/2020: I was in the top 10% of high-scoring reviewers for NeurIPS 2020!
  • 07/2020: Our audio-visual video parsing work got accepted by ECCV 2020 as a Spotlight.
  • 05/2020: Our three papers will be presented in the CVPR 2020 Sight and Sound workshop.
  • 02/2020: Two papers on video restoration got accepted by CVPR 2020! Congratulations to all co-authors!
  • 01/2020: RDN is accepted by IEEE TPAMI! Congratulations to Yulun!
  • 12/2019: Please check our deep audio prior paper.
  • 08/2019: One paper is accepted by IEEE TIP. Congratulations to Xuechen!
  • 07/2019: One paper is accepted by ICCV 2019. Congratulations to Wei!
  • 05/2019: Our two works: audio-visual event localization and audio-visual video captioning will be presented in the CVPR 2019 Sight and Sound workshop.
  • 02/2019: I will serve as an ICCV 2019 reviewer.
  • 12/2018: Two papers are posted on ArXiv. Please watch the corresponding demos.
  • 07/2018: One paper is accepted by ECCV 2018! AVE dataset and codes have been released.
  • 02/2018: One paper is accepted by CVPR 2018. Congratulations to Yulun!
  • 07/2017: I recieve 'Outstanding Graduate of Tsinghua university' and 'Outstanding Master Thesis Award'.
  • 03/2017: I will join Prof. Chenliang Xu's lab to pursue a PhD degree at University of Rochester!

Students

Students at UTD:
Siva Sai Nagender Vasireddy (PhD student; Fall 2022)
Shijian Deng (PhD student; Spring 2023)
Saksham Singh Kushwaha (PhD student; Summer 2023)

Ziru Huang (Visiting student; Tsinghua University; Spring 2024)
Aditya Kulkarni (Undergraduate student; Fall 2024)
Atmin Mehul Sheth (Undergraduate student; Summer 2023)
William Doan (Undergraduate student; Fall 2023)
Anikait Bharadwaj (K12 (Frisco ISD); Spring 2024)
Michael Yang (K12; Summer 2023)
Matthew Wang (K12; Summer 2023)
Zeke Barnett (K12; Spring 2023)

Collaborated External Students:
Tianyu Yang (PhD student at University of Notre Dame)
Shentong Mo (PhD student at Carnegie Mellon University)
Yiyang Nan (Graduate student at Brown University)
Kai wang (PhD student at University of Toronto)

Alumni:
Yuxin Ye (Graduate student at Tsinghua University)
Yichen Chi (Graduate student at Tsinghua University)
Junhao Gu (PhD student at Tsinghua University)
Jiamiao Zhang (Graduate student at Tsinghua University)
Sen Fang (Undergraduate student at Victoria University, Next: PhD student at Rutgers University)
Sasha Kaplan (Undergraduate student; Spring 2023)
Sisi Aarukapalli (Undergraduate student; Summer 2023)
Harsh Singh (PhD student at UTD; Spring and Summer 2023; Next: CV MSC at MBZUAI)
Yulang Wu (Graduate student at UTD CS, Spring 2023; Next: Postdoc at University of California San Francisco)
Guangyao Li (PhD student at Renmin University of China, Fall 2020 - Spring 2023)
Shijian Deng (Graduate student at University of Rochester; next: PhD student at UTD)
Hai Wang (Graduate student at Tsinghua University; next: PhD student at UCL)
Sizhe Li (Undergraduate student at University of Rochester; next: Visiting student at MIT)
Yiyang Su (Undergraduate student at University of Rochester; next: PhD student at Michigan State University)
Rohan Sharma (Graduate student at University of Rochester; next: PhD student at SUNY Buffalo)
Chenxiao Guan (Undergraduate student at University of Rochester; next: Graduate student at CMU)

Publications

Most recent publications on Google Scholar.
indicates equal contribution.

  • All
  • Preprint
  • Vision+Sound
  • Video Restoration
  • Image Restoration

Text-to-Audio Generation Synchronized with Videos

Shentong Mo, Jing Shi, Yapeng Tian

Preprint'24.

DAVIS: High-Quality Audio-Visual Separation with Generative Diffusion Models

Chao Huang, Susan Liang, Yapeng Tian, Anurag Kumar, Chenliang Xu

Preprint'23.

SignDiff: Learning Diffusion Models for American Sign Language Production

Sen Fang, Chunyu Sui, Xuedong Zhang, Yapeng Tian

Preprint'23.

Towards Long Form Audio-visual Video Understanding

Wenxuan Hou, Guangyao Li, Yapeng Tian, Di Hu

Preprint'23.

EgoVSR: Towards High-Quality Egocentric Video Super-Resolution

Yichen Chi, Junhao Gu, Jiamiao Zhang, Wenming Yang, Yapeng Tian

Preprint'23.

Learning in Audio-visual Context: A Review, Analysis, and New Perspective

Yake Wei, Di Hu, Yapeng Tian, Xuelong Li

Preprint'22.

T-VSL: Text-Guided Visual Sound Source Localization in Mixtures

Tanvir Mahmud, Yapeng Tian, Diana Marculescu

CVPR'24: IEEE/CVF Conference on Computer Vision and Pattern Recognition

OSCaR: Object State Captioning and State Change Representation

Nguyen Nguyen, Jing Bi, Ali Vosoughi, Yapeng Tian, Pooyan Fazli, Chenliang Xu

NAACL'24: The North American Chapter of the Association for Computational Linguistics (Findings)

SPICA: Interactive Video Content Exploration through Augmented Audio Descriptions for Blind or Low-Vision Viewers

Zheng Ning, Brianna Wimer, Kaiwen Jiang, Keyi Chen, Jerrick Ban, Yapeng Tian, Yuhang Zhao, Toby Li

CHI'24: The ACM Conference on Human Factors in Computing Systems.

STADNet: Spatial-Temporal Attention-Guided Dual-Path Network for cardiac cine MRI super-resolution

Jun Lyu, Shuo Wang, Yapeng Tian, Jing Zou, Shunjie Dong, Chengyan Wang, Angelica I Aviles-Rivero, Jing Qin

MIA'24: Medical Image Analysis

Unveiling cross modality bias in visual question answering: A causal view with possible worlds vqa

Ali Vosoughi, Shijian Deng, Songyang Zhang, Yapeng Tian, Chenliang Xu, Jiebo Luo

TMM'24: IEEE Transactions on Multimedia

LAVSS: Location-Guided Audio-Visual Spatial Audio Separation

Yuxin Ye, Wenming Yang, Yapeng Tian

WACV'24: Winter Conference on Applications of Computer Vision.

Disentangled counterfactual learning for physical audiovisual commonsense reasoning

Changsheng Lv, Shuai Zhang, Yapeng Tian, Mengshi Qi, Huadong Ma

NeurIPS'23: The Annual Conference on Neural Information Processing Systems.

AV-NeRF: Learning Neural Fields for Real-World Audio-Visual Scene Synthesis

Susan Liang, Chao Huang, Yapeng Tian, Anurag Kumar, Chenliang Xu

NeurIPS'23: The Annual Conference on Neural Information Processing Systems.

PEANUT: A Human-AI Collaborative Tool for Annotating Audio-Visual Data

Zheng Zhang, Zheng Ning, Chenliang Xu Yapeng Tian, Toby Jia-Jun Li

UIST'23: ACM Symposium on User Interface Software and Technology.

Towards Robust Active Speaker Detection

Siva Sai Nagender Vasireddy, Chenxu Zhang, Xiaohu Guo, Yapeng Tian

ICCVW'23: ICCV AV4D Workshop .

Position-Aware Audio-Visual Separation for Spatial Audio

Yuxin Ye, Wenming Yang, Yapeng Tian

ICCVW'23: ICCV AV4D Workshop .

Towards Better Egocentric Action Understanding in a Multi-Input Multi-Output View

Wenxuan Hou, Ruoxuan Feng, Yixin Xu, Yapeng Tian, Di Hu

ICCVW'23: ICCV AV4D Workshop .

Neural Acoustic Context Field: Rendering Realistic Room Impulse Response With Neural Fields

Susan Liang, Chao Huang, Yapeng Tian, Anurag Kumar, Chenliang Xu

ICCVW'23: ICCV AV4D Workshop .

Separating Invisible Sounds Toward Universal Audio-Visual Scene-Aware Sound Separation

Yiyang Su, Ali Vosoughi, Shijian Deng, Yapeng Tian, Chenliang Xu

ICCVW'23: ICCV AV4D Workshop .

Audio-Visual Class-Incremental Learning

Weiguo Pian, Shentong Mo, Yunhui Guo, Yapeng Tian

ICCV'23: IEEE/CVF International Conference on Computer Vision.

Class-Incremental Grouping Network for Continual Audio-Visual Learning

Shentong Mo, Weiguo Pian, Yapeng Tian

ICCV'23: IEEE/CVF International Conference on Computer Vision.

DiffIR: Efficient Diffusion Model for Image Restoration

Bin Xia, Yulun Zhang, Shiyin Wang, Yitong Wang, Xinglong Wu, Yapeng Tian, Wenming Yang, Luc Van Gool

ICCV'23: IEEE/CVF International Conference on Computer Vision.

Dual Arbitrary Scale Super-Resolution for Multi-Contrast MRI

Jiamiao Zhang, Yichen Chi, Jun Lyu, Wenming Yang, Yapeng Tian

MICCAI'23: Medical Image Computing and Computer-Assisted Intervention.

Meta-Learning based Degradation Representation for Blind Super-Resolution

Bin Xia, Yapeng Tian, Yulun Zhang, Yucheng Hang, Wenming Yang, Qingmin Liao

TIP'23: IEEE Transactions on Image Processing.

AV-SAM: Segment Anything Model Meets Audio-Visual Localization and Segmentation

Shentong Mo, Yapeng Tian

CVPRW'23: CVPR Sight and Sound Workshop.

DiffAVA: Personalized Text-to-Audio Generation with Visual Alignment

Shentong Mo, Jing Shi, Yapeng Tian

CVPRW'23: CVPR Sight and Sound Workshop.

AV-NeRF: Learning Neural Fields for Real-World Audio-Visual Scene Synthesis

Susan Liang, Chao Huang, Yapeng Tian, Anurag Kumar, Chenliang Xu

CVPRW'23: CVPR Sight and Sound Workshop.

Audio-Visual Grouping Network for Sound Localization from Mixtures

Shentong Mo, Yapeng Tian

CVPR'23: IEEE/CVF Conference on Computer Vision and Pattern Recognition.

Egocentric Audio-Visual Object Localization

Chao Huang, Yapeng Tian, Anurag Kumar, and Chenliang Xu

CVPR'23: IEEE/CVF Conference on Computer Vision and Pattern Recognition.

Structured Sparsity Learning for Efficient Video Super-Resolution

Bin Xia, Jingwen He, Yulun Zhang, Yitong Wang, Yapeng Tian, Wenming Yang, Luc Van Gool

CVPR'23: IEEE/CVF Conference on Computer Vision and Pattern Recognition.

Knowledge Distillation based Degradation Estimation for Blind Super-Resolution

Bin Xia, Yulun Zhang, Yitong Wang, Yapeng Tian, Wenming Yang, Radu Timofte, Luc Van Gool

ICLR'23: International Conference on Learning Representations.

Basic Binary Convolution Unit for Binarized Image Restoration Network

Bin Xia, Yulun Zhang, Yitong Wang, Yapeng Tian, Wenming Yang, Radu Timofte, Luc Van Gool

ICLR'23: International Conference on Learning Representations.

Stdan: deformable attention network for space-time video super-resolution

Hai Wang, Xiaoyu Xiang, Yapeng Tian, Wenming Yang, Qingmin Liao

TNNLS'23: IEEE Transactions on Neural Networks and Learning Systems.

GDSSR: Toward Real-World Ultra-High-Resolution Image Super-Resolution

Yichen Chi, Wenming Yang, Yapeng Tian

SPL'23: IEEE Signal Processing Letters.

Towards Unified, Explainable, and Robust Multisensory Perception

Yapeng Tian

AAAI'23: AAAI Conference on Artificial Intelligence. (NFH program)

Multi-modal Grouping Network for Weakly-Supervised Audio-Visual Video Parsing

Shentong Mo, Yapeng Tian

NeurIPS'22: The Annual Conference on Neural Information Processing Systems.

Learning Spatio-Temporal Downsampling for Effective Video Upscaling

Xiaoyu Xiang, Yapeng Tian, Vijay Rengarajan, Lucas Young, Bo Zhu, Rakesh Ranjan

ECCV'22: European Conference on Computer Vision.

Audio-Visual Scene Understanding Towards Unified, Explainable, and Robust Multisensory Perception

Yapeng Tian

PhD Thesis

DuDoCAF: Dual-Domain Cross-Attention Fusion with Recurrent Transformer for Fast Multi-contrast MR Imaging

Jun Lyu, Bin Sui, Chengyan Wang, Yapeng Tian, Qi Dou, and Jing Qin

MICCAI'22: Medical Image Computing and Computer Assisted Intervention.

Audio-Visual Object Localization in Egocentric Videos

Chao Huang, Yapeng Tian, Anurag Kumar, and Chenliang Xu

CVPRW'22: CVPR Workshops

Egocentric audio-visual learning.

Learning to Answer Questions in Dynamic Audio-Visual Scenarios

Guangyao Li, Yake Wei, Yapeng Tian, Chenliang Xu, Ji-Rong Wen, and Di Hu

CVPR'22 Oral: IEEE/CVF Conference on Computer Vision and Pattern Recognition.

Transformer-empowered Multi-contrast MRI Super-Resolution

Guangyuan Li, Jun Lv, Yapeng Tian, Qi Dou, Chengyan Wang, Chenliang Xu, Jing Qin

CVPR'22: IEEE/CVF Conference on Computer Vision and Pattern Recognition.

Coarse-to-Fine Embedded PatchMatch and Multi-Scale Dynamic Aggregation for Reference-based Super-Resolution

Bin Xia, Yapeng Tian, Yucheng Hang, Wenming Yang, Qingmin Liao, Jie Zhou

AAAI'22: The AAAI Conference on Artificial Intelligence.

Efficient Non-Local Contrastive Attention for Image Super-Resolution

Bin Xia, Yucheng Hang, Yapeng Tian, Wenming Yang, Qingmin Liao, Jie Zhou

AAAI'22: The AAAI Conference on Artificial Intelligence.

Space-Time Memory Network for Sounding Object Localization in Videos

Sizhe Li, Yapeng Tian, and Chenliang Xu

BMVC'21: The British Machine Vision Conference.

Video Matting via Consistency-Regularized Graph Neural Networks

Tiantian Wang, Sifei Liu, Yapeng Tian, Kai Li, and Ming-Hsuan Yang

ICCV'21: IEEE/CVF International Conference on Computer Vision.

Can audio-visual integration strengthen robustness under multimodal attacks?

Yapeng Tian and Chenliang Xu

CVPR'21: IEEE/CVF Conference on Computer Vision and Pattern Recognition.

Cyclic Co-Learning of Sounding Object Visual Grounding and Sound Separation

Yapeng Tian, Di Hu, and Chenliang Xu

CVPR'21: IEEE/CVF Conference on Computer Vision and Pattern Recognition.

Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video Parsing

Yapeng Tian, Dingzeyu Li, and Chenliang Xu

ECCV'20 Spotlight: European Conference on Computer Vision.

Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution

Xiaoyu Xiang, Yapeng Tian, Yulun Zhang, Yun Fu, Jan Allebach, and Chenliang Xu

CVPR'20: IEEE/CVF Conference on Computer Vision and Pattern Recognition.

TDAN: Temporally Deformable Alignment Network for Video Super-Resolution

Yapeng Tian, Yulun Zhang, Yun Fu, and Chenliang Xu

CVPR'20: IEEE/CVF Conference on Computer Vision and Pattern Recognition.

This is the first work that uses deformable alignment to address video restoration.

Deep Audio Prior

Yapeng Tian, Chenliang Xu, and Dingzeyu Li

CVPRW'20: CVPR Workshops.

Residual Dense Network for Image Super-Resolution

Yulun Zhang, Yapeng Tian, Yu Kong , Bineng Zhong, Yun Fu

TPAMI'20: IEEE Transactions on Pattern Analysis and Machine Intelligence.

CFSNet: Toward a Controllable Feature Space for Image Restoration

Wei Wang, Ruiming Guo, Yapeng Tian, and Wenming Yang

ICCV'19: IEEE/CVF International Conference on Computer Vision.

Interpretable and Controllable Audio-Visual Video Captioning

Yapeng Tian, Chenxiao Guan, Goodman Justin, Marc Moore, and Chenliang Xu

CVPRW'19: CVPR Workshops.

Multisensory interpretability in terms of the audio-visual video captioning task.

LCSCNet: Linear Compressing Based Skip-Connecting Network for ISR

Wenming Yang, Xuechen Zhang, Yapeng Tian, Wei Wang, Jing-Hao Xue, Qingmin Liao

TIP'19: IEEE Trans. Image Processing.

Deep Learning for Single Image Super-Resolution: A Brief Review

Wenming Yang, Xuechen Zhang, Yapeng Tian, Wei Wang, JingHao Xue, Qingmin Liao

TMM'19: IEEE Trans. Multimedia.

Audio-Visual Event Localization in Unconstrained Videos

Yapeng Tian, Jing Shi, Bochen Li, Zhiyao Duan, Chenliang Xu

ECCV'18: European Conference on Computer Vision.

Residual Dense Network for Image Super-Resolution

Yulun Zhang, Yapeng Tian, Yu Kong , Bineng Zhong, Yun Fu

CVPR'18 Spotlight: IEEE/CVF Conf. on Computer Vision and Pattern Recognition.

NTIRE 2017 Challenge on Single Image Super-Resolution: Methods and Results

Timofte et al.

CVPRW'17: CVPR Workshops.

Consistent Coding Scheme for Single-Image Super-Resolution

Wenming Yang, Yapeng Tian, Fei Zhou, Qingmin Liao, Hai Chen, Chenglin Zheng

TMM'16: EEE Trans. Multimedia. (First student author)

Anchored Neighborhood Regression based SISR from Self-examples

Yapeng Tian, Fei Zhou, Wenming Yang, Xuesen Shang, Qingmin Liao

ICIP'16: IEEE International Conference on Image Processing.

SISR Using Clustering-Based Global Regression and Propagation Filtering

Wenming Yang, Yapeng Tian, Fei Zhou, ..., Qingmin Liao

ACPR'15 Oral: Asian Conference on Pattern Recognition. (First student author)

Text-to-Audio Generation Synchronized with Videos

Shentong Mo, Jing Shi, Yapeng Tian

Preprint'24.

DAVIS: High-Quality Audio-Visual Separation with Generative Diffusion Models

Chao Huang, Susan Liang, Yapeng Tian, Anurag Kumar, Chenliang Xu

Preprint'23.

SignDiff: Learning Diffusion Models for American Sign Language Production

Sen Fang, Chunyu Sui, Xuedong Zhang, Yapeng Tian

Preprint'23.

Towards Long Form Audio-visual Video Understanding

Wenxuan Hou, Guangyao Li, Yapeng Tian, Di Hu

Preprint'23.

EgoVSR: Towards High-Quality Egocentric Video Super-Resolution

Yichen Chi, Junhao Gu, Jiamiao Zhang, Wenming Yang, Yapeng Tian

Preprint'23.

Learning in Audio-visual Context: A Review, Analysis, and New Perspective

Yake Wei, Di Hu, Yapeng Tian, Xuelong Li

Preprint'22.

PEANUT: A Human-AI Collaborative Tool for Annotating Audio-Visual Data

Zheng Zhang, Zheng Ning, Chenliang Xu Yapeng Tian, Toby Jia-Jun Li

UIST'23: ACM Symposium on User Interface Software and Technology.

Text-to-Audio Generation Synchronized with Videos

Shentong Mo, Jing Shi, Yapeng Tian

Preprint'24.

DAVIS: High-Quality Audio-Visual Separation with Generative Diffusion Models

Chao Huang, Susan Liang, Yapeng Tian, Anurag Kumar, Chenliang Xu

Preprint'23.

SignDiff: Learning Diffusion Models for American Sign Language Production

Sen Fang, Chunyu Sui, Xuedong Zhang, Yapeng Tian

Preprint'23.

Towards Long Form Audio-visual Video Understanding

Wenxuan Hou, Guangyao Li, Yapeng Tian, Di Hu

Preprint'23.

Learning in Audio-visual Context: A Review, Analysis, and New Perspective

Yake Wei, Di Hu, Yapeng Tian, Xuelong Li

Preprint'22.

T-VSL: Text-Guided Visual Sound Source Localization in Mixtures

Tanvir Mahmud, Yapeng Tian, Diana Marculescu

CVPR'24: IEEE/CVF Conference on Computer Vision and Pattern Recognition

SPICA: Interactive Video Content Exploration through Augmented Audio Descriptions for Blind or Low-Vision Viewers

Zheng Ning, Brianna Wimer, Kaiwen Jiang, Keyi Chen, Jerrick Ban, Yapeng Tian, Yuhang Zhao, Toby Li

CHI'24: The ACM Conference on Human Factors in Computing Systems.

LAVSS: Location-Guided Audio-Visual Spatial Audio Separation

Yuxin Ye, Wenming Yang, Yapeng Tian

WACV'24: Winter Conference on Applications of Computer Vision.

Disentangled counterfactual learning for physical audiovisual commonsense reasoning

Changsheng Lv, Shuai Zhang, Yapeng Tian, Mengshi Qi, Huadong Ma

NeurIPS'23: The Annual Conference on Neural Information Processing Systems.

AV-NeRF: Learning Neural Fields for Real-World Audio-Visual Scene Synthesis

Susan Liang, Chao Huang, Yapeng Tian, Anurag Kumar, Chenliang Xu

NeurIPS'23: The Annual Conference on Neural Information Processing Systems.

PEANUT: A Human-AI Collaborative Tool for Annotating Audio-Visual Data

Zheng Zhang, Zheng Ning, Chenliang Xu Yapeng Tian, Toby Jia-Jun Li

UIST'23: ACM Symposium on User Interface Software and Technology.

Towards Robust Active Speaker Detection

Siva Sai Nagender Vasireddy, Chenxu Zhang, Xiaohu Guo, Yapeng Tian

ICCVW'23: ICCV AV4D Workshop .

Position-Aware Audio-Visual Separation for Spatial Audio

Yuxin Ye, Wenming Yang, Yapeng Tian

ICCVW'23: ICCV AV4D Workshop .

Towards Better Egocentric Action Understanding in a Multi-Input Multi-Output View

Wenxuan Hou, Ruoxuan Feng, Yixin Xu, Yapeng Tian, Di Hu

ICCVW'23: ICCV AV4D Workshop .

Neural Acoustic Context Field: Rendering Realistic Room Impulse Response With Neural Fields

Susan Liang, Chao Huang, Yapeng Tian, Anurag Kumar, Chenliang Xu

ICCVW'23: ICCV AV4D Workshop .

Separating Invisible Sounds Toward Universal Audio-Visual Scene-Aware Sound Separation

Yiyang Su, Ali Vosoughi, Shijian Deng, Yapeng Tian, Chenliang Xu

ICCVW'23: ICCV AV4D Workshop .

Audio-Visual Class-Incremental Learning

Weiguo Pian, Shentong Mo, Yunhui Guo, Yapeng Tian

ICCV'23: IEEE/CVF International Conference on Computer Vision.

Class-Incremental Grouping Network for Continual Audio-Visual Learning

Shentong Mo, Weiguo Pian, Yapeng Tian

ICCV'23: IEEE/CVF International Conference on Computer Vision.

AV-SAM: Segment Anything Model Meets Audio-Visual Localization and Segmentation

Shentong Mo, Yapeng Tian

CVPRW'23: CVPR Sight and Sound Workshop.

DiffAVA: Personalized Text-to-Audio Generation with Visual Alignment

Shentong Mo, Jing Shi, Yapeng Tian

CVPRW'23: CVPR Sight and Sound Workshop.

AV-NeRF: Learning Neural Fields for Real-World Audio-Visual Scene Synthesis

Susan Liang, Chao Huang, Yapeng Tian, Anurag Kumar, Chenliang Xu

CVPRW'23: CVPR Sight and Sound Workshop.

Audio-Visual Grouping Network for Sound Localization from Mixtures

Shentong Mo, Yapeng Tian

CVPR'23: IEEE/CVF Conference on Computer Vision and Pattern Recognition.

Egocentric Audio-Visual Object Localization

Chao Huang, Yapeng Tian, Anurag Kumar, and Chenliang Xu

CVPR'23: IEEE/CVF Conference on Computer Vision and Pattern Recognition.

Towards Unified, Explainable, and Robust Multisensory Perception

Yapeng Tian

AAAI'23: AAAI Conference on Artificial Intelligence. (NFH program)

Multi-modal Grouping Network for Weakly-Supervised Audio-Visual Video Parsing

Shentong Mo, Yapeng Tian

NeurIPS'22: The Annual Conference on Neural Information Processing Systems.

Audio-Visual Scene Understanding Towards Unified, Explainable, and Robust Multisensory Perception

Yapeng Tian

PhD Thesis

Audio-Visual Object Localization in Egocentric Videos

Chao Huang, Yapeng Tian, Anurag Kumar, and Chenliang Xu

CVPRW'22: CVPR Workshops

Egocentric audio-visual learning.

Learning to Answer Questions in Dynamic Audio-Visual Scenarios

Guangyao Li, Yake Wei, Yapeng Tian, Chenliang Xu, Ji-Rong Wen, and Di Hu

CVPR'22 Oral: IEEE/CVF Conference on Computer Vision and Pattern Recognition.

Space-Time Memory Network for Sounding Object Localization in Videos

Sizhe Li, Yapeng Tian, and Chenliang Xu

BMVC'21: The British Machine Vision Conference.

Can audio-visual integration strengthen robustness under multimodal attacks?

Yapeng Tian and Chenliang Xu

CVPR'21: IEEE/CVF Conference on Computer Vision and Pattern Recognition.

Cyclic Co-Learning of Sounding Object Visual Grounding and Sound Separation

Yapeng Tian, Di Hu, and Chenliang Xu

CVPR'21: IEEE/CVF Conference on Computer Vision and Pattern Recognition.

Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video Parsing

Yapeng Tian, Dingzeyu Li, and Chenliang Xu

ECCV'20 Spotlight: European Conference on Computer Vision.

Deep Audio Prior

Yapeng Tian, Chenliang Xu, and Dingzeyu Li

CVPRW'20: CVPR Workshops.

Interpretable and Controllable Audio-Visual Video Captioning

Yapeng Tian, Chenxiao Guan, Goodman Justin, Marc Moore, and Chenliang Xu

CVPRW'19: CVPR Workshops.

Multisensory interpretability in terms of the audio-visual video captioning task.

Audio-Visual Event Localization in Unconstrained Videos

Yapeng Tian, Jing Shi, Bochen Li, Zhiyao Duan, Chenliang Xu

ECCV'18: European Conference on Computer Vision.

EgoVSR: Towards High-Quality Egocentric Video Super-Resolution

Yichen Chi, Junhao Gu, Jiamiao Zhang, Wenming Yang, Yapeng Tian

Preprint'23.

STADNet: Spatial-Temporal Attention-Guided Dual-Path Network for cardiac cine MRI super-resolution

Jun Lyu, Shuo Wang, Yapeng Tian, Jing Zou, Shunjie Dong, Chengyan Wang, Angelica I Aviles-Rivero, Jing Qin

MIA'24: Medical Image Analysis

Structured Sparsity Learning for Efficient Video Super-Resolution

Bin Xia, Jingwen He, Yulun Zhang, Yitong Wang, Yapeng Tian, Wenming Yang, Luc Van Gool

CVPR'23: IEEE/CVF Conference on Computer Vision and Pattern Recognition.

Stdan: deformable attention network for space-time video super-resolution

Hai Wang, Xiaoyu Xiang, Yapeng Tian, Wenming Yang, Qingmin Liao

TNNLS'23: IEEE Transactions on Neural Networks and Learning Systems.

Learning Spatio-Temporal Downsampling for Effective Video Upscaling

Xiaoyu Xiang, Yapeng Tian, Vijay Rengarajan, Lucas Young, Bo Zhu, Rakesh Ranjan

ECCV'22: European Conference on Computer Vision.

Video Matting via Consistency-Regularized Graph Neural Networks

Tiantian Wang, Sifei Liu, Yapeng Tian, Kai Li, and Ming-Hsuan Yang

ICCV'21: IEEE/CVF International Conference on Computer Vision.

Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution

Xiaoyu Xiang, Yapeng Tian, Yulun Zhang, Yun Fu, Jan Allebach, and Chenliang Xu

CVPR'20: IEEE/CVF Conference on Computer Vision and Pattern Recognition.

TDAN: Temporally Deformable Alignment Network for Video Super-Resolution

Yapeng Tian, Yulun Zhang, Yun Fu, and Chenliang Xu

CVPR'20: IEEE/CVF Conference on Computer Vision and Pattern Recognition.

This is the first work that uses deformable alignment to address video restoration.

DiffIR: Efficient Diffusion Model for Image Restoration

Bin Xia, Yulun Zhang, Shiyin Wang, Yitong Wang, Xinglong Wu, Yapeng Tian, Wenming Yang, Luc Van Gool

ICCV'23: IEEE/CVF International Conference on Computer Vision.

Dual Arbitrary Scale Super-Resolution for Multi-Contrast MRI

Jiamiao Zhang, Yichen Chi, Jun Lyu, Wenming Yang, Yapeng Tian

MICCAI'23: Medical Image Computing and Computer-Assisted Intervention.

Meta-Learning based Degradation Representation for Blind Super-Resolution

Bin Xia, Yapeng Tian, Yulun Zhang, Yucheng Hang, Wenming Yang, Qingmin Liao

TIP'23: IEEE Transactions on Image Processing.

Knowledge Distillation based Degradation Estimation for Blind Super-Resolution

Bin Xia, Yulun Zhang, Yitong Wang, Yapeng Tian, Wenming Yang, Radu Timofte, Luc Van Gool

ICLR'23: International Conference on Learning Representations.

Basic Binary Convolution Unit for Binarized Image Restoration Network

Bin Xia, Yulun Zhang, Yitong Wang, Yapeng Tian, Wenming Yang, Radu Timofte, Luc Van Gool

ICLR'23: International Conference on Learning Representations.

GDSSR: Toward Real-World Ultra-High-Resolution Image Super-Resolution

Yichen Chi, Wenming Yang, Yapeng Tian

SPL'23: IEEE Signal Processing Letters.

DuDoCAF: Dual-Domain Cross-Attention Fusion with Recurrent Transformer for Fast Multi-contrast MR Imaging

Jun Lyu, Bin Sui, Chengyan Wang, Yapeng Tian, Qi Dou, and Jing Qin

MICCAI'22: Medical Image Computing and Computer Assisted Intervention.

Transformer-empowered Multi-contrast MRI Super-Resolution

Guangyuan Li, Jun Lv, Yapeng Tian, Qi Dou, Chengyan Wang, Chenliang Xu, Jing Qin

CVPR'22: IEEE/CVF Conference on Computer Vision and Pattern Recognition.

Coarse-to-Fine Embedded PatchMatch and Multi-Scale Dynamic Aggregation for Reference-based Super-Resolution

Bin Xia, Yapeng Tian, Yucheng Hang, Wenming Yang, Qingmin Liao, Jie Zhou

AAAI'22: The AAAI Conference on Artificial Intelligence.

Efficient Non-Local Contrastive Attention for Image Super-Resolution

Bin Xia, Yucheng Hang, Yapeng Tian, Wenming Yang, Qingmin Liao, Jie Zhou

AAAI'22: The AAAI Conference on Artificial Intelligence.

Residual Dense Network for Image Super-Resolution

Yulun Zhang, Yapeng Tian, Yu Kong , Bineng Zhong, Yun Fu

TPAMI'20: IEEE Transactions on Pattern Analysis and Machine Intelligence.

CFSNet: Toward a Controllable Feature Space for Image Restoration

Wei Wang, Ruiming Guo, Yapeng Tian, and Wenming Yang

ICCV'19: IEEE/CVF International Conference on Computer Vision.

LCSCNet: Linear Compressing Based Skip-Connecting Network for ISR

Wenming Yang, Xuechen Zhang, Yapeng Tian, Wei Wang, Jing-Hao Xue, Qingmin Liao

TIP'19: IEEE Trans. Image Processing.

Deep Learning for Single Image Super-Resolution: A Brief Review

Wenming Yang, Xuechen Zhang, Yapeng Tian, Wei Wang, JingHao Xue, Qingmin Liao

TMM'19: IEEE Trans. Multimedia.

Residual Dense Network for Image Super-Resolution

Yulun Zhang, Yapeng Tian, Yu Kong , Bineng Zhong, Yun Fu

CVPR'18 Spotlight: IEEE/CVF Conf. on Computer Vision and Pattern Recognition.

NTIRE 2017 Challenge on Single Image Super-Resolution: Methods and Results

Timofte et al.

CVPRW'17: CVPR Workshops.

Consistent Coding Scheme for Single-Image Super-Resolution

Wenming Yang, Yapeng Tian, Fei Zhou, Qingmin Liao, Hai Chen, Chenglin Zheng

TMM'16: EEE Trans. Multimedia. (First student author)

Anchored Neighborhood Regression based SISR from Self-examples

Yapeng Tian, Fei Zhou, Wenming Yang, Xuesen Shang, Qingmin Liao

ICIP'16: IEEE International Conference on Image Processing.

SISR Using Clustering-Based Global Regression and Propagation Filtering

Wenming Yang, Yapeng Tian, Fei Zhou, ..., Qingmin Liao

ACPR'15 Oral: Asian Conference on Pattern Recognition. (First student author)

Teaching

Service

Organizer:

Senior Program Committee or Area Chair:

  • AAAI: AAAI Conference on Artificial Intelligence, 2023-2024

Session Chair:

  • AAAI 2023 (Multimodal Learning, Low-Level & Physics-based Vision)

Conference Program Committee/Reviewer:

  • CVPR: IEEE/CVF Conference on Computer Vision and Pattern Recognition
  • ICCV: IEEE/CVF International Conference on Computer Vision
  • ECCV: European Conference on Computer Vision
  • NeurIPS: Conference on Neural Information Processing Systems
  • ICLR: International Conference on Learning Representations
  • AAAI: AAAI Conference on Artificial Intelligence
  • ICML: International Conference on Machine Learning
  • WACV: Winter Conference on Applications of Computer Vision
  • ACCV: Asian Conference on Computer Vision
  • MICCAI: International Conference On Medical Image Computing & Computer Assisted Intervention

Journal Reviewer:

  • TPAMI: IEEE Transactions on Pattern Analysis and Machine Intelligence
  • TMLR: The Transactions on Machine Learning Research
  • TIP: IEEE Transactions on Image Processing
  • TNNLS: IEEE Transactions on Neural Networks and Learning Systems
  • TMM: IEEE Transactions on Multimedia
  • TCSVT: IEEE Transcations on Circuits and Systems for Video Technology
  • TASLP: IEEE/ACM Transactions on Audio, Speech and Language Processing
  • Scientific Reports–Nature
  • CGF: Computer Graphics Forum
  • CVIU: Computer Vision and Image Understanding
  • SPIC: Signal Processing: Image Communication
  • IEEE Access

Talks and Seminars:

  • Learning Semantic-aware Grouping for Weakly-Supervised Audio-Visual Scene Understanding
         Sight and Sound Workshop @ CVPR, June 2023
  • Human-Multisensory AI Collaboration: Opportunities and Challenges
          AV4D Workshop @ ECCV, Oct. 2022
  • UTD CS Mixer, Oct. 2022
  • Audio-Visual Scene Understanding Towards Unified, Explainable, and Robust Multisensory Perception

    KTH Dive-Deep Seminar, Dec. 2021
         RIT PhD Colloquium Series, Oct. 2021

  • Audio-Visual Video Understanding, IIAI Seminar, Sep. 2021
  • The Future of Audio-Visual Research Panel Discussion, VALSE Webinar, Nov. 2021

Awards

Undergraduate Research Apprenticeship Program (URAP) award, 2023 and 2024
Cisco Faculty Research Award, 2023
AAAI New Faculty Highlights, 2023
CVPR Doctoral Consortium, 2022
Top 10% of High-Scoring Reviewers for NeurIPS, 2020
Outstanding Graduate of Tsinghua University, 2017
Outstanding Master Thesis Award, Tsinghua University, 2017
National Scholarship, Tsinghua University, 2016

Vitæ

Full CV in PDF.

  • University of Texas at Dallas 2022 - now
    Assistant Professor
    Department of Computer Science
  • University of Rochester 2017 - 2022
    Ph.D. Student
    Department of Computer Science
  • Meta Sep. 2021 - Jan. 2022
    Research Intern
    Reality Labs
  • Adobe Summer 2021
    Research Intern
    Creative Intelligence Lab
  • Adobe Summer 2019
    Research Intern
    Creative Intelligence Lab
  • Tsinghua University 2014-2017
    M.E. Student
    Department of Electronic Engineering
  • Chinese Academy of Sciences Nov. 2016- May 2017
    Visiting Student
    Shenzhen Institutes of Advanced Technology
  • Xidian University 2009 - 2013
    B.E. Student
    School of Electronic Engineering

This website was built with jekyll based on a template from Martin Saveski.