Publications
You can also find the full list of my articles on my Google Scholar profile.
2024
Efficient Large Language Models: A Survey Zhongwei Wan, Xin Wang, Che Liu, Samiul Alam, Yu Zheng, Jiachen Liu, Zhongnan Qu, Shen Yan, Yi Zhu, Quanlu Zhang, Mosharaf Chowdhury, Mi Zhang Transactions on Machine Learning Research (TMLR) 2024 arxiv code |
2023
PreDiff: Precipitation Nowcasting with Latent Diffusion Models Zhihan Gao, Xingjian Shi, Boran Han, Hao Wang, Xiaoyong Jin, Danielle Maddix, Yi Zhu, Mu Li, Yuyang Wang Conference on Neural Information Processing Systems (NeurIPS) 2023 arxiv code | |
Prompt Pre-Training with Twenty-Thousand Classes for Open-Vocabulary Visual Recognition Shuhuai Ren, Aston Zhang, Yi Zhu, Shuai Zhang, Shuai Zheng, Mu Li, Alex Smola, Xu Sun Conference on Neural Information Processing Systems (NeurIPS) 2023 arxiv code | |
GFM: Building Geospatial Foundation Models via Continual Pretraining Matias Mendieta, Boran Han, Xingjian Shi, Yi Zhu, Chen Chen, Mu Li International Conference on Computer Vision (ICCV) 2023 arxiv code | |
Motion-Guided Masking for Spatiotemporal Representation Learning David Fan, Jue Wang, Shuai Liao, Yi Zhu, Vimal Bhat, Hector Santos-Villalobos, Rohith MV, Xinyu Li International Conference on Computer Vision (ICCV) 2023 arxiv code | |
Tailoring Instructions to Student's Learning Levels Boosts Knowledge Distillation Yuxin Ren, Zihan Zhong, Xingjian Shi, Yi Zhu, Chun Yuan, Mu Li Association for Computational Linguistics (ACL) 2023 arxiv code | |
SimCon Loss with Multiple Views for Text Supervised Semantic Segmentation Yash Patel, Yusheng Xie, Yi Zhu, Srikar Appalaraju, R. Manmatha arXiv preprint arXiv:2302.03432 2023 arxiv code | |
AIM: Adapting Image Models for Efficient Video Understanding Taojiannan Yang, Yi Zhu, Yusheng Xie, Aston Zhang, Chen Chen, Mu Li International Conference on Learning Representations (ICLR) 2023 arxiv code | |
Unsupervised Semantic Segmentation with Self-supervised Object-centric Representations Andrii Zadaianchuk, Matthaeus Kleindessner, Yi Zhu, Francesco Locatello, Thomas Brox International Conference on Learning Representations (ICLR) 2023 arxiv |
2022
What Makes for Good Tokenizers in Vision Transformer? Shengju Qian, Yi Zhu, Wenbo Li, Mu Li, Jiaya Jia IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 2022 arxiv IEEE journal code | |
Are Multimodal Models Robust to Image and Text Perturbations? Jielin Qiu, Yi Zhu, Xingjian Shi, Florian Wenzel, Zhiqiang Tang, Ding Zhao, Bo Li, Mu Li arXiv preprint arXiv:2212.08044 2022 arxiv | |
SPT: Semi-Parametric Prompt Tuning for Multitask Prompted Learning M Saiful Bari, Aston Zhang, Shuai Zheng, Xingjian Shi, Yi Zhu, Shafiq Joty, Mu Li arXiv preprint arXiv:2212.10929 2022 arxiv | |
Visual Prompt Tuning for Test-time Domain Adaptation Yunhe Gao, Xingjian Shi, Yi Zhu, Hao Wang, Zhiqiang Tang, Xiong Zhou, Mu Li, Dimitris N. Metaxas arXiv preprint arXiv:2210.04831 2022 arxiv | |
Earthformer: Exploring Space-Time Transformers for Earth System Forecasting Zhihan Gao, Xingjian Shi, Hao Wang, Yi Zhu, Yuyang Wang, Mu Li, Dit-Yan Yeung Conference on Neural Information Processing Systems (NeurIPS) 2022 arxiv code | |
MixGen: A New Multi-Modal Data Augmentation Xiaoshuai Hao, Yi Zhu, Srikar Appalaraju, Aston Zhang, Wanqian Zhang, Bo Li, Mu Li arXiv preprint arXiv:2206.08358 2022 arxiv code | |
Partial and Asymmetric Contrastive Learning for Out-of-Distribution Detection in Long-Tailed Recognition Haotao Wang, Aston Zhang, Yi Zhu, Shuai Zheng, Mu Li, Alex Smola, and Zhangyang Wang International Conference on Machine Learning (ICML) 2022 Long Oral arxiv code | |
Pixel-level Correspondence for Self-Supervised Learning from Video Yash Sharma, Yi Zhu, Chris Russell, Thomas Brox International Conference on Machine Learning (ICML) 2022 Workshop arxiv | |
BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training Likun Cai, Zhi Zhang, Yi Zhu, Li Zhang, Mu Li, Xiangyang Xue arXiv preprint arXiv:2203.13249 2022 arxiv code | |
NUTA: Non-uniform Temporal Aggregation for Action Recognition Xinyu Li, Chunhui Liu, Bing Shuai, Yi Zhu, Hao Chen, Joseph Tighe IEEE Winter Conference on Applications of Computer Vision (WACV) 2022 arxiv |
2021
Improving Semantic Segmentation via Efficient Self-Training Yi Zhu, Zhongyue Zhang, Chongruo Wu, Zhi Zhang, Tong He, Hang Zhang, R. Manmatha, Mu Li and Alexander Smola IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 2021 arxiv IEEE journal code | |
Blending Anti-Aliasing into Vision Transformer Shengju Qian, Hao Shao, Yi Zhu, Mu Li, Jiaya Jia Conference on Neural Information Processing Systems (NeurIPS) 2021 arxiv code | |
Progressive Coordinate Transforms for Monocular 3D Object Detection Li Wang, Li Zhang, Yi Zhu, Zhi Zhang, Tong He, Mu Li, Xiangyang Xue Conference on Neural Information Processing Systems (NeurIPS) 2021 arxiv code | |
CrossCLR: Cross-modal Contrastive Learning For Multi-modal Video Representations Mohammadreza Zolfaghari, Yi Zhu, Peter Gehler, Thomas Brox International Conference on Computer Vision (ICCV) 2021 arxiv code | |
VidTr: Video Transformer Without Convolutions Xinyu Li, Yanyi Zhang, Chunhui Liu, Bing Shuai, Yi Zhu, Biagio Brattoli, Hao Chen, Ivan Marsic, Joseph Tighe International Conference on Computer Vision (ICCV) 2021 arxiv code | |
SelfNorm and CrossNorm for Out-of-Distribution Robustness Zhiqiang Tang, Yunhe Gao, Yi Zhu, Zhi Zhang, Mu Li, Dimitris Metaxas International Conference on Computer Vision (ICCV) 2021 arxiv code | |
A Unified Efficient Pyramid Transformer for Semantic Segmentation Fangrui Zhu, Yi Zhu, Li Zhang, Chongruo Wu, Yanwei Fu, Mu Li International Conference on Computer Vision (ICCV) 2021 Workshop arxiv code | |
Video Contrastive Learning with Global Context Haofei Kuang, Yi Zhu, Zhi Zhang, Xinyu Li, Joseph Tighe, Sören Schwertfeger, Cyrill Stachniss, Mu Li International Conference on Computer Vision (ICCV) 2021 Workshop arxiv code | |
Domain Consensus Clustering for Universal Domain Adaptation Guangrui Li, Guoliang Kang, Yi Zhu, Yunchao Wei, Yi Yang Computer Vision and Pattern Recognition (CVPR) 2021 arxiv code | |
AutoAdapt: Automated Segmentation Network Search for Unsupervised Domain Adaptation Xueqing Deng, Yi Zhu, Yuxin Tian, Shawn Newsam Computer Vision and Pattern Recognition (CVPR) 2021 Workshop arxiv | |
Scale Aware Adaptation for Land-Cover Classification in Remote Sensing Imagery Xueqing Deng, Yi Zhu, Yuxin Tian, Shawn Newsam IEEE Winter Conference on Applications of Computer Vision (WACV) 2021 arxiv code |
2020
A Comprehensive Study of Deep Video Action Recognition Yi Zhu, Xinyu Li, Chunhui Liu, Mohammadreza Zolfaghari, Yuanjun Xiong, Chongruo Wu, Zhi Zhang, Joseph Tighe, R. Manmatha, Mu Li arXiv preprint arXiv:2012.06567 2020 arxiv code | |
Towards Good Practices in Self-supervised Representation Learning Srikar Appalaraju, Yi Zhu, Yusheng Xie, István Fehérvári Conference on Neural Information Processing Systems (NeurIPS) 2020 Workshop arxiv | |
Improving Semantic Segmentation via Self-Training Yi Zhu, Zhongyue Zhang, Chongruo Wu, Zhi Zhang, Tong He, Hang Zhang, R. Manmatha, Mu Li and Alexander Smola arXiv preprint arXiv:2004.14960 2020 arxiv code | |
ResNeSt: Split-Attention Networks Hang Zhang, Chongruo Wu, Zhongyue Zhang, Yi Zhu, Zhi Zhang, Haibin Lin, Yue Sun, Tong He, Jonas Mueller, R. Manmatha, Mu Li and Alexander Smola arXiv preprint arXiv:2004.08955 2020 arxiv code | |
Motion-Excited Sampler: Video Adversarial Attack with Sparked Prior Hu Zhang, Linchao Zhu, Yi Zhu and Yi Yang European Conference on Computer Vision (ECCV) 2020 arxiv code | |
Cross-Time and Orientation-Invariant Overhead Image Geolocalization Using Deep Local Features Yuxin Tian, Xueqing Deng, Yi Zhu and Shawn Newsam IEEE Winter Conference on Applications of Computer Vision (WACV) 2020 arxiv code | |
GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing Jian Guo, He He, Tong He, Leonard Lausen, Mu Li, Haibin Lin, Xingjian Shi, Chenguang Wang, Junyuan Xie, Sheng Zha, Aston Zhang, Hang Zhang, Zhi Zhang, Zhongyue Zhang, Shuai Zheng and Yi Zhu Journal of Machine Learning Research (JMLR) 2020 arxiv code |
2019
Motion-Aware Feature for Improved Video Anomaly Detection Yi Zhu and Shawn Newsam British Machine Vision Conference (BMVC) 2019 arxiv | |
Improving Semantic Segmentation via Video Propagation and Label Relaxation Yi Zhu, Karan Sapra, Fitsum A. Reda, Kevin J. Shih, Shawn Newsam, Andrew Tao and Bryan Catanzaro Computer Vision and Pattern Recognition (CVPR) 2019 Oral arxiv code | |
Fine-Grained Land Use Classification at the City Scale Using Ground-Level Images Yi Zhu, Xueqing Deng and Shawn Newsam IEEE Transactions on Multimedia (TMM) 2019 arxiv |
2018
Hidden Two-Stream Convolutional Networks for Action Recognition Yi Zhu, Zhenzhong Lan, Shawn Newsam and Alexander G Hauptmann Asian Conference on Computer Vision (ACCV) 2018 arxiv code | |
Random Temporal Skipping for Multirate Video Analysis Yi Zhu and Shawn Newsam Asian Conference on Computer Vision (ACCV) 2018 arxiv | |
Gated Transfer Network for Transfer Learning Yi Zhu, Jia Xue, and Shawn Newsam Asian Conference on Computer Vision (ACCV) 2018 arxiv code | |
What Is It Like Down There? Generating Dense Ground-Level Views and Image Features From Overhead Imagery Using Conditional Generative Adversarial Networks Xueqing Deng, Yi Zhu, and Shawn Newsam ACM International Conference on Advances in Geographic Information Systems (SIGSPATIAL) 2018 Oral arxiv | |
Towards Universal Representation for Unseen Action Recognition Yi Zhu, Yang Long, Yu Guan, Shawn Newsam and Ling Shao Computer Vision and Pattern Recognition (CVPR) 2018 arxiv | |
Learning Optical Flow via Dilated Networks and Occlusion Reasoning Yi Zhu and Shawn Newsam IEEE International Conference on Image Processing (ICIP) 2018 arxiv | |
Spatial Morphing Kernel Regression For Feature Interpolation Xueqing Deng, Yi Zhu, and Shawn Newsam IEEE International Conference on Image Processing (ICIP) 2018 arxiv |
2017
Large-Scale Mapping of Human Activity using Geo-Tagged Videos Yi Zhu, Sen Liu and Shawn Newsam ACM International Conference on Advances in Geographic Information Systems (SIGSPATIAL) 2017 arxiv | |
DenseNet for Dense Flow Yi Zhu and Shawn Newsam IEEE International Conference on Image Processing (ICIP) 2017 Oral arxiv | |
Guided Optical Flow Learning Yi Zhu, Zhenzhong Lan, Shawn Newsam and Alexander G Hauptmann Computer Vision and Pattern Recognition (CVPR) 2017 Workshop arxiv code | |
Deep Local Video Feature for Action Recognition Zhenzhong Lan, Yi Zhu, Alexander G Hauptmann and Shawn Newsam Computer Vision and Pattern Recognition (CVPR) 2017 Workshop Oral arxiv | |
Efficient Action Detection in Untrimmed Videos via Multi-Task Learning Yi Zhu and Shawn Newsam IEEE Winter Conference on Applications of Computer Vision (WACV) 2017 Oral arxiv |
2016
Spatio-Temporal Sentiment Hotspot Detection using Geotagged Photos Yi Zhu and Shawn Newsam ACM International Conference on Advances in Geographic Information Systems (SIGSPATIAL) 2016 arxiv | |
Depth2Action: Exploring Embedded Depth for Large-Scale Action Recognition Yi Zhu and Shawn Newsam European Conference on Computer Vision (ECCV) 2016 Workshop Oral arxiv | |
UC Merced Submission to the ActivityNet Challenge 2016 Yi Zhu and Shawn Newsam CVPR 2016 ActivityNet challenge. Untrimmed Video Classification Track arxiv |
Before 2016
Yi Zhu and Shawn Newsam, Land Use Classification using Convolutional Neural Networks Applied to Ground-Level Images, ACM International Conference on Advances in Geographic Information Systems (SIGSPATIAL) 2015 (Best Poster Award) arxiv
Yi Zhu, Lingjia Liu and Jianzhong Zhang, Joint Angle and Delay Estimation for 2D Active Broadband MIMO-OFDM Systems, IEEE Global Communications Conference (GLOBECOM) 2013 arxiv
Yi Zhu, Lingjia Liu, Anding Wang, Krishna Sayana and Jianzhong Zhang, DoA Estimation and Capacity Analysis for 2D Active Massive MIMO Systems, IEEE International Conference on Communications (ICC) 2013 arxiv