Tingkai Liu 刘廷恺

NeuroAI Researcher, Cold Spring Harbor Laboratory

profile.png

I am an independent researcher at Cold Spring Harbor Laboratory, part of the new NeuroAI research program, where I work on the intersection between artificial and biological intelligence.

From 2023 to 2024, I worked at ByteDance Inc. (AML), where I specialized in Large Language Model pre-/post-training and multi-modal foundation models.

I earned my MS/PhD in Electrical Engineering from Columbia University in December 2022, where I worked on theoretical and Computational Neuroscience. Prior to attending Columbia, I earned my BS degree at Rice University (Go Owls!).

news

Dec 3, 2024 Our new work, ElastiFormer, is live on ArXiv! We propose a post-training technique that adapts pretrained Transformer models (e.g. LLM, ViT, VLM) into an elastic counterpart via self-distillation. We show that 20% to 50% compute saving could be achieved for different components of the transformer architecture, which could be further reduced by adding very low rank LoRA weights (rank 1) trained via the same distillation objective.
Aug 10, 2024 I’m presenting 2 of our papers at the 62nd Annual Meeting of the Association for Computational Linguistics (ACL) in Bangkok, Thailand from 08/11/2024 - 08/15/2024
  1. (Oral) Expedited Training of Visual Conditioned Language Generation via Redundancy Reduction
  2. (Poster) DeVAn: Dense Video Annotation for Video-Language Models

research

My research covers a wide range of topics ranging from large language models pre-/post-training, multi-modal foundation models and computational neuroscience.
  1. Liu, Junzhang,  Liu, Tingkai, Sui, Yueyuan, and Xia, Stephen
    2024

    We introduce ElastiFormer, a post-training technique that self-distills any pretrained Transformer models into an elastic counterpart with variable inference time compute.

  2. Jian, Yiren,  Liu, Tingkai, Tao, Yunzhe, Zhang, Chunhui, Vosoughi, Soroush, and Yang, Hongxia
    Annual Meeting of the Association for Computational Linguistics (ACL), 2024
    (Oral Presentation)

    We introduce EVLGen, a one-staged single-loss framework based on Token Merging, for pre-training computationally intensive vision-language generative models using frozen pre-trained large language models.

  3. Liu, Tingkai, Tao, Yunzhe, Liu, Haogeng, Fan, Qihang, Zhou, Ding, Huang, Huaibo, He, Ran, and Yang, Hongxia
    Annual Meeting of the Association for Computational Linguistics (ACL), 2024

    We present a novel task and human-annotated dataset for evaluating visual-language models’ ability to generate captions and summaries for real-world video clips.

  4. Zhou, Haotian,  Liu, Tingkai, Ma, Qianli, Yuan, Jianbo, Liu, Pengfei, You, Yang, and Yang, Hongxia
    arXiv, Oct 2023

    We propose a loss-based LLM post-training data selection method method, which compresses Alpaca dataset by 16x while improving model performance on AlpacaEval.

  5. Ma, Qianli, Zhou, Haotian,  Liu, Tingkai, Yuan, Jianbo, Liu, Pengfei, You, Yang, and Yang, Hongxia
    arXiv, Oct 2023

    We propose a heuristic greedy search algorithm based on step-level reward for improved LLM reasoning capabilities during inference. We also present a new automated way to generate massive amount of step-level reward signal for code generation tasks based on mutation testing.

  6. Liu, Haogeng, Fan, Qihang,  Liu, Tingkai, Yang, Linjie, Tao, Yunzhe, Huang, Huaibo, He, Ran, and Yang, Hongxia
    arXiv, Oct 2023

    We propose Video-Teller, a video-language foundation model with fine-grained modality alignment to enhance video-to-text generation tasks.

  7. Aurel A. Lazar, Chung-Heng Yeh, and Liu, Tingkai
    US Patent US11674937B2, , Jun 2023

    We developed a biomimetic method and apparatus for encoding odorants, which has been patented.

  8. Lazar, Aurel A.,  Liu, Tingkai, and Yeh, Chung-Heng
    PLOS Computational Biology, Apr 2023

    We provide theoretical and computational evidence for the functional logic of the Antennal Lobe as a robust odorant object identity recovery processor with ON-OFF event-based processing.

  9. Lazar, Aurel A.,  Liu, Tingkai, Turkcan, Mehmet K., and Zhou, Yiyin
    eLife, Feb 2021

    We developed FlyBrainLab, an open-source computing platform that integrates 3D exploration and visualization of diverse datasets with interactive exploration of modeled executable brain circuits in Drosophila.

  10. Lazar, Aurel A.,  Liu, Tingkai, and Yeh, Chung-Heng
    ICASSP 2020, May 2020

    We present a bio-mimetic odorant encoding machine for sampling, reconstruction, and robust representation of odorant identity in the Drosophila olfactory system.

  11. Lazar, Aurel A.,  Liu, Tingkai, and Zhou, Yiyin
    bioRxiv, Sep 2022

    We demonstrate both theoretically and computationally that the Divisive Normalization Processor (DNP) is an invertible operator that can faithfully represent input information given sufficient output samples, with application to different sensory modalities.

  12. Lazar, Aurel A.,  Liu, Tingkai, Yeh, C.-H., and Zhou, Yiyin
    bioRxiv, Sep 2022

    We propose a feedback divisive normalization architecture of the Mushroom Body Calyx circuit in the Drosophila olfactory system for odorant demixing. We show that the biological network is highly optimized for processing odorant mixtures.

  13. Ukani, N. H., Yeh, C.-H., Tomkins, A., Zhou, Y., Florescu, D., Ortiz, C. L., Huang, Y.-C., Wang, C.-T., Turkcan, M. K.,  Liu, Tingkai, Richmond, P., Lo, C.-C., Coca, D., Chiang, A.-S., and Lazar, A. A.
    bioRxiv, Mar 2019

    The Fruit Fly Brain Observatory is a platform that aims to bridge the gap between structural and functional data in neuroscience research on Drosophila. It provides tools for exploring and analyzing various types of fruit fly brain data, including connectome and functional imaging data.