Faisal Ladhak

Hi! I'm a Ph.D. student in the Computer Science Department at Columbia University. I am part of the Natural Language Processing (NLP) group and I am advised by Kathleen McKeown. I am also a Visiting Student Researcher at Stanford University working with Tatsu Hashimoto and Dan Jurafsky. I am interested in building safe, reliable, and truthful generation systems and improving upon existing evaluation methods in natural language generation. My current research focuses on faithfulness and bias in generation systems. Prior to joining Columbia, I worked as a Senior Applied Scientist at Amazon Web Services AI.


Contact: faisal[at]cs[dot]columbia[dot]edu



Publications

  • When do Pre-training Biases Propagate to Downstream Tasks? A Case Study in Text Summarization
    Faisal Ladhak, Esin Durmus, Mirac Suzgun, Tianyi Zhang, Dan Jurafsky, Kathleen McKeown, Tatsunori Hashimoto
    To Appear in EACL, 2023.
  • Benchmarking Large Language Models for News Summarization
    Tianyi Zhang*, Faisal Ladhak*, Esin Durmus, Percy Liang, Kathleen McKeown, Tatsunori B. Hashimoto
    Preprint, 2023.
    [paper] [bib]
  • Tracing and Removing Data Errors in Natural Language Generation Datasets
    Faisal Ladhak, Esin Durmus, Tatsunori Hashimoto
    Preprint, 2022.
    [paper] [bib]
  • Holistic Evaluation of Language Models
    Preprint, 2022.
    [paper] [bib]
  • Evaluating Human-Language Model Interaction
    Preprint, 2022.
    [paper] [bib]
  • Easily Accessible Text-to-Image Generation Amplifies Demographic Stereotypes at Large Scale
    Federico Bianchi, Pratyusha Kalluri, Esin Durmus, Faisal Ladhak, Myra Cheng, Debora Nozza, Tatsunori Hashimoto, Dan Jurafsky, James Zou, Aylin Caliskan
    Preprint, 2022.
    [paper] [bib]
  • ToKen: Task Decomposition and Knowledge Infusion for Few-Shot Hate Speech Detection
    Badr AlKhamissi*, Faisal Ladhak*, Srinivasan Iyer, Veselin Stoyanov, Zornitsa Kozareva, Xian Li, Pascale Fung, Lambert Mathias, Asli Celikyilmaz, Mona Diab
    Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022.
    [paper] [bib]
  • Improving Faithfulness by Augmenting Negative Summaries from Fake Documents
    Tianshu Wang, Faisal Ladhak, Esin Durmus, He He
    Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022.
    [paper] [bib]
  • Faithful or Extractive? On Mitigating the Faithfulness-Abstractiveness Trade-off in Abstractive Summarization
    Faisal Ladhak*, Esin Durmus*, He He, Claire Cardie, Kathleen McKeown
    Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL), 2022.
    [paper] [bib]
  • Spurious Correlations in Reference-Free Evaluation of Text Generation
    Esin Durmus*, Faisal Ladhak*, Tatsunori Hashimoto
    Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL), 2022.
    [paper] [bib]
  • Constrained Regeneration for Cross-Lingual Query-Focused Extractive Summarization
    Elsbeth Turcan, David Wan, Faisal Ladhak, Petra Galuscakova, Sukanta Sen, Svetlana Tchistiakova, Weijia Xu, Marine Carpuat, Kenneth Heafield, Douglas Oard, Kathleen McKeown
    Proceedings of the 29th International Conference on Computational Linguistics (COLING), 2022.
    [paper] [bib]
  • CREATIVESUMM: Shared Task on Automatic Summarization for Creative Writing
    Proceedings of The Workshop on Automatic Summarization for Creative Writing, 2022.
    [team] [paper] [bib]
  • On the opportunities and risks of foundation models
    Preprint, 2021.
    [paper] [bib]
  • Segmenting Subtitles for Correcting ASR Segmentation Errors
    David Wan, Chris Kedzie, Faisal Ladhak, Elsbeth Turcan, Petra Galuscakova, Elena Zotkina, Zhengping Jiang, Peter Bell, Kathleen McKeown
    Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2021.
    [paper] [bib]
  • The Gem Benchmark: Natural Language Generation, its Evaluation and Metrics
    Proceedings of the 1st Workshop on Natural Language Generation, Evaluation, and Metrics (GEM), 2021.
    [team] [paper] [bib]
  • Incorporating Terminology Constraints in Automatic Post-Editing
    David Wan, Chris Kedzie, Faisal Ladhak, Marine Carpuat and Kathleen McKeown.
    WMT, 2020.
    [paper] [bib]
  • WikiLingua: A New Benchmark Dataset for Crosslingual Abstractive Summarization
    Faisal Ladhak, Esin Durmus, Claire Cardie and Kathleen McKeown.
    Findings of EMNLP, 2020.
    [paper] [bib]
  • To BERT or Not to BERT: Comparing Task-specific and Task-agnostic Semi-Supervised Approaches for Sequence Tagging
    Kasturi Bhattacharjee, Miguel Ballesteros, Rishita Anubhai, Smaranda Muresan, Jie Ma, Faisal Ladhak and Yaser Al-Onaizan.
    Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020.
    [paper] [bib]
  • Exploring Content Selection in Summarization of Novel Chapters
    Faisal Ladhak, Bryan Li, Yaser Al-Onaizan and Kathleen McKeown.
    Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), 2020.
    [paper] [bib]
  • Determining Relative Argument Specificity and Stance for Complex Argumentative Structures
    Esin Durmus, Faisal Ladhak and Claire Cardie.
    Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), 2019.
    [paper] [bib]
  • The Role of Pragmatic and Discourse Context in Determining Argument Impact
    Esin Durmus, Faisal Ladhak and Claire Cardie.
    Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019.
    [paper] [bib]
  • A neural interlingua for multilingual machine translation
    Yichao Lu, Phillip Keung, Faisal Ladhak, Vikas Bhardwaj, Shaonan Zhang and Jason Sun.
    Proceedings of the Conference on Machine Translation (WMT), 2018.
    [paper] [bib]
  • LatticeRnn: Recurrent Neural Networks Over Lattices
    Faisal Ladhak, Ankur Gandhe, Markus Dreyer, Lambert Mathias, Ariya Rastrow and Björn Hoffmeister.
    Interspeech, 2016.
    [paper] [bib]
  • Using Deep Learning to Enhance Cancer Diagnosis and Classification
    Rasool Fakoor, Faisal Ladhak, Azade Nazi, Manfred Huber.
    In Proceedings of the ICML Workshop on the Role of Machine Learning in Transforming Healthcare, 2013.
    [paper] [bib]