Arman Cohan

Assistant Professor of Computer Science
Room / Office: Room 332
Office Address:
17 Hillhouse Avenue
New Haven, CT 06511
Mailing Address:
P.O. Box 208285
New Haven, CT 06520
  • Ph.D. in Computer Science, Georgetown University


My research spans various problems at the intersection of Machine Learning and NLP, including language modeling, representation learning, retrieval, and applications in specialized domains.

Selected Awards & Honors:

  • Dr. Harold N. Glassman Distinguished Doctoral Dissertation Award in Science (2019)
  • EMNLP 2017 Best Long Paper Award (2017)

Selected Publications:

     For a full list of publications please see my Google Scholar page.


  • Question-Evidence Similarity Learning for Long-Context Question Answering (NAACL, 2022)
  • PRIMERA: Pyramid-Based Masked Sentence Pre-training for Multi-Document Summarization (ACL, 2022)
  • Flex: Unifying Evaluation for Few-shot NLP (NeurIPS, 2021)
  • CDLM: Cross-Document Language Modeling (EMNLP 2021)
  • Longformer: The Long Document Transformer (2020)
  • Specter: Document-level Representation Learning using Citation-Informed Transformers (ACL, 2020)
  • SciBERT: A Pretrained Language Model for Scientific Text (EMNLP, 2020)
  • Structural Scaffolds for Citation Intent Classification in Scientific Publications (NAACL, 2019)
  • Pretrained Language Models for Sequential Sentence Classification (EMNLP, 2019)
  • A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents (NAACL, 2018)