================================================== -->

Suchir Salhan

Suchir Salhan is a Computer Science PhD candidate at the University of Cambridge, working on Small Language Models under the supervision of Prof Paula Buttery and the Founder of Per Capita Media, Cambridge University's newest independent media organisation.

To contact me, email sas245@cam.ac.uk.

Quick Links:




Cambridge Small Language Models: Cognitively-Inspired Alternatives to Transformer-based LLMs

My research is concerned with building scalable small Language Models. While industry-led efforts have built competitive LLMs that have fundamentally shifted the job of the contemporary Natural Language Processing (academic) researcher in various ways, there are still several fundamental, open questions to address that are not obviously ancillary to commercial AI research labs.

Modern Natural Language Processing (NLP) is dominated by large pre-trained, highly parameterised neural networks trained on extremely large web-mined corpora. Training and inference using such models are incredibly costly, and the benefits of the pre-train/fine-tune paradigm are unclear for domain-specific downstream tasks. Recent advances in language modelling rely on pretraining highly parameterized neural networks on extremely large web-mined text corpora. Training and inference with such models can be costly in practice, which incentivises the use of smaller counterparts. Additionally, theoretical linguists and cognitive scientists have highlighted several weaknesses with state-of-the-art foundation models.

The Cambridge Small Language Models technical 'blog' focuses on interesting techniques and perspectives from collaborators and other Machine Learning, NLP and Cognitive Science researchers that broadly relate to the above issues.

Follow:

Recent News:

I organise the Natural Language & Information Processing (NLIP) Seminars in the Department of Computer Science & Technology, University of Cambridge.

Handouts & Teaching Materials:

I delivered my first guest lecture in November 2024 for an MPhil course in the University of Cambridge with Prof Buttery and Dr Fermín Moscoso del Prado Martín on Language Model Evaluation – aged 22, this was a great opportunity and privilege so early in my “formal” academic career. In Lent 2025, I am the Teaching Assistant (a new role equivalent to Lead Demonstrator) for CST IA Machine Learning & Real World Data and am a supervisor for Machine Learning & Bayesian Inference (MBLI) [CST Part II]. I am supervising an MPhil Project on Language Model Stability.

Cambridge Tripos Page contains a (non-exhaustive) set of resources for the Computer Science and Linguistics Triposes.