Suchir Salhan

Suchir Salhan is a Computer Science PhD candidate at the University of Cambridge, working on Small Language Models under the supervision of Prof Paula Buttery. He is the Founder of Per Capita Media, Cambridge University's newest independent media organisation.
To contact me, email sas245@cam.ac.uk.
Quick Links:
Small Language Models: Cognitively-Inspired Alternatives to Transformer-based LLMs
My research is concerned with building scalable small Language Models. While industry-led efforts have built competitive LLMs that have fundamentally shifted the job of the contemporary Natural Language Processing (academic) researcher in various ways, there are still several fundamental, open questions to address that are not obviously ancillary to commercial AI research labs.
Recent News:
I am very excited to announce that I will be co-supervising several Small Language Model Undergraduate Research Opportunity (UROP) Internships based on Pico, our learning dynamics framework. Stay tuned for more information about the UROPs and the application process, or email sas245@cam.ac.uk or pjb48@cam.ac.uk if you have any questions.
In Easter 2025, Dr Weiwei Sun and I are co-organising a reading group on Computational Models of Language. We meet weekly in the Department, with hybrid options, for those joining online to discuss Neural Grammar Induction, Computational Models of Acquisition, Change and Language Evolution. To join or express interest, email sas245@cam.ac.uk or ws390@cam.ac.uk
March 2025 - We released PicoLM, the Cambridge Small Language Model & Learning Dynamics Framework. Check out the YouTube Video put together by Zeb Goriely: Introducing PicoLM | YouTube.
March 2025 - I attended and presented a poster at HumanCLAIM in Gottingen, Germany, kindly organised by Prof Lisa Beinborn.
March 2025 - I delivered a presentation in Tubingen on “Human Validated Grammar Profiles for Language Models” in a colloquium organised by Prof Detmar Meurers.
In Lent 2025, I am the Teaching Assistant (a new role equivalent to Lead Demonstrator) for CST IA Machine Learning & Real World Data and am a supervisor for Machine Learning & Bayesian Inference (MBLI) [CST Part II].
I organise the Natural Language & Information Processing (NLIP) Seminars in the Department of Computer Science & Technology, University of Cambridge. These are weekly seminars with speakers presenting their research on Language Models, Machine Learning, Cognitive Modelling and more traditional topics in Computational Linguistics.
November 2024 - I delivered my first guest lecture in November 2024 for an MPhil course in the University of Cambridge with Prof Buttery and Dr Fermín Moscoso del Prado Martín on Language Model Evaluation – aged 22, this was a great opportunity and privilege so early in my “formal” academic career.
November 2024 - I presented my MEng Thesis at EMNLP in Miami. See the ACL paper here: https://aclanthology.org/2024.conll-babylm.15/. Thanks to my Part III Supervisors (Zeb, Richard, Andrew and Paula) for all their help!
Team & Collaborators
I am very grateful to a number of other academics, researchers and PhD students who I am currently working with and/or previously worked with:
- Bianca Ganescu – my MPhil student, currently completing a thesis on Small Multimodal (Vision-Language Models) for the MPhil in Advanced Computer Science. Co-supervised with Dr Andrew Caines and Prof Paula Buttery.
- The Cambridge PicoLM Team: Richard Diehl Martinez (and his MPhil/Part III students: Yuval Weiss and David Demitri Africa), Ryan Daniels (a Machine Learning Engineer with the Accelerate Programme for Scientific Discovery), and Prof Paula Buttery.
- Zeb Goriely, Pietro Lesci and Julius Cheng.
- XFACT (KAIST AI) and NAVER Cloud : Jiwoo Hong (KAIST), Dr James Thorne, Jeonghoon Kim (NAVER Cloud, KAIST AI) and Woojin Chung.
- Dr Fermin Moscoso del Prado Martin – we are working on information-theoretic models of diachronic phonological typology. Paul Siewert and I have also been thinking about Category Theoretic approaches in Linguistics.
- Yuan Gao (CST) , Mila Marcheva (CST) and Nuria Bosch Masip (TAL).
- Gabrielle Gaudeau, Dr Diana Galvan Sosa, Dr Donya Rooin (MilaNLP, Bocconi University), Dr Zheng Yuan and Hongyi (Sam) Gu (KCL/NetMind.AI).
- Dr Konstantinos Voudouris (Institute for Human-Centered AI at Helmholtz Munich).
I am always open to discussing research ideas that might (or might not!) be related to these directions!
Being acknowledged for providing feedback on a paper draft is a very kind and generous token of appreciation! Here is a running list of papers I have been acknowledged in:
- Moscoso del Prado Martin (2025) Measuring Grammatical Diversity from Small Corpora: Derivational Entropy Rates, Mean Length of Utterances, and Annotation Invariance
- Goriely & Buttery (2025) IPA-CHILDES & G2P+: Feature-Rich Resources for Cross-Lingual Phonology and Phonemic Language Modeling
Cambridge Lecture Notes and Supervision Sheets
Here is my lecture handout on Language Model Evaluation for the L95 Module for the MPhil in Advanced Computer Science: L95 Lecture | Michaelmas 2024
I am currently in the process of collating a collection of lecture notes, handouts and supervision materials for courses I am supervising (largely in the Computer Science Tripos) or have previously taken.
I supervise several courses for Cambridge Computer Science undergraduates:
- CST IA (First Year) Introduction to Probability, Machine Learning & Real World Data (Supervisor & Teaching Assistant/Lead Demonstrator)
- CST IB (Second Year): Artificial Intelligence, Formal Models of Language.
- CST II (Third Year): Machine Learning & Bayesian Inference.