PhD Seminar • Machine Learning | Information Retrieval • Modern IR Evaluation in the Retrieval Augmented Generation (RAG) Era | Cheriton School of Computer Science

Please note: This PhD seminar will take place in DC 3301.

Nandan Thakur, PhD candidate
David R. Cheriton School of Computer Science

Supervisor: Professor Jimmy Lin

Traditional IR evaluation (e.g., TREC, Cranfield paradigm) constructs test collections that use fixed corpora and pool relevance judgments, a practice that minimally captures the challenges of RAG applications. This talk starts by mentioning limitations in prevalent IR benchmarks, comprising either stale data, incomplete labels, or simplistic queries. In particular, we motivate why retrieval evaluation must evolve and why a metric shift is needed for IR evaluation in modern-day systems as an emerging requirement. We survey FreshStack, a holistic benchmark that addresses these gaps, by constructing test collections with recent StackOverflow Q&As and GitHub documents to reflect real-world programming questions, providing insight on the diversity-focused metrics used in IR evaluation. The goal is to give practitioners insights into the limits of traditional IR evaluation and guide them toward more realistic, robust evaluation practice of IR systems in the modern-day RAG applications.

Location Information

Location Address: DC - William G. Davis Computer Research Centre
200 University Avenue West
DC 3301
��ݮ��Ƶ, ON, CA N2L 3G1

Location coordinates: