Module Tutor: Dell Zhang
Time: Tuesday evenings 6pm - 9pm (Spring Term)
Venue: Online (Blackboard Collaborate) on Moodle
Code: COIY064H7
Teaching Assistant: Ehshan Veerabangsa (e.veerabangsa@bbk.ac.uk) - Coursework Marking & Asynchronous Support
Dan Jurafsky and
James H. Martin. Speech and Language Processing, 2nd edition, Pearson, 2008. Companion Website (3rd edition draft) |
|
Christopher D. Manning,
Prabhakar Raghavan, and
Hinrich Schutze. Introduction to Information Retrieval. Cambridge University Press, 2008. Companion Website |
Week | Date | Session 1 | Session 2 |
---|---|---|---|
1 | 11/01/2022 |
[IIR-00]
[SLP-01] Motivation [slides] |
[IIR-01] Boolean Retrieval [slides] [classwork-p] [classwork-s] |
2 | 18/01/2022 |
[IIR-02] The Term Vocabulary and Postings Lists [slides] |
[SLP-02]
[SLP-08a] Regular Expressions, Text Normalization, Sequence Labeling [slides] [slides] |
3 | 25/01/2022 |
[IIR-03] Dictionaries and Tolerant Retrieval [slides] [classwork-p] [classwork-s] |
[SLP-02] Edit Distance [slides] |
4 | 01/02/2022 |
[IIR-05] Index Compression [slides] [classwork-p] [classwork-s] |
[IIR-06] Scoring, Term Weighting, and the Vector Space Model [slides] [classwork-p] [classwork-s] [example] |
5 | 08/02/2022 |
[IIR-08] Evaluation in Information Retrieval [slides] [example] |
[IIR-11] Probabilistic Information Retrieval [slides] [example] |
6 | 15/02/2022 |
Reading Week: No Lecture for All Students. Please find below the materials to read. |
|
[SLP-20a] Lexicons for Sentiment, Affect, and Connotation (1/2) [slides] |
[SLP-24] Chatbots and Dialogue Systems [slides] |
||
-- | 20/02/2022 | Coursework Part 1 - Submission Deadline | |
7 | 22/02/2022 |
[IIR-12] Language Models for Information Retrieval [slides] [example] |
[SLP-03a] Language Modeling with N-Grams [slides] |
8 | 01/03/2022 |
[SLP-B] Spelling Correction and the Noisy Channel [slides] |
[IIR-13]
[SLP-04] Text Classification, Naive Bayes, and Sentiment Analysis [slides] [slides] [slides] [example] |
9 | 08/03/2022 |
[IIR-14] Vector Space Classification [slides] [demo] [example] |
[SLP-05] Logistic Regression [slides] |
10 | 15/03/2022 |
[IIR-18] Matrix Decompositions and Latent Semantic Indexing [slides] [article] |
[SLP-06] Vector Semantics [slides] [slides] [slides] |
11 | 22/03/2022 |
[SLP-07] Neural Nets and Neural Language Models [slides] |
[SLP-09] Deep Learning Architectures for Sequence Processing [slides] |
-- | 03/04/2022 | Coursework Part 2 - Submission Deadline | |
-- | Tuesday 03/05/2022 6pm - 9pm |
Revision Lecture [slides] Past Exam Paper 2008-09 2009-10 2010-11 2011-12 2012-13 2013-14 2014-15 2015-16 2016-17 2017-18 2018-19 2019-20 2020-21 |
|
-- | ---------- |
Index Construction [slides] |
Computing Scores in a Complete Search System [slides] |
-- | ---------- |
Relevance Feedback and Query Expansion [slides] |
XML Retrieval [slides] |
-- | ---------- |
Flat Clustering [slides] [demo] [example] |
Hierarchical Clustering [slides] [example] |
-- | ---------- |
Support Vector Machines & Machine Learning on Documents [slides] |
Near-Duplicates and Shingling [slides] [classwork-p] [classwork-s] |
-- | ---------- |
Suffix Trees [slides] [example] |
Probabilistic Topic Models [slides] |
Coursework: 20%
Part 1 [Reassessment]
Normal deadline: Fri 05/08/2022 13:00
Cut-off deadline: Fri 19/08/2022 13:00
Part 2 [Reassessment]
Normal deadline: Fri 05/08/2022 13:00
Cut-off deadline: Fri 19/08/2022 13:00
Please submit your solutions as a PDF file through Moodle.
Examination: 80%
Past exam papers can be found at Birkbeck eLibrary.
MSc students committed to excellence are welcome to contact me for project ideas.
Python [A Short Course for BGRS and BPSN]
Apache Lucene
Terrier IR Platform
The Lemur Project
Python Package - Whoosh
Forsyth David and Ponce Jean: An Introduction to Probability.
Peter Norvig: How to Write a Spelling Corrector.
Peter Norvig: Natural Language Corpus Data, in Beautiful Data: The Stories Behind Elegant Data Solutions.
Paul Graham: A Plan for Spam.
Paul Graham: Better Bayesian Filtering.
Robert M. Bell et al.: The Million Dollar Programming Prize, IEEE Spectrum, May 2009.
Stuart Russell and Peter Norvig, Artificial Intelligence: A Modern Approach, 3rd edition, Prentice Hall, 2010. (Chapter 22 Natural Language Processing)
Ricardo Baeza-Yates and Berthier Ribeiro-Neto, Modern Information Retrieval: The Concepts and Technology Behind Search, 2nd edition, Addison Wesley, 2010.
Bruce Croft, Donald Metzler, and Trevor Strohman, Search Engines: Information Retrieval in Practice, international edition, Pearson Education, 2009.
Stefan Buttcher, Charles Clarke, and Gordon Cormack, Information Retrieval: Implementing and Evaluating Search Engines, MIT Press, 2010.
David Grossman and Ophir Frieder, Information Retrieval: Algorithms and Heuristics, 2nd edition, Springer, 2004.
Jeffrey Dean: Challenges in Building Large-Scale Information Retrieval Systems (WSDM-2009 Keynote Speech). [VideoLecture]
UC Berkeley Course SIMS141: Search Engines: Technology, Society, and Business [Guest Lecture Videos].
Michael McCandless, Erik Hatcher, and Otis Gospodnetic, Lucene in Action, 2nd edition, Manning, 2010.
Toby Segaran, Programming Collective Intelligence: Building Smart Web 2.0 Applications, O'Reilly, 2007.
Matthew Russell, Mining the Social Web: Analyzing Data from Facebook, Twitter, LinkedIn, and Other Social Media Sites, O'Reilly, 2011.
Satnam Alag, Collective Intelligence in Action
, Manning, 2008.
Haralambos Marmanis and Dmitry Babenko, Algorithms of the Intelligent Web
, Manning, 2009.
Ron Zacharski, A Programmer's Guide to Data Mining, Free Online eBook.
Hans Rosling: The Joy of Stats [Video].
Stanford Course CS276/LING286: Information Retrieval and Web Mining
Stuttgart Course: Introduction to Information Retrieval
MSU Course CSE484: Information Retrieval
Cornell Course CS430/INFO430: Information Retrieval
UNT Course CSCE5200: Information Retrieval and Web Search
UIUC Course CS410: Introduction to Text Information Systems (Spring 2008)
UIUC Course CS598: Integrative Intelligent Information Systems (Spring 2008)
UMass Course CS646: Information Retrieval
UCSC Course ISM260: Information Retrieval
UTexas Course CS 371R: Information Retrieval and Web Search
UPenn Course CIS 430: Introduction to Human Language Technology
PSU Course IST 441: Information Retrieval and Search Engines
UNC Course INLS 490-154: Introduction to Information Retrieval System Design and Implementation (Fall 2008)
IIT Course CS429: Introduction to Information Retrieval
Columbia Course COMS 6998: Search Engine Technology
Colorado Course CSCI 7000-001:Introduction to Information Retrieval
JHU Course 605.744: Information Retrieval (Spring 2009)
UCL Course M052: Information Retrieval