Specialization courses
Information Retrieval and Mining
- COURSE CONTENTS
-
Course contents: Web search basics. Document preprocessing, analysis, storage and indexing. Retrieval models (Boolean/Vector Space/Probabilistic). Tolerant retrieval. Evaluation measures and standard test collections. Document clustering (flat/hierarchical). Document classification (Naïve Bayes and vector space). Link analysis. Frequent itemset mining. Language models.
- ASSESSMENT
-
Assessment: The course grade will be based on programming projects (possibly involving a personal examination) and/or exercises (in-class or homework) that will jointly account for 50% of the final grade, and a 3-hour written final examination that will account for the remaining 50% of the course grade. These percentages may vary (+/-10%) each year. In order for a student to successfully complete the course, s/he must score at least 50% in the written exams, and the student’s weighted average should be 50% or higher.