Text Retrieval and Search Engines
Coursera
Course Summary
Search engines are essential tools for managing and mining big text data. Learn how search engines work, the major search algorithms, and how to optimize search accuracy.
-
+
Course Description
Recent years have seen a dramatic growth of natural language text data, including web pages, news articles, scientific literature, emails, enterprise documents, and social media such as blog articles, forum posts, product reviews, and tweets. Text data are unique in that they are usually generated directly by humans rather than a computer system or sensors, and are thus especially valuable for discovering knowledge about people’s opinions and preferences, in addition to many other kinds of knowledge that we encode in text. This course will cover search engine technologies, which play an important role in any data mining applications involving text data for two reasons. First, while the raw data may be large for any particular problem, it is often a relatively small subset of the data that are relevant, and a search engine is an essential tool for quickly discovering a small subset of relevant text data in a large text collection. Second, search engines are needed to help analysts interpret any patterns discovered in the data by allowing them to examine the relevant original text data to make sense of any discovered pattern. You will learn the basic concepts, principles, and the major techniques in text retrieval, which is the underlying science of search engines.
-
+
Course Syllabus
This course will be covering the following topics:
- Introduction to text data mining
- Basic concepts in text retrieval
- Information retrieval models
- Implementation of a search engine
- Evaluation of search engines
- Advanced search engine technologies
-
+
Recommended Background
Basic knowledge of data structures and proficiency in programming with either C++ or Java is recommended. Basic knowledge of probability and statistics is helpful, but not required.
-
+
Course Format
The course will have video lectures, accompanied by quizzes and peer graded assignments.