演讲人: Jian-Yun Nie, University of Montreal
讲座时间: 2011 年 7月13日上午 10:00-11:30
讲座地点: 理工配楼101b
讲座内容:
Dependency Models for Information Retrieval
报告人简介:
Jian-Yun Nie is a professor in University of Montreal. He graduated from Southeast University (B.Sc.) and University of Grenoble (PhD). His research focuses on IR and NLP and has worked on topics such as IR models, cross-lingual and multi-lingual IR, IR using query logs, machine translation, and so on. He has published more than 150 research papers in journals and conferences. In 1999, he won the “Best paper” award at SIGIR conference. He has been a regular PC member for a number of conferences such as SIGIR, CIKM and ACL, and is on editorial board of 7 international journals. In particular, he is serving as a general co-chair of SIGIR 2011 in Beijing. Jian-Yun Nie has been an invited professor/researcher in several institutions such as Microsoft Research, Yahoo, and Chinese University of Hong Kong, and he is currently a visiting professor in Peking University.
报告摘要:
Many attempts have been made to take into account term dependencies, which is a fundamental problem in IR. In this talk, we will review a series of such studies. In general, when a dependency is detected from a query, it is imposed as an additional requirement on the retrieval process. However, we observed that all the detected dependencies are not equally useful for IR. For example, while the dependency within “black Monday” is highly useful, the one in “financial investment” is much less. Indeed, in the latter case, using the two separate words can perform equally well, and imposing the dependency on retrieved documents may even be harmful. We therefore developed a new dependency model, in which the dependencies are integrated according to their utility for IR. Different types of dependency are used, and the utility of a dependency is estimated using a machine learning method. We will show that the model performs better than the previous dependency models on several test collections. This result shows that an IR model should take into account not only term dependencies, but also their goodness.