Combining Background Knowledge and Learned Topics

Topics in Cognitive Science 3 (1):18-47 (2011)
  Copy   BIBTEX

Abstract

Statistical topic models provide a general data - driven framework for automated discovery of high-level knowledge from large collections of text documents. Although topic models can potentially discover a broad range of themes in a data set, the interpretability of the learned topics is not always ideal. Human-defined concepts, however, tend to be semantically richer due to careful selection of words that define the concepts, but they may not span the themes in a data set exhaustively. In this study, we review a new probabilistic framework for combining a hierarchy of human-defined semantic concepts with a statistical topic model to seek the best of both worlds. Results indicate that this combination leads to systematic improvements in generalization performance as well as enabling new techniques for inferring and visualizing the content of a document

Other Versions

No versions found

Links

PhilArchive



    Upload a copy of this work     Papers currently archived: 101,551

External links

Setup an account with your affiliations in order to access resources via your University's proxy server

Through your library

Similar books and articles

肝機能検査データからの因果モデルの構築.寺野 隆雄 稲田 政則 - 2002 - Transactions of the Japanese Society for Artificial Intelligence 17:708-715.
Concepts and the Innate Mind.Eric A. Margolis - 1995 - Dissertation, Rutgers the State University of New Jersey - New Brunswick

Analytics

Added to PP
2010-08-11

Downloads
127 (#172,749)

6 months
6 (#873,397)

Historical graph of downloads
How can I increase my downloads?