Слайд 2Topic modeling
Models of a collection of composites
Composites are documents
Parts are words (or
data:image/s3,"s3://crabby-images/0d3de/0d3deccc46ba14459b7c82f4c3178829fa3e56e4" alt="Topic modeling Models of a collection of composites Composites are documents Parts"
phrases, n-grams)
Two outputs:
chance of selecting a particular part when sampling a particular topic
chance of selecting a particular topic when sampling a particular document or composite
Слайд 3Assumptions
semantic information can be derived from a word-document co-occurrence matrix;
topic is a
data:image/s3,"s3://crabby-images/06959/06959c18aef50b06b1f6cb79c99da69e3d724760" alt="Assumptions semantic information can be derived from a word-document co-occurrence matrix; topic"
probability distribution over words
to make a new document, one chooses a distribution over topics
for each word in that document, one chooses a topic at random according to this distribution, and draws a word from that topic.
Resulting document is a mixture of topics