Vector Space Model
Contents:
Bag-of-words matching
Overview of the vector space model
Query and document vectors
Dot product and Euclidean distance
Term weighting
Inverse document frequency (idf)
Feature selection with tf-idf
Document length normalization
State-of-the-art retrieval formula
tf-idf weighted sum
Cosine and Jacquard coefficient
Cosine with tf-idf weights
p-norm and chi-squared distance
Phrases and multi-word features
Applications of the VSM