Tokenization, stemming and synonyms


Contents:

  1. IR4.1 Vocabulary mismatch in IR
  2. IR4.2 Causes of vocabulary mismatch
  3. IR4.3 How to tokenize text
  4. IR4.4 Tokenizing Asian languages
  5. IR4.5 Morphological variation
  6. IR4.6 Stemming algorithms
  7. IR4.7 Porter and Krovetz stemmer
  8. IR4.8 Character n-grams
  9. IR4.9 Query-based stemming
  10. IR4.10 Removing stopwords
  11. IR4.11 Synonymy and polysemy in IR
  12. IR4.12 MeSH thesaurus
  13. IR4.13 Finding synonyms in Wordnet
  14. IR4.14 Statistical synonyms
  15. IR4.15 Examples of statistical synonyms
  16. IR4.16 Cosine similarity and the correlation coefficient
  17. IR4.17 Generalizsed vector space model
  18. IR4.18 Synonym expansion: Google tilde
  19. IR4.19 User interaction in information retrieval
  20. IR4.20 Relevance Feedback
  21. IR4.21 Rocchio feedback algorithm
  22. IR4.22 Rocchio algorithm illustration
  23. IR4.23 Pseudo-relevance feedback
  24. IR4.24 Illustration of pseudo-relevance feedback
  25. IR4.25 Why pseudo feedback works