Skip to content

Latest commit

 

History

History
21 lines (16 loc) · 2.39 KB

README.textile

File metadata and controls

21 lines (16 loc) · 2.39 KB

Code für das Text Engineering Seminar (siehe Seminarplan )

package ir (Information-Retrieval)

Inhalt Ressourcen/Dependencies Literatur
basic Korpus, Lineare Suche, Term-Dokument-Matrix Shakespeare IIR Kap. 1
boole Invertierter Index, Listen-Intersection, Vorverarbeitung, Positional Index, PositionalIntersect IIR Kap. 1 + 2
ranked Ranked Retrieval: Termgewichtung, Vector Space Model IIR Kap. 6 + 7
evaluation Evaluation: Precision, Recall, F-Maß IIR Kap. 8
lucene Lucene: Indexer und Searcher lucene-core, lucene-queryparser, lucene-analyzers-common Lucene in Action
web Crawler, WebDocument commons-io, nekohtml, jrobotx IIR Kap. 19 + 20

package tm (Text-Mining)

Inhalt Ressourcen/Dependencies Literatur
document Document, Topics, TermIndex, FeatureVector
corpus Korpus, DB, DocumentIndex, Crawler db4o, crawler (siehe package ir.web )
classification TextClassifier, Naive Bayes