Python framework for fast Vector Space Modelling
This package contains the pure Python implementation of gensim. If you don't need the highly optimized version of word2vec, it is sufficient to install this package. Otherwise installing the "python-gensim-addons"-package is strongly recommended. Gensim is a Python library for topic modelling, document indexing and similarity retrieval with large corpora. Target audience is the natural language processing (NLP) and information retrieval (IR) community. Features: * All algorithms are memory-independent w.r.t. the corpus size (can process input larger than RAM). * Intuitive interfaces - easy to plug in your own input corpus/datastream (trivial streaming API) - easy to extend with other Vector Space algorithms (trivial transformation API) * Efficient implementations of popular algorithms, such as online Latent Semantic Analysis (LSA/LSI), Latent Dirichlet Allocation (LDA), Random Projections (RP), Hierarchical Dirichlet Process (HDP) or word2vec deep learning. * Distributed computing: can run Latent Semantic Analysis and Latent Dirichlet Allocation on a cluster of computers, and word2vec on multiple cores. * Extensive HTML documentation and tutorials.
Release | Stable | Testing |
---|---|---|
EPEL 7 | 0.10.0-1.el7 | - |
You can contact the maintainers of this package via email at
python-gensim dash maintainers at fedoraproject dot org
.