The nltk
library for python contains a lot of useful data in addition to it's functions. One convient data set is a list of all english words, accessible like so:
from nltk.corpus import words
word_list = words.words()
# prints 236736
print len(word_list)
You will probably first have to download the word list using nltk
's download()
function. The following code should give you a GUI window to select the data you want (look for "words" under the "Corpora" tab):
import nltk
nltk.download()
Similar Posts
- Filter common words from documents, Score: 0.998
- How to quickly test if an element belongs to a group, Score: 0.984
- Using topic modeling to find related blog posts, Score: 0.983
- Topic modeling of Shakespeare characters, Score: 0.974
- Analysis of Shakespeare character speech topics, Score: 0.915
Comments