단어에 정수를 부여하는 방법으로 단어를 빈도수 순으로 정렬한 단어 집합(vocabulary)을 만들고, 빈도수가 높은 순서대로 낮은 숫자를 부여하는 방법이 있음. 1) dictionary 사용하기 from nltk.tokenize import sent_tokenize from nltk.tokenize import word_tokenize from nltk.corpus import stopwords text = "A barber is a person. a barber is good person. a barber is huge person. he Knew A Secret! The Secret He Kept is huge secret.
Huge secret. His barber kept his word. a barber kept his word.
His barber kept his secret. But keeping and keeping such a huge secret to himsel...
#
인코딩
#
정수
원문 링크 : 정수 인코딩(Integer Encoding)