open:word-embeddings

Word Embeddings

Wikipedia defines word embedding as the collective name for a set of language modeling and feature learning techniques in natural language processing(NLP) where words or phrases from the vocabulary are mapped to vectors of real numbers.

Word embeddings are a way to transform words in text to numerical vectors so that they can be analyzed by standard machine learning algorithms that require vectors as numerical input.

To overcome the limitations of one-hot encoding, the NLP community has borrowed techniques from information retrieval(IR) to vectorize text using the document as the context. Notable techniques are TF-IDF, latent semantic analysis(LSA), and topic modeling. However, these representations capture a slightly different document-centric idea of semantic similarity.


  • open/word-embeddings.txt
  • 마지막으로 수정됨: 2020/06/02 09:25
  • 저자 127.0.0.1