OPEN

Wikipedia defines word embedding as the collective name for a set of language modeling and feature learning techniques in natural language processing(NLP) where words or phrases from the vocabulary are mapped to vectors of real numbers.

Word embeddings are a way to transform words in text to numerical vectors so that they can be analyzed by standard machine learning algorithms that require vectors as numerical input.

To overcome the limitations of one-hot encoding, the NLP community has borrowed techniques from information retrieval(IR) to vectorize text using the document as the context. Notable techniques are TF-IDF, latent semantic analysis(LSA), and topic modeling. However, these representations capture a slightly different document-centric idea of semantic similarity.

Plugin Backlinks: 아무 것도 없습니다.

Word Embeddings

관련 문서

Various Ways