Word Vectorization: How LLMs Learned to Write Like Humans

Company

Deepgram

Date Published

March 13, 2023

Author

Jose Nicholas Francisco

Word count

1893

Language

English

Hacker News points

None

URL

deepgram.com/learn/word-vectorization-how-llms-learned-to-write-like-humans

Summary

Word Vectorization is a technique used by large language models (LLMs) to learn how to write like humans. It involves transforming words into numbers, or vectors, which represent the relationships between words based on their frequency of co-appearance in documents. These word vectors can be manipulated using mathematical operations such as addition and subtraction, allowing LLMs to understand context and complete sentences. Models like BERT and GPT-3 utilize this technique to generate human-like text by processing large amounts of data during training.