/plushcap/analysis/deepgram/how-do-language-models-handle-obscure-words

How Do Language Models Handle Obscure Words?

What's this blog post about?

Language models (LMs) are designed to handle obscure or unknown words by using context clues from surrounding words. These models learn the likelihood of a word appearing based on its neighboring words, which is known as lexical semantics. However, LMs can struggle with out-of-vocabulary (OOV) words that were not part of their training data. One approach to handling OOV words is to replace them with the most likely word based on surrounding context. Another method involves breaking down words into smaller morphemes or subwords, which can help LMs better understand and predict the meanings of unfamiliar words. However, this approach may not be effective for all languages due to differences in morphological structures. To improve LMs' ability to handle OOV words, researchers recommend incorporating linguistic aspects beyond morphology into language models. This could involve developing separate tokenizers tailored to different language families and their unique morphological structures. By better capturing the complexity and diversity of human languages, LMs can more effectively handle unknown or obscure words in various contexts.

Company
Deepgram

Date published
Sept. 8, 2023

Author(s)
Brad Nikkel

Word count
1710

Language
English

Hacker News points
None found.