Files
open_dbm/docs/website/versioned_docs/version-2.1/lexical-richness.md

15 lines
888 B
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
id: lexical-richness
title: Lexical Richness
---
There are several terms for this measurement used across literature (sometimes also called diversity in vocabulary, etc.) and certainly more than one way to quantify it. We felt that an appropriate measure of richness of vocabulary would be the Moving Average Type Token Ratio (MATTR), reported in this paper by Convington and McFall[^1]. Simply put, it quantifies how many unique words are used in speech, which can be a proxy to some clinical measurements.
[^1]: Covington, M. A., & McFall, J. D. (2010). Cutting the Gordian knot: The moving-average typetoken ratio (MATTR). *Journal of quantitative linguistics*, 17(2), 94-100.
## Derived Variables
| Variable | Description |
| ----------- | ----------- |
| `nlp_mattr_mean` | **Lexical richness**, measured using the moving average type token ratio (MATTR). |