• Sab. Gen 17th, 2026

Atlantis is real: Official discovery of Atlantis, language and migrations

Atlantis is the Sardo Corso Graben Horst underwater continental block submerged by the Meltwater Pulses and destroyed by a subduction zone, Capital is Sulcis

Spread the love

The Birth of Conceptometry: Measuring the Wealth of Ideas in Texts

By Luigi Usai —

For decades, textual analysis has been dominated by surface metrics — word counts, lexical density, readability scores — useful but incomplete when the research question is the content of thought rather than the form of language. Conceptometry is a new scientific discipline created to fill that gap: a systematic, computational framework to measure the conceptual richness of a text in relation to its length and structure.

What is Conceptometry?

Conceptometry quantifies how many distinct concepts a text contains, how those concepts are distributed, and how complex or abstract those concepts are. Rather than counting words, it counts and weights ideas. It formalizes four primary metrics:

  • DCg — Raw Conceptual Density: number of unique concepts per token length.
  • DCp — Weighted Conceptual Density: concepts weighted by semantic depth and abstraction.
  • IRC — Conceptual Redundancy Index: degree of repetition / paraphrase of concepts.
  • EI — Informational Efficiency: how effectively textual space is used to convey distinct concepts.

How it works — an overview

Conceptometry combines Natural Language Processing (NLP), ontology linking, and psycholinguistic measures. A compact pipeline:

  1. Preprocessing (tokenization, lemmatization, normalization)
  2. Concept extraction (NER, concept linking to WordNet/BabelNet/ConceptNet)
  3. Complexity weighting (semantic depth Fd, abstraction factor Fa)
  4. Metric computation (DCg, DCp, IRC, EI) and visualization

Raw Text

NLP & Concept Extraction
NER • Linking • Disambiguation

Concept Graph

Metrics
DCg • DCp • IRC • EI

Figure — Conceptometry pipeline (schematic).

Why this matters

Conceptometry reframes questions of textual quality and complexity in terms of the ideas conveyed, not merely the words used. This change has direct implications for:

  • Education — adapt curricula and reading materials to conceptual load rather than only lexical difficulty.
  • Scientific communication — measure clarity and density of argumentation across papers and disciplines.
  • Content moderation & QA — detect verbose but concept-poor AI outputs or identify high-value informational passages.
  • Digital humanities — map conceptual evolution in corpora over time.

Scientific foundations and validation

Conceptometry integrates established work from multiple fields: lexical density and register studies (Ure, 1971), frequency analysis and corpora (Francis & Kucera, 1982), psycholinguistic concreteness norms (Brysbaert et al., 2014), and large-scale lexical networks such as WordNet (Miller, 1990). To be scientifically credible, the discipline requires:

  1. rigorous operational definitions of what constitutes a “concept” in a text;
  2. open algorithms for extraction and weighting (Fd, Fa) that are reproducible;
  3. validation on benchmark corpora (e.g., Brown, COCA) and human-rated concept-mapping studies.

Early prototypes show that concept-weighted metrics correlate with expert judgements of conceptual density in academic texts; robust validation remains a priority for the coming months.

Short roadmap

The near-term development plan for Conceptometry includes: (1) publishing open-source reference code; (2) assembling annotated corpora with concept maps; (3) workshops to refine Fd and Fa operationalizations; (4) interdisciplinary collaborations between NLP, cognitive psychology, and pedagogy.

Selected references

  • Ure, J. (1971). Lexical Density and Register Differentiation. Contemporary Educational Psychology, 5, 96–104.
  • Charles, W. G. (1988). The categorization of sentential contexts. Journal of Psycholinguistic Research, 17(5), 403–411.
  • Francis, W. N., & Kucera, H. (1982). Frequency Analysis of English Usage: Lexicon and Grammar. Houghton Mifflin.
  • Leacock, C., Towell, G., & Voorhees, E. M. (1993). Towards building contextual representations of word senses using statistical models. In Proc. ACL/SIGLEX Workshop, 10–20.
  • Miller, G. A. (1990). WordNet: An on-line lexical database. International Journal of Lexicography, 3(4), 235–312.
  • Brysbaert, M., Warriner, A. B., & Kuperman, V. (2014). Concreteness ratings for 40 thousand English word lemmas. Behavior Research Methods, 46, 904–911. DOI: 10.3758/s13428-013-0403-5
  • Usai, L. (2025). The invention of Conceptometry. Zenodo. DOI: 10.5281/zenodo.16789573

https://pillarsofhercules.altervista.org/concettometria-di-luigi-usai/

https://copilot.microsoft.com/shares/4V6BNrKX7azK9rikTpxaz