Unstandardized Accounting Terminology
  • Home
  • Terms & Concepts
  • Database Matching
  • Name Matching Package

On this page

  • Overview
  • Term Lists
  • Concept Lists
  • Most Common Terms
  • Most Common Concepts

Terms & Concepts


Overview

We construct two alternative accounting dictionaries following top-down and bottom-up approaches. Each dictionary consists of:

  • Term Lists: Unique accounting terms appearing in financial reports
  • Concept Lists: Mappings showing which terms describe the same underlying accounting concepts (i.e., synonyms)

Top-Down: Terms collected from IFRS, US GAAP, and UK GAAP authoritative sources and specialized accounting dictionaries. Terms explicitly classified as synonyms are grouped by concept. Lists are refined using a GPT-based procedure and manual validation, then restricted to terminology observed in our global corpus.

Bottom-Up: Terms extracted from XBRL filings on EDGAR by parsing Exhibit 101.LAB files, which map XBRL taxonomy tags to the natural language labels firms use in their reports. To reduce noise, a term must appear in at least 20 distinct filings. We apply a majority disambiguation rule, removing terms that appear in less than 5% of all filings for a given concept. An international version is created by repeating these steps for non-US firms cross-listed in the United States that file 20-F reports using the IFRS Taxonomy.


Term Lists

Complete list of unique accounting terms identified in each dictionary. Each term has a unique identifier (TID) and n-gram count indicating whether it consists of one word, two words, etc.

📥 Download: Excel File (2.9 MB)

  • Top-Down
  • Bottom-Up (10-Ks)
  • Bottom-Up (20-Fs)

Concept Lists

Shows which terms are synonyms (i.e., describe the same underlying accounting concept). Terms sharing the same concept ID (CID) are used interchangeably to describe the same economic item.

📥 Download: Excel File (2.7 MB)

  • Top-Down
  • Bottom-Up (10-Ks)
  • Bottom-Up (20-Fs)

Most Common Terms

The same terms as in Term Lists above, but ranked by frequency of use across all reports in our corpus. Shows how often each term appears (FREQUENCY), the number of observations (OBS), and the number of distinct firms using the term (FIRMS). This reveals which accounting terms are most prevalent in reporting practice.

📥 Download: Excel File (5.1 MB)

  • Top-Down
  • Bottom-Up (10-Ks)
  • Bottom-Up (20-Fs)

Most Common Concepts

The same concepts as in Concept Lists above, but ranked by frequency and showing all synonymous terms used for each concept. For each concept, you can see the different terms firms use along with usage statistics (FREQUENCY, OBS, FIRMS). This illustrates the extent of terminology variation for the most frequently reported accounting concepts.

📥 Download: Excel File (4.8 MB)

  • Top-Down
  • Bottom-Up (10-Ks)
  • Bottom-Up (20-Fs)
 

© 2025 | Supplementary materials for JAE submission