Unstandardized Accounting Terminology
Supplementary Materials
Abstract
The communication of accounting information requires a domain-specific vocabulary, and, in specialized languages, standardization is considered a key to clear communication, i.e., one term should only be assigned to one concept and vice versa. In practice, accounting terminology is unstandardized and produces undue complexity. We provide the first large-sample evidence on the level and the implications of unstandardized accounting terminology for a global corpus of annual reports. Our study shows that unstandardized accounting terminology is widespread and has economic consequences, increasing human and computerized information processing costs.
About This Repository
This repository provides supplementary materials for our paper: Unstandardized Accounting Terminology. Datasets, concept mappings, as well as our code for company name matching available for download.
What You’ll Find:
Terms & Concepts: Complete accounting dictionaries constructed using top-down (authoritative sources) and bottom-up (XBRL filings) approaches. Includes term lists, concept lists showing synonyms, and frequency statistics across our global corpus of annual reports.
Database Item Matching: Mappings between Compustat and Worldscope data items and our accounting concepts. These mappings enable analysis of how terminology standardization affects data collection by commercial providers.
Name Matching Package: An R package (rMatching) for fuzzy matching of firm names across databases. Developed for matching Perfect Information filings to Datastream, but applicable to any general name matching task. Available on GitHub.