Mapping the Evolution of Threat Perceptions in Global Security Documents with Natural Language Processing
International Relations
Security
Methods
Narratives
Big Data
Abstract
This study presents a pioneering analysis of the evolving landscape of global security issues as depicted in national security documents over the past quarter-century. Employing a comprehensive corpus—the most extensive of its kind—we harness Natural Language Processing (NLP) technologies to dissect and understand the changing nature of threats, risks, and priorities as articulated by states worldwide.
The advent of these documents has created a novel international discursive reality, with official narratives on security and defence now prolific. Our project marks the first attempt to map and scrutinize this corpus in its entirety, offering a systematic examination of the ways in which security concerns are conceptualized and evolve over time.
We developed an NLP methodology that clusters semantically similarly conceptualizations across the corpus. This technique affords a high degree of accuracy in identifying similar meanings within diverse linguistic expressions, thereby significantly reducing subjective bias and eliminating the requirement for inter-coder reliability checks. Our approach enables us to quantitatively trace the prevalence of security topics, establish correlations with an array of variables, and render these insights through a variety of visualizations.
By dissecting the data along multiple variables—such as document type, publication year, UN region, development status, alliance affiliations, and issuing authority—we reveal patterns that spawn a multitude of intriguing research inquiries, particularly focusing on the inclusion, distribution, and conceptual framing of security issues.
Our NLP tools also enrich qualitative analysis. Analysts can sift through data files of extracted sentences, ranked by similarity scores and re-arrangeable by other criteria, to discern the semantic structures shaping security discourse. This empowers analysts to distinguish between recognized and emerging narratives within the field of security studies, such as the formal 'grammar' of securitization involving existential threats, points of no return, and possible resolutions, as contrasted with constructs of hypothetical risks or policy priorities.
Determining what constitutes a significant level of thematic similarity necessitates expert judgment, thereby placing human expertise at the crux of our methodology. This ensures that topics curated for analysis are robust, yielding evidence for hypothesis testing and subsequent evaluation. Topics that return fewer results signal lower corpus relevance, while those yielding no results confirm their absence in the discourse, with benchmarks set against random, non-relevant topics.
Advantages of our methodological framework include objectivity at the coding stage, comprehensiveness in evaluation, reproducibility, amenability to statistical scrutiny, and standardization, which allows for collaborative interoperability among different research entities within a shared semantic landscape.
In conclusion, this study not only redefines our understanding of how security issues are articulated and perceived globally but also sets a new standard for analysing complex discursive realities in an increasingly interconnected and security-conscious world.