Our idea

Our project, carried out by two units, one from the University of Milan and one from the University of Insubria, studies the English metalanguage that was created to analyse and compare, appraise and classify, teach and learn the vernacular languages of Europe between 1500 and 1700, before the development of comparative philology and the institutionalisation of linguistics as an academic discipline. 

MetaLing Corpus: Creating a corpus of English linguistics metalanguage from the 16th to the 18th century 

Our project studies the English metalanguage that was created to analyse and compare, appraise and classify, teach and learn the vernacular languages of Europe between 1500 and 1700, i.e. before the development of comparative philology and the institutionalisation of linguistics as an academic discipline. To this end, we will build a corpus of texts dedicated to or including observations on vernacular languages, which, in the period under review, are to be found in works with a large variety of aims and fields (Van Hal 2019). Through extensive archival research and corpus compilation, the project in the field of history of English for Specific Purposes (ESP) aims to assess the genres and text-types involved in the circulation of linguistic knowledge, and thus throw light onto unconventional texts and voices besides the major works and figures on which scholarship has naturally concentrated. The core part of our study will involve the analysis of the terminology, discursive strategies and descriptive metaphors used to discuss language in these texts, in diachronic perspective.

Our method for corpus collection combines human and computational tools (Moretti 2000) to analyse available sources and make an inventory of authors and works representative of early modern linguistic metalanguage in English. For the purposes of this project, we intend to collect a meaningful corpus (Sangiacomo et al. 2022) in the sense that it both corroborates existing scholarly knowledge about some major aspects of the evolution of linguistics and its metalanguage in English and provides new insights about facets of this evolution that have not been observed previously. For these reasons, we aim at an open corpus, the composition of which may change over time also benefitting from future external contributions. In terms of actual workflow, we will proceed as follows: the large amount of scraped information will be cleaned, simplified, and tokenised via NLTK Python libraries; subsequently, the keywords and collocations will be further consolidated, analysed and processed though lexicon extraction techniques (Anglin, 2019; Lahti et al., 2019). The corpus will be published open access to be freely queried by other researchers.

The use of such a corpus will be multifold. This tool will help raise awareness of the significance of linguistics and philology in multilingual Europe, as a way to enhance the importance of these studies for the advancement of our knowledge of a long tradition of contact, exchange and even conflict between the linguistic and cultural identities of Europe. It will be a scholarly and didactic tool, and the terminology extracted from it will provide data of interest for open source dictionaries and lexical repertoires. This study is timely and relevant as a contribution to the existing debate on the development of the discourse of the humanities as an inherently interdisciplinary field.  

Anglin K. L. 2019. “Gather-Narrow-Extract: A Framework for Studying Local Policy Variation Using Web-Scraping and Natural Language Processing”, Journal of Research on Educational Effectiveness, 12(4), 685-706.

Lahti L., Marjanen J., Roivainen H., Tolonen M. 2019. “Bibliographic Data Science and the History of the Book (c. 1500–1800)”, Cataloging & Classification Quarterly, 57(1), 5-23.

Moretti F. 2000. “Conjectures on World Literature”, New Left Review, 1, 54.

Sangiacomo A., Tanasescu, R., Donker, S., & Hogenbirk, H. 2022. “Mapping the evolution of early modern natural philosophy: corpus collection and authority acknowledgement”. Annals of Science, 79(1), 1–39.

Van Hal, T. 2019. “Early Modern Views on Language and Languages (ca. 1450-1800).” In Oxford Research Encyclopaedia of Linguistics. Oxford UP. 

  • Angela Andreani (Principal Investigator), University of Milan
  • Daniel Russo (Associated Investigator), University of Insubria
  • Martin Petkov Ruskov, University of Milan
  • Simona Turbanti, University of Milan
  • Vahid Asadi, University of Milan
  • Andreani A., "Labelling Language Variety and Diversity in the English Renaissance", 71st Renaissance Society of America (RSA) Annual Meeting, 20 March 2025, Boston (MA), USA.
  • Russo D., “Corpus-Driven Metalinguistic Explorations: Analyzing Language Discussions in Early Modern English Sources”, 71st Renaissance Society of America (RSA) Annual Meeting, 22 March 2025, Boston (MA), USA.
  • Russo D., Andreani A., “Varieties of metalinguistic awareness in English, 1500-1700”, XVII Convegno Internazionale CIRSIL, 18 October 2024, University of Trento.
  • Russo D., Andreani A., “Mapping the history of language-related terminology in English (1500-1700): A corpus-based collocate approach”, Henry Sweet Society Colloquium 2023, 4 September 2023, University of Trás-os-Montes and Alto Douro, Vila Real, Portugal.
  • Russo D., “Navigating the Digital Frontier: A Study of Concurrent Translation (CT) Practices among Italian Professionals”, How can AI translate?, 22 aprile 2024, Università degli Studi di Napoli Federico II, Napoli.
  • Andreani A., Metalinguistic labelling in Florio’s A Worlde of Words (1598), International conference John and/or Giovanni, Tradition and Innovation in Florian Studies, 13-14 June 2024, Sapienza Università di Roma. 
  • Russo D., Andreani A., “A review of the metalinguistic labelling in R.C. Alston’s collection of texts regarding Cant and Dialects”, International Conference on the History of the Language Sciences (ICHoLS), August 26-30, 2024, Tbilisi State University, Tbilisi (Georgia). 
  • Andreani A., Russo D. (2023), “Building a Corpus of the Metalanguage of English Linguistics 1500-1700: Methodological Issues”, Linguistica e Filologia, 43, pp 151-174.
Seminar Series

In Spring 2024, we organised a seminar series dedicated to corpus building, with a particular focus on historical corpora. This series explored experimental approaches for constructing and querying these specialised corpora. Participants had the opportunity to explore innovative methodologies, discuss best practices, and engage with experts in the field.

Symposium
Immagine
Symposium poster

The symposium "Addressing Technical Challenges in Corpus Linguistics Research" took place on 11 April 2025 at the Aula Magna of the University of Insubria, Sant'Abbondio, Como. This event brought together distinguished scholars in the field, including Alicia Rodríguez Álvarez (University of Las Palmas de Canaria), Isabel Sofía Moskowich-Spiegel Fandiño and Luís Miguel Puente Castelo (University of A Coruña),  Walter Giordano (University of Naples Federico II), Silvia Bernardini and Adriano Ferraresi (University of Bologna). Together, they exploreed current technical issues and advancements shaping corpus linguistic research, offering valuable insights for both academics and practitioners.

Immagine
russo andreani notte dei ricercatori
  • Andreani A., Russo D. "Storie e curiosità della lingua inglese", with the students of the Istituto Comprensivo Statale Don Rimoldi, Notte dei ricercatori, University of Insubria, Varese, 27 September 2024.
Sponsors

Il progetto è realizzato con il contributo del Ministero dell’Università e della Ricerca, progetto PRIN bando 2022 – “MetaLing Corpus: Creating a corpus of English linguistics metalanguage from the 16th to the 18th century”, ref.: 202233C93X, finanziato dall’Unione Europea – NextGenerationEU, PNRR Missione 4 - Componente 2 - Investimento 1.1.

For information

Angela Andreani, University of Milan (Principal Investigator) angela.andreani@unimi.it

Daniel Russo, University of Insubria (Associated Investigator) daniel.russo@uninsubria.it