The analysis of a corpus of 297 speeches made by the President of the Republic of Cameroon between 1982 and 2002, by means of the methods and the software of the textual statistics and the linguistic concepts of the discourse analysis, has brought to light lexical, rhetorical and structural characteristics of Paul Biya's speeches. After the identification of some of the themes of the corpus, lexical analysis and lexicometric study of the evolution of vocabulary have been made. Statistical methods have helped to clarify the enunciation through the study of lexical time and the adaptation to the public targeted.Finally, the analysis of two types of speeches: speeches made by the President of the Republic every year on the eve of Youth Day's celebration on February 10, and speeches made every end of year on December 31 to the Nation and to foreign diplomats has shown a diachronic change of vocabulary, showing a clear adaptation to the audience. ; Mobilisant les concepts de l'analyse du discours combinés aux méthodes de la lexicométrie, cette thèse se propose de mettre en lumière, à travers l'analyse d'un corpus de 297 discours, les principales caractéristiques tant lexicales que rhétoriques et structurelles du discours du président de la République du Cameroun, Paul Biya, de 1982 et 2002. Après avoir dégagé quelques-uns des axes thématiques du corpus, on poursuit l'analyse lexicale par l'étude lexicométrique de l'évolution du vocabulaire. Les méthodes statistiques permettent ensuite d'appréhender certains phénomènes énonciatifs en fonction du temps lexical et de l'adaptation à l'auditoire. Enfin, dans un dernier temps, ce sont deux types de discours « rituels » qui sont analysés : les discours à la Jeunesse, prononcés par le président de la République le 10 février de chaque année pour célébrer la fête de la Jeunesse, et les discours de voeux de fin d'année. La prise en compte de ces deux types de discours permet de poursuivre l'étude du phénomène d'adaptation à l'auditoire : d'abord en confrontant les discours à la Jeunesse au reste du corpus présidentiel ensuite en faisant contraster les discours de voeux à la Nation avec les discours de voeux au diplomates.
The analysis of a corpus of 297 speeches made by the President of the Republic of Cameroon between 1982 and 2002, by means of the methods and the software of the textual statistics and the linguistic concepts of the discourse analysis, has brought to light lexical, rhetorical and structural characteristics of Paul Biya's speeches. After the identification of some of the themes of the corpus, lexical analysis and lexicometric study of the evolution of vocabulary have been made. Statistical methods have helped to clarify the enunciation through the study of lexical time and the adaptation to the public targeted.Finally, the analysis of two types of speeches: speeches made by the President of the Republic every year on the eve of Youth Day's celebration on February 10, and speeches made every end of year on December 31 to the Nation and to foreign diplomats has shown a diachronic change of vocabulary, showing a clear adaptation to the audience. ; Mobilisant les concepts de l'analyse du discours combinés aux méthodes de la lexicométrie, cette thèse se propose de mettre en lumière, à travers l'analyse d'un corpus de 297 discours, les principales caractéristiques tant lexicales que rhétoriques et structurelles du discours du président de la République du Cameroun, Paul Biya, de 1982 et 2002. Après avoir dégagé quelques-uns des axes thématiques du corpus, on poursuit l'analyse lexicale par l'étude lexicométrique de l'évolution du vocabulaire. Les méthodes statistiques permettent ensuite d'appréhender certains phénomènes énonciatifs en fonction du temps lexical et de l'adaptation à l'auditoire. Enfin, dans un dernier temps, ce sont deux types de discours « rituels » qui sont analysés : les discours à la Jeunesse, prononcés par le président de la République le 10 février de chaque année pour célébrer la fête de la Jeunesse, et les discours de voeux de fin d'année. La prise en compte de ces deux types de discours permet de poursuivre l'étude du phénomène d'adaptation à l'auditoire : d'abord en confrontant les discours à ...
This paper analyzes both the network of actors and the network of the discourses mobilized in the controversy around Professor Didier Raoult and his Hydroxychloroquine-based therapeutic proposal against COVID-19. To confirm our hypothesis, we implement a sophisticated and innovative research method on a corpus of 1.2 million Tweets, which consists of applying a network analysis combined with a lexicometrics analysis. We show that the reaction peaks on Twitter were linked to important media events. Moreover, many groups clustered around the accounts of political figures and media outlets that received numerous mentions. Trump's and Bolsonaro's supporter groups also connected with the French-speaking pro-Raoult groups. The messages of the pro-Raoult combined anti-science conspiracy theories and a critique of the political economy of liberalism and its impasses
Intro -- Preface -- Contents -- List of Figures -- List of Tables -- List of Abbreviations -- 1. Introduction: Qualitative Data Analysis in a Digital World -- 1.1. The Emergence of "Digital Humanities" -- 1.2. Digital Text and Social Science Research -- 1.3. Example Study: Research Question and Data Set -- 1.3.1. Democratic Demarcation -- 1.3.2. Data Set -- 1.4. Contributions and Structure of the Study -- 2. Computer-Assisted Text Analysis in the Social Sciences -- 2.1. Text as Data between Quality and Quantity -- 2.2. Text as Data for Natural Language Processing -- 2.2.1. Modeling Semantics -- 2.2.2. Linguistic Preprocessing -- 2.2.3. Text Mining Applications -- 2.3. Types of Computational Qualitative Data Analysis -- 2.3.1. Computational Content Analysis -- 2.3.2. Computer-Assisted Qualitative Data Analysis -- 2.3.3. Lexicometrics for Corpus Exploration -- 2.3.4. Machine Learning -- 3. Integrating Text Mining Applications for Complex Analysis -- 3.1. Document Retrieval -- 3.1.1. Requirements -- 3.1.2. Key Term Extraction -- 3.1.3. Retrieval with Dictionaries -- 3.1.4. Contextualizing Dictionaries -- 3.1.5. Scoring Co-Occurrences -- 3.1.6. Evaluation -- 3.1.7. Summary of Lessons Learned -- 3.2. Corpus Exploration -- 3.2.1. Requirements -- 3.2.2. Identification and Evaluation of Topics -- 3.2.3. Clustering of Time Periods -- 3.2.4. Selection of Topics -- 3.2.5. Term Co-Occurrences -- 3.2.6. Keyness of Terms -- 3.2.7. Sentiments of Key Terms -- 3.2.8. Semantically Enriched Co-Occurrence Graphs -- 3.2.9. Summary of Lessons Learned -- 3.3. Classification for Qualitative Data Analysis -- 3.3.1. Requirements -- 3.3.2. Experimental Data -- 3.3.3. Individual Classification -- 3.3.4. Training Set Size and Semantic Smoothing -- 3.3.5. Classification for Proportion and Trend Analysis -- 3.3.6. Active Learning -- 3.3.7. Summary of Lessons Learned.
Zugriffsoptionen:
Die folgenden Links führen aus den jeweiligen lokalen Bibliotheken zum Volltext:
Nation-building, one of the major challenges for most political leaders in African countries in the pre- and post-independence period, has not been an easy task, both in practice and symbolically. Indeed, the very first challenge was to forge a unique identity corresponding to the aspirations of their people. But, after independence, this project faces enormous difficulties to materialize.Moreover, the word nation, despite its recent use in reference to the philological study by Marcel Mauss (Mauss, 2013), undoubtedly reveals among African politicians of the postcolonial period their common desire to embark on a path unification of their peoples beyond their specificities and their cultural diversity. But, from a symbolic point of view, specific to the politicians of French West Africa, this concept of nation does not correspond to the African reality, because it embodies a whole imaginary which sometimes goes beyond the national framework stricto-sensu. Returning to the issues of national construction, both material and imaginary, which the Socialist Party faced during the colonial and postcolonial period, is therefore the aim of this research.Through a textometric analysis of the political discourse of two illustrious speakers and main leaders of the Senegalese Socialist Party: L. S. Senghor and Abdou Diouf, this thesis seeks to track down traces of this African identity steeped in foreign identities. Hence a cultural heritage that seems to seal the fate of the peoples of West Africa in general and Senegal in particular. This thesis therefore remains a means for us to question the survival of the African tradition in Senegalese political discourse, more particularly of the Socialist Party, craftsman of independence and whose history coincides with that of Senegal in many respects. ; La construction nationale, un des défis majeurs de la plupart des leaders politiques des pays africains de la période pré et postindépendance, n'a pas été tâche facile aussi bien dans les faits que d'un point de vue symbolique. En ...
Nation-building, one of the major challenges for most political leaders in African countries in the pre- and post-independence period, has not been an easy task, both in practice and symbolically. Indeed, the very first challenge was to forge a unique identity corresponding to the aspirations of their people. But, after independence, this project faces enormous difficulties to materialize.Moreover, the word nation, despite its recent use in reference to the philological study by Marcel Mauss (Mauss, 2013), undoubtedly reveals among African politicians of the postcolonial period their common desire to embark on a path unification of their peoples beyond their specificities and their cultural diversity. But, from a symbolic point of view, specific to the politicians of French West Africa, this concept of nation does not correspond to the African reality, because it embodies a whole imaginary which sometimes goes beyond the national framework stricto-sensu. Returning to the issues of national construction, both material and imaginary, which the Socialist Party faced during the colonial and postcolonial period, is therefore the aim of this research.Through a textometric analysis of the political discourse of two illustrious speakers and main leaders of the Senegalese Socialist Party: L. S. Senghor and Abdou Diouf, this thesis seeks to track down traces of this African identity steeped in foreign identities. Hence a cultural heritage that seems to seal the fate of the peoples of West Africa in general and Senegal in particular. This thesis therefore remains a means for us to question the survival of the African tradition in Senegalese political discourse, more particularly of the Socialist Party, craftsman of independence and whose history coincides with that of Senegal in many respects. ; La construction nationale, un des défis majeurs de la plupart des leaders politiques des pays africains de la période pré et postindépendance, n'a pas été tâche facile aussi bien dans les faits que d'un point de vue symbolique. En ...
This thesis studies the discourse of the European Union about the European Judicial Network between 1996 and 1999. Using discourse analysis and lexicometrics within the theoretical framework of argumentative semantics, it explores the meaning of words associated with the expression of « area of freedom, security and justice ». The research mainly aims at understanding how the discourse reappropriate universal values in order to construct, through argumentative processes, a European identity that embodies the European Judicial Network. After presenting the political context and the linguistic theories used in the analysis methods, the thesis examines the conditions of discourse production and its semantic particularities. It argues that the discourse constructs an identity based on the values (liberty, security and justice) conveyed by the three pillars of the European Judicial Network : institutions, organised crime and citizens. The thesis then puts forward the idea of a "discursive normalisation" that helps, through its linguistic forms, to legitimate the identity construction and its political issues. It also defines a process of "argumentative manipulation" resulting from this "discursive normalisation" implementing a very distinctive linguistic frame. According to those results, the research concludes that normalisation and manipulation generate circularity of the discourse, which is based on the conflict between two discursive orientations, on the one hand security and on the other hand democracy and the Rule of law. ; La recherche menée dans cette thèse traite du discours de l'Union européenne sur l'espace judiciaire européen entre 1996 et 1999. En convoquant l'analyse du discours et la lexicométrie dans un cadre théorique de la sémantique argumentative, elle analyse le sens des mots autour de l'expression « espace de liberté, de sécurité et de justice ». L'objectif est de définir comment s'élabore la réappropriation de valeurs universelles au profit du discours et de l'identité européenne véhiculée par ...
La crisis financiera resultado de la de las subprime norteamericana supone un revés para la economía europea que aún afecta a numerosos países europeos. El Banco Central Europeo (BCE), como institución clave de la política monetaria, es uno de los agentes principales que configura el discurso neoliberal financiero europeo. Un análisis lexicométrico previo de nuestro corpus, que incluye los discursos de los presidentes del BCE (periodo 2003-2016), nos permitió identificar el término sovereignty como forma específica de algunos años. Este trabajo se propone analizar de qué modo el BCE justifica la cesión de la soberanía (financiera, nacional.) de los Estados a favor de la UE y de sus instituciones, y a qué apela dicha institución cuando emplea un concepto clave del discurso público por su alto valor simbólico: la soberanía. En resumen, de qué manera el BCE, "emperador ilícito de Europa" (Bouchard 2013), marca sus pautas de dominación económica. Así pues, en este artículo nos proponemos analizar cuantitativa (con el programa Lexico 3.6) y cualitativamente las formas relativas a dicha familia (sovereignty, sovereignties, sovereign, sovereigns) para ver su evolución a lo largo de los discursos de los presidentes de dicha institución. Hemos tomado como punto de partida la práctica y teoría del análisis del discurso, y se ha estudiado el contexto de los términos seleccionados. ; The financial crisis stemming from the US subprime mortgage meltdown represented a setback for the European economy that still affects many European countries. The European Central Bank (ECB), a key monetary policy institution, is one of the major actors shaping European neoliberal financial discourse. An earlier lexicometric analysis of a corpus of speeches delivered by ECB presidents between 2003 and 2016 revealed a high specificity index for the term sovereignty in certain years. This article aims to analyze how the ECB justifies transferring, inter alia, financial and national sovereignty to the European Union and its institutions and what the ECB refers to when utilizing this highly symbolic concept from public discourse. Ultimately, this work aims to determine how the ECB, "Europe's illicit emperor" (Bouchard 2013), frames its discourse of economic domination. Consequently, this article quantitatively (using Lexico 3.6) and qualitatively analyses the various forms in this lexical family (sovereignty, sovereignties, sovereign, sovereigns) to assess how they have evolved over time in the speeches given by ECB presidents. It adopts a methodology steeped in the theory and practice of discourse analysis, studying the contexts in which the selected terms appear in order to shed light on the implications of these changing backdrops.
International audience ; Purpose: Global changes require a shift in the energy model, with low emissions of greenhouse gases. Deep geothermal energy promises to be a renewable and limitless energy. As a result, it is presented as a viable energy source in the context of the energy transition. In France, geothermal energy will expand beyond historical regions, specifically the Paris Basin, to develop throughout the country. Geothermal energy is being developed from the ground up in Switzerland. This extension will require the development of new technical and economic models, most notably through the use of industrial demonstrators to adapt to new geological contexts. However, the development of this technology result in a slew of technical issues, including earthquakes, and leads to a lot of controversy. To investigate the political construction of the resource, we use the conceptual framework of Political Ecology and Actor-Network Theory. Methods: We carried out case studies through fieldwork in France and Switzerland, based on semi-structured interviews and press analysis (qualitative methods and lexicometrics). Conclusions: A proposition is a more systemic framework of analysis to explain the challenges of deploying deep geothermal energy in less-than-ideal environments. While most discussions focus on social acceptance by local communities, we propose going beyond that. Indeed, we can demonstrate that this is only one of several parameters, including technical frontiers, administrative regulation, political support, and industrial strategies. We also propose a more critical and radical approach: the difficulties of implementing deep geothermal energy reveal the internal contradictions that run through ecological modernization.
International audience ; Purpose: Global changes require a shift in the energy model, with low emissions of greenhouse gases. Deep geothermal energy promises to be a renewable and limitless energy. As a result, it is presented as a viable energy source in the context of the energy transition. In France, geothermal energy will expand beyond historical regions, specifically the Paris Basin, to develop throughout the country. Geothermal energy is being developed from the ground up in Switzerland. This extension will require the development of new technical and economic models, most notably through the use of industrial demonstrators to adapt to new geological contexts. However, the development of this technology result in a slew of technical issues, including earthquakes, and leads to a lot of controversy. To investigate the political construction of the resource, we use the conceptual framework of Political Ecology and Actor-Network Theory. Methods: We carried out case studies through fieldwork in France and Switzerland, based on semi-structured interviews and press analysis (qualitative methods and lexicometrics). Conclusions: A proposition is a more systemic framework of analysis to explain the challenges of deploying deep geothermal energy in less-than-ideal environments. While most discussions focus on social acceptance by local communities, we propose going beyond that. Indeed, we can demonstrate that this is only one of several parameters, including technical frontiers, administrative regulation, political support, and industrial strategies. We also propose a more critical and radical approach: the difficulties of implementing deep geothermal energy reveal the internal contradictions that run through ecological modernization.
International audience ; Global changes require a shift in the energy model, with low emissions of greenhouse gases. Deep geothermal energy promises to be a renewable and limitless energy. As a result, it is presented as a viable energy source in the context of the energy transition. In France, geothermal energy will expand beyond historical regions, specifically the Paris Basin, to develop throughout the country. Geothermal energy is being developed from the ground up in Switzerland. This extension will require the development of new technical and economic models, most notably through the use of industrial demonstrators to adapt to new geological contexts. However, the development of this technology results in a slew of technical issues, including earthquakes, and leads to a lot of controversy. To investigate the political construction of the resource, we use the conceptual framework of Political Ecology and Actor-Network Theory.Methods: We carried out case studies through fieldwork in France and Switzerland, based on semi-structured interviews and press analysis (qualitative methods and lexicometrics).Conclusions: A proposition is a more systemic framework of analysis to explain the challenges of deploying deep geothermal energy in less-than-ideal environments. While most discussions focus on social acceptance by local communities, we propose going beyond that. Indeed, we can demonstrate that this is only one of several parameters, including technical frontiers, administrative regulation, political support, and industrial strategies. We also propose a more critical and radical approach: the difficulties of implementing deep geothermal energy reveal the internal contradictions that run through ecological modernization.
La recherche menée dans cette thèse traite du discours de l'Union européenne sur l'espace judiciaire européen entre 1996 et 1999. En convoquant l'analyse du discours et la lexicométrie dans un cadre théorique de la sémantique argumentative, elle analyse le sens des mots autour de l'expression « espace de liberté, de sécurité et de justice ». L'objectif est de définir comment s'élabore la réappropriation de valeurs universelles au profit du discours et de l'identité européenne véhiculée par les procédés argumentatifs utilisés pour représenter l'espace judiciaire européen. Le contexte politique et les théories linguistiques de la méthode d'analyse exposés, cette thèse porte sur les conditions de communication du discours et de sa particularité sémantique, en montrant qu'il construit une identité en triade renvoyant aux trois valeurs de son emblème triptyque (« liberté, sécurité et justice ») :les institutions, la criminalité organisée et les citoyens. Elle révèle les indices linguistiques de cette construction identitaire avec ses enjeux politiques et de légitimation en introduisant la notion de « normalisation discursive ». Celle-ci adopte un schéma linguistique très marqué conduisant à la mise en œuvre d'un processus de « manipulation argumentative » défini et démontré dans ce travail. Les résultats illustrent que la normalisation et la manipulation engendrent une circularité du discours reposant sur le conflit de deux orientations discursives :celle d'un discours sécuritaire et celle d'un discours sur l'État de droit et la démocratie. This thesis studies the discourse of the European Union about the European Judicial Network between 1996 and 1999. Using discourse analysis and lexicometrics within the theoretical framework of argumentative semantics, it explores the meaning of words associated with the expression of « area of freedom, security and justice ». The research mainly aims at understanding how the discourse reappropriate universal values in order to construct, through argumentative processes, a ...
"Two developments in computational text analysis may change the way qualitative data analysis in social sciences is performed: 1. the availability of digital text worth to investigate is growing rapidly, and 2. the improvement of algorithmic information extraction approaches, also called text mining, allows for further bridging the gap between qualitative and quantitative text analysis. The key factor hereby is the inclusion of context into computational linguistic models which extends conventional computational content analysis towards the extraction of meaning. To clarify methodological differences of various computer-assisted text analysis approaches the article suggests a typology from the perspective of a qualitative researcher. This typology shows compatibilities between manual qualitative data analysis methods and computational, rather quantitative approaches for large scale mixed method text analysis designs." (author's abstract)
Die breite öffentliche Debatte um den Klimawandel und seine Folgen hat die politischen Entscheidungsträger erreicht. Große Unsicherheit herrscht über das tatsächliche Ausmaß und die Auswirkungen des Wandels. Ein besseres Verständnis unseres Klimasystems ist daher dringend notwendig. Um das zukünftige Klima vorhersagen zu können, müssen Klimaänderungen der Vergangenheit verstanden werden. Die Historische Klimatologie bietet erprobte Methoden zur Rekonstruktion des Klimas der letzten ca. 1000 Jahre. Grundlage für qualitativ hochwertige Ergebnisse ist die große Anzahl der in eine Auswertung einbezogenen Quellen. Diese muss in Zukunft deutlich erhöht und auf weitere Regionen der Erde ausgedehnt werden. Viele potentielle Quellen liegen gedruckt oder bereits digital in Bibliotheken vor. Hieraus ergibt sich die Leitfrage dieser Arbeit: Können moderne, softwaregestützte Methoden des Data-Mining auch auf historische Texte angewendet werden? Die Zielstellung ist, dass der einzelne Bearbeiter in der Historischen Klimatologie effizienter arbeiten kann, so dass mehr Quellen in kürzerer Zeit bei gleichbleibender oder besserer Qualität ausgewertet werden können. Mit der entwickelten Software Konkordanz zur halbautomatischen Textextraktion kann der Arbeitsschritt "Quellenzitate extrahieren" deutlich effizienter durchgeführt werden. Im Experiment wurde mit der halbautomatischen Textextraktion nur ¼ der Zeit benötigt – bei einer Genauigkeit von 95%. Es ist zu vermuten, dass bei wachsender Erfahrung mit der neuen Methode diese Werte noch gesteigert werden können. Für die Software Konkordanz wurden neue Such- und Bewertungsalgorithmen entwickelt. Diese basieren auf Ansätzen der Computerlinguistik, Lexikometrie und des Data-Mining. Beim Design der Mensch-Maschine-Interaktion konnte auf etablierte Techniken des Benutzerschnittstellendesigns zurückgegriffen werden. Bei der Auswertung von Reisetagebüchern ergeben sich besondere Herausforderungen durch ständig wechselnde Standorte. Für eine historisch-klimatologische Betrachtung müssen diese lokalisiert werden. Bei den hier untersuchten Tagebüchern wurden vorwiegend arabische Ortsnamen nach Gehör ad-hoc ins Englische transkribiert. Zur Lokalisierung dieser wurde eine phonetische Suche auf Basis des DoubleMetaphone-Algorithmus implementiert. Die Suche wird ergänzt durch eine neu entwickelte phonetische Bewertung. Bei der Entwicklung und dem Testen der Algorithmen konnte auf etablierte Ansätze des Data-Mining und maschinellen Lernens zurückgegriffen werden. Die phonetische Suche findet sowohl mehr relevante als auch weniger nicht-relevante Orte als die vorher verfügbare Suchmethoden. Mit der neu entwickelten Suche konnten fast doppelt so viele von Buckinghams Reiseorten korrekt lokalisiert werden, als mit der besten vorher verfügbaren "fuzzy"-GeoNames Suche. Mit den entwickelten Methoden zum halbautomatischen Auffinden relevanter Textstellen und der Unterstützung bei der Kodierung durch den StatementsManager bleibt die Rekonstruktion der Reiseroute einer der zeitaufwändigsten Arbeitsschritte bei der Analyse von Reisetagebüchern. Es wurde daher ein RouteFinder-Algorithmus entwickelt, der aus einer Liste von Ortsnamen und dem erwarteten mittleren Abstand zwischen den Orten die wahrscheinlichsten Reiserouten errechnet Der Algorithmus erkennt automatisch nicht- oder nur falschlokalisierbare Ortsnamen in der Eingabemenge. Bei der Entwicklung des Algorithmus wurden graphentheoretische Ansätze verfolgt; bei der Optimierung auf Data-Mining, multivariate Datenanalyse und maschinelles Lernen aufgebaut. Die Ergebnisse der automatischen Routenrekonstruktion sind faszinierend. Mit den neu entwickelten Methoden wurden die sechs Reisetagebücher Buckinghams analysiert. Die Tagebücher lagen bereits digital in Online-Archiven vor. Buckingham reiste von Dezember 1815 bis Dezember 1816 von Alexandria in Ägypten über Palästina, Syrien, Mesopotamien und Persien nach Indien. Der Zeitpunkt seiner Reise ist klimatologisch interessant, da sie im Nachgang des Ausbruchs des Tambora (Indonesien) vom Frühjahr 1815 stattfand. In Europa und Nordamerika ging das Jahr 1816 durch die durch den Ausbruch modifizierten Strömungs¬muster der Atmosphäre als "Jahr ohne Sommer" in die Geschichte ein. Die lokalisierten und kodierten Witterungsnotizen Buckinghams wurden in Wettertableaus überführt. Damit sind die witterungsrelevanten Aussagen aus über 3100 Seiten Reisebeschreibung übersichtlich auf drei Seiten zusammengefasst. Während seiner Reise erlebte Buckingham zwei Dürren, eine in der südlichen Levante, die offenbar im Winter 1815/1816 begann, eine weitere in Persien, die je nach Region bereits zwei oder drei Jahre andauerte. Der Vergleich mit einem modernen Klimamodell zeigt, dass dieses nach einem tambora-ähnlichen Ausbruch einen Witterungsverlauf erwarten lässt, der gerade entgegengesetzt zu der von Buckingham gemachten Beobachtung ist. ; The public discussion on climate change and its effects has now reached policymakers. Coupled to this is a great uncertainty concerning the true extent of climate change, and of its consequences. It follows that a better understanding of our climate system is a pressing necessity. For accurate future climate predictions, natural climate changes of the past have to be understood. Instrumental measurement records cover only the last 100 years, and so are not sufficient for this purpose. On the other hand, Historical Climatology provides tried and tested methods for the reconstruction of the climate over the past 1000 years. In order to ensure results of a high calibre, it is crucial to use as many historical sources as possible. Consequently, both the number of accessible sources and their geographical coverage must be considerably increased in the future. Many potential sources are already available in libraries in printed, or indeed digital form. The central question asked in this thesis is therefore: Is it possible to apply modern, software-supported methods of data mining to historical sources? The goal is to enable the analysis of more sources in less time, while maintaining or raising the standards of quality, by increasing the efficiency of historical climatologists. The newly developed software Konkordanz, enabling semi-automatized text extraction, considerably improves the step in which the relevant parts of historical sources are extracted. Experiments show that only a quarter of the time is needed in comparison to manually extracting quotes; the precision is around 95%. Moreover, it can be expected that these figures will increase, once these new methods become more familiar. In order to build Konkordanz, new search and assessment algorithms were developed. These are based on natural language processing, lexicometrics and data mining. The user interface was developed according to the relevant modern design principles. A potentially confounding factor in the textual analysis of a travel diary is the constantly changing roster of locations it contains. Historical climatological studies can only be performed when these locations are pinpointed. In the diaries analysed in this thesis, spoken Arabic location names where transcribed ad-hoc to the English language. In order to pinpoint these locations, a phonetic search based on the DoubleMetaphone algorithm was implemented, together with a newly developed phonetic rating complementing this search. Development and testing was driven by methods of data mining and machine learning. The phonetic search finds more relevant and less non-relevant locations than previously available methods. The newly developed phonetics search enables us to pinpoint nearly as twice as many of Buckingham's travel locations as was possible with the previously best "fuzzy" GeoNames search. With the newly developed method for half-automated text extraction, and the support for coding of quotes, the reconstruction of travel routes remains one of the most time-consuming steps in the analysis of travel diaries. Therefore, a RouteFinder algorithm was developed. It reconstructs, from a list of location names and the expected mean distance between these, the most probable travel routes. Development of the algorithm was driven by graph theory. Optimization was done through data mining, multivariate data analysis and machine learning. The results of automated route reconstruction are fascinating. The newly developed methods where applied to James Silk Buckingham's six travel diaries. The diaries are digitally available on online archives. From December 1815 to December 1816 Buckingham travelled from Alexandria in Egypt through Palestine, Syria, Mesopotamia and Persia to India. Travel records from this period of time are particularly interesting to climatologists, for in the spring 1815, Mount Tambora erupted in Indonesia. In the aftermath of the event, large parts of Europe and North-America experienced the so-called "year without summer" in 1816. Buckingham's localised and coded weather records were transformed into weather tables. These three pages represent most of the climatically relevant information of over 3100 pages of the travel reports. During his travels, Buckingham experienced two droughts: one was in south Levant, starting in winter 1815/1816, the other was spread all through Persia, and must have started between two and three years earlier, depending on region. In the event of a "Tambora-like" eruption, modern climate models predict weather patterns precisely the opposite of those observed by Buckingham.