Small corpus, great institution - and an attempt to understand them
The New Year';s speech by the President of the Republic is one of the most important political speeches in Finland. We have gathered all the speeches from 1935 to 2007 into a corpus containing the speeches in writing. Our objective is to explore what the speeches are like in terms of linguistic choices and as a set or type of texts. We are also interested in the social dimensions of the speeches and the ideological meanings produced in them. This paper presents an analysis of our research questions and methods of analysis, rather than going into empirical results. We present the method and project we have decided to call "Teko'; (from text to corpus), based on the compilation and structuring of small, mutually comparable corpora, as well as on detailed quantitative (corpus linguistic) and qualitative analysis (based on text analysis, applying, e.g., the process analysis of the Systemic Functional Grammar). We are considering the following research positions and questions of analysis related to them: the uniformity of the speeches as compared to another set of texts, i.e. that of news (e.g. based on their morphological features that have been analysed semiautomatically), the internal uniformity of the speeches judging by how the speakers refer to themselves (differences arising from the speakers on the one hand and the topics on the other hand) and the uniformity of the speeches on the basis of process analysis (distribution of processes by presidents and topics). Our fundamental question in this paper is how the quantitative analysis of a small corpus can be connected to a qualitative analysis of individual texts.