In our paper we present a corpus of transcribed Lithuanian parliamentary speeches. The corpus is prepared in a specific format, appropriate for different authorship identification tasks. The corpus consists of approximately 111 thousand texts (24 million words). Each text matches one parliamentary speech produced during an ordinary session from the period of 7 parliamentary terms starting on March 10, 1990 and ending on December 23, 2013. The texts are grouped into 147 categories corresponding to individual authors, therefore they can be used for authorship attribution tasks; besides, these texts are also grouped according to age, gender and political views, therefore they are also suitable for author profiling tasks. Whereas short texts complicate recognition of author speaking style and are ambiguous in relation to the style of other authors, we incorporated only texts containing not less than 100 words into the corpus. In order to make each category as comprehensive and representative as possible, we included only those authors, who produced speeches at least 200 times. All the texts are lemmatized, morphologically and syntactically annotated, tokenized into the character n-grams. The statistical information of the corpus is also available. We have also demonstrated that the created corpus can be effectively used in authorship attribution and author profiling tasks with supervised machine learning methods. The corpus structure also allows using it with unsupervised machine learning methods and can be used for creation of rule-based methods, as well as in different linguistic analyses.
In our paper we present a corpus of transcribed Lithuanian parliamentary speeches. The corpus is prepared in a specific format, appropriate for different authorship identification tasks. The corpus consists of approximately 111 thousand texts (24 million words). Each text matches one parliamentary speech produced during an ordinary session from the period of 7 parliamentary terms starting on March 10, 1990 and ending on December 23, 2013. The texts are grouped into 147 categories corresponding to individual authors, therefore they can be used for authorship attribution tasks; besides, these texts are also grouped according to age, gender and political views, therefore they are also suitable for author profiling tasks. Whereas short texts complicate recognition of author speaking style and are ambiguous in relation to the style of other authors, we incorporated only texts containing not less than 100 words into the corpus. In order to make each category as comprehensive and representative as possible, we included only those authors, who produced speeches at least 200 times. All the texts are lemmatized, morphologically and syntactically annotated, tokenized into the character n-grams. The statistical information of the corpus is also available. We have also demonstrated that the created corpus can be effectively used in authorship attribution and author profiling tasks with supervised machine learning methods. The corpus structure also allows using it with unsupervised machine learning methods and can be used for creation of rule-based methods, as well as in different linguistic analyses.
In our paper we present a corpus of transcribed Lithuanian parliamentary speeches. The corpus is prepared in a specific format, appropriate for different authorship identification tasks. The corpus consists of approximately 111 thousand texts (24 million words). Each text matches one parliamentary speech produced during an ordinary session from the period of 7 parliamentary terms starting on March 10, 1990 and ending on December 23, 2013. The texts are grouped into 147 categories corresponding to individual authors, therefore they can be used for authorship attribution tasks; besides, these texts are also grouped according to age, gender and political views, therefore they are also suitable for author profiling tasks. Whereas short texts complicate recognition of author speaking style and are ambiguous in relation to the style of other authors, we incorporated only texts containing not less than 100 words into the corpus. In order to make each category as comprehensive and representative as possible, we included only those authors, who produced speeches at least 200 times. All the texts are lemmatized, morphologically and syntactically annotated, tokenized into the character n-grams. The statistical information of the corpus is also available. We have also demonstrated that the created corpus can be effectively used in authorship attribution and author profiling tasks with supervised machine learning methods. The corpus structure also allows using it with unsupervised machine learning methods and can be used for creation of rule-based methods, as well as in different linguistic analyses.
In our paper we present a corpus of transcribed Lithuanian parliamentary speeches. The corpus is prepared in a specific format, appropriate for different authorship identification tasks. The corpus consists of approximately 111 thousand texts (24 million words). Each text matches one parliamentary speech produced during an ordinary session from the period of 7 parliamentary terms starting on March 10, 1990 and ending on December 23, 2013. The texts are grouped into 147 categories corresponding to individual authors, therefore they can be used for authorship attribution tasks; besides, these texts are also grouped according to age, gender and political views, therefore they are also suitable for author profiling tasks. Whereas short texts complicate recognition of author speaking style and are ambiguous in relation to the style of other authors, we incorporated only texts containing not less than 100 words into the corpus. In order to make each category as comprehensive and representative as possible, we included only those authors, who produced speeches at least 200 times. All the texts are lemmatized, morphologically and syntactically annotated, tokenized into the character n-grams. The statistical information of the corpus is also available. We have also demonstrated that the created corpus can be effectively used in authorship attribution and author profiling tasks with supervised machine learning methods. The corpus structure also allows using it with unsupervised machine learning methods and can be used for creation of rule-based methods, as well as in different linguistic analyses.
In our paper we present a corpus of transcribed Lithuanian parliamentary speeches. The corpus is prepared in a specific format, appropriate for different authorship identification tasks. The corpus consists of approximately 111 thousand texts (24 million words). Each text matches one parliamentary speech produced during an ordinary session from the period of 7 parliamentary terms starting on March 10, 1990 and ending on December 23, 2013. The texts are grouped into 147 categories corresponding to individual authors, therefore they can be used for authorship attribution tasks; besides, these texts are also grouped according to age, gender and political views, therefore they are also suitable for author profiling tasks. Whereas short texts complicate recognition of author speaking style and are ambiguous in relation to the style of other authors, we incorporated only texts containing not less than 100 words into the corpus. In order to make each category as comprehensive and representative as possible, we included only those authors, who produced speeches at least 200 times. All the texts are lemmatized, morphologically and syntactically annotated, tokenized into the character n-grams. The statistical information of the corpus is also available. We have also demonstrated that the created corpus can be effectively used in authorship attribution and author profiling tasks with supervised machine learning methods. The corpus structure also allows using it with unsupervised machine learning methods and can be used for creation of rule-based methods, as well as in different linguistic analyses.
Sostavleno po osnovnym zakonam, ustavam, prikazam po voennomu vi︠e︡domstvu, t︠s︡irkuli︠a︡ram glavnago shtaba, otzyvam glavnago shtaba, t︠s︡irkuli︠a︡rnym predpisanīi︠a︡m nachalʹstva, prikazam po voennym okrugam, pi︠e︡shenīi︠a︡m glavnago voennago suda i drugim zakonnym dokumentam po 1-e i︠a︡nvari︠a︡ 1899 goda. ; Mode of access: Internet.
The article features the meeting held on November 21–22, 1915, where representatives of the authorities, municipal and public organizations discussed the issues of assistance to World War I refugees on the territory of the Irkutsk Governorate General. The study is based on the previously unstudied meeting minutes and the newspaper reports and describes the participants, the main issues, and the results of the discussion. The research made it obvious that initially the authorities did not plan to accommodate the refugees in such a remote region but were forced to redirect them farther east because the western regions were overcrowded. The cities of Eastern Siberiadid not have time to prepare for such a massive resettlement. The migrants were hastily resettled, provided with rations and medical assistance. The money, however minimal and irregular, came mostly from the government, the department of migration, the All-Russian Union ofCities, and local charitable organizations. At the meeting, it was proposed to set up a general Siberian Committee for Assistance to Refugees, to organize local refugee departments, and assign all expenses to the state. ; Статья посвящена совещанию, состоявшемуся 21–22 ноября 1915 г., на котором представители власти, городов и общественных организаций обсуждали вопросы приема и организации помощи беженцам Первой мировой войны в пределах Иркутского генерал-губернаторства. Исследование основано на протоколах совещания и материалах периодической печати, которые введены в научный оборот впервые. Охарактеризованы участники, главные темы, результаты обсуждения. Показано, что власти первоначально не планировали размещать беженцев в столь отдаленном регионе, но из-за переполненности западных областей были вынуждены направлять их далее на восток. Города Восточной Сибири не успели подготовиться к такому массовому переселению. Людей спешно расселяли, организовывали питательные, медицинские пункты, налаживали пайковое обеспечение. На эти цели использовали правительственные субсидии (хотя помощь от казны была минимальной, поступала несвоевременно), ресурсы переселенческого ведомства, Всероссийского союза городов, местных благотворительных организаций. Участники совещания предлагали создать общесибирский комитет помощи беженцам, организовать систему поддержки беженцев на территории уездов, все расходы по содержанию беженцев и выселенцев возложить исключительно на государство.
The paper features the role of the author's image as a factor that determines the variable functioning of the text in the process of its interpretation by a native speaker. A linguistic experiment conducted by the author revealed semantic versions of the source text as a result of the influence of the author's image on the interpreting activity of the addressee. The image of the author is viewed as a portrait of a political leader modeled by the addressee and his or her communicative intentions on the basis of the analysis of interpretive texts as the result of perception and interpretation of the source text. The research proves that the image of the author influences the addressee's interpretation of the text. The author used the method of linguistic analysis to prove that the image of the author of the political text is shaped in mundane political consciousness under the impact of various factors and influences mundane linguistic consciousness and the interpreting activity of the addressee. This is confirmed by the fact that the same text, attributed to different politicians, received different interpretations depending on how the recipient perceived the politician designated as the author of the text. ; Статья посвящена рассмотрению роли образа автора как фактора, детерминирующего вариативное функционирование текста в процессе его интерпретации рядовым носителем языка. Исследование выполнено на материале лингвистического эксперимента, в ходе которого были выявлены смысловые версии исходного текста как результат влияния образа автора на интерпретационную деятельность адресата. Новизна реализуемого в данной статье подхода заключается в том, что образ автора рассматривается в качестве моделируемого адресатом портрета политического лидера и его коммуникативных интенций на основании анализа интерпретирующих текстов как результата восприятия и интерпретации исходного (интерпретируемого) текста. Целью работы является обоснование положения о том, что образ автора влияет на интерпретацию текста адресатом. При исследовании материала применялся метод лингвистического анализа текста. Полученные интерпретирующие тексты показывают, что сформировавшийся в обыденном политическом сознании под влиянием различных факторов образ автора политического текста влияет на его восприятие рядовым носителем языка и на интерпретационную деятельность адресата. Это подтверждает тот факт, что один и тот же текст, авторство которого мы приписали разным политическим деятелям, получает разные интерпретационные модели в зависимости от того, как реципиент воспринимает политика, который указан как автор текста.
This article analyzes the socio-political views of the specific naive author (a woman of 75 years, secondary education, working profession), which are presented in a hand-written format. The certain basic attitudes, which are characteristic for everyday consciousness of this kind of author, are also revealed. They are: People and power ("Tsar is good and boyars are bad"); Russia is in trouble (it is realized in topics: collapse of the USSR, glorious past is forgotten, enemies are around, dominance of emigrants, deplorable condition of village, ex-scientists and militaries became downand-outs); Pride in the country and belief in its rescue. The basic attitudes, which are allocated in the article, reflect a certain layer of the national socio-political everyday consciousness, which is especially fully embodied in the older generation, which was formed in the Soviet period. Such basic attitudes take their beginning from far away. They got in the deep layers of consciousness or maybe sub-consciousness. They have features of archetypes, do not disappear completely and are reflected in different behavioral forms, including the written-speech activities. The presented material can be interesting not only to philologists, but also to related scientific subjects such as sociologists, historians, philosophers and psychologists. ; В статье анализируются социально-политические взгляды конкретного наивного автора (женщины 75 лет, среднего образования, рабочей профессии), представленные в письменно-рукописном формате. Выявлены определенные базовые установки, характерные для обыденного сознания такого типа автора: Народ и власть («Царь хороший, да бояре плохие»); Россия – в беде (реализуется в темах: распад СССР, славное прошлое забыто, кругом враги, засилье эмигрантов, плачевное состояние деревни, бывшие ученые и военные стали бомжами); Гордость за страну и вера в ее спасение. Базовые установки, выделенные в статье, отражают определенный пласт народного социально-политического обыденного сознания, которое особенно полно воплощено в людях старшего поколения, сформированного в советское время. Такие базовые установки, берущие начала издалека и попавшие в более глубинные пласты сознания – возможно, подсознания – и имеющие черты архетипов, не исчезают полностью и отражаются в разных поведенческих формах, в том числе в письменно-речевой деятельности. Представленный материал может быть интересен не только филологам, но и смежным научным дисциплинам – социологам, историкам, философам, психологам.
The article discusses the content of the first Russian manual on pediatric traumatology and the key points of the biography and work of its author, Nikolai G. Dame. The article introduction briefly outlines the creation history of the first Russian manual on traumatology for adult patients, which noted that pediatric traumatology developed within the framework of pediatric surgery. The first part of the work focused on its greatest interest to readers. It presents previously unpublished reviews of famous orthopedic traumatologists for the first (1950) and second (1960) editions, which perfectly reveal the books content. The second part of the article included the published autobiography written by N.G. Dame. Additionally, the work of N.G. Dame based on archival documents as a military field surgeon during the Great Patriotic War was presented in the same section. The final part included the role of N.G. Dame in creating an original school of pediatric trauma surgeons. The merit of N. G. Dame is the creation of a manual that outlines the basics of diagnosing and treating injuries in children of all types and localizations. He believed that a traumatologist should have the necessary knowledge and skills in emergency general surgery, neurotraumatology, and other surgical specialties. Thus, he anticipated the creation of injury surgery, of which the basic provisions were transferred from the practice of military field surgery. ; В статье рассматривается содержание первого отечественного руководства по травматологии детского возраста и ключевые моменты биографии и творчества его автора Николая Григорьевича Дамье. Во введении статьи кратко изложена история создания первых в России руководств по травматологии для взрослых пациентов, отмечено, что травматология детского возраста развивалась в рамках детской хирургии. В первой часть представлены не публиковавшиеся ранее отзывы известных травматологов-ортопедов на первое (1950) и второе (1960) издания, они прекрасно раскрывают содержание книги. Во второй части статьи ...