Language models in Russian linguistics
In: Rossijskij gumanitarnyj žurnal: Liberal arts in Russia, Band 6, Heft 2, S. 165
ISSN: 2312-6442
10473 Ergebnisse
Sortierung:
In: Rossijskij gumanitarnyj žurnal: Liberal arts in Russia, Band 6, Heft 2, S. 165
ISSN: 2312-6442
Convolutional Neural Networks (CNNs) have shown to yield very strong results in several Computer Vision tasks. Their application to language has received much less attention, and it has mainly focused on static classification tasks, such as sentence classification for Sentiment Analysis or relation extraction. In this work, we study the application of CNNs to language modeling, a dynamic, sequential prediction task that needs models to capture local as well as long-range dependency information. Our contribution is twofold. First, we show that CNNs achieve 11-26% better absolute performance than feed-forward neural language models, demonstrating their potential for language representation even in sequential tasks. As for recurrent models, our model outperforms RNNs but is below state of the art LSTM models. Second, we gain some understanding of the behavior of the model, showing that CNNs in language act as feature detectors at a high level of abstraction, like in Computer Vision, and that the model can profitably use information from as far as 16 words before the target. ; This project has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No 655577 (LOVe); ERC 2011 Starting Independent Research Grant n. 283554 (COMPOSES) and the Erasmus Mundus Scholarship for Joint Master Programs.
BASE
SSRN
In: Political analysis: PA ; the official journal of the Society for Political Methodology and the Political Methodology Section of the American Political Science Association, S. 1-5
ISSN: 1476-4989
Abstract
A recent paper by Häffner et al. (2023, Political Analysis 31, 481–499) introduces an interpretable deep learning approach for domain-specific dictionary creation, where it is claimed that the dictionary-based approach outperforms finetuned language models in predictive accuracy while retaining interpretability. We show that the dictionary-based approach's reported superiority over large language models, BERT specifically, is due to the fact that most of the parameters in the language models are excluded from finetuning. In this letter, we first discuss the architecture of BERT models, then explain the limitations of finetuning only the top classification layer, and lastly we report results where finetuned language models outperform the newly proposed dictionary-based approach by 27% in terms of
$R^2$
and 46% in terms of mean squared error once we allow these parameters to learn during finetuning. Researchers interested in large language models, text classification, and text regression should find our results useful. Our code and data are publicly available.
SSRN
SSRN
Comunicació presentada al 58th Annual Meeting of the Association for Computational Linguistics celebrat del 5 al 10 de juliol de 2020 de manera virtual. ; Language models keep track of complex information about the preceding context – including, e.g., syntactic relations in a sentence. We investigate whether they also capture information beneficial for resolving pronominal anaphora in English. We analyze two state of the art models with LSTM and Transformer architectures, via probe tasks and analysis on a coreference annotated corpus. The Transformer outperforms the LSTM in all analyses. Our results suggest that language models are more successful at learning grammatical constraints than they are at learning truly referential information, in the sense of capturing the fact that we use language to refer to entities in the world. However, we find traces of the latter aspect, too. ; This project has received funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (grant agreement No 715154), and from the Spanish Ramón y Cajal programme (grant RYC-2015-18907). We thankfully acknowledge the computer resources at CTE-POWER and the technical support provided by Barcelona Supercomputing Center (RES-IM2019-3-0006).
BASE
In: Frontiers in political science, Band 5
ISSN: 2673-3145
Large Language Models (LLMs) are a type of artificial intelligence that uses information from very large datasets to model the use of language and generate content. While LLMs like GPT-3 have been used widely in many applications, the recent public release of OpenAI's ChatGPT has opened more debate about the potential uses and abuses of LLMs. In this paper, we provide a brief introduction to LLMs and discuss their potential application in political science and political methodology. We use two examples of LLMs from our recent research to illustrate how LLMs open new areas of research. We conclude with a discussion of how researchers can use LLMs in their work, and issues that researchers need to be aware of regarding using LLMs in political science and political methodology.
In: Risk, hazards & crisis in public policy
ISSN: 1944-4079
AbstractThe widespread embrace of Large Language Models (LLMs) integrated with chatbot interfaces, such as ChatGPT, represents a potentially critical moment in the development of risk communication and management. In this article, we consider the implications of the current wave of LLM‐based chat programs for risk communication. We examine ChatGPT‐generated responses to 24 different hazard situations. We compare these responses to guidelines published for public consumption on the US Department of Homeland Security's Ready.gov website. We find that, although ChatGPT did not generate false or misleading responses, ChatGPT responses were typically less than optimal in terms of their similarity to guidances from the federal government. While delivered in an authoritative tone, these responses at times omitted important information and contained points of emphasis that were substantially different than those from Ready.gov. Moving forward, it is critical that researchers and public officials both seek to harness the power of LLMs to inform the public and acknowledge the challenges represented by a potential shift in information flows away from public officials and experts and towards individuals.
SSRN
SSRN
Videolectures are currently being digitised all over the world for its enormous value as reference resource. Many of these lectures are accompanied with slides. The slides offer a great opportunity for improving ASR systems performance. We propose a simple yet powerful extension to the linear interpolation of language models for adapting language models with slide information. Two types of slides are considered, correct slides, and slides automatic extracted from the videos with OCR. Furthermore, we compare both time aligned and unaligned slides. Results report an improvement of up to 3.8 % absolute WER points when using correct slides. Surprisingly, when using automatic slides obtained with poor OCR quality, the ASR system still improves up to 2.2 absolute WER points. ; The research leading to these results has received funding from the European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement no 287755 (transLectures). Also supported by the Spanish Government (Plan E, iTrans2 TIN2009-14511). ; Martínez-Villaronga, A.; Del Agua Teba, MA.; Andrés Ferrer, J.; Juan Císcar, A. (2013). Language model adaptation for video lectures transcription. En Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on. IInstitute of Electrical and Electronics Engineers (IEEE). 8450-8454. https://doi.org/10.1109/ICASSP.2013.6639314 ; S ; 8450 ; 8454
BASE
SSRN
In: The Chinese University of Hong Kong Faculty of Law Research Paper No. 2024-04
SSRN
SSRN