Suchergebnisse
Filter
17 Ergebnisse
Sortierung:
English-Bulgarian and Bulgarian-English dictionary of intelligence and counterintelligence terminology
In: Poredica "Strogo sekretno"
Bulgarian-English parallel corpus MaCoCu-bg-en 1.0
In: http://hdl.handle.net/11356/1521
The Bulgarian-English parallel corpus MaCoCu-bg-en 1.0 was built by crawling the ".bg" and ".бг" internet top-level domains in 2021, extending the crawl dynamically to other domains as well. All the crawling process was carried out by the MaCoCu crawler (https://github.com/macocu/MaCoCu-crawler). Websites containing documents in both target languages were identified and processed using the tool Bitextor (https://github.com/bitextor/bitextor). Considerable efforts were devoted into cleaning the extracted text to provide a high-quality parallel corpus. This was achieved by removing boilerplate and near-duplicated paragraphs and documents that are not in one of the targeted languages. Document and segment alignment as implemented in Bitextor were carried out, and BicleanerAI (https://github.com/bitextor/bicleaner-ai) and Bifixer (https://github.com/bitextor/bifixer) were used for fixing, cleaning, and deduplicating the final version of the corpus. While the TXT format consists solely of pairs of source and target segments (one or several sentences), each segment pair in the TMX format is accompanied by the following metadata: - source and target document URL; - quality score as provided by the tool BicleanerAI; - translation direction identification: the source segment in each segment pair was identified by using a probabilistic model; - personal information identification ("biroamer-entities"): segments containing personal information are flagged, so final users of the corpus can decide whether to use these segments; - language variants: the language variant of English (British or American) was identified for every segment pair on document and domain level. Notice and take down: Should you consider that our data contains material that is owned by you and should therefore not be reproduced here, please: (1) Clearly identify yourself, with detailed contact data such as an address, telephone number or email address at which you can be contacted. (2) Clearly identify the copyrighted work claimed to be infringed. (3) Clearly identify the material that is claimed to be infringing and information reasonably sufficient in order to allow us to locate the material. (4) Please write to the contact person for this resource whose email is available in the full item record. We will comply with legitimate requests by removing the affected sources from the next release of the corpus. This action has received funding from the European Union's Connecting Europe Facility 2014-2020 - CEF Telecom, under Grant Agreement No. INEA/CEF/ICT/A2020/2278341. This communication reflects only the author's view. The Agency is not responsible for any use that may be made of the information it contains.
BASE
Slovene-English parallel corpus MaCoCu-sl-en 1.0
In: http://hdl.handle.net/11356/1523
The Slovene-English parallel corpus MaCoCu-sl-en 1.0 was built by crawling the ".si" internet top-level domain in 2021, extending the crawl dynamically to other domains as well. All the crawling process was carried out by the MaCoCu crawler (https://github.com/macocu/MaCoCu-crawler). Websites containing documents in both target languages were identified and processed using the tool Bitextor (https://github.com/bitextor/bitextor). Considerable efforts were devoted into cleaning the extracted text to provide a high-quality parallel corpus. This was achieved by removing boilerplate and near-duplicated paragraphs and documents that are not in one of the targeted languages. Document and segment alignment as implemented in Bitextor were carried out, and BicleanerAI (https://github.com/bitextor/bicleaner-ai) and Bifixer (https://github.com/bitextor/bifixer) were used for fixing, cleaning, and deduplicating the final version of the corpus. While the TXT format consists solely of pairs of source and target segments (one or several sentences), each segment pair in the TMX format is accompanied by the following metadata: - source and target document URL; - quality score as provided by the tool BicleanerAI; - translation direction identification: the source segment in each segment pair was identified by using a probabilistic model; - personal information identification ("biroamer-entities"): segments containing personal information are flagged, so final users of the corpus can decide whether to use these segments; - language variants: the language variant of English (British or American) was identified for every segment pair on document and domain level. Notice and take down: Should you consider that our data contains material that is owned by you and should therefore not be reproduced here, please: (1) Clearly identify yourself, with detailed contact data such as an address, telephone number or email address at which you can be contacted. (2) Clearly identify the copyrighted work claimed to be infringed. (3) Clearly identify the material that is claimed to be infringing and information reasonably sufficient in order to allow us to locate the material. (4) Please write to the contact person for this resource whose email is available in the full item record. We will comply with legitimate requests by removing the affected sources from the next release of the corpus. This action has received funding from the European Union's Connecting Europe Facility 2014-2020 - CEF Telecom, under Grant Agreement No. INEA/CEF/ICT/A2020/2278341. This communication reflects only the author's view. The Agency is not responsible for any use that may be made of the information it contains.
BASE
O STILISTIKI IN NJENEM POMENU: JEZIKOSLOVNA DISCIPLINA V DRUŽBI 21. STOLETJA
In: Teorija in praksa, S. 203-220
Through analysis of the books on stylistics written in the English, German,
Czech, Slovak and Croatian languages, we describe the development of stylistics,
its predecessors, independence from literary science, and the contemporary
situation. We focus on Slovenian linguistic stylistics based on an
analysis and review of entries including keyword stylistics in the Slovenian
bibliographic catalogue Cobiss+. By reviewing and analysing the stylistic
publications of Tomo Korošec, who devoted the largest part of his research
to media stylistics, we substantiate his contribution to Slovenian theoretical
stylistics. The main finding of our comprehensive analysis is that stylistic
research in Slovenia has been intense since the 1960s, that an important part
of this research relates to the work of Tomo Korošec and that, alongside
theoretical stylistics, it is important to include school stylistics as part of general
education on all levels.
Keywords: linguistic stylistics, history of stylistics, media stylistics, journalism
stylistic, stylistic of advertising, linguistic education, rhetoric
Linguistically annotated multilingual comparable corpora of parliamentary debates ParlaMint.ana 2.0
ParlaMint is a multilingual set of comparable corpora containing parliamentary debates mostly starting in 2015 and extending to mid-2020, with each corpus being about 20 million words in size. The sessions in the corpora are marked as belonging to the COVID-19 period (after October 2019), or being "reference" (before that date). The corpora have extensive metadata, including aspects of the parliament; the speakers (name, gender, MP status, party affiliation, party coalition/opposition); are structured into time-stamped terms, sessions and meetings; with speeches being marked by the speaker and their role (e.g. chair, regular speaker). The speeches also contain marked-up transcriber comments, such as gaps in the transcription, interruptions, applause, etc. Note that some corpora have further information, e.g. the year of birth of the speakers, links to their Wikipedia articles, their membership in various committees, etc. The corpora are encoded according to the Parla-CLARIN TEI recommendation (https://clarin-eric.github.io/parla-clarin/), but have been validated against the compatible, but much stricter ParlaMint schemas. This entry contains the linguistically marked-up version of the corpus, while the text version is available at http://hdl.handle.net/11356/1388. The ParlaMint.ana linguistic annotation includes tokenization, sentence segmentation, lemmatisation, Universal Dependencies part-of-speech, morphological features, and syntactic dependencies, and the 4-class CoNLL-2003 named entities. Some corpora also have further linguistic annotations, such as PoS tagging or named entities according to language-specific schemes, with their corpus TEI headers giving further details on the annotation vocabularies and tools. The compressed files include the ParlaMint.ana XML TEI-encoded linguistically annotated corpus; the derived corpus in CoNLL-U with TSV speech metadata; and the vertical files (with registry file), suitable for use with CQP-based concordancers, such as CWB, noSketch Engine or KonText. Also included is the 2.0 release of the data and scripts available at the GitHub repository of the ParlaMint project.
BASE
Linguistically annotated multilingual comparable corpora of parliamentary debates ParlaMint.ana 2.1
ParlaMint 2.1 is a multilingual set of 17 comparable corpora containing parliamentary debates mostly starting in 2015 and extending to mid-2020, with each corpus being about 20 million words in size. The sessions in the corpora are marked as belonging to the COVID-19 period (from November 1st 2019), or being "reference" (before that date). The corpora have extensive metadata, including aspects of the parliament; the speakers (name, gender, MP status, party affiliation, party coalition/opposition); are structured into time-stamped terms, sessions and meetings; with speeches being marked by the speaker and their role (e.g. chair, regular speaker). The speeches also contain marked-up transcriber comments, such as gaps in the transcription, interruptions, applause, etc. Note that some corpora have further information, e.g. the year of birth of the speakers, links to their Wikipedia articles, their membership in various committees, etc. The corpora are encoded according to the Parla-CLARIN TEI recommendation (https://clarin-eric.github.io/parla-clarin/), but have been validated against the compatible, but much stricter ParlaMint schemas. This entry contains the linguistically marked-up version of the corpus, while the text version is available at http://hdl.handle.net/11356/1432. The ParlaMint.ana linguistic annotation includes tokenization, sentence segmentation, lemmatisation, Universal Dependencies part-of-speech, morphological features, and syntactic dependencies, and the 4-class CoNLL-2003 named entities. Some corpora also have further linguistic annotations, such as PoS tagging or named entities according to language-specific schemes, with their corpus TEI headers giving further details on the annotation vocabularies and tools. The compressed files include the ParlaMint.ana XML TEI-encoded linguistically annotated corpus; the derived corpus in CoNLL-U with TSV speech metadata; and the vertical files (with registry file), suitable for use with CQP-based concordancers, such as CWB, noSketch Engine or KonText. Also included is the 2.1 release of the data and scripts available at the GitHub repository of the ParlaMint project. As opposed to the previous version 2.0, this version corrects some errors in various corpora and adds the information on upper / lower house for bicameral parliaments. The vertical files have also been changed to make them easier to use in the concordancers.
BASE
Nazaj v solo: politicno branje Althusserja
In: Filozofski vestnik: FV, Band 35, Heft 1, S. 73-92
ISSN: 0353-4510
Pregled in analiza aktualnih konceptov socialnega dela z ljudmi z demenco
In: Socialno delo: časopis za teorijo in prakso, Band 61, Heft 2-3
Overview and analysis of current concepts in social work with people with dementia
The article presents the research project "Long-term care of people with dementia in social work theory and practice", the first Slovene national study in the field of research on the social dimensions of dementia. The first part presents the conceptual background of social work, which is the link between social work with people with dementia and the paradigmatic changes in long-term care. The second part presents the importance of the development of long-term care for people with dementia, the third part presents the purpose and objectives of the research project, and the fourth part elaborates the conceptual background, which is the basic guiding principle of the research in the project. Particular emphasis is placed on the methodological selection of current foreign scientific articles dealing with the topic presented, which have been published in the last twenty years in English. The results of the analysis show that three conceptual orientations prevail in the field of social work with people with dementia: (1) exploration of needs, (2) destigmatisation and anti-discrimination of people with dementia, and (3) participation of people with dementia in processes of help and support. In the concluding part, the author relates the findings of the analysis to the contemporary starting points of social work with people with dementia in Slovenia.
Problematika cenzure in lokalizacije pri prevajanju besedila kultnih kart Yu-Gi-Oh! ; The Issue of Censorship and Localisation in Translations of Card Texts in the Cult Trading Card Game Yu-Gi-Oh!
In: Maribor
Živimo v času vedno večjih tehnoloških napredkov, zaradi česar smo vedno bolj povezani s pripadniki drugih držav in kultur. Pogosto lahko v vsakodnevnem življenju opazimo lastnosti ali elemente drugih kultur, ki so se skozi čas zakoreninili v našo kulturo ter s tem pripomogli k prepletanju in tvorbi novih, hibridiziranih, kultur. Kljub vedno večji dostopnosti ter posledičnemu prevzemanju tujih kulturnospecifičnih elementov pa se kulture na določenih področjih med seboj še vedno zelo razlikujejo. Poglavitne razlike se kažejo v dovzetnosti kultur za določene vsebine. Obstaja veliko vsebin, ki jih družba, glede na svojo kulturo, različno dojema. To pogosto privede do težav pri prenosu določenih vsebin iz ene kulture v drugo, zaradi česar se pri premagovanju teh ovir mnogi pogosto poslužujejo uporabe sredstev, kot sta cenzura in lokalizacija. Magistrska naloga obravnava uporabo teh sredstev pri prevajanju besedil kultnih kart iz igre Yu-Gi-Oh! Trading Card Game. Osredinili se bomo na pomen in pogostost pojavljanja cenzure in lokalizacije v angleških in nemških prevodih kart. Igra izhaja z Japonske, zato vsebuje veliko kulturnospecifičnih elementov, ki so pri prenosu vsebine z vzhodnega na zahodni trg bili zamenjani ali popolnoma odstranjeni. Pri prevajanju tovrstnih vsebin ni potrebno le dobro znanje obeh jezikov, temveč je za uspešno lokalizacijo in, posledično, ohranjanje izvorne ideje treba izkazati veliko mero izvirnosti in fleksibilnosti. Z magistrsko nalogo bomo analizirali prevajalske postopke, s pomočjo katerih so pri angleških in nemških prevodih bile prikrite ali spremenjene vsebine iz izvornega besedila kart ter predstavili vpliv teh sprememb, tako na idejo kot tudi sam potek igranja. ; We live in a time of technological advances which makes it seem as if the gap between nations, and cultures is shrinking every day. In everyday life we constantly face traits or elements of foreign cultures which have been imbedded in our own, intertwining different cultures and forming new ones. We call them hybridized cultures. Even though it is getting easier to access other cultures and, consequently, to integrate foreign culture-specific elements into our own, there are certain aspects in which cultures still differ greatly from one another. One of the main differences is the susceptibility of different cultures to certain topics. Society's perception of different discourses varies depending on cultural expectations, which often leads to difficulties with transferring certain topics from one culture to another. This hurdle can often be overcome with censorship or localisation. In this master's thesis we will discuss the use of both these means of text's modification in relation to the translation of card texts in the cult card game Yu-Gi-Oh! Trading Card Game. We will focus on the importance and the extent to which censorship and localisation were used in English and German card translations. Yu-Gi-Oh! Trading Card Game originated in Japan, therefore the game contains a lot of culture-specific elements. During the card game's transition between markets a lot of these elements were either replaced or completely removed from the game. Translating such contents requires not only good knowledge of both languages but also a translator's imagination and flexibility. This master's thesis will analyse different translation strategies used in the English and German card translations to change or eliminate certain elements from the original card texts. We will also analyse how these changes affected the game's original ideas and the way of playing.
BASE