machine learning | Pollux - Fachinformationsdienst Politikwissenschaft

AbstractThis paper investigates how unsupervised machine learning methods might make hermeneutic interpretive text analysis more objective in the social sciences. Through a close examination of the uses of topic modeling—a popular unsupervised approach in the social sciences—it argues that the primary way in which unsupervised learning supports interpretation is by allowing interpreters to discover unanticipated information in larger and more diverse corpora and by improving the transparency of the interpretive process. This view highlights that unsupervised modeling does not eliminate the researchers' judgments from the process of producing evidence for social scientific theories. The paper shows this by distinguishing between two prevalent attitudes toward topic modeling, i.e., topic realism and topic instrumentalism. Under neither can modeling provide social scientific evidence without the researchers' interpretive engagement with the original text materials. Thus the unsupervised text analysis cannot improve the objectivity of interpretation by alleviating the problem of underdetermination in interpretive debate. The paper argues that the sense in which unsupervised methods can improve objectivity is by providing researchers with the resources to justify to others that their interpretations are correct. This kind of objectivity seeks to reduce suspicions in collective debate that interpretations are the products of arbitrary processes influenced by the researchers' idiosyncratic decisions or starting points. The paper discusses this view in relation to alternative approaches to formalizing interpretation and identifies several limitations on what unsupervised learning can be expected to achieve in terms of supporting interpretive work.

Zugriff(Open Access)Subito

Verfügbarkeit an Ihrem Standort wird überprüft

Dieser Artikel ist auch in Ihrer Bibliothek verfügbar: |

elektronisch

gedruckt

Exportieren

Aufsatz(elektronisch)#541. November 2022

Hiding opinions from machine learning

In: PNAS nexus, Band 1, Heft 5

Waniek, Marcin; Magdy, Walid; Rahwan, Talal; Contractor, Noshir

ISSN: 2752-6542

Abstract
Recent breakthroughs in machine learning and big data analysis are allowing our online activities to be scrutinized at an unprecedented scale, and our private information to be inferred without our consent or knowledge. Here, we focus on algorithms designed to infer the opinions of Twitter users toward a growing number of topics, and consider the possibility of modifying the profiles of these users in the hope of hiding their opinions from such algorithms. We ran a survey to understand the extent of this privacy threat, and found evidence suggesting that a significant proportion of Twitter users wish to avoid revealing at least some of their opinions about social, political, and religious issues. Moreover, our participants were unable to reliably identify the Twitter activities that reveal one's opinion to such algorithms. Given these findings, we consider the possibility of fighting AI with AI, i.e., instead of relying on human intuition, people may have a better chance at hiding their opinion if they modify their Twitter profiles following advice from an automated assistant. We propose a heuristic that identifies which Twitter accounts the users should follow or mention in their tweets, and show that such a heuristic can effectively hide the user's opinions. Altogether, our study highlights the risk associated with developing machine learning algorithms that analyze people's profiles, and demonstrates the potential to develop countermeasures that preserve the basic right of choosing which of our opinions to share with the world.

Zugriff(Open Access)Subito

Verfügbarkeit an Ihrem Standort wird überprüft

Dieser Artikel ist auch in Ihrer Bibliothek verfügbar: |

elektronisch

gedruckt

Exportieren

Aufsatz(elektronisch)#5514. September 2023

Machine Learning for Data Linkage

In: International journal of population data science: (IJPDS), Band 8, Heft 2

Ellum, Rhosanna; Lewis, Alex; Xhaferaj, Kristina; Shipsey, Rachel; Račinskij, Viktor; White, Zoe

ISSN: 2399-4908

Data linkage traditionally uses deterministic and probabilistic methods. Alternatively, machine learning methods can be applied as classification algorithms, using the data to inform decisions. This project compared the quality, in terms of precision and recall, of traditional methods with selected machine learning methods when applied to a standard linkage problem.
Two supervised methods, gradient boosted trees (GBT) and multiple layered perceptron classifier (MLPC), and one unsupervised method, maximum entropy classification (MEC), were implemented. The England and Wales 2021 Census to Census Coverage Survey (CCS) linkage was used as a gold-standard (GS) linked dataset to provide training samples for the supervised methods as well as testing samples for all methods. The F1 score (harmonic mean of precision and recall) was used to compare the performance of the models and to determine the optimal parameters and thresholds.
The Splink implementation of Fellegi-Sunter with Expectation Maximisation was used as a baseline for comparison.
The methods, trained on a sample of the GS, were used to link census and CCS data. All methods performed well with MEC achieving the highest precision (99.79%) but lowest recall (96.36%). The MLPC model achieved the highest F1 score (98.94%).
To understand the implications of not retraining supervised models for each dataset, the models were also used to link Census to a health dataset. The supervised models were not retrained using the health data; instead, the optimised GS models were applied. MEC had the lowest precision (96.51%) but the highest recall (98.48%) and highest F1 score (97.49%). With F1 scores of 96.99% and 96.14% respectively, the GBT and MLPC supervised models were not far behind in performance, despite not being trained using health data.
We have shown that machine learning methods can be used effectively for data linkage problems. Unsurprisingly, supervised models perform best when trained on and applied to the same data. Further research into generic training may allow us to use both supervised and unsupervised machine learning models for future data linkage.

Zugriff(Open Access)Subito

Verfügbarkeit an Ihrem Standort wird überprüft

Dieser Artikel ist auch in Ihrer Bibliothek verfügbar: |

elektronisch

gedruckt

Exportieren

Aufsatz(elektronisch)#562023

Counterfeit Currency Detection using Machine Learning

In: Beigh, T. M., Arivazagan, J., & Venkatesan, V. P. (2023). Counterfeit Currency Detection using Machine Learning. Journal of Emerging Technologies and Innovative Research, 10(3), 356–358. ISSN-2349-5162

Beigh, Tabiya; J., Arivazagan; Venkatesan, V. Prasanna

Zugriff(Open Access)Subito

Verfügbarkeit an Ihrem Standort wird überprüft

Dieser Artikel ist auch in Ihrer Bibliothek verfügbar: |

elektronisch

gedruckt

SSRN

Exportieren

Aufsatz(elektronisch)#5725. Mai 2010