Die Analyse kategorialer Daten: anwendungsorientierte Einführung in Logit-Modellierung und kategoriale Regression
In: Lehr- und Handbücher der Statistik
19 Ergebnisse
Sortierung:
In: Lehr- und Handbücher der Statistik
In: Psychometrika, Band 87, Heft 4, S. 1238-1269
A comprehensive class of models is proposed that can be used for continuous, binary, ordered categorical and count type responses. The difficulty of items is described by difficulty functions, which replace the item difficulty parameters that are typically used in item response models. They crucially determine the response distribution and make the models very flexible with regard to the range of distributions that are covered. The model class contains several widely used models as the binary Rasch model and the graded response model as special cases, allows for simplifications, and offers a distribution free alternative to count type items. A major strength of the models is that they can be used for mixed item formats, when different types of items are combined to measure abilities or attitudes. It is an immediate consequence of the comprehensive modeling approach that allows that difficulty functions automatically adapt to the response distribution. Basic properties of the model class are shown. Several real data sets are used to illustrate the flexibility of the models.
In: WIREs Computational Statistics, Band 14, Heft 2, S. 1-28
Ordinal models can be seen as being composed from simpler, in particular binary models. This view on ordinal models allows to derive a taxonomy of models that includes basic ordinal regression models, models with more complex parameterizations, the class of hierarchically structured models, and the more recently developed finite mixture models. The structured overview that is given covers existing models and shows how models can be extended to account for further effects of explanatory variables. Particular attention is given to the modeling of additional heterogeneity as, for example, dispersion effects. The modeling is embedded into the framework of response styles and the exact meaning of heterogeneity terms in ordinal models is investigated. It is shown that the meaning of terms is crucially determined by the type of model that is used. Moreover, it is demonstrated how models with a complex category-specific effect structure can be simplified to obtain simpler models that fit sufficiently well. The fitting of models is illustrated by use of a real data set, and a short overview of existing software is given.
In: Journal of Classification, Band 39, Heft 2, S. 241-263
Existing ordinal trees and random forests typically use scores that are assigned to the ordered categories, which implies that a higher scale level is used. Versions of ordinal trees are proposed that take the scale level seriously and avoid the assignment of artificial scores. The construction principle is based on an investigation of the binary models that are implicitly used in parametric ordinal regression. These building blocks can be fitted by trees and combined in a similar way as in parametric models. The obtained trees use the ordinal scale level only. Since binary trees and random forests are constituent elements of the proposed trees, one can exploit the wide range of binary trees that have already been developed. A further topic is the potentially poor performance of random forests, which seems to have been neglected in the literature. Ensembles that include parametric models are proposed to obtain prediction methods that tend to perform well in a wide range of settings. The performance of the methods is evaluated empirically by using several data sets.
In: International Statistical Review, Band 89, Heft 1, S. 18-35
Appropriate modelling of Likert-type items should account for the scale level and the specific role of the neutral middle category, which is present in most Likert-type items that are in common use. Powerful hierarchical models that account for both aspects are proposed. To avoid biased estimates, the models separate the neutral category when modelling the effects of explanatory variables on the outcome. The main model that is propagated uses binary response models as building blocks in a hierarchical way. It has the advantage that it can be easily extended to include response style effects and non-linear smooth effects of explanatory variables. By simple transformation of the data, available software for binary response variables can be used to fit the model. The proposed hierarchical model can be used to investigate the effects of covariates on single Likert-type items and also for the analysis of a combination of items. For both cases, estimation tools are provided. The usefulness of the approach is illustrated by applying the methodology to a large data set.
In: Sociological methodology, Band 51, Heft 1, S. 86-111
ISSN: 1467-9531
In this article, a modeling strategy is proposed that accounts for heterogeneity in nominal responses that is typically ignored when using common multinomial logit models. Heterogeneity can arise from unobserved variance heterogeneity, but it may also represent uncertainty in choosing from alternatives or, more generally, result from varying coefficients determined by effect modifiers. It is demonstrated that the bias in parameter estimation in multinomial logit models can be substantial if heterogeneity is present but ignored. The modeling strategy avoids biased estimates and allows researchers to investigate which variables determine uncertainty in choice behavior. Several applications demonstrate the usefulness of the model.
In: Statistical papers, Band 29, Heft 1, S. 257-269
ISSN: 1613-9798
In: Political analysis: PA ; the official journal of the Society for Political Methodology and the Political Methodology Section of the American Political Science Association, S. 1-18
ISSN: 1476-4989
Abstract
Valence is a crucial concept in studying spatial voting and party competition. The widely adopted approach is to rely on intercepts of vote choice models and to infer, based on their size and direction, how valence affects party strategies in empirical settings. The approach suffers from fundamental statistical flaws. This contribution provides the statistical fundamentals to advance the empirical modeling of valence. It proposes an appropriate modeling approach to interpret intercepts as valences and alternate specifications to parameterize the effects of valence.
In: Statistical Methods & Applications, Band 32, Heft 1, S. 129-148
Different voters behave differently at the polls, different students make different university choices, or different countries choose different health care systems. Many research questions important to social scientists concern choice behavior, which involves dealing with nominal dependent variables. Drawing on the principle of maximum random utility, we propose applying a flexible and general heterogeneous multinomial logit model to study differences in choice behavior. The model systematically accounts for heterogeneity that classical models do not capture, indicates the strength of heterogeneity, and permits examining which explanatory variables cause heterogeneity. As the proposed approach allows incorporating theoretical expectations about heterogeneity into the analysis of nominal dependent variables, it can be applied to a wide range of research problems. Our empirical example uses individual-level survey data to demonstrate the benefits of the model in studying heterogeneity in electoral decisions.
In: International Statistical Review, Band 90, Heft 2, S. 306-327
The potential of location-shift models to find adequate models between the proportional odds model and the non-proportional odds model is investigated. It is demonstrated that these models are very useful in ordinal modelling. While proportional odds models are often too simple, non-proportional odds models are typically unnecessary complicated and seem widely dispensable. In addition, the class of location-shift models is extended to allow for smooth effects. The additive location-shift model contains two functions for each explanatory variable, one for the location and one for dispersion. It is much sparser than hard-to-handle additive models with category-specific covariate functions but more flexible than common vector generalised additive models. An R package is provided that is able to fit parametric and additive location-shift models.
In: Statistics and Computing, Band 31, Heft 2, S. 1-12
In binary and ordinal regression one can distinguish between a location component and a scaling component. While the former determines the location within the range of the response categories, the scaling indicates variance heterogeneity. In particular since it has been demonstrated that misleading effects can occur if one ignores the presence of a scaling component, it is important to account for potential scaling effects in the regression model, which is not possible in available recursive partitioning methods. The proposed recursive partitioning method yields two trees: one for the location and one for the scaling. They show in a simple interpretable way how variables interact to determine the binary or ordinal response. The developed algorithm controls for the global significance level and automatically selects the variables that have an impact on the response. The modeling approach is illustrated by several real-world applications.
In: Statistica Neerlandica: journal of the Netherlands Society for Statistics and Operations Research, Band 72, Heft 3, S. 224-245
ISSN: 1467-9574
Mixture models for ordinal responses in the tradition of cub models use the uniform distribution to account for uncertainty of respondents. A model is proposed that uses more flexible distributions in the uncertainty component: the discretized Beta distribution allows to account for response styles, in particular the preference for middle or extreme categories. The proposal is compared with traditional cub models in simulation studies and its use is illustrated by two applications.
In: The journal of mathematical sociology, Band 28, Heft 2, S. 125-146
ISSN: 1545-5874
In: ZA-Information / Zentralarchiv für Empirische Sozialforschung, Heft 29, S. 81-93
'Bei Modellierungsansätzen für diskrete Verweildauern mit wenigen Ausprägungen erweisen sich diskrete Hazardratenmodelle als adäquates Instrument. Ansätze dafür werden durch einen sequentiellen Mechanismus motiviert. Mit Hilfe des Programmsystems GLAMOUR wird ein Vergleich durchgeführt zwischen dem früheren Westdeutschland und Ungarn hinsichtlich des Merkmals 'sexueller Erstkontakt von Jugendlichen'.' (Autorenreferat)
Einführung -- Univariate Deskription und Exploration von Daten -- Multivariate Deskription und Exploration -- Wahrscheinlichkeitsrechnung -- Diskrete Zufallsvariablen -- Stetige Zufallsvariablen -- Mehr über Zufallsvariablen und Verteilungen -- Mehrdimensionale Zufallsvariablen -- Parameterschätzung -- Testen von Hypothesen -- Spezielle Testprobleme -- Regressionsanalyse -- Varianzanalyse -- Zeitreihen -- Einführung in R -- Tabellen.