"Text data is important for many domains, from healthcare to marketing to the digital humanities, but specialized approaches are necessary to create features for machine learning from language. Supervised Machine Learning for Text Analysis in R explains how to preprocess text data for modeling, train models, and evaluate model performance using tools from the tidyverse and tidymodels ecosystem. Models like these can be used to make predictions for new observations, to understand what natural language features or characteristics contribute to differences in the output, and more. If you are already familiar with the basics of predictive modeling, use the comprehensive, detailed examples in this book to extend your skills to the domain of natural language processing"--
"Individual-Based Models of Cultural Evolution shows readers how to create individual-based models of cultural evolution using the programming language R. The field of cultural evolution has emerged in the last few decades as a thriving, interdisciplinary effort to understand cultural change and cultural diversity within an evolutionary framework and using evolutionary tools, concepts and methods"--
Modern Applied Regressions creates an intricate and colorful mural with mosaics of categorical and limited response variable (CLRV) models using both Bayesian and Frequentist approaches. Written for graduate students, junior researchers, and quantitative analysts in behavioral, health, and social sciences, this text provides details for doing Bayesian and frequentist data analysis of CLRV models. Each chapter can be read and studied separately with R coding snippets and template interpretation for easy replication. Along with the doing part, the text provides basic and accessible statistical theories behind these models and uses a narrative style to recount their origins and evolution. This book first scaffolds both Bayesian and frequentist paradigms for regression analysis, and then moves onto different types of categorical and limited response variable models, including binary, ordered, multinomial, count, and survival regression. Each of the middle four chapters discusses a major type of CLRV regression that subsumes an array of important variants and extensions. The discussion of all major types usually begins with the history and evolution of the prototypical model, followed by the formulation of basic statistical properties and an elaboration on the doing part of the model and its extension. The doing part typically includes R codes, results, and their interpretation. The last chapter discusses advanced modeling and predictive techniques--multilevel modeling, causal inference and propensity score analysis, and machine learning--that are largely built with the toolkits designed for the CLRV models previously covered. The online resources for this book, including R and Stan codes and supplementarynotes, can be accessed at https://sites.google.com/site/socjunxu/home/statistics/modernapplied-regressions.
Web technologies are increasingly relevant to scientists working with data, for both accessing data and creating rich dynamic and interactive displays. The XML and JSON data formats are widely used in Web services, regular Web pages and JavaScript code, and visualization formats such as SVG and KML for Google Earth and Google Maps. In addition, scientists use HTTP and other network protocols to scrape data from Web pages, access REST and SOAP Web Services, and interact with NoSQL databases and text search applications. This book provides a practical hands-on introduction to these technologies, including high-level functions the authors have developed for data scientists. It describes strategies and approaches for extracting data from HTML, XML, and JSON formats and how to programmatically access data from the Web. Along with these general skills, the authors illustrate several applications that are relevant to data scientists, such as reading and writing spreadsheet documents both locally and via GoogleDocs, creating interactive and dynamic visualizations, displaying spatial-temporal displays with Google Earth, and generating code from descriptions of data structures to read and write data. These topics demonstrate the rich possibilities and opportunities to do new things with these modern technologies. The book contains many examples and case-studies that readers can use directly and adapt to their own work.
Frontmatter -- Things People do with Censored Data that are Just Wrong -- Three Approaches for Censored Data -- Reporting Limits -- Reporting, Storing, and Using Censored Data -- Plotting Censored Data -- Computing Summary Statistics and Totals -- Computing Interval Estimates -- What can be done when all Data are below the Reporting Limit? -- Comparing Two Groups -- Comparing Three or more Groups -- Correlation -- Regression and Trends -- Multivariate Methods for Censored Data -- The NADA for R Software -- Appendix: Datasets -- References -- Index -- Statistics in Practice.
Access options:
The following links lead to the full text from the respective local libraries:
Like its bestselling predecessor, Multilevel Modeling Using R, Second Edition provides the reader with a helpful guide to conducting multilevel data modeling using the R software environment. After reviewing standard linear models, the authors present the basics of multilevel models and explain how to fit these models using R. They then show how to employ multilevel modeling with longitudinal data and demonstrate the valuable graphical options in R. The book also describes models for categorical dependent variables in both single level and multilevel data. This thoroughly updated revision gives the reader state-of-the-art tools to launch their own investigations in multilevel modeling and gain insight into their research.
Access options:
The following links lead to the full text from the respective local libraries:
This book is a comprehensive introduction to the power of R for marketing research professionals. The text describes statistical models from a conceptual point of view with a minimal amount of mathematics, assuming only an introductory knowledge of statistics. Hands-on chapters accelerate the learning curve by asking readers to interact with R from the beginning. Core topics include the R language, basic statistics, linear modeling and data visualization, which is presented as an integral part of the analysis. Later chapters cover more advanced topics but are intended to be accessible to all analysts. These sections examine logistic regression, customer segmentation, hierarchical linear modeling, market basket analysis, structural equation modeling, and conjoint analysis in R. The text uniquely presents Bayesian models with a minimally complex, demonstrating and explaining Bayesian methods along with traditional analyzes for analysis of variance, linear models, and conjoint analysis based on metrics and choices. With its emphasis on data visualization, model evaluation, and developing statistical intuition, this book provides guidance for any analyst looking to develop or improve skills in R for marketing applications.
What's in this book (Read this first!) -- Part I The basics: models, probability, Bayes' rule and r: Introduction: credibility, models, and parameters; The R programming language; What is this stuff called probability?; Bayes' rule -- Part II All the fundamentals applied to inferring a binomila probability: Inferring a binomial probability via exact mathematical analysis; Markov chain Monte Carlo; JAGS; Hierarchical models; Model comparison and hierarchical modeling; Null hypothesis significance testing; Bayesian approaches to testing a point ("Null") hypothesis; Goals, power, and sample size; Stan -- Part III The generalized linear model: Overview of the generalized linear model; Metric-predicted variable on one or two groups; Metric predicted variable with one metric predictor; Metric predicted variable with multiple metric predictors; Metric predicted variable with one nominal predictor; Metric predicted variable with multiple nominal predictors; Dichotomous predicted variable; Nominal predicted variable; Ordinal predicted variable; Count predicted variable; Tools in the trunk -- Bibliography -- Index