"This classroom-tested book fills a major gap in graduate- and professional-level data science and social science education. It can be used to train a new generation of social data scientists to tackle real-world problems and improve the skills and competencies of applied social scientists and public policy practitioners. It empowers you to use the massive and rapidly growing amounts of available data to interpret economic and social activities in a scientific and rigorous manner"--
Cover -- Half Title -- Title -- Copyright -- Contents -- Preface -- Editors -- Contributors -- 1: Introduction -- 1.1: Why this book? -- 1.2: Defining big data and its value -- 1.3: Social science, inference, and big data -- 1.4: Social science, data quality, and big data -- 1.5: New tools for new data -- 1.6: The book's "use case" -- 1.7: The structure of the book -- 1.7.1: Part I: Capture and curation -- 1.7.2: Part II: Modeling and analysis -- 1.7.3: Part III: Inference and ethics -- 1.8: Resources -- I: Capture and Curation -- 2: Working with Web Data and APIs -- 2.1: Introduction -- 2.2: Scraping information from the web -- 2.2.1: Obtaining data from the HHMI website -- 2.2.2: Limits of scraping -- 2.3: New data in the research enterprise -- 2.4: A functional view -- 2.4.1: Relevant APIs and resources -- 2.4.2: RESTful APIs, returned data, and Python wrappers -- 2.5: Programming against an API -- 2.6: Using the ORCID API via a wrapper -- 2.7: Quality, scope, and management -- 2.8: Integrating data from multiple sources -- 2.8.1: The Lagotto API -- 2.8.2: Working with a corpus -- 2.9: Working with the graph of relationships -- 2.9.1: Citation links between articles -- 2.9.2: Categories, sources, and connections -- 2.9.3: Data availability and completeness -- 2.9.4: The value of sparse dynamic data -- 2.10: Bringing it together: Tracking pathways to impact -- 2.10.1: Network analysis approaches -- 2.10.2: Future prospects and new data sources -- 2.11: Summary -- 2.12: Resources -- 2.13: Acknowledgements and copyright -- 3: Record Linkage -- 3.1: Motivation -- 3.2: Introduction to record linkage -- 3.3: Preprocessing data for record linkage -- 3.4: Indexing and blocking -- 3.5: Matching -- 3.5.1: Rule-based approaches -- 3.5.2: Probabilistic record linkage -- 3.5.3: Machine learning approaches to linking -- 3.5.4: Disambiguating networks.
Zugriffsoptionen:
Die folgenden Links führen aus den jeweiligen lokalen Bibliotheken zum Volltext:
We develop a preliminary version of an Integrated Longitudinal Business Database (ILBD) that combines administrative records and survey data for all employer and nonemployer business units in the United States. Unlike other large-scale business databases, the ILBD tracks business transitions from nonemployer to employer status. This feature of the ILBD opens a new frontier for the study of business formation, early lifecycle dynamics and the precursors to job creation in the U.S. economy. There are 5.4 million nonfarm business firms with employees as of 2000 and another 15.5 million with no employees. Our analysis focuses on 40 industries that account for nearly half of nonemployers and 36 percent of nonemployer revenues. Within these industries, nonemployers account for 14 percent of business revenues. About 220,000 of the seven million nonemployers in our selected industries hire workers and migrate to the employer universe over a three-year horizon. These Migrants account for 20 percent of revenue among young employers (three years or less since first hire). Compared to other nonemployers, the revenue of Migrants grows very rapidly in the year prior to and the year of transition to employer status.
Frontmatter -- Contents -- Prefatory Note -- Introduction: Big Data for Twenty- First- Century Economic Statistics: The Future Is Now -- I. Toward Comprehensive Use of Big Data in Economic Statistics -- 1. Reengineering Key National Economic Indicators -- 2. Big Data in the US Consumer Price Index -- 3. Improving Retail Trade Data Products Using Alternative Data Sources -- 4. From Transaction Data to Economic Statistics -- 5. Improving the Accuracy of Economic Measurement with Multiple Data Sources -- II. Uses of Big Data for Classification -- 6. Transforming Naturally Occurring Text Data into Economic Statistics -- 7. Automating Response Evaluation for Franchising Questions on the 2017 Economic Census -- 8. Using Public Data to Generate Industrial Classification Codes -- III. Uses of Big Data for Sectoral Measurement -- 9. Nowcasting the Local Economy -- 10. Unit Values for Import and Export Price Indexes -- 11. Quantifying Productivity Growth in the Delivery of Important Episodes of Care within the Medicare Program Using Insurance Claims and Administrative Data -- 12. Valuing Housing Services in the Era of Big Data -- IV. Methodological Challenges and Advances -- 13. Off to the Races -- 14. A Machine Learning Analysis of Seasonal and Cyclical Sales in Weekly Scanner Data -- 15. Estimating the Benefits of New Products -- Contributors -- Author Index -- Subject Index
Zugriffsoptionen:
Die folgenden Links führen aus den jeweiligen lokalen Bibliotheken zum Volltext: