This poster highlights the United States Census Bureau Data Repository, which preserves and disseminates survey instruments, specifications, data dictionaries, codebooks, and other materials provided by the U.S. Census Bureau. The Inter‐university Consortium for Political and Social Research (ICPSR), the host of this data repository, has also listed additional Census‐related data collections from its larger holdings.
CoreTrustSeal Data Repository certification is meant to demonstrate to researchers that data repositories are taking appropriate measures to ensure sustainable and trustworthy data infrastructures. Certification also improves repository processes and procedures through measurement against a community standard. CoreTrustSeal began issuing certifications in 2017 and replaces the Data Seal of Approval (DSA) certification and World Data System (WDS) Regular Members certification. The Inter-university Consortium for Political and Social Research (ICPSR) was one of the first six data repositories to earn the Data Seal of Approval in 2011. ICPSR earned the World Data System certification in 2013. This presentation discusses ICPSR's experiences going through the new CoreTrustSeal certification process. We discuss the effort required to apply for certification, the differences from the earlier repository certifications, and the lessons learned from the process.
DOI:10.12685/027.7-2-1-49The Inter-university Consortium for Political and Social Research (ICPSR) was founded over 50 years ago "to further the development of research in political science" (Miller 1963:11). Since that summer of 1962, the scope and range of services ICPSR offers has expanded significantly to encompass the wider social and behavioral research community, and from a handful of data collections to thousands. With over 750 consortial members from around the world, ICPSR is now a leader in preserving, curating, and providing access to scientific data so others can reuse the data and validate research findings. Much of the success of ICPSR can be traced back to the consortial model upon which the organization was founded, with members providing funding, input, and a sense of community. This article describes the history and current status of the consortium, and discusses upcoming challenges and opportunities.
The Inter-university Consortium for Political and Social Research (ICPSR) was founded over 50 years ago "to further the development of research in political science" (Miller 1963:11). Since that summer of 1962, the scope and range of services ICPSR offers has expanded significantly to encompass the wider social and behavioral research community, and from a handful of data collections to thousands. With over 750 consortial members from around the world, ICPSR is now a leader in preserving, curating, and providing access to scientific data so others can reuse the data and validate research findings. Much of the success of ICPSR can be traced back to the consortial model upon which the organization was founded, with members providing funding, input, and a sense of community. This article describes the history and current status of the consortium, and discusses upcoming challenges and opportunities.
In 2010, ICPSR began a long process of recovering data from Gordon Streib's Cornell Study of Occupational Retirement (CSOR). Because the unique data fill a gap in our understanding of US retirement history, we determined that an extensive data recovery project was warranted. This paper describes the scope of the data collection and the steps in ICPSR's recovery process. Though the data recovery was ultimately successful, this paper documents the amount of time invested and costs associated with this kind of recovery work. It also highlights the value of these data for future research in understanding gender and retirement in a historic context. In addition to the resulting publicly available data arising from this project, extensive paper medical records are housed at ICPSR for on-site analysis or for a future digitization project. These data would provide unique health information on older women and men traced over a period of time in the 1950s and represents future work for ICPSR to undertake.
Science functions best within a liberal democracy. Every hypothesis test is an expression of doubt, as it carries with it the implication that a particular presumption may be incorrect (Kruschke 1998), whereas authoritarianism punishes challenges to prescribed beliefs. Consequently, science can lead to true innovation and improvements in knowledge only when laws and social norms permit dissent.
In 2019, the American Economic Association (AEA) adopted a Data and Code Availability Policy "to improve the reproducibility and transparency of materials supporting research published in the AEA journals by providing improved guidance on the types of materials required, increased quality control, and more review earlier in the publication process." The AEA initiative is one of the most comprehensive reproducibility and data/code sharing initiatives in the social sciences. In this presentation, we review the AEA workflow, including how the AEA assesses compliance with the policy and the accuracy of the information by running code to reproduce the reported results. We also demonstrate the newly established AEA Data and Code Repository at the Inter-university Consortium for Political and Social Research (ICPSR), which facilitates the AEA's workflow and review. Each data collection in the repository receives a persistent digital identifier (DOI), as well as descriptive metadata to increase findability, including JEL codes and subject terms. Data collections are also linked back to the journal article. Additionally, the AEA migrated their entire back archive of more than 3,000 data and code supplements to the AEA Data and Code Repository at ICPSR. This represents almost two decades of required data sharing associated with AEA journal publications. ; http://deepblue.lib.umich.edu/bitstream/2027.42/156061/1/Lyle ICPSR MIDAS Reproducibility Challenge 2020.pdf ; Description of Lyle ICPSR MIDAS Reproducibility Challenge 2020.pdf : Presentation ; SELF
Structured Data Transformation Language (SDTL) provides structured, machine actionable representations of data transformation commands found in statistical analysis software. The Continuous Capture of Metadata for Statistical Data Project (C2Metadata) created SDTL as part of an automated system that captures provenance metadata from data transformation scripts and adds variable derivations to standard metadata files. SDTL also has potential for auditing scripts and for translating scripts between languages. SDTL is expressed in a set of JSON schemas, which are machine actionable and easily serialized to other formats. Statistical software languages have a number of special features that have been carried into SDTL. We explain how SDTL handles differences among statistical languages and complex operations, such as merging files and reshaping data tables from "wide" to "long".
AbstractJournal editors have a large amount of power to advance open science in their respective fields by incentivising and mandating open policies and practices at their journals. The Data PASS Journal Editors Discussion Interface (JEDI, an online community for social science journal editors: www.dpjedi.org) has collated several resources on embedding open science in journal editing (www.dpjedi.org/resources). However, it can be overwhelming as an editor new to open science practices to know where to start. For this reason, we created a guide for journal editors on how to get started with open science. The guide outlines steps that editors can take to implement open policies and practices within their journal, and goes through the what, why, how, and worries of each policy and practice. This manuscript introduces and summarizes the guide (full guide: https://doi.org/10.31219/osf.io/hstcx).