AbstractData practices in biomedical research often rely on standards that build on normative assumptions regarding privacy and involve 'ethics work.' In an increasingly datafied research environment, identifiability gains a new temporal and spatial dimension, especially in regard to genomic data. In this paper, we analyze how genomic identifiability is considered as a specific data issue in a recent controversial case: publication of the genome sequence of the HeLa cell line. Considering developments in the sociotechnological and data environment, such as big data, biomedical, recreational, and research uses of genomics, our analysis highlights what it means to be (re-)identifiable in the postgenomic era. By showing how the risk of genomic identifiability is not a specificity of the HeLa controversy, but rather a systematic data issue, we argue that a new conceptualization is needed. With the notion of post-identifiability as a sociotechnological situation, we show how past assumptions and ideas about future possibilities come together in the case of genomic identifiability. We conclude by discussing how kinship, temporality, and openness are subject to renewed negotiations along with the changing understandings and expectations of identifiability and status of genomic data.
Infectious diseases are caused by pathogenic micro-organisms which can be bacteria, viruses, parasites or fungi. The diseases can be spread through many different routes, either directly or indirectly. Military personnel are at high risk of contracting infections, in particular vector-borne and zoonotic infections, during overseas deployments, where they may be exposed to endemic or emerging infections to which they do not have immunity. Additionally, overcrowded settings with poor sanitation are high risks for disease. Genomics is having a transformational impact on medicine. It is enabling advances in accurate diagnosis of infectious disease, development of effective and targeted treatment strategies and opportunities to assess pathogenicity. Further, it supports the detection, surveillance of infectious diseases, the development and assessment of vaccines, as well as the assessment and prediction of anti-microbial resistance. These capabilities are all key military needs to protect personnel in this inter-connected world. The advances in sequencing technologies have resulted in an explosion of genomic data. However, making sense of genomic data requires advances in computational analysis technologies together with crossdisciplinary scientific approaches, skill sets and people. There are extensive reference databases of genomic data. One such open access database is PubMLST.org: it contains well curated genomes for more than 100 microbial species and genera integrated with provenance and phenotype information. All levels of sequence data, from single gene sequences up to and including complete, finished genomes can be accessed on this platform. This data is, however, both large and complex and intractable to analyse and understand using traditional analysis tools. This paper will discuss the challenges of analysing such genomic data for bacterial infections and consider the application of bioinformatics tools and techniques to analyse and communicate microbial genomic data in healthcare.
This research attempts to seek changing patterns of raw data availability and their correlations with implementations of open mandate policies. With a list of 13,785 journal articles whose authors archived datasets in a popular biomedical data repository after these articles were published in journals, this research uses regression analysis to test the correlations between data contributions and mandate implementations. It finds that both funder-based and publisher-based mandates have a strong impact on scholars' likelihood to contribute to open data repositories. Evidence also suggests that like policies have changed the habit of authors in selecting publishing venues: open access journals have been apparently preferred by those authors whose projects are sponsored by the federal government agencies, and these journals are also highly ranked in the biomedical fields. Various stakeholders, particularly institutional administrators and open access professionals, may find the findings of this research helpful for adjusting data management policies to increase the number of quality free datasets and enhance data usability. The data-sharing example in biomedical studies provides a good case to show the importance of policy-making in the reshaping of scholarly communication.
Background: Neuroblastoma is a heterogeneous disease with diverse clinical outcomes. Current risk group models require improvement as patients within the same risk group can still show variable prognosis. Recently collected genome-wide datasets provide opportunities to infer neuroblastoma subtypes in a more unified way. Within this context, data integration is critical as different molecular characteristics can contain complementary signals. To this end, we utilized the genomic datasets available for the SEQC cohort patients to develop supervised and unsupervised models that can predict disease prognosis. Results: Our supervised model trained on the SEQC cohort can accurately predict overall survival and event-free survival profiles of patients in two independent cohorts. We also performed extensive experiments to assess the prediction accuracy of high risk patients and patients without MYCN amplification. Our results from this part suggest that clinical endpoints can be predicted accurately across multiple cohorts. To explore the data in an unsupervised manner, we used an integrative clustering strategy named multi-view kernel k-means (MVKKM) that can effectively integrate multiple high-dimensional datasets with varying weights. We observed that integrating different gene expression datasets results in a better patient stratification compared to using these datasets individually. Also, our identified subgroups provide a better Cox regression model fit compared to the existing risk group definitions. Conclusion: Altogether, our results indicate that integration of multiple genomic characterizations enables the discovery of subtypes that improve over existing definitions of risk groups. Effective prediction of survival times will have a direct impact on choosing the right therapies for patients. ; European Union (EU) ; Publisher's Version
As genomic researchers are encouraged to engage in broad genomic data sharing, American Indian/Alaska Native/Native Hawaiian (AI/AN/NH) leaders have raised questions about ownership of data and biospecimens and concerns over emerging challenges and potential threats to tribal sovereignty. Using a community-engaged research approach, we conducted 42 semi-structured interviews with tribal leaders, clinicians, researchers, policy makers, and tribal research review board members about their perspectives on ethical issues related to genetics in AI/AN/NH communities. We report findings related to perspectives on genetic research, data sharing, and envisioning stronger oversight and management of data. In particular, participants voiced concerns about different models of data sharing, infrastructure and logistics for housing data, and who should have authority to grant access to data. The results will ultimately guide policy-making and the creation of guidelines and new strategies for tribes to drive the research agenda and promote ethically and culturally appropriate research.Ethn Dis.2019;29(Suppl 3):659-668;doi:10.18865/ed.29.S3.659
M.F. and S.O.M.D. are supported by Genome Quebec, Genome Canada, the Government of Canada, and the Ministère de l'Économie, Innovation et Exportation du Québec (Can-SHARE grant 141210); S.O.M.D. is supported by the Canadian Institutes of Health Research (grants EP1-120608; EP2-120609); M.H. is supported by BD2K NIH/NCI 5U54HG007990-02; S. Scollen, S.V., M.B., I.L., J.T., S.U.-R., S.d.l.T., M.L., H.S. and the EGA are supported by ELIXIR, the research infrastructure for life-science data. This work was supported by ELIXIR-EXCELERATE, funded by the European Commission within the Research Infrastructures programme of Horizon 2020, grant agreement number 676559 (J.D.S., I.L.), the Wellcome Trust grant numbers WT201535/Z/16/Z (P.F.) and WT098051 (S.K., D.L., P.F.). A.J.B. is supported by the European Union FP7 Programme 'EMIF' IMI-JU grant no. 115372, and H2020 Programme 'GCOF' grant no. 643439