In applied statistical data analysis, overdispersion is a common feature. It can be addressed using both multiplicative and additive random effects. A multiplicative model for count data incorporates a gamma random effect as a multiplicative factor into the mean, whereas an additive model assumes a normally distributed random effect, entered into the linear predictor. Using Bayesian principles, these ideas are applied to longitudinal count data, based on the so-called combined model. The performance of the additive and multiplicative approaches is compared using a simulation study. ; The authors gratefully acknowledge support from IAP research Network P7/06 of the Belgian Government (Belgian Science Policy). The computational resources and services used in this work were provided by the VSC (Flemish Supercomputer Center), funded by the Research Foundation - Flanders (FWO) and the Flemish Government – department EWI.
Alternative gene splicing is a common phenomenon in which a single gene gives rise to multiple transcript isoforms. The process is strictly guided and involves a multitude of proteins and regulatory complexes. Unfortunately, aberrant splicing events have been linked to genetic disorders. Therefore, understanding mechanisms of alternative splicing regulation and differences in splicing events between diseased and healthy tissues is crucial in advancing personalized medicine and drug developments. We propose a linear mixed model, Random Effects for the Identification of Differential Splicing (REIDS), for the identification of alternative splicing events using Human Transcriptome Arrays (HTA). For each exon, a splicing score is calculated based on two scores, an exon score and an array score. The junction information is used to rank the identified exons from strongly confident to less confident candidates for alternative splicing. The design of junctions was also discussed to highlight the complexity of exonexon and exon-junction interactions. Based on a list of Rt-PCR validated probe sets, REIDS outperforms AltAnalyze and iGems in the % recall rate. ; The computational resources and services used in this work were provided by the VSC (Flemish Supercomputer Center), funded by the Research Foundation - Flanders (FWO) and the Flemish Government - department EWI. We thank the Flemish Government and the University of Hasselt for the BOF scholarship funding the research of M.V.M. The funding agency played no role in the design and collection of the present study.
Background The evaluation and validation of surrogate endpoints have been extensively studied in the last decade. Prentice [1] and Freedman, Graubard and Schatzkin [2] laid the foundations for the evaluation of surrogate endpoints in randomized clinical trials. Later, Buyse et al. [5] proposed a meta-analytic methodology, producing different methods for different settings, which was further studied by Alonso and Molenberghs [9], in their unifying approach based on information theory. Purpose In this article, we focus our attention on the trial-level surrogacy and propose alternative procedures to evaluate such surrogacy measure, which do not pre-specify the type of association. A promising correction based on cross-validation is investigated. As well as the construction of confidence intervals for this measure. Methods In order to avoid making assumption about the type of relationship between the treatment effects and its distribution, a collection of alternative methods, based on regression trees, bagging, random forests, and support vector machines, combined with bootstrap-based confidence interval and, should one wish, in conjunction with a cross-validation based correction, will be proposed and applied. We apply the various strategies to data from three clinical studies: in opthalmology, in advanced colorectal cancer, and in schizophrenia. Results The results obtained for the three case studies are compared; they indicate that using random forest or bagging models produces larger estimated values for the surrogacy measure, which are in general stabler and the confidence interval narrower than linear regression and support vector regression. For the advanced colorectal cancer studies, we even found the trial-level surrogacy is considerably different from what has been reported. Limitations In general the alternative methods are more computationally demanding, and specially the calculation of the confidence intervals, require more computational time that the delta-method counterpart. Conclusions First, more flexible modeling techniques can be used, allowing for other type of association. Second, when no cross-validation-based correction is applied, overly optimistic trial-level surrogacy estimates will be found, thus cross-validation is highly recommendable. Third, the use of the delta method to calculate confidence intervals is not recommendable since it makes assumptions valid only in very large samples. It may also produce range-violating limits. We therefore recommend alternatives: bootstrap methods in general. Also, the information-theoretic approach produces comparable results with the bagging and random forest approaches, when cross-validation correction is applied. It is also important to observe that, even for the case in which the linear model might be a good option too, bagging methods perform well too, and their confidence intervals were more narrow. ; The authors gratefully acknowledge support from FWO-Vlaanderen Research Project 'Sensitivity Analysis for Incomplete and Coarse Data' and Belgian IUAP/PAI network # P6/03 'Statistical Techniques and Modeling for Complex Substantive Questions with Complex Data' of the Belgian Government (Belgian Science Policy). The authors also gratefully acknowledge support from FWO-Vlaanderen Research Project G.0151.05.
not only definition of a conceptual model that includes an intestinal compartment containing the intestinal microbiota, but also a way to determine the related kinetic constants in an in vitro model and to scale the resulting kinetic constants to the whole organism. Using pyrrolizidine N-oxides as the model compounds it was shown that anaerobic fecal incubations provide a way to define kinetic constants for pyrrolizidine alkaloid N-oxidation by the intestinal microbiota of both rats and human. The Vmax values thus obtained require subsequent scaling to the whole organism. This can be done is various ways including i) fitting of PBK model predicted data to available in vivo data, ii) based on the fecal fraction of body weight and iii) using the bacterial counts and volumes of the various intestinal compartments. Using the PBK models thus obtained the role of the intestinal micro-biota in the bioactivation of pyrrolizidine N-oxides to the parent PAs can be taken into account. The PBK models enable definition of relative potency factors for the N-oxides relative to their parent PAs by comparison of the area under the concentration time curves for the parent PA upon dosing either the N-oxide or an equimolar amount of the parent PA. This reveals that the relative potency of PAs is predicted to be lower than that of the corresponding PA. Nanomaterials, present in many products nowadays, have different physico-chemical, biological and toxicological properties compared to the same material in the bulk form. Therefore, their potential toxicity needs to be analyzed to ensure their safety for human, animals and the environment. In vivo studies are commonly used to assess the risk of chemicals. However, due to the large amounts of nanoma-terials that can be produced and for reducing animal testing and financial and time costs, there is a growing interest in incorporating in vitro and in silico models in the risk assessment process and increasing the focus on in vitro-in vivo extrapolation (IVIVE) methods. To support research on nanomaterial toxicity, including IVIVE, several project initiatives aim to enhance the usability of nanomate-rial data by developing databases containing information from various available experimental nanomaterial datasets. The H2020 Nano-InformaTIX project is one of these initiatives. Based on data collected over the last decades, this project aims to build a user-friendly platform for risk management of engineered nanomaterials. Our aim is to utilize the data gathered within the NanoinformaTIX project to develop an approach to perform in vitro-in vivo extrapolation analyses for nanomaterial risk assessment. In order to select nanomateri-als within the vast amounts of data within the NanoInformaTIX database , which have to be analyzed for dose-response trends, we developed a software tool (the R package NMTox) to explore the database and to identify monotonic trends in the dose-response relationship of nanomaterials for toxicity endpoints. In the second stage, dose-response models are fitted (using a software tool that is currently being developed) on the nanomaterials for which a monoton-ic trend is identified. These nanomaterials/toxicity endpoint combinations will be the focus for in vitro to in vivo extrapolation. As a case study, we focus on the cell viability data available within the NanoinfomaTIX database. Inference was performed by testing the significance of monotonic dose-response relationships using Likelihood ratio tests. Since a high number of data subsets were test-Poly-and perfluoralkyl substances (PFASs), like perfluorooctanoic acid (PFOA) and perfluorooctanesulfonic acid (PFOS) have been reported to cause liver toxicity in experimental animals and to disturb lipid homeostasis in experimental animals and humans. To obtain more insight into the cellular effects of PFASs on the human liver, we assessed the effects on gene expression of 19 PFASs in HepaRG cells by performing microarray studies for PFOS and RT-PCR analyses of selected genes for all PFASs. We also assessed the biokinetics of PFOS and PFOA in the in vitro model, determining time-and concentration-dependent cell-associated PFAS levels. BMDExpress analysis of the PFOS microarray data point to various affected processes, with cholesterol biosynthesis (downregulated) and ATF4-related signaling (upregulated) being among the processes with the lowest BMC values. Results from RT-PCR analyses for genes related to ATF4 signaling, cholesterol biosynthesis, PPAR signaling and other sensitive genes point to differences in potencies for the tested PFASs. The shorter chain PFASs (PFPeA, PFHxA, PFBS, HPFO-DA) only impacted on the expression of PPAR-regulated genes. Interestingly, BMC values for different PFASs related to ATF4 activation were correlated with those for decreased expression of the cholesterogenic genes, suggesting a possible relation between these processes. Of the tested PFASs, HFPO-TA was shown to be the most potent modulator of gene expression. The in vitro biokinetic data indicate that at the applied culture conditions maximum cell-associated PFOS and PFOA levels are obtained around 1 hour after exposure, remaining stable up to the end of exposure (24 hours), being up to 10-fold higher for PFOS compared to PFOA, depending on the nominal concentrations applied. Altogether, the study provides mechanistic insights into the effects and relative potencies of PFASs on the human liver, pointing to a possible association between ATF4 signaling and the PFAS-induced decrease in cholesterogenic gene expression. In combination with in vitro bioki-netic data, as obtained for PFOS and PFOA in the present study, these results will be used as a basis for quantitative in vitro to in vivo ex-trapolations (QIVIVE), with the application of physiologically-based kinetic (PBK) modelling, to assess whether effects found in vitro can be expected at relevant in vivo exposure scenarios. The intestinal microbiome is able to affect the susceptibility to a wide range of pharmaceutical and foodborne chemicals through a broad range of reactions. An example is the reduction of food borne pyr-rolizidine alkaloid N-oxides to the parent pyrrolizidine alkaloids (PAs) enabling their further bioactivation in the liver to reactive toxic pyrrole metabolites. To include the reactions by the intestinal microbiome in physiologically based kinetic (PBK) models requires ; European Union's Horizon 2020 research and innovation programme [814426]
When the true relationship between a covariate and an outcome is nonlinear, one should use a nonlinear mean structure that can take this pattern into account. In this article, the fractional polynomial modeling framework, which assumes a prespecified set of powers, is extended to a nonlinear fractional polynomial framework (NLFP). Inferences are drawn in a Bayesian fashion. The proposed modeling paradigm is applied to predict the long-term persistence of vaccine-induced anti-HPV antibodies. In addition, the subject-specific posterior probability to be above a threshold value at a given time is calculated. The model is compared with a power-law model using the deviance information criterion (DIC). The newly proposed model is found to fit better than the power-law model. A sensitivity analysis was conducted, from which a relative independence of the results from the prior distribution of the power was observed. Supplementary materials for this article are available online. ; IAP research Network of Belgian Government (Belgian Science Policy) [P7/06]
In infectious diseases, it is important to predict the long-term persistence of vaccine-induced antibodies and to estimate the time points where the individual titers are below the treshold value for protection. This article focuses on HPV-16/18, and uses a so-called fractional-polynomial model to this effect, derived in a data-driven fashion. Initially, model selection was done from among the second- and first-order fractional polynomials on the one hand, and the linear mixed model on the other. According to a functional selection procedure, the first-order fractional polynomial was selected. Apart from the fractional polynomial model, we also fitted a power law model, which is a special case of the fractional polynomial model. Both models were compared using Akaike's Information Criterion. Over the observation period, the fractional polynomials fitted the data better than the power-law model; this, of course, does not imply that it fits better over the long run and hence caution ought to be used when prediction is of interest. Therefore, we point out that the persistence of the anti-HPV responses induced by these vaccines can only be ascertained empirically by long-term follow-up analysis. ; The authors gratefully acknowledge support from IAP research Network P6/03 of the Belgian Government (Belgian Science Policy). They also thank the study participants and clinical investigators from the Phase IIb primary efficacy study (NCT00689741). Finally, they thank the laboratory personnel for their contribution in performing the assays.
This paper aims to develop a probability-based model involving the use of direct likelihood formulation and generalised linear modelling in order to estimate important disease parameters from real data. The force of infection and the recovery rate or per capita loss of infection are the parameters of interest. The problem of dealing with time-varying disease parameters is also addressed in the paper by fitting piecewise constant parameters over time. The findings of the current paper are comparable and similar to estimates from an independent approach suggested by White et al.21 that employed Bayesian MCMC modelling via WinBUGS. ; The authors gratefully acknowledge the financial support from The Wellcome Trust (Grant No. 061584), and the IUAP research network Nr. P5/24 of the Belgian Government (Belgian Science Policy). Shaun Ramroop would like to thank the NRF of South Africa for funding his PhD work (THUTHUKA-Researchers in training Ref. No: TTK2005081700004). Mahidol-Oxford Tropical Medicine Research Unit is funded by the Wellcome Trust of Great Britain. This article is published with the permission of the Director of KEMRI.
Several studies have demonstrated that the metabolite composition of plasma may indicate the presence of lung cancer. The metabolism of cancer is characterized by an enhanced glucose uptake and glycolysis which is exploited by 18 f-fDG positron emission tomography (pet) in the work-up and management of cancer. This study aims to explore relationships between 1 H-nMR spectroscopy derived plasma metabolite concentrations and the uptake of labeled glucose (18 f-fDG) in lung cancer tissue. PET parameters of interest are standard maximal uptake values (SUV max), total body metabolic active tumor volumes (MAtV WtB) and total body total lesion glycolysis (TLG WtB) values. patients with high values of these parameters have higher plasma concentrations of N-acetylated glycoproteins which suggest an upregulation of the hexosamines biosynthesis. High MAtV WtB and TLG WtB values are associated with higher concentrations of glucose, glycerol, N-acetylated glycoproteins, threonine, aspartate and valine and lower levels of sphingomyelins and phosphatidylcholines appearing at the surface of lipoproteins. These higher concentrations of glucose and non-carbohydrate glucose precursors such as amino acids and glycerol suggests involvement of the gluconeogenesis pathway. The lower plasma concentration of those phospholipids points to a higher need for membrane synthesis. Our results indicate that the metabolic reprogramming in cancer is more complex than the initially described Warburg effect. ; Tis study is part of the Limburg Clinical Research Program (LCRP) UHasselt-ZOL-Jessa, supported by the foundation Limburg Sterk Merk, province of Limburg, Flemish government, Hasselt University, Ziekenhuis OostLimburg and Jessa Hospital
In this article, we discuss methods to select three different types of genes (treatment related, response related, or both) and investigate whether they can serve as biomarkers for a binary outcome variable. We consider an extension of the joint model introduced by Lin et al. (2010) and Tilahun et al. (2010) for a continuous response. As the model has certain drawbacks in a binary setting, we also present a way to use classical selection methods to identify subgroups of genes, which are treatment and/or response related. We evaluate their potential to serve as biomarkers by applying DLDA to predict the response level. ; We gratefully acknowledge support of the IAP research network P6/03 of the Belgian Government (Belgian Science Policy).
The force of infection, describing the rate at which a susceptible person acquires an infection, is a key parameter in models estimating the infectious disease burden, and the effectiveness and cost-effectiveness of infectious disease prevention. Since Muench formulated the first catalytic model to estimate the force of infection from current status data in 1934, exactly 75 years ago, several authors addressed the estimation of this parameter by more advanced statistical methods, while applying these to seroprevalence and reported incidence/case notification data. In this paper we present an historical overview, discussing the relevance of Muench's work, and we explain the wide array of newer methods with illustrations on pre-vaccination serological survey data of two airborne infections: rubella and parvovirus B19. We also provide guidance on deciding which method(s) to apply to estimate the force of infection, given a particular set of data. ; We thank the editor and both referees for their valuable suggestions that have led to an improved version of the manuscript. This work was supported by research project (MSM 0021620839), funded by 'SIMID', a strategic basic research project funded by the institute for the Promotion of Innovation by Science and Technology in Flanders (IWT) (project number 06008); by the Fund of Scientific Research (FWO, Research Grant G039304) in Flanders, Belgium; and by the TAP research network (no. P6/03) of the Belgian Government (Belgian Science Policy). The R-code used to analyse the datasets in this manuscript is available from the authors.
This article aims to develop a probability-based model involving the use of direct likelihood formulation and generalised linear modelling (GLM) approaches useful in estimating important disease parameters from longitudinal or repeated measurement data. The current application is based on infection with respiratory syncytial virus. The force of infection and the recovery rate or per capita loss of infection are the parameters of interest. However, because of the limitation arising from the study design and subsequently, the data generated only the force of infection is estimable. The problem of dealing with time-varying disease parameters is also addressed in the article by fitting piecewise constant parameters over time via the GLM approach. The current model formulation is based on that published in White LJ, Buttery J, Cooper B, Nokes DJ and Medley GF. Rotavirus within day care centres in Oxfordshire, UK: characterization of partial immunity. Journal of Royal Society Interface 2008; 5:1481-1490 with an application to rotavirus transmission and immunity. ; The authors gratefully acknowledge the financial support from The Wellcome Trust (Grant No. 061584), and the IUAP research network Nr. P5/24 of the Belgian Government (Belgian Science Policy). Shaun Ramroop would like to thank the NRF of South Africa for funding his PhD work (THUTHUKA-Researchers in training Ref. No: TTK2005081700004). Mahidol-Oxford Tropical Medicine Research Unit is funded by the Wellcome Trust of Great Britain. This article is published with the permission of the Director of KEMRI.
Microarrays enable the expression levels of thousands of genes to be measured simultaneously. However, only a small fraction of these genes are expected to be expressed under different experimental conditions. Nowadays, filtering has been introduced as a step in the microarray preprocessing pipeline. Gene filtering aims at reducing the dimensionality of data by filtering redundant features prior to the actual statistical analysis. Previous filtering methods focus on the Affymetrix platform and can not be easily ported to the Illumina platform. As such, we developed a filtering method for Illumina bead arrays. We developed an R package, beadarrayFilter, to implement the latter method. In this paper, the main functions in the package are highlighted and using many examples, we illustrate how beadarrayFilter can be used to filter bead arrays. ; We acknowledge the support from IAP research network grant nr. P6/03 of the Belgian government (Belgian Science Policy), SymBioSys, the Katholieke Universiteit Leuven center of Excellence on Computational Systems Biology, (EF/05/007), and Bioframe of the institute for the Promotion of Innovation by Science and technology in Flanders (IWT: 060045/KUL-BIO-M$S-PLANT).
The use of semi-parametric mixed models has proven useful in a wide variety of settings. Here. we focus on the application of the methodology ill file particular Case Of a cross-over design with relatively long. sequences of repeated measurements Within each treatment period and for each subject. Other than an overall measure of the difference between each One of file experimental groups and the control group. specific time point comparisons may also be of interest. TO that effect. We propose file use of flexible semi-parametric mixed models, enabling file construction of simulation-based simultaneous confidence bands. The bands take into account both between- and within-subject variabilities, while simultaneously correcting for Multiple time point comparisons. Owing to the relatively long sequences of Measurements per Subject, the presence of serially correlated errors is anticipated and investigated. WC illustrate how several formulations of semi-parametric mixed models call be fitted and the construction of simulation-based simultaneous confidence bands using SAS PROC MIXED. Copyright (C) 2008 John Wiley & Sons, Ltd. ; We gratefully acknowledge the financial support from the IAP research network no. P5/24 of the Belgian Government (Belgian Science Policy) and from th
Dissolved oxygen is an essential controlling factor in the performance of facultative and maturation ponds since both take many advantages of algal photosynthetic oxygenation. The rate of this photosynthesis strongly depends on the time during the day and the location in a pond system, whose roles have been overlooked in previous guidelines of pond operation and maintenance (O&M). To elucidate these influences, a linear mixed effect model (LMM) was built on the data collected from three intensive sampling campaigns in a waste stabilization pond in Cuenca, Ecuador. Within two parallel lines of facultative and maturation ponds, nine locations were sampled at two depths in each pond. In general, the output of the mixed model indicated high spatial autocorrelations of data and wide spatiotemporal variations of the oxygen level among and within the ponds. Particularly, different ponds showed different patterns of oxygen dynamics, which were associated with many factors including flow behavior, sludge accumulation, algal distribution, influent fluctuation, and pond function. Moreover, a substantial temporal change in the oxygen level between day and night, from zero to above 20 mg O-2.L-1, was observed. Algal photosynthetic activity appeared to be the main reason for these variations in the model, as it was facilitated by intensive solar radiation at high altitude. Since these diurnal and spatial patterns can supply a large amount of useful information on pond performance, insightful recommendations on dissolved oxygen (DO) monitoring and regulations were delivered. More importantly, as a mixed model showed high predictive performance, i.e., high goodness-of-fit (R-2 of 0.94), low values of mean absolute error, we recommended this advanced statistical technique as an effective tool for dealing with high autocorrelation of data in pond systems. ; This research was performed in the context of the VLIR Ecuador Biodiversity Network project. This project was funded by the Vlaamse Interuniversitaire Raad-Universitaire Ontwikkelingssamenwerking (VLIR-UOS), which supports partnerships between universities and university colleges in Flanders and the South. We are grateful to ETAPA for allowing us to use their facilities and wastewater treatment pond system to perform this research. We thank four anonymous reviewers for their careful reading of our manuscript and their many insightful comments and suggestions. Long Ho is supported by the special research fund (BOF) of Ghent University. Duy Tan Pham is supported by a PhD grant of the Vietnamese government.