Additive models are flexible regression tools that handle linear as well as nonlinear terms. The latter are typically modelled via smoothing splines. Additive mixed models extend additive models to include random terms when the data are sampled according to cluster designs (e.g., longitudinal). These models find applications in the study of phenomena like growth, certain disease mechanisms and energy consumption in humans, when repeated measurements are available. In this paper, we propose a novel additive mixed model for quantile regression. Our methods are motivated by an application to physical activity based on a dataset with more than half million accelerometer measurements in children of the UK Millennium Cohort Study. In a simulation study, we assess the proposed methods against existing alternatives..
The aim of this chapter is to provide an overview of recent developments in principal component analysis (PCA) methods when the data are incomplete. Missing data bring uncertainty into the analysis and their treatment requires statistical approaches that are tailored to cope with specific missing data processes (i.e., ignorable and nonignorable mechanisms). Since the publication of the classic textbook by Jolliffe, which includes a short, same-titled section on the missing data problem in PCA, there have been a few methodological contributions that hinge upon a probabilistic approach to PCA. In this chapter, we unify methods for ignorable and nonignorable missing data in a general likelihood framework. We also provide real data examples to illustrate the application of these methods using the R language and environment for statistical computing and graphics.
In regression applications, the presence of nonlinearity and correlation among observations offer computational challenges not only in traditional settings such as least squares regression, but also (and especially) when the objective function is non-smooth as in the case of quantile regression. In this paper, we develop methods for the modeling and estimation of nonlinear conditional quantile functions when data are clustered within two-level nested designs. This work represents an extension of the linear quantile mixed models of Geraci and Bottai (2014, Statistics and Computing). We develop a novel algorithm which is a blend of a smoothing algorithm for quantile regression and a second order Laplacian approximation for nonlinear mixed models. To assess the proposed methods, we present a simulation study and two applications, one in pharmacokinetics and one related to growth curve modeling in agriculture.
In statistical applications, the normal and the Laplace distributions are often contrasted: the former as a standard tool of analysis, the latter as its robust counterpart. I discuss the convolutions of these two popular distributions and their applications in research. I consider four models within a simple 2×2 scheme which is of practical interest in the analysis of clustered (e.g., longitudinal) data. In my view, these models, some of which are less known than others by the majority of applied researchers, constitute a ‘family’ of sensible alternatives when modelling issues arise. In three examples, I revisit data published recently in the epidemiological and clinical literature as well as a classic biological dataset.
In mediation analysis, the effect of an exposure (or treatment) on an outcome variable is decomposed into two components: a direct effect, which pertains to an immediate influence of the exposure on the outcome, and an indirect effect, which the exposure exerts on the outcome through a third variable called mediator. Our motivating example concerns the relationship between maternal smoking (the exposure, X), birthweight (the mediator, M), and infant mortality (the outcome, Y), which has attracted the interest of epidemiologists and statisticians for many years. We introduce new causal estimands, named u-specific direct and indirect effects, which describe the direct and indirect effects of the exposure on the outcome at a specific quantile u of the mediator, 0 < u < 1. Under sequential ignorability we derive an interesting and novel decomposition of u-specific indirect effects. The components of this decomposition have a straightforward interpretation and can provide new insights into the complexity of the mechanisms underlying the indirect effect. We illustrate the proposed methods using data on infant mortality in the US population. We provide analytical evidence that supports the hypothesis that the risk of sudden infant death syndrome is not predicted by changes in the birthweight distribution.
Physical activity and inactivity are two independent dimensions over which children aggregate into distinct behavioural profiles. Read my new article ‘Probabilistic principal component analysis to identify profiles of physical activity behaviours in the presence of non-ignorable missing data’ in the Journal of the Royal Statistical Society: Series C at http://onlinelibrary.wiley.com/doi/10.1111/rssc.12105/abstract.
Read my new article ‘Improved transformation-based quantile regression’ in the Canadian Journal of Statistics at http://onlinelibrary.wiley.com/doi/10.1002/cjs.11240/abstract!