for Bayesian Bootstraps for Massive Data By projecteuclid.org Published On :: Thu, 19 Mar 2020 22:02 EDT Andrés F. Barrientos, Víctor Peña. Source: Bayesian Analysis, Volume 15, Number 2, 363--388.Abstract: In this article, we present data-subsetting algorithms that allow for the approximate and scalable implementation of the Bayesian bootstrap. They are analogous to two existing algorithms in the frequentist literature: the bag of little bootstraps (Kleiner et al., 2014) and the subsampled double bootstrap (Sengupta et al., 2016). Our algorithms have appealing theoretical and computational properties that are comparable to those of their frequentist counterparts. Additionally, we provide a strategy for performing lossless inference for a class of functionals of the Bayesian bootstrap and briefly introduce extensions to the Dirichlet Process. Full Article
for High-Dimensional Posterior Consistency for Hierarchical Non-Local Priors in Regression By projecteuclid.org Published On :: Mon, 13 Jan 2020 04:00 EST Xuan Cao, Kshitij Khare, Malay Ghosh. Source: Bayesian Analysis, Volume 15, Number 1, 241--262.Abstract: The choice of tuning parameters in Bayesian variable selection is a critical problem in modern statistics. In particular, for Bayesian linear regression with non-local priors, the scale parameter in the non-local prior density is an important tuning parameter which reflects the dispersion of the non-local prior density around zero, and implicitly determines the size of the regression coefficients that will be shrunk to zero. Current approaches treat the scale parameter as given, and suggest choices based on prior coverage/asymptotic considerations. In this paper, we consider the fully Bayesian approach introduced in (Wu, 2016) with the pMOM non-local prior and an appropriate Inverse-Gamma prior on the tuning parameter to analyze the underlying theoretical property. Under standard regularity assumptions, we establish strong model selection consistency in a high-dimensional setting, where $p$ is allowed to increase at a polynomial rate with $n$ or even at a sub-exponential rate with $n$ . Through simulation studies, we demonstrate that our model selection procedure can outperform other Bayesian methods which treat the scale parameter as given, and commonly used penalized likelihood methods, in a range of simulation settings. Full Article
for Bayesian Design of Experiments for Intractable Likelihood Models Using Coupled Auxiliary Models and Multivariate Emulation By projecteuclid.org Published On :: Mon, 13 Jan 2020 04:00 EST Antony Overstall, James McGree. Source: Bayesian Analysis, Volume 15, Number 1, 103--131.Abstract: A Bayesian design is given by maximising an expected utility over a design space. The utility is chosen to represent the aim of the experiment and its expectation is taken with respect to all unknowns: responses, parameters and/or models. Although straightforward in principle, there are several challenges to finding Bayesian designs in practice. Firstly, the utility and expected utility are rarely available in closed form and require approximation. Secondly, the design space can be of high-dimensionality. In the case of intractable likelihood models, these problems are compounded by the fact that the likelihood function, whose evaluation is required to approximate the expected utility, is not available in closed form. A strategy is proposed to find Bayesian designs for intractable likelihood models. It relies on the development of an automatic, auxiliary modelling approach, using multivariate Gaussian process emulators, to approximate the likelihood function. This is then combined with a copula-based approach to approximate the marginal likelihood (a quantity commonly required to evaluate many utility functions). These approximations are demonstrated on examples of stochastic process models involving experimental aims of both parameter estimation and model comparison. Full Article
for Bayesian Estimation Under Informative Sampling with Unattenuated Dependence By projecteuclid.org Published On :: Mon, 13 Jan 2020 04:00 EST Matthew R. Williams, Terrance D. Savitsky. Source: Bayesian Analysis, Volume 15, Number 1, 57--77.Abstract: An informative sampling design leads to unit inclusion probabilities that are correlated with the response variable of interest. However, multistage sampling designs may also induce higher order dependencies, which are ignored in the literature when establishing consistency of estimators for survey data under a condition requiring asymptotic independence among the unit inclusion probabilities. This paper constructs new theoretical conditions that guarantee that the pseudo-posterior, which uses sampling weights based on first order inclusion probabilities to exponentiate the likelihood, is consistent not only for survey designs which have asymptotic factorization, but also for survey designs that induce residual or unattenuated dependence among sampled units. The use of the survey-weighted pseudo-posterior, together with our relaxed requirements for the survey design, establish a wide variety of analysis models that can be applied to a broad class of survey data sets. Using the complex sampling design of the National Survey on Drug Use and Health, we demonstrate our new theoretical result on multistage designs characterized by a cluster sampling step that expresses within-cluster dependence. We explore the impact of multistage designs and order based sampling. Full Article
for The Bayesian Update: Variational Formulations and Gradient Flows By projecteuclid.org Published On :: Mon, 13 Jan 2020 04:00 EST Nicolas Garcia Trillos, Daniel Sanz-Alonso. Source: Bayesian Analysis, Volume 15, Number 1, 29--56.Abstract: The Bayesian update can be viewed as a variational problem by characterizing the posterior as the minimizer of a functional. The variational viewpoint is far from new and is at the heart of popular methods for posterior approximation. However, some of its consequences seem largely unexplored. We focus on the following one: defining the posterior as the minimizer of a functional gives a natural path towards the posterior by moving in the direction of steepest descent of the functional. This idea is made precise through the theory of gradient flows, allowing to bring new tools to the study of Bayesian models and algorithms. Since the posterior may be characterized as the minimizer of different functionals, several variational formulations may be considered. We study three of them and their three associated gradient flows. We show that, in all cases, the rate of convergence of the flows to the posterior can be bounded by the geodesic convexity of the functional to be minimized. Each gradient flow naturally suggests a nonlinear diffusion with the posterior as invariant distribution. These diffusions may be discretized to build proposals for Markov chain Monte Carlo (MCMC) algorithms. By construction, the diffusions are guaranteed to satisfy a certain optimality condition, and rates of convergence are given by the convexity of the functionals. We use this observation to propose a criterion for the choice of metric in Riemannian MCMC methods. Full Article
for Scalable Bayesian Inference for the Inverse Temperature of a Hidden Potts Model By projecteuclid.org Published On :: Mon, 13 Jan 2020 04:00 EST Matthew Moores, Geoff Nicholls, Anthony Pettitt, Kerrie Mengersen. Source: Bayesian Analysis, Volume 15, Number 1, 1--27.Abstract: The inverse temperature parameter of the Potts model governs the strength of spatial cohesion and therefore has a major influence over the resulting model fit. A difficulty arises from the dependence of an intractable normalising constant on the value of this parameter and thus there is no closed-form solution for sampling from the posterior distribution directly. There is a variety of computational approaches for sampling from the posterior without evaluating the normalising constant, including the exchange algorithm and approximate Bayesian computation (ABC). A serious drawback of these algorithms is that they do not scale well for models with a large state space, such as images with a million or more pixels. We introduce a parametric surrogate model, which approximates the score function using an integral curve. Our surrogate model incorporates known properties of the likelihood, such as heteroskedasticity and critical temperature. We demonstrate this method using synthetic data as well as remotely-sensed imagery from the Landsat-8 satellite. We achieve up to a hundredfold improvement in the elapsed runtime, compared to the exchange algorithm or ABC. An open-source implementation of our algorithm is available in the R package bayesImageS . Full Article
for Hierarchical Normalized Completely Random Measures for Robust Graphical Modeling By projecteuclid.org Published On :: Thu, 19 Dec 2019 22:10 EST Andrea Cremaschi, Raffaele Argiento, Katherine Shoemaker, Christine Peterson, Marina Vannucci. Source: Bayesian Analysis, Volume 14, Number 4, 1271--1301.Abstract: Gaussian graphical models are useful tools for exploring network structures in multivariate normal data. In this paper we are interested in situations where data show departures from Gaussianity, therefore requiring alternative modeling distributions. The multivariate $t$ -distribution, obtained by dividing each component of the data vector by a gamma random variable, is a straightforward generalization to accommodate deviations from normality such as heavy tails. Since different groups of variables may be contaminated to a different extent, Finegold and Drton (2014) introduced the Dirichlet $t$ -distribution, where the divisors are clustered using a Dirichlet process. In this work, we consider a more general class of nonparametric distributions as the prior on the divisor terms, namely the class of normalized completely random measures (NormCRMs). To improve the effectiveness of the clustering, we propose modeling the dependence among the divisors through a nonparametric hierarchical structure, which allows for the sharing of parameters across the samples in the data set. This desirable feature enables us to cluster together different components of multivariate data in a parsimonious way. We demonstrate through simulations that this approach provides accurate graphical model inference, and apply it to a case study examining the dependence structure in radiomics data derived from The Cancer Imaging Atlas. Full Article
for Calibration Procedures for Approximate Bayesian Credible Sets By projecteuclid.org Published On :: Thu, 19 Dec 2019 22:10 EST Jeong Eun Lee, Geoff K. Nicholls, Robin J. Ryder. Source: Bayesian Analysis, Volume 14, Number 4, 1245--1269.Abstract: We develop and apply two calibration procedures for checking the coverage of approximate Bayesian credible sets, including intervals estimated using Monte Carlo methods. The user has an ideal prior and likelihood, but generates a credible set for an approximate posterior based on some approximate prior and likelihood. We estimate the realised posterior coverage achieved by the approximate credible set. This is the coverage of the unknown “true” parameter if the data are a realisation of the user’s ideal observation model conditioned on the parameter, and the parameter is a draw from the user’s ideal prior. In one approach we estimate the posterior coverage at the data by making a semi-parametric logistic regression of binary coverage outcomes on simulated data against summary statistics evaluated on simulated data. In another we use Importance Sampling from the approximate posterior, windowing simulated data to fall close to the observed data. We illustrate our methods on four examples. Full Article
for Bayesian Functional Forecasting with Locally-Autoregressive Dependent Processes By projecteuclid.org Published On :: Thu, 19 Dec 2019 22:10 EST Guillaume Kon Kam King, Antonio Canale, Matteo Ruggiero. Source: Bayesian Analysis, Volume 14, Number 4, 1121--1141.Abstract: Motivated by the problem of forecasting demand and offer curves, we introduce a class of nonparametric dynamic models with locally-autoregressive behaviour, and provide a full inferential strategy for forecasting time series of piecewise-constant non-decreasing functions over arbitrary time horizons. The model is induced by a non Markovian system of interacting particles whose evolution is governed by a resampling step and a drift mechanism. The former is based on a global interaction and accounts for the volatility of the functional time series, while the latter is determined by a neighbourhood-based interaction with the past curves and accounts for local trend behaviours, separating these from pure noise. We discuss the implementation of the model for functional forecasting by combining a population Monte Carlo and a semi-automatic learning approach to approximate Bayesian computation which require limited tuning. We validate the inference method with a simulation study, and carry out predictive inference on a real dataset on the Italian natural gas market. Full Article
for Variance Prior Forms for High-Dimensional Bayesian Variable Selection By projecteuclid.org Published On :: Thu, 19 Dec 2019 22:10 EST Gemma E. Moran, Veronika Ročková, Edward I. George. Source: Bayesian Analysis, Volume 14, Number 4, 1091--1119.Abstract: Consider the problem of high dimensional variable selection for the Gaussian linear model when the unknown error variance is also of interest. In this paper, we show that the use of conjugate shrinkage priors for Bayesian variable selection can have detrimental consequences for such variance estimation. Such priors are often motivated by the invariance argument of Jeffreys (1961). Revisiting this work, however, we highlight a caveat that Jeffreys himself noticed; namely that biased estimators can result from inducing dependence between parameters a priori . In a similar way, we show that conjugate priors for linear regression, which induce prior dependence, can lead to such underestimation in the Bayesian high-dimensional regression setting. Following Jeffreys, we recommend as a remedy to treat regression coefficients and the error variance as independent a priori . Using such an independence prior framework, we extend the Spike-and-Slab Lasso of Ročková and George (2018) to the unknown variance case. This extended procedure outperforms both the fixed variance approach and alternative penalized likelihood methods on simulated data. On the protein activity dataset of Clyde and Parmigiani (1998), the Spike-and-Slab Lasso with unknown variance achieves lower cross-validation error than alternative penalized likelihood methods, demonstrating the gains in predictive accuracy afforded by simultaneous error variance estimation. The unknown variance implementation of the Spike-and-Slab Lasso is provided in the publicly available R package SSLASSO (Ročková and Moran, 2017). Full Article
for Bayes Factors for Partially Observed Stochastic Epidemic Models By projecteuclid.org Published On :: Tue, 11 Jun 2019 04:00 EDT Muteb Alharthi, Theodore Kypraios, Philip D. O’Neill. Source: Bayesian Analysis, Volume 14, Number 3, 927--956.Abstract: We consider the problem of model choice for stochastic epidemic models given partial observation of a disease outbreak through time. Our main focus is on the use of Bayes factors. Although Bayes factors have appeared in the epidemic modelling literature before, they can be hard to compute and little attention has been given to fundamental questions concerning their utility. In this paper we derive analytic expressions for Bayes factors given complete observation through time, which suggest practical guidelines for model choice problems. We adapt the power posterior method for computing Bayes factors so as to account for missing data and apply this approach to partially observed epidemics. For comparison, we also explore the use of a deviance information criterion for missing data scenarios. The methods are illustrated via examples involving both simulated and real data. Full Article
for Extrinsic Gaussian Processes for Regression and Classification on Manifolds By projecteuclid.org Published On :: Tue, 11 Jun 2019 04:00 EDT Lizhen Lin, Niu Mu, Pokman Cheung, David Dunson. Source: Bayesian Analysis, Volume 14, Number 3, 907--926.Abstract: Gaussian processes (GPs) are very widely used for modeling of unknown functions or surfaces in applications ranging from regression to classification to spatial processes. Although there is an increasingly vast literature on applications, methods, theory and algorithms related to GPs, the overwhelming majority of this literature focuses on the case in which the input domain corresponds to a Euclidean space. However, particularly in recent years with the increasing collection of complex data, it is commonly the case that the input domain does not have such a simple form. For example, it is common for the inputs to be restricted to a non-Euclidean manifold, a case which forms the motivation for this article. In particular, we propose a general extrinsic framework for GP modeling on manifolds, which relies on embedding of the manifold into a Euclidean space and then constructing extrinsic kernels for GPs on their images. These extrinsic Gaussian processes (eGPs) are used as prior distributions for unknown functions in Bayesian inferences. Our approach is simple and general, and we show that the eGPs inherit fine theoretical properties from GP models in Euclidean spaces. We consider applications of our models to regression and classification problems with predictors lying in a large class of manifolds, including spheres, planar shape spaces, a space of positive definite matrices, and Grassmannians. Our models can be readily used by practitioners in biological sciences for various regression and classification problems, such as disease diagnosis or detection. Our work is also likely to have impact in spatial statistics when spatial locations are on the sphere or other geometric spaces. Full Article
for Jointly Robust Prior for Gaussian Stochastic Process in Emulation, Calibration and Variable Selection By projecteuclid.org Published On :: Tue, 11 Jun 2019 04:00 EDT Mengyang Gu. Source: Bayesian Analysis, Volume 14, Number 3, 877--905.Abstract: Gaussian stochastic process (GaSP) has been widely used in two fundamental problems in uncertainty quantification, namely the emulation and calibration of mathematical models. Some objective priors, such as the reference prior, are studied in the context of emulating (approximating) computationally expensive mathematical models. In this work, we introduce a new class of priors, called the jointly robust prior, for both the emulation and calibration. This prior is designed to maintain various advantages from the reference prior. In emulation, the jointly robust prior has an appropriate tail decay rate as the reference prior, and is computationally simpler than the reference prior in parameter estimation. Moreover, the marginal posterior mode estimation with the jointly robust prior can separate the influential and inert inputs in mathematical models, while the reference prior does not have this property. We establish the posterior propriety for a large class of priors in calibration, including the reference prior and jointly robust prior in general scenarios, but the jointly robust prior is preferred because the calibrated mathematical model typically predicts the reality well. The jointly robust prior is used as the default prior in two new R packages, called “RobustGaSP” and “RobustCalibration”, available on CRAN for emulation and calibration, respectively. Full Article
for Probability Based Independence Sampler for Bayesian Quantitative Learning in Graphical Log-Linear Marginal Models By projecteuclid.org Published On :: Tue, 11 Jun 2019 04:00 EDT Ioannis Ntzoufras, Claudia Tarantola, Monia Lupparelli. Source: Bayesian Analysis, Volume 14, Number 3, 797--823.Abstract: We introduce a novel Bayesian approach for quantitative learning for graphical log-linear marginal models. These models belong to curved exponential families that are difficult to handle from a Bayesian perspective. The likelihood cannot be analytically expressed as a function of the marginal log-linear interactions, but only in terms of cell counts or probabilities. Posterior distributions cannot be directly obtained, and Markov Chain Monte Carlo (MCMC) methods are needed. Finally, a well-defined model requires parameter values that lead to compatible marginal probabilities. Hence, any MCMC should account for this important restriction. We construct a fully automatic and efficient MCMC strategy for quantitative learning for such models that handles these problems. While the prior is expressed in terms of the marginal log-linear interactions, we build an MCMC algorithm that employs a proposal on the probability parameter space. The corresponding proposal on the marginal log-linear interactions is obtained via parameter transformation. We exploit a conditional conjugate setup to build an efficient proposal on probability parameters. The proposed methodology is illustrated by a simulation study and a real dataset. Full Article
for Low Information Omnibus (LIO) Priors for Dirichlet Process Mixture Models By projecteuclid.org Published On :: Tue, 11 Jun 2019 04:00 EDT Yushu Shi, Michael Martens, Anjishnu Banerjee, Purushottam Laud. Source: Bayesian Analysis, Volume 14, Number 3, 677--702.Abstract: Dirichlet process mixture (DPM) models provide flexible modeling for distributions of data as an infinite mixture of distributions from a chosen collection. Specifying priors for these models in individual data contexts can be challenging. In this paper, we introduce a scheme which requires the investigator to specify only simple scaling information. This is used to transform the data to a fixed scale on which a low information prior is constructed. Samples from the posterior with the rescaled data are transformed back for inference on the original scale. The low information prior is selected to provide a wide variety of components for the DPM to generate flexible distributions for the data on the fixed scale. The method can be applied to all DPM models with kernel functions closed under a suitable scaling transformation. Construction of the low information prior, however, is kernel dependent. Using DPM-of-Gaussians and DPM-of-Weibulls models as examples, we show that the method provides accurate estimates of a diverse collection of distributions that includes skewed, multimodal, and highly dispersed members. With the recommended priors, repeated data simulations show performance comparable to that of standard empirical estimates. Finally, we show weak convergence of posteriors with the proposed priors for both kernels considered. Full Article
for A Bayesian Nonparametric Multiple Testing Procedure for Comparing Several Treatments Against a Control By projecteuclid.org Published On :: Fri, 31 May 2019 22:05 EDT Luis Gutiérrez, Andrés F. Barrientos, Jorge González, Daniel Taylor-Rodríguez. Source: Bayesian Analysis, Volume 14, Number 2, 649--675.Abstract: We propose a Bayesian nonparametric strategy to test for differences between a control group and several treatment regimes. Most of the existing tests for this type of comparison are based on the differences between location parameters. In contrast, our approach identifies differences across the entire distribution, avoids strong modeling assumptions over the distributions for each treatment, and accounts for multiple testing through the prior distribution on the space of hypotheses. The proposal is compared to other commonly used hypothesis testing procedures under simulated scenarios. Two real applications are also analyzed with the proposed methodology. Full Article
for Alleviating Spatial Confounding for Areal Data Problems by Displacing the Geographical Centroids By projecteuclid.org Published On :: Fri, 31 May 2019 22:05 EDT Marcos Oliveira Prates, Renato Martins Assunção, Erica Castilho Rodrigues. Source: Bayesian Analysis, Volume 14, Number 2, 623--647.Abstract: Spatial confounding between the spatial random effects and fixed effects covariates has been recently discovered and showed that it may bring misleading interpretation to the model results. Techniques to alleviate this problem are based on decomposing the spatial random effect and fitting a restricted spatial regression. In this paper, we propose a different approach: a transformation of the geographic space to ensure that the unobserved spatial random effect added to the regression is orthogonal to the fixed effects covariates. Our approach, named SPOCK, has the additional benefit of providing a fast and simple computational method to estimate the parameters. Also, it does not constrain the distribution class assumed for the spatial error term. A simulation study and real data analyses are presented to better understand the advantages of the new method in comparison with the existing ones. Full Article
for Efficient Acquisition Rules for Model-Based Approximate Bayesian Computation By projecteuclid.org Published On :: Wed, 13 Mar 2019 22:00 EDT Marko Järvenpää, Michael U. Gutmann, Arijus Pleska, Aki Vehtari, Pekka Marttinen. Source: Bayesian Analysis, Volume 14, Number 2, 595--622.Abstract: Approximate Bayesian computation (ABC) is a method for Bayesian inference when the likelihood is unavailable but simulating from the model is possible. However, many ABC algorithms require a large number of simulations, which can be costly. To reduce the computational cost, Bayesian optimisation (BO) and surrogate models such as Gaussian processes have been proposed. Bayesian optimisation enables one to intelligently decide where to evaluate the model next but common BO strategies are not designed for the goal of estimating the posterior distribution. Our paper addresses this gap in the literature. We propose to compute the uncertainty in the ABC posterior density, which is due to a lack of simulations to estimate this quantity accurately, and define a loss function that measures this uncertainty. We then propose to select the next evaluation location to minimise the expected loss. Experiments show that the proposed method often produces the most accurate approximations as compared to common BO strategies. Full Article
for A Bayesian Nonparametric Spiked Process Prior for Dynamic Model Selection By projecteuclid.org Published On :: Wed, 13 Mar 2019 22:00 EDT Alberto Cassese, Weixuan Zhu, Michele Guindani, Marina Vannucci. Source: Bayesian Analysis, Volume 14, Number 2, 553--572.Abstract: In many applications, investigators monitor processes that vary in space and time, with the goal of identifying temporally persistent and spatially localized departures from a baseline or “normal” behavior. In this manuscript, we consider the monitoring of pneumonia and influenza (P&I) mortality, to detect influenza outbreaks in the continental United States, and propose a Bayesian nonparametric model selection approach to take into account the spatio-temporal dependence of outbreaks. More specifically, we introduce a zero-inflated conditionally identically distributed species sampling prior which allows borrowing information across time and to assign data to clusters associated to either a null or an alternate process. Spatial dependences are accounted for by means of a Markov random field prior, which allows to inform the selection based on inferences conducted at nearby locations. We show how the proposed modeling framework performs in an application to the P&I mortality data and in a simulation study, and compare with common threshold methods for detecting outbreaks over time, with more recent Markov switching based models, and with spike-and-slab Bayesian nonparametric priors that do not take into account spatio-temporal dependence. Full Article
for Efficient Bayesian Regularization for Graphical Model Selection By projecteuclid.org Published On :: Wed, 13 Mar 2019 22:00 EDT Suprateek Kundu, Bani K. Mallick, Veera Baladandayuthapani. Source: Bayesian Analysis, Volume 14, Number 2, 449--476.Abstract: There has been an intense development in the Bayesian graphical model literature over the past decade; however, most of the existing methods are restricted to moderate dimensions. We propose a novel graphical model selection approach for large dimensional settings where the dimension increases with the sample size, by decoupling model fitting and covariance selection. First, a full model based on a complete graph is fit under a novel class of mixtures of inverse–Wishart priors, which induce shrinkage on the precision matrix under an equivalence with Cholesky-based regularization, while enabling conjugate updates. Subsequently, a post-fitting model selection step uses penalized joint credible regions to perform model selection. This allows our methods to be computationally feasible for large dimensional settings using a combination of straightforward Gibbs samplers and efficient post-fitting inferences. Theoretical guarantees in terms of selection consistency are also established. Simulations show that the proposed approach compares favorably with competing methods, both in terms of accuracy metrics and computation times. We apply this approach to a cancer genomics data example. Full Article
for Variational Message Passing for Elaborate Response Regression Models By projecteuclid.org Published On :: Wed, 13 Mar 2019 22:00 EDT M. W. McLean, M. P. Wand. Source: Bayesian Analysis, Volume 14, Number 2, 371--398.Abstract: We build on recent work concerning message passing approaches to approximate fitting and inference for arbitrarily large regression models. The focus is on regression models where the response variable is modeled to have an elaborate distribution, which is loosely defined to mean a distribution that is more complicated than common distributions such as those in the Bernoulli, Poisson and Normal families. Examples of elaborate response families considered here are the Negative Binomial and $t$ families. Variational message passing is more challenging due to some of the conjugate exponential families being non-standard and numerical integration being needed. Nevertheless, a factor graph fragment approach means the requisite calculations only need to be done once for a particular elaborate response distribution family. Computer code can be compartmentalized, including that involving numerical integration. A major finding of this work is that the modularity of variational message passing extends to elaborate response regression models. Full Article
for Bayesian Effect Fusion for Categorical Predictors By projecteuclid.org Published On :: Wed, 13 Mar 2019 22:00 EDT Daniela Pauger, Helga Wagner. Source: Bayesian Analysis, Volume 14, Number 2, 341--369.Abstract: We propose a Bayesian approach to obtain a sparse representation of the effect of a categorical predictor in regression type models. As this effect is captured by a group of level effects, sparsity cannot only be achieved by excluding single irrelevant level effects or the whole group of effects associated to this predictor but also by fusing levels which have essentially the same effect on the response. To achieve this goal, we propose a prior which allows for almost perfect as well as almost zero dependence between level effects a priori. This prior can alternatively be obtained by specifying spike and slab prior distributions on all effect differences associated to this categorical predictor. We show how restricted fusion can be implemented and develop an efficient MCMC (Markov chain Monte Carlo) method for posterior computation. The performance of the proposed method is investigated on simulated data and we illustrate its application on real data from EU-SILC (European Union Statistics on Income and Living Conditions). Full Article
for Statistical Inference for the Evolutionary History of Cancer Genomes By projecteuclid.org Published On :: Tue, 03 Mar 2020 04:00 EST Khanh N. Dinh, Roman Jaksik, Marek Kimmel, Amaury Lambert, Simon Tavaré. Source: Statistical Science, Volume 35, Number 1, 129--144.Abstract: Recent years have seen considerable work on inference about cancer evolution from mutations identified in cancer samples. Much of the modeling work has been based on classical models of population genetics, generalized to accommodate time-varying cell population size. Reverse-time, genealogical views of such models, commonly known as coalescents, have been used to infer aspects of the past of growing populations. Another approach is to use branching processes, the simplest scenario being the classical linear birth-death process. Inference from evolutionary models of DNA often exploits summary statistics of the sequence data, a common one being the so-called Site Frequency Spectrum (SFS). In a bulk tumor sequencing experiment, we can estimate for each site at which a novel somatic point mutation has arisen, the proportion of cells that carry that mutation. These numbers are then grouped into collections of sites which have similar mutant fractions. We examine how the SFS based on birth-death processes differs from those based on the coalescent model. This may stem from the different sampling mechanisms in the two approaches. However, we also show that despite this, they are quantitatively comparable for the range of parameters typical for tumor cell populations. We also present a model of tumor evolution with selective sweeps, and demonstrate how it may help in understanding the history of a tumor as well as the influence of data pre-processing. We illustrate the theory with applications to several examples from The Cancer Genome Atlas tumors. Full Article
for Risk Models for Breast Cancer and Their Validation By projecteuclid.org Published On :: Tue, 03 Mar 2020 04:00 EST Adam R. Brentnall, Jack Cuzick. Source: Statistical Science, Volume 35, Number 1, 14--30.Abstract: Strategies to prevent cancer and diagnose it early when it is most treatable are needed to reduce the public health burden from rising disease incidence. Risk assessment is playing an increasingly important role in targeting individuals in need of such interventions. For breast cancer many individual risk factors have been well understood for a long time, but the development of a fully comprehensive risk model has not been straightforward, in part because there have been limited data where joint effects of an extensive set of risk factors may be estimated with precision. In this article we first review the approach taken to develop the IBIS (Tyrer–Cuzick) model, and describe recent updates. We then review and develop methods to assess calibration of models such as this one, where the risk of disease allowing for competing mortality over a long follow-up time or lifetime is estimated. The breast cancer risk model model and calibration assessment methods are demonstrated using a cohort of 132,139 women attending mammography screening in the State of Washington, USA. Full Article
for Gaussianization Machines for Non-Gaussian Function Estimation Models By projecteuclid.org Published On :: Wed, 08 Jan 2020 04:00 EST T. Tony Cai. Source: Statistical Science, Volume 34, Number 4, 635--656.Abstract: A wide range of nonparametric function estimation models have been studied individually in the literature. Among them the homoscedastic nonparametric Gaussian regression is arguably the best known and understood. Inspired by the asymptotic equivalence theory, Brown, Cai and Zhou ( Ann. Statist. 36 (2008) 2055–2084; Ann. Statist. 38 (2010) 2005–2046) and Brown et al. ( Probab. Theory Related Fields 146 (2010) 401–433) developed a unified approach to turn a collection of non-Gaussian function estimation models into a standard Gaussian regression and any good Gaussian nonparametric regression method can then be used. These Gaussianization Machines have two key components, binning and transformation. When combined with BlockJS, a wavelet thresholding procedure for Gaussian regression, the procedures are computationally efficient with strong theoretical guarantees. Technical analysis given in Brown, Cai and Zhou ( Ann. Statist. 36 (2008) 2055–2084; Ann. Statist. 38 (2010) 2005–2046) and Brown et al. ( Probab. Theory Related Fields 146 (2010) 401–433) shows that the estimators attain the optimal rate of convergence adaptively over a large set of Besov spaces and across a collection of non-Gaussian function estimation models, including robust nonparametric regression, density estimation, and nonparametric regression in exponential families. The estimators are also spatially adaptive. The Gaussianization Machines significantly extend the flexibility and scope of the theories and methodologies originally developed for the conventional nonparametric Gaussian regression. This article aims to provide a concise account of the Gaussianization Machines developed in Brown, Cai and Zhou ( Ann. Statist. 36 (2008) 2055–2084; Ann. Statist. 38 (2010) 2005–2046), Brown et al. ( Probab. Theory Related Fields 146 (2010) 401–433). Full Article
for Conditionally Conjugate Mean-Field Variational Bayes for Logistic Models By projecteuclid.org Published On :: Fri, 11 Oct 2019 04:03 EDT Daniele Durante, Tommaso Rigon. Source: Statistical Science, Volume 34, Number 3, 472--485.Abstract: Variational Bayes (VB) is a common strategy for approximate Bayesian inference, but simple methods are only available for specific classes of models including, in particular, representations having conditionally conjugate constructions within an exponential family. Models with logit components are an apparently notable exception to this class, due to the absence of conjugacy among the logistic likelihood and the Gaussian priors for the coefficients in the linear predictor. To facilitate approximate inference within this widely used class of models, Jaakkola and Jordan ( Stat. Comput. 10 (2000) 25–37) proposed a simple variational approach which relies on a family of tangent quadratic lower bounds of the logistic log-likelihood, thus restoring conjugacy between these approximate bounds and the Gaussian priors. This strategy is still implemented successfully, but few attempts have been made to formally understand the reasons underlying its excellent performance. Following a review on VB for logistic models, we cover this gap by providing a formal connection between the above bound and a recent Pólya-gamma data augmentation for logistic regression. Such a result places the computational methods associated with the aforementioned bounds within the framework of variational inference for conditionally conjugate exponential family models, thereby allowing recent advances for this class to be inherited also by the methods relying on Jaakkola and Jordan ( Stat. Comput. 10 (2000) 25–37). Full Article
for User-Friendly Covariance Estimation for Heavy-Tailed Distributions By projecteuclid.org Published On :: Fri, 11 Oct 2019 04:03 EDT Yuan Ke, Stanislav Minsker, Zhao Ren, Qiang Sun, Wen-Xin Zhou. Source: Statistical Science, Volume 34, Number 3, 454--471.Abstract: We provide a survey of recent results on covariance estimation for heavy-tailed distributions. By unifying ideas scattered in the literature, we propose user-friendly methods that facilitate practical implementation. Specifically, we introduce elementwise and spectrumwise truncation operators, as well as their $M$-estimator counterparts, to robustify the sample covariance matrix. Different from the classical notion of robustness that is characterized by the breakdown property, we focus on the tail robustness which is evidenced by the connection between nonasymptotic deviation and confidence level. The key insight is that estimators should adapt to the sample size, dimensionality and noise level to achieve optimal tradeoff between bias and robustness. Furthermore, to facilitate practical implementation, we propose data-driven procedures that automatically calibrate the tuning parameters. We demonstrate their applications to a series of structured models in high dimensions, including the bandable and low-rank covariance matrices and sparse precision matrices. Numerical studies lend strong support to the proposed methods. Full Article
for The Geometry of Continuous Latent Space Models for Network Data By projecteuclid.org Published On :: Fri, 11 Oct 2019 04:03 EDT Anna L. Smith, Dena M. Asta, Catherine A. Calder. Source: Statistical Science, Volume 34, Number 3, 428--453.Abstract: We review the class of continuous latent space (statistical) models for network data, paying particular attention to the role of the geometry of the latent space. In these models, the presence/absence of network dyadic ties are assumed to be conditionally independent given the dyads’ unobserved positions in a latent space. In this way, these models provide a probabilistic framework for embedding network nodes in a continuous space equipped with a geometry that facilitates the description of dependence between random dyadic ties. Specifically, these models naturally capture homophilous tendencies and triadic clustering, among other common properties of observed networks. In addition to reviewing the literature on continuous latent space models from a geometric perspective, we highlight the important role the geometry of the latent space plays on properties of networks arising from these models via intuition and simulation. Finally, we discuss results from spectral graph theory that allow us to explore the role of the geometry of the latent space, independent of network size. We conclude with conjectures about how these results might be used to infer the appropriate latent space geometry from observed networks. Full Article
for Comment: Contributions of Model Features to BART Causal Inference Performance Using ACIC 2016 Competition Data By projecteuclid.org Published On :: Fri, 12 Apr 2019 04:00 EDT Nicole Bohme Carnegie. Source: Statistical Science, Volume 34, Number 1, 90--93.Abstract: With a thorough exposition of the methods and results of the 2016 Atlantic Causal Inference Competition, Dorie et al. have set a new standard for reproducibility and comparability of evaluations of causal inference methods. In particular, the open-source R package aciccomp2016, which permits reproduction of all datasets used in the competition, will be an invaluable resource for evaluation of future methodological developments. Building upon results from Dorie et al., we examine whether a set of potential modifications to Bayesian Additive Regression Trees (BART)—multiple chains in model fitting, using the propensity score as a covariate, targeted maximum likelihood estimation (TMLE), and computing symmetric confidence intervals—have a stronger impact on bias, RMSE, and confidence interval coverage in combination than they do alone. We find that bias in the estimate of SATT is minimal, regardless of the BART formulation. For purposes of CI coverage, however, all proposed modifications are beneficial—alone and in combination—but use of TMLE is least beneficial for coverage and results in considerably wider confidence intervals. Full Article
for Comment on “Automated Versus Do-It-Yourself Methods for Causal Inference: Lessons Learned from a Data Analysis Competition” By projecteuclid.org Published On :: Fri, 12 Apr 2019 04:00 EDT Susan Gruber, Mark J. van der Laan. Source: Statistical Science, Volume 34, Number 1, 82--85.Abstract: Dorie and co-authors (DHSSC) are to be congratulated for initiating the ACIC Data Challenge. Their project engaged the community and accelerated research by providing a level playing field for comparing the performance of a priori specified algorithms. DHSSC identified themes concerning characteristics of the DGP, properties of the estimators, and inference. We discuss these themes in the context of targeted learning. Full Article
for Matching Methods for Causal Inference: A Review and a Look Forward By projecteuclid.org Published On :: Thu, 05 Aug 2010 15:41 EDT Elizabeth A. StuartSource: Statist. Sci., Volume 25, Number 1, 1--21.Abstract: When estimating causal effects using observational data, it is desirable to replicate a randomized experiment as closely as possible by obtaining treated and control groups with similar covariate distributions. This goal can often be achieved by choosing well-matched samples of the original treated and control groups, thereby reducing bias due to the covariates. Since the 1970s, work on matching methods has examined how to best choose treated and control subjects for comparison. Matching methods are gaining popularity in fields such as economics, epidemiology, medicine and political science. However, until now the literature and related advice has been scattered across disciplines. Researchers who are interested in using matching methods—or developing methods related to matching—do not have a single place to turn to learn about past and current research. This paper provides a structure for thinking about matching methods and guidance on their use, coalescing the existing research (both old and new) and providing a summary of where the literature on matching methods is now and where it should be headed. Full Article
for We thank you for not smoking / design : Biman Mullick. By search.wellcomelibrary.org Published On :: London (33 Stillness Rd, London, SE23 1NG) : Cleanair, Campaign for a Smoke-free Environment, [198-?] Full Article
for We thank you for not smoking / design : Biman Mullick. By search.wellcomelibrary.org Published On :: London (33 Stillness Rd, London, SE23 1NG) : Cleanair, Campaign for a Smoke-free Environment, [198-?] Full Article
for We thank you for not smoking / Biman Mullick. By search.wellcomelibrary.org Published On :: London : Cleanair, [1988?] Full Article
for We thank you for not smoking / design : Biman Mullick. By search.wellcomelibrary.org Published On :: London (33 Stillness Rd, London, SE23 1NG) : Cleanair, Campaign for a Smoke-free Environment, [198-?] Full Article
for Smoking is bad for your image / design : Biman Mullick. By search.wellcomelibrary.org Published On :: [London?], [199-?] Full Article
for The 2019 Victoria’s Secret Fashion Show Is Canceled After Facing Backlash for Lack of Body Diversity By www.health.com Published On :: Fri, 22 Nov 2019 13:30:29 -0500 The reaction on social media has been fierce. Full Article
for Blake Lively's Favorite Affordable Jeans Brand Is Having a Major Sale Right Now By www.health.com Published On :: Mon, 25 Nov 2019 16:40:00 -0500 Here's everything you need to know about Old Navy's Black Friday and Cyber Monday plans. Full Article
for Healthy Holiday Gift Ideas for Women By www.health.com Published On :: Tue, 26 Nov 2019 12:41:16 -0500 Treat the babe in your life to one (or two or three) of these indulgent gifts. Full Article
for Editor’s Pick: Gifts for Your Tech-Obsessed Friend By www.health.com Published On :: Tue, 26 Nov 2019 12:49:30 -0500 A guide to the tech gadgets even your hard-to-shop-for friends and family members will love. Full Article
for Gabrielle Union's Mesmerizing Tie Dye Activewear Set Is On Sale for Black Friday By www.health.com Published On :: Wed, 27 Nov 2019 15:35:02 -0500 The rainbow sports bra and leggings set from Splits59 is a must-have for anyone craving a pop of color in their workout wardrobe. Full Article
for These Nordstrom Cyber Monday Deals Are Giving Black Friday a Run for Its Money By www.health.com Published On :: Wed, 21 Nov 2018 10:40:05 -0500 This is not a drill: You can get up to 50% off at Nordstrom right now. Full Article
for Katie Holmes’s Affordable Sneakers Are the Star of Her Latest Outfit By www.health.com Published On :: Thu, 05 Dec 2019 16:45:45 -0500 Meghan Markle is also a fan of the comfy shoes. Full Article
for These Clark Booties Are Actually Comfortable Enough to Wear All Day—and They’re on Sale By www.health.com Published On :: Sun, 08 Dec 2019 09:36:02 -0500 You can save 50% right now. Full Article
for Forget Black Booties, Amal Clooney and J.Lo Are Wearing This Weather-Resistant Boot Trend Instead By www.health.com Published On :: Tue, 10 Dec 2019 15:31:21 -0500 And it’s on sale at Nordstrom. Full Article
for Sweatsuits Should Be Your Cozy Day Uniform—and These Are Our Favorites From Amazon By www.health.com Published On :: Wed, 11 Dec 2019 11:39:15 -0500 This retro style is making a comeback for a reason. Full Article
for Nike Launches Zoom Pulse Sneakers for Medical Workers Who Are On Their Feet All Day By www.health.com Published On :: Fri, 13 Dec 2019 13:45:17 -0500 The new style is available to shop today. Full Article
for Multisensory Integration and the Society for Neuroscience: Then and Now By www.jneurosci.org Published On :: 2020-01-02 Barry E. SteinJan 2, 2020; 40:3-11Viewpoints Full Article
for Optimization of a GCaMP Calcium Indicator for Neural Activity Imaging By www.jneurosci.org Published On :: 2012-10-03 Jasper AkerboomOct 3, 2012; 32:13819-13840Cellular Full Article
for Dissociable Intrinsic Connectivity Networks for Salience Processing and Executive Control By www.jneurosci.org Published On :: 2007-02-28 William W. SeeleyFeb 28, 2007; 27:2349-2356BehavioralSystemsCognitive Full Article