pr

Functional weak limit theorem for a local empirical process of non-stationary time series and its application

Ulrike Mayer, Henryk Zähle, Zhou Zhou.

Source: Bernoulli, Volume 26, Number 3, 1891--1911.

Abstract:
We derive a functional weak limit theorem for a local empirical process of a wide class of piece-wise locally stationary (PLS) time series. The latter result is applied to derive the asymptotics of weighted empirical quantiles and weighted V-statistics of non-stationary time series. The class of admissible underlying time series is illustrated by means of PLS linear processes and PLS ARCH processes.




pr

Local differential privacy: Elbow effect in optimal density estimation and adaptation over Besov ellipsoids

Cristina Butucea, Amandine Dubois, Martin Kroll, Adrien Saumard.

Source: Bernoulli, Volume 26, Number 3, 1727--1764.

Abstract:
We address the problem of non-parametric density estimation under the additional constraint that only privatised data are allowed to be published and available for inference. For this purpose, we adopt a recent generalisation of classical minimax theory to the framework of local $alpha$-differential privacy and provide a lower bound on the rate of convergence over Besov spaces $mathcal{B}^{s}_{pq}$ under mean integrated $mathbb{L}^{r}$-risk. This lower bound is deteriorated compared to the standard setup without privacy, and reveals a twofold elbow effect. In order to fulfill the privacy requirement, we suggest adding suitably scaled Laplace noise to empirical wavelet coefficients. Upper bounds within (at most) a logarithmic factor are derived under the assumption that $alpha$ stays bounded as $n$ increases: A linear but non-adaptive wavelet estimator is shown to attain the lower bound whenever $pgeq r$ but provides a slower rate of convergence otherwise. An adaptive non-linear wavelet estimator with appropriately chosen smoothing parameters and thresholding is shown to attain the lower bound within a logarithmic factor for all cases.




pr

On the eigenproblem for Gaussian bridges

Pavel Chigansky, Marina Kleptsyna, Dmytro Marushkevych.

Source: Bernoulli, Volume 26, Number 3, 1706--1726.

Abstract:
Spectral decomposition of the covariance operator is one of the main building blocks in the theory and applications of Gaussian processes. Unfortunately, it is notoriously hard to derive in a closed form. In this paper, we consider the eigenproblem for Gaussian bridges. Given a base process, its bridge is obtained by conditioning the trajectories to start and terminate at the given points. What can be said about the spectrum of a bridge, given the spectrum of its base process? We show how this question can be answered asymptotically for a family of processes, including the fractional Brownian motion.




pr

Influence of the seed in affine preferential attachment trees

David Corlin Marchand, Ioan Manolescu.

Source: Bernoulli, Volume 26, Number 3, 1665--1705.

Abstract:
We study randomly growing trees governed by the affine preferential attachment rule. Starting with a seed tree $S$, vertices are attached one by one, each linked by an edge to a random vertex of the current tree, chosen with a probability proportional to an affine function of its degree. This yields a one-parameter family of preferential attachment trees $(T_{n}^{S})_{ngeq |S|}$, of which the linear model is a particular case. Depending on the choice of the parameter, the power-laws governing the degrees in $T_{n}^{S}$ have different exponents. We study the problem of the asymptotic influence of the seed $S$ on the law of $T_{n}^{S}$. We show that, for any two distinct seeds $S$ and $S'$, the laws of $T_{n}^{S}$ and $T_{n}^{S'}$ remain at uniformly positive total-variation distance as $n$ increases. This is a continuation of Curien et al. ( J. Éc. Polytech. Math. 2 (2015) 1–34), which in turn was inspired by a conjecture of Bubeck et al. ( IEEE Trans. Netw. Sci. Eng. 2 (2015) 30–39). The technique developed here is more robust than previous ones and is likely to help in the study of more general attachment mechanisms.




pr

On the probability distribution of the local times of diagonally operator-self-similar Gaussian fields with stationary increments

Kamran Kalbasi, Thomas Mountford.

Source: Bernoulli, Volume 26, Number 2, 1504--1534.

Abstract:
In this paper, we study the local times of vector-valued Gaussian fields that are ‘diagonally operator-self-similar’ and whose increments are stationary. Denoting the local time of such a Gaussian field around the spatial origin and over the temporal unit hypercube by $Z$, we show that there exists $lambdain(0,1)$ such that under some quite weak conditions, $lim_{n ightarrow+infty}frac{sqrt[n]{mathbb{E}(Z^{n})}}{n^{lambda}}$ and $lim_{x ightarrow+infty}frac{-logmathbb{P}(Z>x)}{x^{frac{1}{lambda}}}$ both exist and are strictly positive (possibly $+infty$). Moreover, we show that if the underlying Gaussian field is ‘strongly locally nondeterministic’, the above limits will be finite as well. These results are then applied to establish similar statements for the intersection local times of diagonally operator-self-similar Gaussian fields with stationary increments.




pr

A characterization of the finiteness of perpetual integrals of Lévy processes

Martin Kolb, Mladen Savov.

Source: Bernoulli, Volume 26, Number 2, 1453--1472.

Abstract:
We derive a criterium for the almost sure finiteness of perpetual integrals of Lévy processes for a class of real functions including all continuous functions and for general one-dimensional Lévy processes that drifts to plus infinity. This generalizes previous work of Döring and Kyprianou, who considered Lévy processes having a local time, leaving the general case as an open problem. It turns out, that the criterium in the general situation simplifies significantly in the situation, where the process has a local time, but we also demonstrate that in general our criterium can not be reduced. This answers an open problem posed in ( J. Theoret. Probab. 29 (2016) 1192–1198).




pr

On stability of traveling wave solutions for integro-differential equations related to branching Markov processes

Pasha Tkachov.

Source: Bernoulli, Volume 26, Number 2, 1354--1380.

Abstract:
The aim of this paper is to prove stability of traveling waves for integro-differential equations connected with branching Markov processes. In other words, the limiting law of the left-most particle of a (time-continuous) branching Markov process with a Lévy non-branching part is demonstrated. The key idea is to approximate the branching Markov process by a branching random walk and apply the result of Aïdékon [ Ann. Probab. 41 (2013) 1362–1426] on the limiting law of the latter one.




pr

A new McKean–Vlasov stochastic interpretation of the parabolic–parabolic Keller–Segel model: The one-dimensional case

Denis Talay, Milica Tomašević.

Source: Bernoulli, Volume 26, Number 2, 1323--1353.

Abstract:
In this paper, we analyze a stochastic interpretation of the one-dimensional parabolic–parabolic Keller–Segel system without cut-off. It involves an original type of McKean–Vlasov interaction kernel. At the particle level, each particle interacts with all the past of each other particle by means of a time integrated functional involving a singular kernel. At the mean-field level studied here, the McKean–Vlasov limit process interacts with all the past time marginals of its probability distribution in a similarly singular way. We prove that the parabolic–parabolic Keller–Segel system in the whole Euclidean space and the corresponding McKean–Vlasov stochastic differential equation are well-posed for any values of the parameters of the model.




pr

Rates of convergence in de Finetti’s representation theorem, and Hausdorff moment problem

Emanuele Dolera, Stefano Favaro.

Source: Bernoulli, Volume 26, Number 2, 1294--1322.

Abstract:
Given a sequence ${X_{n}}_{ngeq 1}$ of exchangeable Bernoulli random variables, the celebrated de Finetti representation theorem states that $frac{1}{n}sum_{i=1}^{n}X_{i}stackrel{a.s.}{longrightarrow }Y$ for a suitable random variable $Y:Omega ightarrow [0,1]$ satisfying $mathsf{P}[X_{1}=x_{1},dots ,X_{n}=x_{n}|Y]=Y^{sum_{i=1}^{n}x_{i}}(1-Y)^{n-sum_{i=1}^{n}x_{i}}$. In this paper, we study the rate of convergence in law of $frac{1}{n}sum_{i=1}^{n}X_{i}$ to $Y$ under the Kolmogorov distance. After showing that a rate of the type of $1/n^{alpha }$ can be obtained for any index $alpha in (0,1]$, we find a sufficient condition on the distribution of $Y$ for the achievement of the optimal rate of convergence, that is $1/n$. Besides extending and strengthening recent results under the weaker Wasserstein distance, our main result weakens the regularity hypotheses on $Y$ in the context of the Hausdorff moment problem.




pr

Characterization of probability distribution convergence in Wasserstein distance by $L^{p}$-quantization error function

Yating Liu, Gilles Pagès.

Source: Bernoulli, Volume 26, Number 2, 1171--1204.

Abstract:
We establish conditions to characterize probability measures by their $L^{p}$-quantization error functions in both $mathbb{R}^{d}$ and Hilbert settings. This characterization is two-fold: static (identity of two distributions) and dynamic (convergence for the $L^{p}$-Wasserstein distance). We first propose a criterion on the quantization level $N$, valid for any norm on $mathbb{R}^{d}$ and any order $p$ based on a geometrical approach involving the Voronoï diagram. Then, we prove that in the $L^{2}$-case on a (separable) Hilbert space, the condition on the level $N$ can be reduced to $N=2$, which is optimal. More quantization based characterization cases in dimension 1 and a discussion of the completeness of a distance defined by the quantization error function can be found at the end of this paper.




pr

Interacting reinforced stochastic processes: Statistical inference based on the weighted empirical means

Giacomo Aletti, Irene Crimaldi, Andrea Ghiglietti.

Source: Bernoulli, Volume 26, Number 2, 1098--1138.

Abstract:
This work deals with a system of interacting reinforced stochastic processes , where each process $X^{j}=(X_{n,j})_{n}$ is located at a vertex $j$ of a finite weighted directed graph, and it can be interpreted as the sequence of “actions” adopted by an agent $j$ of the network. The interaction among the dynamics of these processes depends on the weighted adjacency matrix $W$ associated to the underlying graph: indeed, the probability that an agent $j$ chooses a certain action depends on its personal “inclination” $Z_{n,j}$ and on the inclinations $Z_{n,h}$, with $h eq j$, of the other agents according to the entries of $W$. The best known example of reinforced stochastic process is the Pólya urn. The present paper focuses on the weighted empirical means $N_{n,j}=sum_{k=1}^{n}q_{n,k}X_{k,j}$, since, for example, the current experience is more important than the past one in reinforced learning. Their almost sure synchronization and some central limit theorems in the sense of stable convergence are proven. The new approach with weighted means highlights the key points in proving some recent results for the personal inclinations $Z^{j}=(Z_{n,j})_{n}$ and for the empirical means $overline{X}^{j}=(sum_{k=1}^{n}X_{k,j}/n)_{n}$ given in recent papers (e.g. Aletti, Crimaldi and Ghiglietti (2019), Ann. Appl. Probab. 27 (2017) 3787–3844, Crimaldi et al. Stochastic Process. Appl. 129 (2019) 70–101). In fact, with a more sophisticated decomposition of the considered processes, we can understand how the different convergence rates of the involved stochastic processes combine. From an application point of view, we provide confidence intervals for the common limit inclination of the agents and a test statistics to make inference on the matrix $W$, based on the weighted empirical means. In particular, we answer a research question posed in Aletti, Crimaldi and Ghiglietti (2019).




pr

A Bayesian nonparametric approach to log-concave density estimation

Ester Mariucci, Kolyan Ray, Botond Szabó.

Source: Bernoulli, Volume 26, Number 2, 1070--1097.

Abstract:
The estimation of a log-concave density on $mathbb{R}$ is a canonical problem in the area of shape-constrained nonparametric inference. We present a Bayesian nonparametric approach to this problem based on an exponentiated Dirichlet process mixture prior and show that the posterior distribution converges to the log-concave truth at the (near-) minimax rate in Hellinger distance. Our proof proceeds by establishing a general contraction result based on the log-concave maximum likelihood estimator that prevents the need for further metric entropy calculations. We further present computationally more feasible approximations and both an empirical and hierarchical Bayes approach. All priors are illustrated numerically via simulations.




pr

A unified principled framework for resampling based on pseudo-populations: Asymptotic theory

Pier Luigi Conti, Daniela Marella, Fulvia Mecatti, Federico Andreis.

Source: Bernoulli, Volume 26, Number 2, 1044--1069.

Abstract:
In this paper, a class of resampling techniques for finite populations under $pi $ps sampling design is introduced. The basic idea on which they rest is a two-step procedure consisting in: (i) constructing a “pseudo-population” on the basis of sample data; (ii) drawing a sample from the predicted population according to an appropriate resampling design. From a logical point of view, this approach is essentially based on the plug-in principle by Efron, at the “sampling design level”. Theoretical justifications based on large sample theory are provided. New approaches to construct pseudo populations based on various forms of calibrations are proposed. Finally, a simulation study is performed.




pr

Stable processes conditioned to hit an interval continuously from the outside

Leif Döring, Philip Weissmann.

Source: Bernoulli, Volume 26, Number 2, 980--1015.

Abstract:
Conditioning stable Lévy processes on zero probability events recently became a tractable subject since several explicit formulas emerged from a deep analysis using the Lamperti transformations for self-similar Markov processes. In this article, we derive new harmonic functions and use them to explain how to condition stable processes to hit continuously a compact interval from the outside.




pr

Distances and large deviations in the spatial preferential attachment model

Christian Hirsch, Christian Mönch.

Source: Bernoulli, Volume 26, Number 2, 927--947.

Abstract:
This paper considers two asymptotic properties of a spatial preferential-attachment model introduced by E. Jacob and P. Mörters (In Algorithms and Models for the Web Graph (2013) 14–25 Springer). First, in a regime of strong linear reinforcement, we show that typical distances are at most of doubly-logarithmic order. Second, we derive a large deviation principle for the empirical neighbourhood structure and express the rate function as solution to an entropy minimisation problem in the space of stationary marked point processes.




pr

Convergence of the age structure of general schemes of population processes

Jie Yen Fan, Kais Hamza, Peter Jagers, Fima Klebaner.

Source: Bernoulli, Volume 26, Number 2, 893--926.

Abstract:
We consider a family of general branching processes with reproduction parameters depending on the age of the individual as well as the population age structure and a parameter $K$, which may represent the carrying capacity. These processes are Markovian in the age structure. In a previous paper ( Proc. Steklov Inst. Math. 282 (2013) 90–105), the Law of Large Numbers as $K o infty $ was derived. Here we prove the central limit theorem, namely the weak convergence of the fluctuation processes in an appropriate Skorokhod space. We also show that the limit is driven by a stochastic partial differential equation.




pr

Stochastic differential equations with a fractionally filtered delay: A semimartingale model for long-range dependent processes

Richard A. Davis, Mikkel Slot Nielsen, Victor Rohde.

Source: Bernoulli, Volume 26, Number 2, 799--827.

Abstract:
In this paper, we introduce a model, the stochastic fractional delay differential equation (SFDDE), which is based on the linear stochastic delay differential equation and produces stationary processes with hyperbolically decaying autocovariance functions. The model departs from the usual way of incorporating this type of long-range dependence into a short-memory model as it is obtained by applying a fractional filter to the drift term rather than to the noise term. The advantages of this approach are that the corresponding long-range dependent solutions are semimartingales and the local behavior of the sample paths is unaffected by the degree of long memory. We prove existence and uniqueness of solutions to the SFDDEs and study their spectral densities and autocovariance functions. Moreover, we define a subclass of SFDDEs which we study in detail and relate to the well-known fractionally integrated CARMA processes. Finally, we consider the task of simulating from the defining SFDDEs.




pr

Robust modifications of U-statistics and applications to covariance estimation problems

Stanislav Minsker, Xiaohan Wei.

Source: Bernoulli, Volume 26, Number 1, 694--727.

Abstract:
Let $Y$ be a $d$-dimensional random vector with unknown mean $mu $ and covariance matrix $Sigma $. This paper is motivated by the problem of designing an estimator of $Sigma $ that admits exponential deviation bounds in the operator norm under minimal assumptions on the underlying distribution, such as existence of only 4th moments of the coordinates of $Y$. To address this problem, we propose robust modifications of the operator-valued U-statistics, obtain non-asymptotic guarantees for their performance, and demonstrate the implications of these results to the covariance estimation problem under various structural assumptions.




pr

A unified approach to coupling SDEs driven by Lévy noise and some applications

Mingjie Liang, René L. Schilling, Jian Wang.

Source: Bernoulli, Volume 26, Number 1, 664--693.

Abstract:
We present a general method to construct couplings of stochastic differential equations driven by Lévy noise in terms of coupling operators. This approach covers both coupling by reflection and refined basic coupling which are often discussed in the literature. As applications, we prove regularity results for the transition semigroups and obtain successful couplings for the solutions to stochastic differential equations driven by additive Lévy noise.




pr

Normal approximation for sums of weighted $U$-statistics – application to Kolmogorov bounds in random subgraph counting

Nicolas Privault, Grzegorz Serafin.

Source: Bernoulli, Volume 26, Number 1, 587--615.

Abstract:
We derive normal approximation bounds in the Kolmogorov distance for sums of discrete multiple integrals and weighted $U$-statistics made of independent Bernoulli random variables. Such bounds are applied to normal approximation for the renormalized subgraph counts in the Erdős–Rényi random graph. This approach completely solves a long-standing conjecture in the general setting of arbitrary graph counting, while recovering recent results obtained for triangles and improving other bounds in the Wasserstein distance.




pr

Tail expectile process and risk assessment

Abdelaati Daouia, Stéphane Girard, Gilles Stupfler.

Source: Bernoulli, Volume 26, Number 1, 531--556.

Abstract:
Expectiles define a least squares analogue of quantiles. They are determined by tail expectations rather than tail probabilities. For this reason and many other theoretical and practical merits, expectiles have recently received a lot of attention, especially in actuarial and financial risk management. Their estimation, however, typically requires to consider non-explicit asymmetric least squares estimates rather than the traditional order statistics used for quantile estimation. This makes the study of the tail expectile process a lot harder than that of the standard tail quantile process. Under the challenging model of heavy-tailed distributions, we derive joint weighted Gaussian approximations of the tail empirical expectile and quantile processes. We then use this powerful result to introduce and study new estimators of extreme expectiles and the standard quantile-based expected shortfall, as well as a novel expectile-based form of expected shortfall. Our estimators are built on general weighted combinations of both top order statistics and asymmetric least squares estimates. Some numerical simulations and applications to actuarial and financial data are provided.




pr

Weak convergence of quantile and expectile processes under general assumptions

Tobias Zwingmann, Hajo Holzmann.

Source: Bernoulli, Volume 26, Number 1, 323--351.

Abstract:
We show weak convergence of quantile and expectile processes to Gaussian limit processes in the space of bounded functions endowed with an appropriate semimetric which is based on the concepts of epi- and hypo- convergence as introduced in A. Bücher, J. Segers and S. Volgushev (2014), ‘ When Uniform Weak Convergence Fails: Empirical Processes for Dependence Functions and Residuals via Epi- and Hypographs ’, Annals of Statistics 42 . We impose assumptions for which it is known that weak convergence with respect to the supremum norm generally fails to hold. For quantiles, we consider stationary observations, where the marginal distribution function is assumed to be strictly increasing and continuous except for finitely many points and to admit strictly positive – possibly infinite – left- and right-sided derivatives. For expectiles, we focus on independent and identically distributed (i.i.d.) observations. Only a finite second moment and continuity at the boundary points but no further smoothness properties of the distribution function are required. We also show consistency of the bootstrap for this mode of convergence in the i.i.d. case for quantiles and expectiles.




pr

Prediction and estimation consistency of sparse multi-class penalized optimal scoring

Irina Gaynanova.

Source: Bernoulli, Volume 26, Number 1, 286--322.

Abstract:
Sparse linear discriminant analysis via penalized optimal scoring is a successful tool for classification in high-dimensional settings. While the variable selection consistency of sparse optimal scoring has been established, the corresponding prediction and estimation consistency results have been lacking. We bridge this gap by providing probabilistic bounds on out-of-sample prediction error and estimation error of multi-class penalized optimal scoring allowing for diverging number of classes.




pr

A new method for obtaining sharp compound Poisson approximation error estimates for sums of locally dependent random variables

Michael V. Boutsikas, Eutichia Vaggelatou

Source: Bernoulli, Volume 16, Number 2, 301--330.

Abstract:
Let X 1 , X 2 , …, X n be a sequence of independent or locally dependent random variables taking values in ℤ + . In this paper, we derive sharp bounds, via a new probabilistic method, for the total variation distance between the distribution of the sum ∑ i =1 n X i and an appropriate Poisson or compound Poisson distribution. These bounds include a factor which depends on the smoothness of the approximating Poisson or compound Poisson distribution. This “smoothness factor” is of order O( σ −2 ), according to a heuristic argument, where σ 2 denotes the variance of the approximating distribution. In this way, we offer sharp error estimates for a large range of values of the parameters. Finally, specific examples concerning appearances of rare runs in sequences of Bernoulli trials are presented by way of illustration.




pr

Discover Protestant nonconformity in England and Wales / Paul Blake.

Dissenters, Religious -- Great Britain.




pr

Our Lady of Grace family page of history : a bookweek bicentennial project / edited by Janeen Brian.

Our Lady of Grace School (Glengowrie, S.A.)




pr

Austin-Area District Looks for Digital/Blended Learning Program; Baltimore Seeks High School Literacy Program

The Round Rock Independent School District in Texas is looking for a digital curriculum and blended learning program. Baltimore is looking for a comprehensive high school literacy program.

The post Austin-Area District Looks for Digital/Blended Learning Program; Baltimore Seeks High School Literacy Program appeared first on Market Brief.



  • Purchasing Alert
  • Curriculum / Digital Curriculum
  • Educational Technology/Ed-Tech
  • Learning Management / Student Information Systems
  • Procurement / Purchasing / RFPs

pr

ACT and Teachers’ Union Partner to Provide Remote Learning Resources Amid Pandemic

ACT and the American Federation of Teachers are partnering to provide free resources as educators increasingly switch to distance learning amid the COVID-19 pandemic.

The post ACT and Teachers’ Union Partner to Provide Remote Learning Resources Amid Pandemic appeared first on Market Brief.




pr

Item 01: Captain Vernon Sturdee diary, 25 April, 1915 to 2 July, 1915




pr

Item 07: A Journal of ye [the] Proceedings of his Majesty's Sloop Swallow, Captain Phillip [Philip] Carteret Commander, Commencing ye [the] 23 of July 1766 and ended [4 July 1767]




pr

Item 08: A Logg [Log] Book of the proceedings on Board His Majesty's Ship Swallow, Captain Philip Carteret Commander Commencing from the 20th August 1766 and Ending [21st May 1768]




pr

Item 13: Swallow 1767, A journal of the proceedings on Board His Majesty's Sloop Swallow, commencing the 1st of March 1767 and Ended the 7th of July 1767




pr

Sydney in 1848 : illustrated by copper-plate engravings of its principal streets, public buildings, churches, chapels, etc. / from drawings by Joseph Fowles.




pr

Russia probe transcripts released by House Intelligence Committee

Reaction and analysis from Fox News contributor Byron York and former Florida Attorney General Pam Bondi.





pr

Pence aimed to project normalcy during his trip to Iowa, but coronavirus got in the way

Vice President Pence’s trip to Iowa shows how the Trump administration’s aims to move past coronavirus are sometimes complicated by the virus itself.





pr

Brazil's Amazon: Surge in deforestation as military prepares to deploy

The military is preparing to deploy to the region to try to stop illegal logging and mining.





pr

The accusation against Joe Biden has Democrats rediscovering the value of due process

Some Democrats took "Believe Women" literally until Joe Biden was accused. Now they're relearning that guilt-by-accusation doesn't serve justice.





pr

Pence press secretary tests positive for coronavirus

The news comes shortly after a valet who served meals to President Trump also tested positive for the virus.





pr

Bayesian Sparse Multivariate Regression with Asymmetric Nonlocal Priors for Microbiome Data Analysis

Kurtis Shuler, Marilou Sison-Mangus, Juhee Lee.

Source: Bayesian Analysis, Volume 15, Number 2, 559--578.

Abstract:
We propose a Bayesian sparse multivariate regression method to model the relationship between microbe abundance and environmental factors for microbiome data. We model abundance counts of operational taxonomic units (OTUs) with a negative binomial distribution and relate covariates to the counts through regression. Extending conventional nonlocal priors, we construct asymmetric nonlocal priors for regression coefficients to efficiently identify relevant covariates and their effect directions. We build a hierarchical model to facilitate pooling of information across OTUs that produces parsimonious results with improved accuracy. We present simulation studies that compare variable selection performance under the proposed model to those under Bayesian sparse regression models with asymmetric and symmetric local priors and two frequentist models. The simulations show the proposed model identifies important covariates and yields coefficient estimates with favorable accuracy compared with the alternatives. The proposed model is applied to analyze an ocean microbiome dataset collected over time to study the association of harmful algal bloom conditions with microbial communities.




pr

A Loss-Based Prior for Variable Selection in Linear Regression Methods

Cristiano Villa, Jeong Eun Lee.

Source: Bayesian Analysis, Volume 15, Number 2, 533--558.

Abstract:
In this work we propose a novel model prior for variable selection in linear regression. The idea is to determine the prior mass by considering the worth of each of the regression models, given the number of possible covariates under consideration. The worth of a model consists of the information loss and the loss due to model complexity. While the information loss is determined objectively, the loss expression due to model complexity is flexible and, the penalty on model size can be even customized to include some prior knowledge. Some versions of the loss-based prior are proposed and compared empirically. Through simulation studies and real data analyses, we compare the proposed prior to the Scott and Berger prior, for noninformative scenarios, and with the Beta-Binomial prior, for informative scenarios.




pr

Additive Multivariate Gaussian Processes for Joint Species Distribution Modeling with Heterogeneous Data

Jarno Vanhatalo, Marcelo Hartmann, Lari Veneranta.

Source: Bayesian Analysis, Volume 15, Number 2, 415--447.

Abstract:
Species distribution models (SDM) are a key tool in ecology, conservation and management of natural resources. Two key components of the state-of-the-art SDMs are the description for species distribution response along environmental covariates and the spatial random effect that captures deviations from the distribution patterns explained by environmental covariates. Joint species distribution models (JSDMs) additionally include interspecific correlations which have been shown to improve their descriptive and predictive performance compared to single species models. However, current JSDMs are restricted to hierarchical generalized linear modeling framework. Their limitation is that parametric models have trouble in explaining changes in abundance due, for example, highly non-linear physical tolerance limits which is particularly important when predicting species distribution in new areas or under scenarios of environmental change. On the other hand, semi-parametric response functions have been shown to improve the predictive performance of SDMs in these tasks in single species models. Here, we propose JSDMs where the responses to environmental covariates are modeled with additive multivariate Gaussian processes coded as linear models of coregionalization. These allow inference for wide range of functional forms and interspecific correlations between the responses. We propose also an efficient approach for inference with Laplace approximation and parameterization of the interspecific covariance matrices on the Euclidean space. We demonstrate the benefits of our model with two small scale examples and one real world case study. We use cross-validation to compare the proposed model to analogous semi-parametric single species models and parametric single and joint species models in interpolation and extrapolation tasks. The proposed model outperforms the alternative models in all cases. We also show that the proposed model can be seen as an extension of the current state-of-the-art JSDMs to semi-parametric models.




pr

A New Bayesian Approach to Robustness Against Outliers in Linear Regression

Philippe Gagnon, Alain Desgagné, Mylène Bédard.

Source: Bayesian Analysis, Volume 15, Number 2, 389--414.

Abstract:
Linear regression is ubiquitous in statistical analysis. It is well understood that conflicting sources of information may contaminate the inference when the classical normality of errors is assumed. The contamination caused by the light normal tails follows from an undesirable effect: the posterior concentrates in an area in between the different sources with a large enough scaling to incorporate them all. The theory of conflict resolution in Bayesian statistics (O’Hagan and Pericchi (2012)) recommends to address this problem by limiting the impact of outliers to obtain conclusions consistent with the bulk of the data. In this paper, we propose a model with super heavy-tailed errors to achieve this. We prove that it is wholly robust, meaning that the impact of outliers gradually vanishes as they move further and further away from the general trend. The super heavy-tailed density is similar to the normal outside of the tails, which gives rise to an efficient estimation procedure. In addition, estimates are easily computed. This is highlighted via a detailed user guide, where all steps are explained through a simulated case study. The performance is shown using simulation. All required code is given.




pr

Dynamic Quantile Linear Models: A Bayesian Approach

Kelly C. M. Gonçalves, Hélio S. Migon, Leonardo S. Bastos.

Source: Bayesian Analysis, Volume 15, Number 2, 335--362.

Abstract:
The paper introduces a new class of models, named dynamic quantile linear models, which combines dynamic linear models with distribution-free quantile regression producing a robust statistical method. Bayesian estimation for the dynamic quantile linear model is performed using an efficient Markov chain Monte Carlo algorithm. The paper also proposes a fast sequential procedure suited for high-dimensional predictive modeling with massive data, where the generating process is changing over time. The proposed model is evaluated using synthetic and well-known time series data. The model is also applied to predict annual incidence of tuberculosis in the state of Rio de Janeiro and compared with global targets set by the World Health Organization.




pr

A Novel Algorithmic Approach to Bayesian Logic Regression (with Discussion)

Aliaksandr Hubin, Geir Storvik, Florian Frommlet.

Source: Bayesian Analysis, Volume 15, Number 1, 263--333.

Abstract:
Logic regression was developed more than a decade ago as a tool to construct predictors from Boolean combinations of binary covariates. It has been mainly used to model epistatic effects in genetic association studies, which is very appealing due to the intuitive interpretation of logic expressions to describe the interaction between genetic variations. Nevertheless logic regression has (partly due to computational challenges) remained less well known than other approaches to epistatic association mapping. Here we will adapt an advanced evolutionary algorithm called GMJMCMC (Genetically modified Mode Jumping Markov Chain Monte Carlo) to perform Bayesian model selection in the space of logic regression models. After describing the algorithmic details of GMJMCMC we perform a comprehensive simulation study that illustrates its performance given logic regression terms of various complexity. Specifically GMJMCMC is shown to be able to identify three-way and even four-way interactions with relatively large power, a level of complexity which has not been achieved by previous implementations of logic regression. We apply GMJMCMC to reanalyze QTL (quantitative trait locus) mapping data for Recombinant Inbred Lines in Arabidopsis thaliana and from a backcross population in Drosophila where we identify several interesting epistatic effects. The method is implemented in an R package which is available on github.




pr

High-Dimensional Posterior Consistency for Hierarchical Non-Local Priors in Regression

Xuan Cao, Kshitij Khare, Malay Ghosh.

Source: Bayesian Analysis, Volume 15, Number 1, 241--262.

Abstract:
The choice of tuning parameters in Bayesian variable selection is a critical problem in modern statistics. In particular, for Bayesian linear regression with non-local priors, the scale parameter in the non-local prior density is an important tuning parameter which reflects the dispersion of the non-local prior density around zero, and implicitly determines the size of the regression coefficients that will be shrunk to zero. Current approaches treat the scale parameter as given, and suggest choices based on prior coverage/asymptotic considerations. In this paper, we consider the fully Bayesian approach introduced in (Wu, 2016) with the pMOM non-local prior and an appropriate Inverse-Gamma prior on the tuning parameter to analyze the underlying theoretical property. Under standard regularity assumptions, we establish strong model selection consistency in a high-dimensional setting, where $p$ is allowed to increase at a polynomial rate with $n$ or even at a sub-exponential rate with $n$ . Through simulation studies, we demonstrate that our model selection procedure can outperform other Bayesian methods which treat the scale parameter as given, and commonly used penalized likelihood methods, in a range of simulation settings.




pr

Learning Semiparametric Regression with Missing Covariates Using Gaussian Process Models

Abhishek Bishoyi, Xiaojing Wang, Dipak K. Dey.

Source: Bayesian Analysis, Volume 15, Number 1, 215--239.

Abstract:
Missing data often appear as a practical problem while applying classical models in the statistical analysis. In this paper, we consider a semiparametric regression model in the presence of missing covariates for nonparametric components under a Bayesian framework. As it is known that Gaussian processes are a popular tool in nonparametric regression because of their flexibility and the fact that much of the ensuing computation is parametric Gaussian computation. However, in the absence of covariates, the most frequently used covariance functions of a Gaussian process will not be well defined. We propose an imputation method to solve this issue and perform our analysis using Bayesian inference, where we specify the objective priors on the parameters of Gaussian process models. Several simulations are conducted to illustrate effectiveness of our proposed method and further, our method is exemplified via two real datasets, one through Langmuir equation, commonly used in pharmacokinetic models, and another through Auto-mpg data taken from the StatLib library.




pr

Determinantal Point Process Mixtures Via Spectral Density Approach

Ilaria Bianchini, Alessandra Guglielmi, Fernando A. Quintana.

Source: Bayesian Analysis, Volume 15, Number 1, 187--214.

Abstract:
We consider mixture models where location parameters are a priori encouraged to be well separated. We explore a class of determinantal point process (DPP) mixture models, which provide the desired notion of separation or repulsion. Instead of using the rather restrictive case where analytical results are partially available, we adopt a spectral representation from which approximations to the DPP density functions can be readily computed. For the sake of concreteness the presentation focuses on a power exponential spectral density, but the proposed approach is in fact quite general. We later extend our model to incorporate covariate information in the likelihood and also in the assignment to mixture components, yielding a trade-off between repulsiveness of locations in the mixtures and attraction among subjects with similar covariates. We develop full Bayesian inference, and explore model properties and posterior behavior using several simulation scenarios and data illustrations. Supplementary materials for this article are available online (Bianchini et al., 2019).




pr

Bayesian Network Marker Selection via the Thresholded Graph Laplacian Gaussian Prior

Qingpo Cai, Jian Kang, Tianwei Yu.

Source: Bayesian Analysis, Volume 15, Number 1, 79--102.

Abstract:
Selecting informative nodes over large-scale networks becomes increasingly important in many research areas. Most existing methods focus on the local network structure and incur heavy computational costs for the large-scale problem. In this work, we propose a novel prior model for Bayesian network marker selection in the generalized linear model (GLM) framework: the Thresholded Graph Laplacian Gaussian (TGLG) prior, which adopts the graph Laplacian matrix to characterize the conditional dependence between neighboring markers accounting for the global network structure. Under mild conditions, we show the proposed model enjoys the posterior consistency with a diverging number of edges and nodes in the network. We also develop a Metropolis-adjusted Langevin algorithm (MALA) for efficient posterior computation, which is scalable to large-scale networks. We illustrate the superiorities of the proposed method compared with existing alternatives via extensive simulation studies and an analysis of the breast cancer gene expression dataset in the Cancer Genome Atlas (TCGA).




pr

Latent Nested Nonparametric Priors (with Discussion)

Federico Camerlenghi, David B. Dunson, Antonio Lijoi, Igor Prünster, Abel Rodríguez.

Source: Bayesian Analysis, Volume 14, Number 4, 1303--1356.

Abstract:
Discrete random structures are important tools in Bayesian nonparametrics and the resulting models have proven effective in density estimation, clustering, topic modeling and prediction, among others. In this paper, we consider nested processes and study the dependence structures they induce. Dependence ranges between homogeneity, corresponding to full exchangeability, and maximum heterogeneity, corresponding to (unconditional) independence across samples. The popular nested Dirichlet process is shown to degenerate to the fully exchangeable case when there are ties across samples at the observed or latent level. To overcome this drawback, inherent to nesting general discrete random measures, we introduce a novel class of latent nested processes. These are obtained by adding common and group-specific completely random measures and, then, normalizing to yield dependent random probability measures. We provide results on the partition distributions induced by latent nested processes, and develop a Markov Chain Monte Carlo sampler for Bayesian inferences. A test for distributional homogeneity across groups is obtained as a by-product. The results and their inferential implications are showcased on synthetic and real data.




pr

Calibration Procedures for Approximate Bayesian Credible Sets

Jeong Eun Lee, Geoff K. Nicholls, Robin J. Ryder.

Source: Bayesian Analysis, Volume 14, Number 4, 1245--1269.

Abstract:
We develop and apply two calibration procedures for checking the coverage of approximate Bayesian credible sets, including intervals estimated using Monte Carlo methods. The user has an ideal prior and likelihood, but generates a credible set for an approximate posterior based on some approximate prior and likelihood. We estimate the realised posterior coverage achieved by the approximate credible set. This is the coverage of the unknown “true” parameter if the data are a realisation of the user’s ideal observation model conditioned on the parameter, and the parameter is a draw from the user’s ideal prior. In one approach we estimate the posterior coverage at the data by making a semi-parametric logistic regression of binary coverage outcomes on simulated data against summary statistics evaluated on simulated data. In another we use Importance Sampling from the approximate posterior, windowing simulated data to fall close to the observed data. We illustrate our methods on four examples.