nc Development of tolerance and cross-tolerance to psychomotor effects of benzodiazepines in man / by Kari Aranko. By search.wellcomelibrary.org Published On :: Helsinki : Department of Pharmacology and Toxicology, University of Helsinki, 1985. Full Article
nc The university chemical dependency project : final report : November 1 1986 / Steven A. Bloch, Steven Ungerleider. By search.wellcomelibrary.org Published On :: [Indiana] : Integrated Research Services, Inc., 1986. Full Article
nc Newsletter of the Parapsychology Foundation, Inc. By search.wellcomelibrary.org Published On :: [New York, N.Y.] : [The Foundation] [195-?]-1970. Full Article
nc Archive of the Association Culturelle Franco-Australienne By feedproxy.google.com Published On :: 29/09/2015 12:00:00 AM Full Article
nc Illuminated address presented to Andrew Lynch, 1925 By feedproxy.google.com Published On :: 30/09/2015 12:00:00 AM Full Article
nc Pam Liell papers relating to ‘Scrolls’ Book Club, 1994-2008 including correspondence with Alex Buzo, 1994-1998 By feedproxy.google.com Published On :: 1/10/2015 12:00:00 AM Full Article
nc Correspondence relating to Lewis Harold Bell Lasseter, 1931 By feedproxy.google.com Published On :: 9/10/2015 12:00:00 AM Full Article
nc Selected Poems of Henry Lawson: Correspondence: Vol.1 By feedproxy.google.com Published On :: 29/10/2015 12:00:00 AM Full Article
nc Sydney in 1848 : illustrated by copper-plate engravings of its principal streets, public buildings, churches, chapels, etc. / from drawings by Joseph Fowles. By feedproxy.google.com Published On :: 28/04/2016 12:00:00 AM Full Article
nc Top three Ruthy Hebard moments: NCAA record for consecutive FGs etched her place in history By sports.yahoo.com Published On :: Fri, 03 Apr 2020 23:08:48 GMT Over four years in Eugene, Ruthy Hebard has made a name for herself with reliability and dynamic play. She's had many memorable moments in a Duck uniform. But her career day against Washington State (34 points), her moment reaching 2,000 career points and her NCAA record for consecutive made FGs (2018) tops the list. Against the Trojans, she set the record (30) and later extended it to 33. Full Article video Sports
nc Kobe, Duncan, Garnett headline Basketball Hall of Fame class By sports.yahoo.com Published On :: Sat, 04 Apr 2020 16:12:32 GMT Kobe Bryant was already immortal. Bryant and fellow NBA greats Tim Duncan and Kevin Garnett headlined a nine-person group announced Saturday as this year’s class of enshrinees into the Naismith Memorial Basketball Hall of Fame. Two-time NBA champion coach Rudy Tomjanovich finally got his call, as did longtime Baylor women’s coach Kim Mulkey, 1,000-game winner Barbara Stevens of Bentley and three-time Final Four coach Eddie Sutton. Full Article article Sports
nc WNBA Draft Profile: Productive forward Ruthy Hebard has uncanny handling, scoring, rebounding ability By sports.yahoo.com Published On :: Thu, 09 Apr 2020 21:52:59 GMT Ruthy Hebard, who ranks 2nd in Oregon history in points (2,368) and 3rd in rebounds (1,299), prepares to play in the WNBA following four years in Eugene. Hebard is the Oregon and Pac-12 all-time leader in career field-goal percentage (65.1) and averaged 17.3 points per game and a career-high 9.6 rebounds per game as a senior. Full Article video Sports
nc NCAA women's hoops committee moves away from RPI to NET By sports.yahoo.com Published On :: Mon, 04 May 2020 20:31:26 GMT The women's basketball committee will start using the NCAA Evaluation Tool instead of RPI to help evaluate teams for the tournament starting with the upcoming season. “It’s an exciting time for the game as we look to the future,” said Nina King, senior deputy athletics director and chief of staff at Duke, who chair the Division I Women’s Basketball Committee next season. “We felt after much analysis that the women’s basketball NET, which will be determined by who you played, where you played, how efficiently you played and the result of the game, is a more accurate tool and should be used by the committee going forward.” Full Article article Sports
nc Stanford's Tara VanDerveer on Haley Jones' versatile freshman year: 'It was really incredible' By sports.yahoo.com Published On :: Fri, 08 May 2020 16:17:17 GMT During Friday's "Pac-12 Perspective," Stanford head coach Tara VanDerveer spoke about Haley Jones' positionless game and how the Cardinal used the dynamic freshman in 2019-20. Download and listen wherever you get your podcasts. Full Article video Sports
nc Pac-12 women's basketball student-athletes reflect on the influence of their moms ahead of Mother's Day By sports.yahoo.com Published On :: Fri, 08 May 2020 21:24:08 GMT Pac-12 student-athletes give shout-outs to their moms ahead of Mother's Day on May 10th, 2020 including UCLA's Michaela Onyenwere, Oregon's Sabrina Ionescu and Satou Sabally, Arizona's Aari McDonald, Cate Reese, and Lacie Hull, Stanford's Kiana Williams, USC's Endyia Rogers, and Aliyah Jeune, and Utah's Brynna Maxwell. Full Article video Sports
nc NCAA lays out 9-step plan to resume sports By sports.yahoo.com Published On :: Fri, 01 May 2020 19:28:30 GMT The process is based on the U.S. three-phase federal guidelines for easing social distancing and re-opening non-essential businesses. Full Article article Sports
nc Nonparametric confidence intervals for conditional quantiles with large-dimensional covariates By projecteuclid.org Published On :: Tue, 05 May 2020 22:00 EDT Laurent Gardes. Source: Electronic Journal of Statistics, Volume 14, Number 1, 661--701.Abstract: The first part of the paper is dedicated to the construction of a $gamma$ - nonparametric confidence interval for a conditional quantile with a level depending on the sample size. When this level tends to 0 or 1 as the sample size increases, the conditional quantile is said to be extreme and is located in the tail of the conditional distribution. The proposed confidence interval is constructed by approximating the distribution of the order statistics selected with a nearest neighbor approach by a Beta distribution. We show that its coverage probability converges to the preselected probability $gamma $ and its accuracy is illustrated on a simulation study. When the dimension of the covariate increases, the coverage probability of the confidence interval can be very different from $gamma $. This is a well known consequence of the data sparsity especially in the tail of the distribution. In a second part, a dimension reduction procedure is proposed in order to select more appropriate nearest neighbors in the right tail of the distribution and in turn to obtain a better coverage probability for extreme conditional quantiles. This procedure is based on the Tail Conditional Independence assumption introduced in (Gardes, Extremes , pp. 57–95, 18(3) , 2018). Full Article
nc Statistical convergence of the EM algorithm on Gaussian mixture models By projecteuclid.org Published On :: Tue, 05 May 2020 22:00 EDT Ruofei Zhao, Yuanzhi Li, Yuekai Sun. Source: Electronic Journal of Statistics, Volume 14, Number 1, 632--660.Abstract: We study the convergence behavior of the Expectation Maximization (EM) algorithm on Gaussian mixture models with an arbitrary number of mixture components and mixing weights. We show that as long as the means of the components are separated by at least $Omega (sqrt{min {M,d}})$, where $M$ is the number of components and $d$ is the dimension, the EM algorithm converges locally to the global optimum of the log-likelihood. Further, we show that the convergence rate is linear and characterize the size of the basin of attraction to the global optimum. Full Article
nc On the Letac-Massam conjecture and existence of high dimensional Bayes estimators for graphical models By projecteuclid.org Published On :: Tue, 05 May 2020 22:00 EDT Emanuel Ben-David, Bala Rajaratnam. Source: Electronic Journal of Statistics, Volume 14, Number 1, 580--604.Abstract: The Wishart distribution defined on the open cone of positive-definite matrices plays a central role in multivariate analysis and multivariate distribution theory. Its domain of parameters is often referred to as the Gindikin set. In recent years, varieties of useful extensions of the Wishart distribution have been proposed in the literature for the purposes of studying Markov random fields and graphical models. In particular, generalizations of the Wishart distribution, referred to as Type I and Type II (graphical) Wishart distributions introduced by Letac and Massam in Annals of Statistics (2007) play important roles in both frequentist and Bayesian inference for Gaussian graphical models. These distributions have been especially useful in high-dimensional settings due to the flexibility offered by their multiple-shape parameters. Concerning Type I and Type II Wishart distributions, a conjecture of Letac and Massam concerns the domain of multiple-shape parameters of these distributions. The conjecture also has implications for the existence of Bayes estimators corresponding to these high dimensional priors. The conjecture, which was first posed in the Annals of Statistics, has now been an open problem for about 10 years. In this paper, we give a necessary condition for the Letac and Massam conjecture to hold. More precisely, we prove that if the Letac and Massam conjecture holds on a decomposable graph, then no two separators of the graph can be nested within each other. For this, we analyze Type I and Type II Wishart distributions on appropriate Markov equivalent perfect DAG models and succeed in deriving the aforementioned necessary condition. This condition in particular identifies a class of counterexamples to the conjecture. Full Article
nc Recovery of simultaneous low rank and two-way sparse coefficient matrices, a nonconvex approach By projecteuclid.org Published On :: Tue, 05 May 2020 22:00 EDT Ming Yu, Varun Gupta, Mladen Kolar. Source: Electronic Journal of Statistics, Volume 14, Number 1, 413--457.Abstract: We study the problem of recovery of matrices that are simultaneously low rank and row and/or column sparse. Such matrices appear in recent applications in cognitive neuroscience, imaging, computer vision, macroeconomics, and genetics. We propose a GDT (Gradient Descent with hard Thresholding) algorithm to efficiently recover matrices with such structure, by minimizing a bi-convex function over a nonconvex set of constraints. We show linear convergence of the iterates obtained by GDT to a region within statistical error of an optimal solution. As an application of our method, we consider multi-task learning problems and show that the statistical error rate obtained by GDT is near optimal compared to minimax rate. Experiments demonstrate competitive performance and much faster running speed compared to existing methods, on both simulations and real data sets. Full Article
nc Parseval inequalities and lower bounds for variance-based sensitivity indices By projecteuclid.org Published On :: Tue, 05 May 2020 22:00 EDT Olivier Roustant, Fabrice Gamboa, Bertrand Iooss. Source: Electronic Journal of Statistics, Volume 14, Number 1, 386--412.Abstract: The so-called polynomial chaos expansion is widely used in computer experiments. For example, it is a powerful tool to estimate Sobol’ sensitivity indices. In this paper, we consider generalized chaos expansions built on general tensor Hilbert basis. In this frame, we revisit the computation of the Sobol’ indices with Parseval equalities and give general lower bounds for these indices obtained by truncation. The case of the eigenfunctions system associated with a Poincaré differential operator leads to lower bounds involving the derivatives of the analyzed function and provides an efficient tool for variable screening. These lower bounds are put in action both on toy and real life models demonstrating their accuracy. Full Article
nc Sparse equisigned PCA: Algorithms and performance bounds in the noisy rank-1 setting By projecteuclid.org Published On :: Mon, 27 Apr 2020 22:02 EDT Arvind Prasadan, Raj Rao Nadakuditi, Debashis Paul. Source: Electronic Journal of Statistics, Volume 14, Number 1, 345--385.Abstract: Singular value decomposition (SVD) based principal component analysis (PCA) breaks down in the high-dimensional and limited sample size regime below a certain critical eigen-SNR that depends on the dimensionality of the system and the number of samples. Below this critical eigen-SNR, the estimates returned by the SVD are asymptotically uncorrelated with the latent principal components. We consider a setting where the left singular vector of the underlying rank one signal matrix is assumed to be sparse and the right singular vector is assumed to be equisigned, that is, having either only nonnegative or only nonpositive entries. We consider six different algorithms for estimating the sparse principal component based on different statistical criteria and prove that by exploiting sparsity, we recover consistent estimates in the low eigen-SNR regime where the SVD fails. Our analysis reveals conditions under which a coordinate selection scheme based on a sum-type decision statistic outperforms schemes that utilize the $ell _{1}$ and $ell _{2}$ norm-based statistics. We derive lower bounds on the size of detectable coordinates of the principal left singular vector and utilize these lower bounds to derive lower bounds on the worst-case risk. Finally, we verify our findings with numerical simulations and a illustrate the performance with a video data where the interest is in identifying objects. Full Article
nc Bayesian variance estimation in the Gaussian sequence model with partial information on the means By projecteuclid.org Published On :: Mon, 27 Apr 2020 22:02 EDT Gianluca Finocchio, Johannes Schmidt-Hieber. Source: Electronic Journal of Statistics, Volume 14, Number 1, 239--271.Abstract: Consider the Gaussian sequence model under the additional assumption that a fixed fraction of the means is known. We study the problem of variance estimation from a frequentist Bayesian perspective. The maximum likelihood estimator (MLE) for $sigma^{2}$ is biased and inconsistent. This raises the question whether the posterior is able to correct the MLE in this case. By developing a new proving strategy that uses refined properties of the posterior distribution, we find that the marginal posterior is inconsistent for any i.i.d. prior on the mean parameters. In particular, no assumption on the decay of the prior needs to be imposed. Surprisingly, we also find that consistency can be retained for a hierarchical prior based on Gaussian mixtures. In this case we also establish a limiting shape result and determine the limit distribution. In contrast to the classical Bernstein-von Mises theorem, the limit is non-Gaussian. We show that the Bayesian analysis leads to new statistical estimators outperforming the correctly calibrated MLE in a numerical simulation study. Full Article
nc On the predictive potential of kernel principal components By projecteuclid.org Published On :: Wed, 15 Apr 2020 04:02 EDT Ben Jones, Andreas Artemiou, Bing Li. Source: Electronic Journal of Statistics, Volume 14, Number 1, 1--23.Abstract: We give a probabilistic analysis of a phenomenon in statistics which, until recently, has not received a convincing explanation. This phenomenon is that the leading principal components tend to possess more predictive power for a response variable than lower-ranking ones despite the procedure being unsupervised. Our result, in its most general form, shows that the phenomenon goes far beyond the context of linear regression and classical principal components — if an arbitrary distribution for the predictor $X$ and an arbitrary conditional distribution for $Yvert X$ are chosen then any measureable function $g(Y)$, subject to a mild condition, tends to be more correlated with the higher-ranking kernel principal components than with the lower-ranking ones. The “arbitrariness” is formulated in terms of unitary invariance then the tendency is explicitly quantified by exploring how unitary invariance relates to the Cauchy distribution. The most general results, for technical reasons, are shown for the case where the kernel space is finite dimensional. The occurency of this tendency in real world databases is also investigated to show that our results are consistent with observation. Full Article
nc Posterior contraction and credible sets for filaments of regression functions By projecteuclid.org Published On :: Tue, 14 Apr 2020 22:01 EDT Wei Li, Subhashis Ghosal. Source: Electronic Journal of Statistics, Volume 14, Number 1, 1707--1743.Abstract: A filament consists of local maximizers of a smooth function $f$ when moving in a certain direction. A filamentary structure is an important feature of the shape of an object and is also considered as an important lower dimensional characterization of multivariate data. There have been some recent theoretical studies of filaments in the nonparametric kernel density estimation context. This paper supplements the current literature in two ways. First, we provide a Bayesian approach to the filament estimation in regression context and study the posterior contraction rates using a finite random series of B-splines basis. Compared with the kernel-estimation method, this has a theoretical advantage as the bias can be better controlled when the function is smoother, which allows obtaining better rates. Assuming that $f:mathbb{R}^{2}mapsto mathbb{R}$ belongs to an isotropic Hölder class of order $alpha geq 4$, with the optimal choice of smoothing parameters, the posterior contraction rates for the filament points on some appropriately defined integral curves and for the Hausdorff distance of the filament are both $(n/log n)^{(2-alpha )/(2(1+alpha ))}$. Secondly, we provide a way to construct a credible set with sufficient frequentist coverage for the filaments. We demonstrate the success of our proposed method in simulations and one application to earthquake data. Full Article
nc Nonconcave penalized estimation in sparse vector autoregression model By projecteuclid.org Published On :: Wed, 01 Apr 2020 04:00 EDT Xuening Zhu. Source: Electronic Journal of Statistics, Volume 14, Number 1, 1413--1448.Abstract: High dimensional time series receive considerable attention recently, whose temporal and cross-sectional dependency could be captured by the vector autoregression (VAR) model. To tackle with the high dimensionality, penalization methods are widely employed. However, theoretically, the existing studies of the penalization methods mainly focus on $i.i.d$ data, therefore cannot quantify the effect of the dependence level on the convergence rate. In this work, we use the spectral properties of the time series to quantify the dependence and derive a nonasymptotic upper bound for the estimation errors. By focusing on the nonconcave penalization methods, we manage to establish the oracle properties of the penalized VAR model estimation by considering the effects of temporal and cross-sectional dependence. Extensive numerical studies are conducted to compare the finite sample performance using different penalization functions. Lastly, an air pollution data of mainland China is analyzed for illustration purpose. Full Article
nc Differential network inference via the fused D-trace loss with cross variables By projecteuclid.org Published On :: Tue, 24 Mar 2020 22:01 EDT Yichong Wu, Tiejun Li, Xiaoping Liu, Luonan Chen. Source: Electronic Journal of Statistics, Volume 14, Number 1, 1269--1301.Abstract: Detecting the change of biological interaction networks is of great importance in biological and medical research. We proposed a simple loss function, named as CrossFDTL, to identify the network change or differential network by estimating the difference between two precision matrices under Gaussian assumption. The CrossFDTL is a natural fusion of the D-trace loss for the considered two networks by imposing the $ell _{1}$ penalty to the differential matrix to ensure sparsity. The key point of our method is to utilize the cross variables, which correspond to the sum and difference of two precision matrices instead of using their original forms. Moreover, we developed an efficient minimization algorithm for the proposed loss function and further rigorously proved its convergence. Numerical results showed that our method outperforms the existing methods in both accuracy and convergence speed for the simulated and real data. Full Article
nc Consistency and asymptotic normality of Latent Block Model estimators By projecteuclid.org Published On :: Mon, 23 Mar 2020 22:02 EDT Vincent Brault, Christine Keribin, Mahendra Mariadassou. Source: Electronic Journal of Statistics, Volume 14, Number 1, 1234--1268.Abstract: The Latent Block Model (LBM) is a model-based method to cluster simultaneously the $d$ columns and $n$ rows of a data matrix. Parameter estimation in LBM is a difficult and multifaceted problem. Although various estimation strategies have been proposed and are now well understood empirically, theoretical guarantees about their asymptotic behavior is rather sparse and most results are limited to the binary setting. We prove here theoretical guarantees in the valued settings. We show that under some mild conditions on the parameter space, and in an asymptotic regime where $log (d)/n$ and $log (n)/d$ tend to $0$ when $n$ and $d$ tend to infinity, (1) the maximum-likelihood estimate of the complete model (with known labels) is consistent and (2) the log-likelihood ratios are equivalent under the complete and observed (with unknown labels) models. This equivalence allows us to transfer the asymptotic consistency, and under mild conditions, asymptotic normality, to the maximum likelihood estimate under the observed model. Moreover, the variational estimator is also consistent and, under the same conditions, asymptotically normal. Full Article
nc Sparsely observed functional time series: estimation and prediction By projecteuclid.org Published On :: Thu, 27 Feb 2020 22:04 EST Tomáš Rubín, Victor M. Panaretos. Source: Electronic Journal of Statistics, Volume 14, Number 1, 1137--1210.Abstract: Functional time series analysis, whether based on time or frequency domain methodology, has traditionally been carried out under the assumption of complete observation of the constituent series of curves, assumed stationary. Nevertheless, as is often the case with independent functional data, it may well happen that the data available to the analyst are not the actual sequence of curves, but relatively few and noisy measurements per curve, potentially at different locations in each curve’s domain. Under this sparse sampling regime, neither the established estimators of the time series’ dynamics nor their corresponding theoretical analysis will apply. The subject of this paper is to tackle the problem of estimating the dynamics and of recovering the latent process of smooth curves in the sparse regime. Assuming smoothness of the latent curves, we construct a consistent nonparametric estimator of the series’ spectral density operator and use it to develop a frequency-domain recovery approach, that predicts the latent curve at a given time by borrowing strength from the (estimated) dynamic correlations in the series across time. This new methodology is seen to comprehensively outperform a naive recovery approach that would ignore temporal dependence and use only methodology employed in the i.i.d. setting and hinging on the lag zero covariance. Further to predicting the latent curves from their noisy point samples, the method fills in gaps in the sequence (curves nowhere sampled), denoises the data, and serves as a basis for forecasting. Means of providing corresponding confidence bands are also investigated. A simulation study interestingly suggests that sparse observation for a longer time period may provide better performance than dense observation for a shorter period, in the presence of smoothness. The methodology is further illustrated by application to an environmental data set on fair-weather atmospheric electricity, which naturally leads to a sparse functional time series. Full Article
nc Reduction problems and deformation approaches to nonstationary covariance functions over spheres By projecteuclid.org Published On :: Tue, 11 Feb 2020 22:03 EST Emilio Porcu, Rachid Senoussi, Enner Mendoza, Moreno Bevilacqua. Source: Electronic Journal of Statistics, Volume 14, Number 1, 890--916.Abstract: The paper considers reduction problems and deformation approaches for nonstationary covariance functions on the $(d-1)$-dimensional spheres, $mathbb{S}^{d-1}$, embedded in the $d$-dimensional Euclidean space. Given a covariance function $C$ on $mathbb{S}^{d-1}$, we chase a pair $(R,Psi)$, for a function $R:[-1,+1] o mathbb{R}$ and a smooth bijection $Psi$, such that $C$ can be reduced to a geodesically isotropic one: $C(mathbf{x},mathbf{y})=R(langle Psi (mathbf{x}),Psi (mathbf{y}) angle )$, with $langle cdot ,cdot angle $ denoting the dot product. The problem finds motivation in recent statistical literature devoted to the analysis of global phenomena, defined typically over the sphere of $mathbb{R}^{3}$. The application domains considered in the manuscript makes the problem mathematically challenging. We show the uniqueness of the representation in the reduction problem. Then, under some regularity assumptions, we provide an inversion formula to recover the bijection $Psi$, when it exists, for a given $C$. We also give sufficient conditions for reducibility. Full Article
nc On a Metropolis–Hastings importance sampling estimator By projecteuclid.org Published On :: Mon, 10 Feb 2020 04:01 EST Daniel Rudolf, Björn Sprungk. Source: Electronic Journal of Statistics, Volume 14, Number 1, 857--889.Abstract: A classical approach for approximating expectations of functions w.r.t. partially known distributions is to compute the average of function values along a trajectory of a Metropolis–Hastings (MH) Markov chain. A key part in the MH algorithm is a suitable acceptance/rejection of a proposed state, which ensures the correct stationary distribution of the resulting Markov chain. However, the rejection of proposals causes highly correlated samples. In particular, when a state is rejected it is not taken any further into account. In contrast to that we consider a MH importance sampling estimator which explicitly incorporates all proposed states generated by the MH algorithm. The estimator satisfies a strong law of large numbers as well as a central limit theorem, and, in addition to that, we provide an explicit mean squared error bound. Remarkably, the asymptotic variance of the MH importance sampling estimator does not involve any correlation term in contrast to its classical counterpart. Moreover, although the analyzed estimator uses the same amount of information as the classical MH estimator, it can outperform the latter in scenarios of moderate dimensions as indicated by numerical experiments. Full Article
nc Detection of sparse positive dependence By projecteuclid.org Published On :: Wed, 29 Jan 2020 22:01 EST Ery Arias-Castro, Rong Huang, Nicolas Verzelen. Source: Electronic Journal of Statistics, Volume 14, Number 1, 702--730.Abstract: In a bivariate setting, we consider the problem of detecting a sparse contamination or mixture component, where the effect manifests itself as a positive dependence between the variables, which are otherwise independent in the main component. We first look at this problem in the context of a normal mixture model. In essence, the situation reduces to a univariate setting where the effect is a decrease in variance. In particular, a higher criticism test based on the pairwise differences is shown to achieve the detection boundary defined by the (oracle) likelihood ratio test. We then turn to a Gaussian copula model where the marginal distributions are unknown. Standard invariance considerations lead us to consider rank tests. In fact, a higher criticism test based on the pairwise rank differences achieves the detection boundary in the normal mixture model, although not in the very sparse regime. We do not know of any rank test that has any power in that regime. Full Article
nc On Mahalanobis Distance in Functional Settings By Published On :: 2020 Mahalanobis distance is a classical tool in multivariate analysis. We suggest here an extension of this concept to the case of functional data. More precisely, the proposed definition concerns those statistical problems where the sample data are real functions defined on a compact interval of the real line. The obvious difficulty for such a functional extension is the non-invertibility of the covariance operator in infinite-dimensional cases. Unlike other recent proposals, our definition is suggested and motivated in terms of the Reproducing Kernel Hilbert Space (RKHS) associated with the stochastic process that generates the data. The proposed distance is a true metric; it depends on a unique real smoothing parameter which is fully motivated in RKHS terms. Moreover, it shares some properties of its finite dimensional counterpart: it is invariant under isometries, it can be consistently estimated from the data and its sampling distribution is known under Gaussian models. An empirical study for two statistical applications, outliers detection and binary classification, is included. The results are quite competitive when compared to other recent proposals in the literature. Full Article
nc Generalized probabilistic principal component analysis of correlated data By Published On :: 2020 Principal component analysis (PCA) is a well-established tool in machine learning and data processing. The principal axes in PCA were shown to be equivalent to the maximum marginal likelihood estimator of the factor loading matrix in a latent factor model for the observed data, assuming that the latent factors are independently distributed as standard normal distributions. However, the independence assumption may be unrealistic for many scenarios such as modeling multiple time series, spatial processes, and functional data, where the outcomes are correlated. In this paper, we introduce the generalized probabilistic principal component analysis (GPPCA) to study the latent factor model for multiple correlated outcomes, where each factor is modeled by a Gaussian process. Our method generalizes the previous probabilistic formulation of PCA (PPCA) by providing the closed-form maximum marginal likelihood estimator of the factor loadings and other parameters. Based on the explicit expression of the precision matrix in the marginal likelihood that we derived, the number of the computational operations is linear to the number of output variables. Furthermore, we also provide the closed-form expression of the marginal likelihood when other covariates are included in the mean structure. We highlight the advantage of GPPCA in terms of the practical relevance, estimation accuracy and computational convenience. Numerical studies of simulated and real data confirm the excellent finite-sample performance of the proposed approach. Full Article
nc Expectation Propagation as a Way of Life: A Framework for Bayesian Inference on Partitioned Data By Published On :: 2020 A common divide-and-conquer approach for Bayesian computation with big data is to partition the data, perform local inference for each piece separately, and combine the results to obtain a global posterior approximation. While being conceptually and computationally appealing, this method involves the problematic need to also split the prior for the local inferences; these weakened priors may not provide enough regularization for each separate computation, thus eliminating one of the key advantages of Bayesian methods. To resolve this dilemma while still retaining the generalizability of the underlying local inference method, we apply the idea of expectation propagation (EP) as a framework for distributed Bayesian inference. The central idea is to iteratively update approximations to the local likelihoods given the state of the other approximations and the prior. The present paper has two roles: we review the steps that are needed to keep EP algorithms numerically stable, and we suggest a general approach, inspired by EP, for approaching data partitioning problems in a way that achieves the computational benefits of parallelism while allowing each local update to make use of relevant information from the other sites. In addition, we demonstrate how the method can be applied in a hierarchical context to make use of partitioning of both data and parameters. The paper describes a general algorithmic framework, rather than a specific algorithm, and presents an example implementation for it. Full Article
nc High-Dimensional Interactions Detection with Sparse Principal Hessian Matrix By Published On :: 2020 In statistical learning framework with regressions, interactions are the contributions to the response variable from the products of the explanatory variables. In high-dimensional problems, detecting interactions is challenging due to combinatorial complexity and limited data information. We consider detecting interactions by exploring their connections with the principal Hessian matrix. Specifically, we propose a one-step synthetic approach for estimating the principal Hessian matrix by a penalized M-estimator. An alternating direction method of multipliers (ADMM) is proposed to efficiently solve the encountered regularized optimization problem. Based on the sparse estimator, we detect the interactions by identifying its nonzero components. Our method directly targets at the interactions, and it requires no structural assumption on the hierarchy of the interactions effects. We show that our estimator is theoretically valid, computationally efficient, and practically useful for detecting the interactions in a broad spectrum of scenarios. Full Article
nc Convergences of Regularized Algorithms and Stochastic Gradient Methods with Random Projections By Published On :: 2020 We study the least-squares regression problem over a Hilbert space, covering nonparametric regression over a reproducing kernel Hilbert space as a special case. We first investigate regularized algorithms adapted to a projection operator on a closed subspace of the Hilbert space. We prove convergence results with respect to variants of norms, under a capacity assumption on the hypothesis space and a regularity condition on the target function. As a result, we obtain optimal rates for regularized algorithms with randomized sketches, provided that the sketch dimension is proportional to the effective dimension up to a logarithmic factor. As a byproduct, we obtain similar results for Nystr"{o}m regularized algorithms. Our results provide optimal, distribution-dependent rates that do not have any saturation effect for sketched/Nystr"{o}m regularized algorithms, considering both the attainable and non-attainable cases, in the well-conditioned regimes. We then study stochastic gradient methods with projection over the subspace, allowing multi-pass over the data and minibatches, and we derive similar optimal statistical convergence results. Full Article
nc GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing By Published On :: 2020 We present GluonCV and GluonNLP, the deep learning toolkits for computer vision and natural language processing based on Apache MXNet (incubating). These toolkits provide state-of-the-art pre-trained models, training scripts, and training logs, to facilitate rapid prototyping and promote reproducible research. We also provide modular APIs with flexible building blocks to enable efficient customization. Leveraging the MXNet ecosystem, the deep learning models in GluonCV and GluonNLP can be deployed onto a variety of platforms with different programming languages. The Apache 2.0 license has been adopted by GluonCV and GluonNLP to allow for software distribution, modification, and usage. Full Article
nc Targeted Fused Ridge Estimation of Inverse Covariance Matrices from Multiple High-Dimensional Data Classes By Published On :: 2020 We consider the problem of jointly estimating multiple inverse covariance matrices from high-dimensional data consisting of distinct classes. An $ell_2$-penalized maximum likelihood approach is employed. The suggested approach is flexible and generic, incorporating several other $ell_2$-penalized estimators as special cases. In addition, the approach allows specification of target matrices through which prior knowledge may be incorporated and which can stabilize the estimation procedure in high-dimensional settings. The result is a targeted fused ridge estimator that is of use when the precision matrices of the constituent classes are believed to chiefly share the same structure while potentially differing in a number of locations of interest. It has many applications in (multi)factorial study designs. We focus on the graphical interpretation of precision matrices with the proposed estimator then serving as a basis for integrative or meta-analytic Gaussian graphical modeling. Situations are considered in which the classes are defined by data sets and subtypes of diseases. The performance of the proposed estimator in the graphical modeling setting is assessed through extensive simulation experiments. Its practical usability is illustrated by the differential network modeling of 12 large-scale gene expression data sets of diffuse large B-cell lymphoma subtypes. The estimator and its related procedures are incorporated into the R-package rags2ridges. Full Article
nc On the consistency of graph-based Bayesian semi-supervised learning and the scalability of sampling algorithms By Published On :: 2020 This paper considers a Bayesian approach to graph-based semi-supervised learning. We show that if the graph parameters are suitably scaled, the graph-posteriors converge to a continuum limit as the size of the unlabeled data set grows. This consistency result has profound algorithmic implications: we prove that when consistency holds, carefully designed Markov chain Monte Carlo algorithms have a uniform spectral gap, independent of the number of unlabeled inputs. Numerical experiments illustrate and complement the theory. Full Article
nc Generalized Nonbacktracking Bounds on the Influence By Published On :: 2020 This paper develops deterministic upper and lower bounds on the influence measure in a network, more precisely, the expected number of nodes that a seed set can influence in the independent cascade model. In particular, our bounds exploit r-nonbacktracking walks and Fortuin-Kasteleyn-Ginibre (FKG) type inequalities, and are computed by message passing algorithms. Further, we provide parameterized versions of the bounds that control the trade-off between efficiency and accuracy. Finally, the tightness of the bounds is illustrated on various network models. Full Article
nc Provably robust estimation of modulo 1 samples of a smooth function with applications to phase unwrapping By Published On :: 2020 Consider an unknown smooth function $f: [0,1]^d ightarrow mathbb{R}$, and assume we are given $n$ noisy mod 1 samples of $f$, i.e., $y_i = (f(x_i) + eta_i) mod 1$, for $x_i in [0,1]^d$, where $eta_i$ denotes the noise. Given the samples $(x_i,y_i)_{i=1}^{n}$, our goal is to recover smooth, robust estimates of the clean samples $f(x_i) mod 1$. We formulate a natural approach for solving this problem, which works with angular embeddings of the noisy mod 1 samples over the unit circle, inspired by the angular synchronization framework. This amounts to solving a smoothness regularized least-squares problem -- a quadratically constrained quadratic program (QCQP) -- where the variables are constrained to lie on the unit circle. Our proposed approach is based on solving its relaxation, which is a trust-region sub-problem and hence solvable efficiently. We provide theoretical guarantees demonstrating its robustness to noise for adversarial, as well as random Gaussian and Bernoulli noise models. To the best of our knowledge, these are the first such theoretical results for this problem. We demonstrate the robustness and efficiency of our proposed approach via extensive numerical simulations on synthetic data, along with a simple least-squares based solution for the unwrapping stage, that recovers the original samples of $f$ (up to a global shift). It is shown to perform well at high levels of noise, when taking as input the denoised modulo $1$ samples. Finally, we also consider two other approaches for denoising the modulo 1 samples that leverage tools from Riemannian optimization on manifolds, including a Burer-Monteiro approach for a semidefinite programming relaxation of our formulation. For the two-dimensional version of the problem, which has applications in synthetic aperture radar interferometry (InSAR), we are able to solve instances of real-world data with a million sample points in under 10 seconds, on a personal laptop. Full Article
nc Learning with Fenchel-Young losses By Published On :: 2020 Over the past decades, numerous loss functions have been been proposed for a variety of supervised learning tasks, including regression, classification, ranking, and more generally structured prediction. Understanding the core principles and theoretical properties underpinning these losses is key to choose the right loss for the right problem, as well as to create new losses which combine their strengths. In this paper, we introduce Fenchel-Young losses, a generic way to construct a convex loss function for a regularized prediction function. We provide an in-depth study of their properties in a very broad setting, covering all the aforementioned supervised learning tasks, and revealing new connections between sparsity, generalized entropies, and separation margins. We show that Fenchel-Young losses unify many well-known loss functions and allow to create useful new ones easily. Finally, we derive efficient predictive and training algorithms, making Fenchel-Young losses appealing both in theory and practice. Full Article
nc Causal Discovery Toolbox: Uncovering causal relationships in Python By Published On :: 2020 This paper presents a new open source Python framework for causal discovery from observational data and domain background knowledge, aimed at causal graph and causal mechanism modeling. The cdt package implements an end-to-end approach, recovering the direct dependencies (the skeleton of the causal graph) and the causal relationships between variables. It includes algorithms from the `Bnlearn' and `Pcalg' packages, together with algorithms for pairwise causal discovery such as ANM. Full Article
nc Latent Simplex Position Model: High Dimensional Multi-view Clustering with Uncertainty Quantification By Published On :: 2020 High dimensional data often contain multiple facets, and several clustering patterns can co-exist under different variable subspaces, also known as the views. While multi-view clustering algorithms were proposed, the uncertainty quantification remains difficult --- a particular challenge is in the high complexity of estimating the cluster assignment probability under each view, and sharing information among views. In this article, we propose an approximate Bayes approach --- treating the similarity matrices generated over the views as rough first-stage estimates for the co-assignment probabilities; in its Kullback-Leibler neighborhood, we obtain a refined low-rank matrix, formed by the pairwise product of simplex coordinates. Interestingly, each simplex coordinate directly encodes the cluster assignment uncertainty. For multi-view clustering, we let each view draw a parameterization from a few candidates, leading to dimension reduction. With high model flexibility, the estimation can be efficiently carried out as a continuous optimization problem, hence enjoys gradient-based computation. The theory establishes the connection of this model to a random partition distribution under multiple views. Compared to single-view clustering approaches, substantially more interpretable results are obtained when clustering brains from a human traumatic brain injury study, using high-dimensional gene expression data. Full Article
nc Learning Linear Non-Gaussian Causal Models in the Presence of Latent Variables By Published On :: 2020 We consider the problem of learning causal models from observational data generated by linear non-Gaussian acyclic causal models with latent variables. Without considering the effect of latent variables, the inferred causal relationships among the observed variables are often wrong. Under faithfulness assumption, we propose a method to check whether there exists a causal path between any two observed variables. From this information, we can obtain the causal order among the observed variables. The next question is whether the causal effects can be uniquely identified as well. We show that causal effects among observed variables cannot be identified uniquely under mere assumptions of faithfulness and non-Gaussianity of exogenous noises. However, we are able to propose an efficient method that identifies the set of all possible causal effects that are compatible with the observational data. We present additional structural conditions on the causal graph under which causal effects among observed variables can be determined uniquely. Furthermore, we provide necessary and sufficient graphical conditions for unique identification of the number of variables in the system. Experiments on synthetic data and real-world data show the effectiveness of our proposed algorithm for learning causal models. Full Article
nc Switching Regression Models and Causal Inference in the Presence of Discrete Latent Variables By Published On :: 2020 Given a response $Y$ and a vector $X = (X^1, dots, X^d)$ of $d$ predictors, we investigate the problem of inferring direct causes of $Y$ among the vector $X$. Models for $Y$ that use all of its causal covariates as predictors enjoy the property of being invariant across different environments or interventional settings. Given data from such environments, this property has been exploited for causal discovery. Here, we extend this inference principle to situations in which some (discrete-valued) direct causes of $ Y $ are unobserved. Such cases naturally give rise to switching regression models. We provide sufficient conditions for the existence, consistency and asymptotic normality of the MLE in linear switching regression models with Gaussian noise, and construct a test for the equality of such models. These results allow us to prove that the proposed causal discovery method obtains asymptotic false discovery control under mild conditions. We provide an algorithm, make available code, and test our method on simulated data. It is robust against model violations and outperforms state-of-the-art approaches. We further apply our method to a real data set, where we show that it does not only output causal predictors, but also a process-based clustering of data points, which could be of additional interest to practitioners. Full Article
nc Branch and Bound for Piecewise Linear Neural Network Verification By Published On :: 2020 The success of Deep Learning and its potential use in many safety-critical applicationshas motivated research on formal verification of Neural Network (NN) models. In thiscontext, verification involves proving or disproving that an NN model satisfies certaininput-output properties. Despite the reputation of learned NN models as black boxes,and the theoretical hardness of proving useful properties about them, researchers havebeen successful in verifying some classes of models by exploiting their piecewise linearstructure and taking insights from formal methods such as Satisifiability Modulo Theory.However, these methods are still far from scaling to realistic neural networks. To facilitateprogress on this crucial area, we exploit the Mixed Integer Linear Programming (MIP) formulation of verification to propose a family of algorithms based on Branch-and-Bound (BaB). We show that our family contains previous verification methods as special cases.With the help of the BaB framework, we make three key contributions. Firstly, we identifynew methods that combine the strengths of multiple existing approaches, accomplishingsignificant performance improvements over previous state of the art. Secondly, we introducean effective branching strategy on ReLU non-linearities. This branching strategy allows usto efficiently and successfully deal with high input dimensional problems with convolutionalnetwork architecture, on which previous methods fail frequently. Finally, we proposecomprehensive test data sets and benchmarks which includes a collection of previouslyreleased testcases. We use the data sets to conduct a thorough experimental comparison ofexisting and new algorithms and to provide an inclusive analysis of the factors impactingthe hardness of verification problems. Full Article
nc A Convex Parametrization of a New Class of Universal Kernel Functions By Published On :: 2020 The accuracy and complexity of kernel learning algorithms is determined by the set of kernels over which it is able to optimize. An ideal set of kernels should: admit a linear parameterization (tractability); be dense in the set of all kernels (accuracy); and every member should be universal so that the hypothesis space is infinite-dimensional (scalability). Currently, there is no class of kernel that meets all three criteria - e.g. Gaussians are not tractable or accurate; polynomials are not scalable. We propose a new class that meet all three criteria - the Tessellated Kernel (TK) class. Specifically, the TK class: admits a linear parameterization using positive matrices; is dense in all kernels; and every element in the class is universal. This implies that the use of TK kernels for learning the kernel can obviate the need for selecting candidate kernels in algorithms such as SimpleMKL and parameters such as the bandwidth. Numerical testing on soft margin Support Vector Machine (SVM) problems show that algorithms using TK kernels outperform other kernel learning algorithms and neural networks. Furthermore, our results show that when the ratio of the number of training data to features is high, the improvement of TK over MKL increases significantly. Full Article
nc Ancestral Gumbel-Top-k Sampling for Sampling Without Replacement By Published On :: 2020 We develop ancestral Gumbel-Top-$k$ sampling: a generic and efficient method for sampling without replacement from discrete-valued Bayesian networks, which includes multivariate discrete distributions, Markov chains and sequence models. The method uses an extension of the Gumbel-Max trick to sample without replacement by finding the top $k$ of perturbed log-probabilities among all possible configurations of a Bayesian network. Despite the exponentially large domain, the algorithm has a complexity linear in the number of variables and sample size $k$. Our algorithm allows to set the number of parallel processors $m$, to trade off the number of iterations versus the total cost (iterations times $m$) of running the algorithm. For $m = 1$ the algorithm has minimum total cost, whereas for $m = k$ the number of iterations is minimized, and the resulting algorithm is known as Stochastic Beam Search. We provide extensions of the algorithm and discuss a number of related algorithms. We analyze the properties of ancestral Gumbel-Top-$k$ sampling and compare against alternatives on randomly generated Bayesian networks with different levels of connectivity. In the context of (deep) sequence models, we show its use as a method to generate diverse but high-quality translations and statistical estimates of translation quality and entropy. Full Article