c AgileAssets v7.5 Improves Flexibility, Field Productivity for Tunnel... By www.prweb.com Published On :: Web and mobile applications enhance efficiency and data accuracy using satellite maps and offline capabilities.(PRWeb April 23, 2020)Read the full story at https://www.prweb.com/releases/agileassets_v7_5_improves_flexibility_field_productivity_for_tunnel_inspections_asset_maintenance/prweb17071093.htm Full Article
c Gun Rights: California Gun Owners & Ammo Dealers Fire Back Against... By www.prweb.com Published On :: Ammunition Depot comments on Judge Roger T. Benitez ruling that Californians may again purchase ammo without a background check and order ammo online.(PRWeb April 24, 2020)Read the full story at https://www.prweb.com/releases/gun_rights_california_gun_owners_ammo_dealers_fire_back_against_proposition_63/prweb17075447.htm Full Article
c Jamboree Begins Construction on Capstone Development to Change... By www.prweb.com Published On :: In a public-private partnership to develop housing, resident services and hope for 102 working families in Haster Orangewood community, Jamboree Housing Corporation and the City of Anaheim announce...(PRWeb April 27, 2020)Read the full story at https://www.prweb.com/releases/jamboree_begins_construction_on_capstone_development_to_change_trajectory_of_neighborhood_in_anaheim_ca/prweb17073166.htm Full Article
c Suntuity AirWorks Offering FREE Assistance in Drone Acquisition... By www.prweb.com Published On :: The drones and programs will be fully paid for by the DOJ as part of the $850 million funding that has been allocated to help public safety departments fight the spread of COVID-19. This includes...(PRWeb April 30, 2020)Read the full story at https://www.prweb.com/releases/suntuity_airworks_offering_free_assistance_in_drone_acquisition_through_850mm_federal_grant_assistance_program_for_public_safety_agencies/prweb17090555.htm Full Article
c New York State YMCAs are “Open For Good” By www.prweb.com Published On :: With New York is on PAUSE, the Alliance of New York State YMCAs will showcase how YMCAs are staying “Open For Good” to meet the needs of their community during the COVID-19 crisis on Giving Tuesday...(PRWeb May 02, 2020)Read the full story at https://www.prweb.com/releases/new_york_state_ymcas_are_open_for_good/prweb17088694.htm Full Article
c Viable Policy Pathways Expand Access to Renewable Energy for... By www.prweb.com Published On :: Newly launched REBA Institute shares research suggesting multiple policy pathways increase access, lower costs and drive decarbonization of the electricity sector.(PRWeb May 05, 2020)Read the full story at https://www.prweb.com/releases/viable_policy_pathways_expand_access_to_renewable_energy_for_commercial_industrial_sector/prweb17099869.htm Full Article
c Health Worker Data Alliance: Monitoring Emotional, Physical and... By www.prweb.com Published On :: Surveys provide secure, anonymous feedback from staff at all levels of healthcare organizations(PRWeb May 06, 2020)Read the full story at https://www.prweb.com/releases/health_worker_data_alliance_monitoring_emotional_physical_and_occupational_health_of_healthcare_workers_during_covid_19/prweb17101008.htm Full Article
c Colorado Court Rules STRmix Is “Relevant and Reliable” Practice for... By www.prweb.com Published On :: Defendant’s Motion to Exclude Expert Testimony regarding evidence generated by STRmix denied.(PRWeb May 08, 2020)Read the full story at https://www.prweb.com/releases/colorado_court_rules_strmix_is_relevant_and_reliable_practice_for_interpreting_likelihood_ratios/prweb17101548.htm Full Article
c Penalized generalized empirical likelihood with a diverging number of general estimating equations for censored data By projecteuclid.org Published On :: Mon, 17 Feb 2020 04:02 EST Niansheng Tang, Xiaodong Yan, Xingqiu Zhao. Source: The Annals of Statistics, Volume 48, Number 1, 607--627.Abstract: This article considers simultaneous variable selection and parameter estimation as well as hypothesis testing in censored survival models where a parametric likelihood is not available. For the problem, we utilize certain growing dimensional general estimating equations and propose a penalized generalized empirical likelihood, where the general estimating equations are constructed based on the semiparametric efficiency bound of estimation with given moment conditions. The proposed penalized generalized empirical likelihood estimators enjoy the oracle properties, and the estimator of any fixed dimensional vector of nonzero parameters achieves the semiparametric efficiency bound asymptotically. Furthermore, we show that the penalized generalized empirical likelihood ratio test statistic has an asymptotic central chi-square distribution. The conditions of local and restricted global optimality of weighted penalized generalized empirical likelihood estimators are also discussed. We present a two-layer iterative algorithm for efficient implementation, and investigate its convergence property. The performance of the proposed methods is demonstrated by extensive simulation studies, and a real data example is provided for illustration. Full Article
c Almost sure uniqueness of a global minimum without convexity By projecteuclid.org Published On :: Mon, 17 Feb 2020 04:02 EST Gregory Cox. Source: The Annals of Statistics, Volume 48, Number 1, 584--606.Abstract: This paper establishes the argmin of a random objective function to be unique almost surely. This paper first formulates a general result that proves almost sure uniqueness without convexity of the objective function. The general result is then applied to a variety of applications in statistics. Four applications are discussed, including uniqueness of M-estimators, both classical likelihood and penalized likelihood estimators, and two applications of the argmin theorem, threshold regression and weak identification. Full Article
c Asymptotic genealogies of interacting particle systems with an application to sequential Monte Carlo By projecteuclid.org Published On :: Mon, 17 Feb 2020 04:02 EST Jere Koskela, Paul A. Jenkins, Adam M. Johansen, Dario Spanò. Source: The Annals of Statistics, Volume 48, Number 1, 560--583.Abstract: We study weighted particle systems in which new generations are resampled from current particles with probabilities proportional to their weights. This covers a broad class of sequential Monte Carlo (SMC) methods, widely-used in applied statistics and cognate disciplines. We consider the genealogical tree embedded into such particle systems, and identify conditions, as well as an appropriate time-scaling, under which they converge to the Kingman $n$-coalescent in the infinite system size limit in the sense of finite-dimensional distributions. Thus, the tractable $n$-coalescent can be used to predict the shape and size of SMC genealogies, as we illustrate by characterising the limiting mean and variance of the tree height. SMC genealogies are known to be connected to algorithm performance, so that our results are likely to have applications in the design of new methods as well. Our conditions for convergence are strong, but we show by simulation that they do not appear to be necessary. Full Article
c Markov equivalence of marginalized local independence graphs By projecteuclid.org Published On :: Mon, 17 Feb 2020 04:02 EST Søren Wengel Mogensen, Niels Richard Hansen. Source: The Annals of Statistics, Volume 48, Number 1, 539--559.Abstract: Symmetric independence relations are often studied using graphical representations. Ancestral graphs or acyclic directed mixed graphs with $m$-separation provide classes of symmetric graphical independence models that are closed under marginalization. Asymmetric independence relations appear naturally for multivariate stochastic processes, for instance, in terms of local independence. However, no class of graphs representing such asymmetric independence relations, which is also closed under marginalization, has been developed. We develop the theory of directed mixed graphs with $mu $-separation and show that this provides a graphical independence model class which is closed under marginalization and which generalizes previously considered graphical representations of local independence. Several graphs may encode the same set of independence relations and this means that in many cases only an equivalence class of graphs can be identified from observational data. For statistical applications, it is therefore pivotal to characterize graphs that induce the same independence relations. Our main result is that for directed mixed graphs with $mu $-separation each equivalence class contains a maximal element which can be constructed from the independence relations alone. Moreover, we introduce the directed mixed equivalence graph as the maximal graph with dashed and solid edges. This graph encodes all information about the edges that is identifiable from the independence relations, and furthermore it can be computed efficiently from the maximal graph. Full Article
c Averages of unlabeled networks: Geometric characterization and asymptotic behavior By projecteuclid.org Published On :: Mon, 17 Feb 2020 04:02 EST Eric D. Kolaczyk, Lizhen Lin, Steven Rosenberg, Jackson Walters, Jie Xu. Source: The Annals of Statistics, Volume 48, Number 1, 514--538.Abstract: It is becoming increasingly common to see large collections of network data objects, that is, data sets in which a network is viewed as a fundamental unit of observation. As a result, there is a pressing need to develop network-based analogues of even many of the most basic tools already standard for scalar and vector data. In this paper, our focus is on averages of unlabeled, undirected networks with edge weights. Specifically, we (i) characterize a certain notion of the space of all such networks, (ii) describe key topological and geometric properties of this space relevant to doing probability and statistics thereupon, and (iii) use these properties to establish the asymptotic behavior of a generalized notion of an empirical mean under sampling from a distribution supported on this space. Our results rely on a combination of tools from geometry, probability theory and statistical shape analysis. In particular, the lack of vertex labeling necessitates working with a quotient space modding out permutations of labels. This results in a nontrivial geometry for the space of unlabeled networks, which in turn is found to have important implications on the types of probabilistic and statistical results that may be obtained and the techniques needed to obtain them. Full Article
c Optimal prediction in the linearly transformed spiked model By projecteuclid.org Published On :: Mon, 17 Feb 2020 04:02 EST Edgar Dobriban, William Leeb, Amit Singer. Source: The Annals of Statistics, Volume 48, Number 1, 491--513.Abstract: We consider the linearly transformed spiked model , where the observations $Y_{i}$ are noisy linear transforms of unobserved signals of interest $X_{i}$: egin{equation*}Y_{i}=A_{i}X_{i}+varepsilon_{i},end{equation*} for $i=1,ldots ,n$. The transform matrices $A_{i}$ are also observed. We model the unobserved signals (or regression coefficients) $X_{i}$ as vectors lying on an unknown low-dimensional space. Given only $Y_{i}$ and $A_{i}$ how should we predict or recover their values? The naive approach of performing regression for each observation separately is inaccurate due to the large noise level. Instead, we develop optimal methods for predicting $X_{i}$ by “borrowing strength” across the different samples. Our linear empirical Bayes methods scale to large datasets and rely on weak moment assumptions. We show that this model has wide-ranging applications in signal processing, deconvolution, cryo-electron microscopy, and missing data with noise. For missing data, we show in simulations that our methods are more robust to noise and to unequal sampling than well-known matrix completion methods. Full Article
c Efficient estimation of linear functionals of principal components By projecteuclid.org Published On :: Mon, 17 Feb 2020 04:02 EST Vladimir Koltchinskii, Matthias Löffler, Richard Nickl. Source: The Annals of Statistics, Volume 48, Number 1, 464--490.Abstract: We study principal component analysis (PCA) for mean zero i.i.d. Gaussian observations $X_{1},dots,X_{n}$ in a separable Hilbert space $mathbb{H}$ with unknown covariance operator $Sigma $. The complexity of the problem is characterized by its effective rank $mathbf{r}(Sigma):=frac{operatorname{tr}(Sigma)}{|Sigma |}$, where $mathrm{tr}(Sigma)$ denotes the trace of $Sigma $ and $|Sigma|$ denotes its operator norm. We develop a method of bias reduction in the problem of estimation of linear functionals of eigenvectors of $Sigma $. Under the assumption that $mathbf{r}(Sigma)=o(n)$, we establish the asymptotic normality and asymptotic properties of the risk of the resulting estimators and prove matching minimax lower bounds, showing their semiparametric optimality. Full Article
c Uniformly valid confidence intervals post-model-selection By projecteuclid.org Published On :: Mon, 17 Feb 2020 04:02 EST François Bachoc, David Preinerstorfer, Lukas Steinberger. Source: The Annals of Statistics, Volume 48, Number 1, 440--463.Abstract: We suggest general methods to construct asymptotically uniformly valid confidence intervals post-model-selection. The constructions are based on principles recently proposed by Berk et al. ( Ann. Statist. 41 (2013) 802–837). In particular, the candidate models used can be misspecified, the target of inference is model-specific, and coverage is guaranteed for any data-driven model selection procedure. After developing a general theory, we apply our methods to practically important situations where the candidate set of models, from which a working model is selected, consists of fixed design homoskedastic or heteroskedastic linear models, or of binary regression models with general link functions. In an extensive simulation study, we find that the proposed confidence intervals perform remarkably well, even when compared to existing methods that are tailored only for specific model selection procedures. Full Article
c Consistent selection of the number of change-points via sample-splitting By projecteuclid.org Published On :: Mon, 17 Feb 2020 04:02 EST Changliang Zou, Guanghui Wang, Runze Li. Source: The Annals of Statistics, Volume 48, Number 1, 413--439.Abstract: In multiple change-point analysis, one of the major challenges is to estimate the number of change-points. Most existing approaches attempt to minimize a Schwarz information criterion which balances a term quantifying model fit with a penalization term accounting for model complexity that increases with the number of change-points and limits overfitting. However, different penalization terms are required to adapt to different contexts of multiple change-point problems and the optimal penalization magnitude usually varies from the model and error distribution. We propose a data-driven selection criterion that is applicable to most kinds of popular change-point detection methods, including binary segmentation and optimal partitioning algorithms. The key idea is to select the number of change-points that minimizes the squared prediction error, which measures the fit of a specified model for a new sample. We develop a cross-validation estimation scheme based on an order-preserved sample-splitting strategy, and establish its asymptotic selection consistency under some mild conditions. Effectiveness of the proposed selection criterion is demonstrated on a variety of numerical experiments and real-data examples. Full Article
c The numerical bootstrap By projecteuclid.org Published On :: Mon, 17 Feb 2020 04:02 EST Han Hong, Jessie Li. Source: The Annals of Statistics, Volume 48, Number 1, 397--412.Abstract: This paper proposes a numerical bootstrap method that is consistent in many cases where the standard bootstrap is known to fail and where the $m$-out-of-$n$ bootstrap and subsampling have been the most commonly used inference approaches. We provide asymptotic analysis under both fixed and drifting parameter sequences, and we compare the approximation error of the numerical bootstrap with that of the $m$-out-of-$n$ bootstrap and subsampling. Finally, we discuss applications of the numerical bootstrap, such as constrained and unconstrained M-estimators converging at both regular and nonstandard rates, Laplace-type estimators, and test statistics for partially identified models. Full Article
c Concentration and consistency results for canonical and curved exponential-family models of random graphs By projecteuclid.org Published On :: Mon, 17 Feb 2020 04:02 EST Michael Schweinberger, Jonathan Stewart. Source: The Annals of Statistics, Volume 48, Number 1, 374--396.Abstract: Statistical inference for exponential-family models of random graphs with dependent edges is challenging. We stress the importance of additional structure and show that additional structure facilitates statistical inference. A simple example of a random graph with additional structure is a random graph with neighborhoods and local dependence within neighborhoods. We develop the first concentration and consistency results for maximum likelihood and $M$-estimators of a wide range of canonical and curved exponential-family models of random graphs with local dependence. All results are nonasymptotic and applicable to random graphs with finite populations of nodes, although asymptotic consistency results can be obtained as well. In addition, we show that additional structure can facilitate subgraph-to-graph estimation, and present concentration results for subgraph-to-graph estimators. As an application, we consider popular curved exponential-family models of random graphs, with local dependence induced by transitivity and parameter vectors whose dimensions depend on the number of nodes. Full Article
c The multi-armed bandit problem: An efficient nonparametric solution By projecteuclid.org Published On :: Mon, 17 Feb 2020 04:02 EST Hock Peng Chan. Source: The Annals of Statistics, Volume 48, Number 1, 346--373.Abstract: Lai and Robbins ( Adv. in Appl. Math. 6 (1985) 4–22) and Lai ( Ann. Statist. 15 (1987) 1091–1114) provided efficient parametric solutions to the multi-armed bandit problem, showing that arm allocation via upper confidence bounds (UCB) achieves minimum regret. These bounds are constructed from the Kullback–Leibler information of the reward distributions, estimated from specified parametric families. In recent years, there has been renewed interest in the multi-armed bandit problem due to new applications in machine learning algorithms and data analytics. Nonparametric arm allocation procedures like $epsilon $-greedy, Boltzmann exploration and BESA were studied, and modified versions of the UCB procedure were also analyzed under nonparametric settings. However, unlike UCB these nonparametric procedures are not efficient under general parametric settings. In this paper, we propose efficient nonparametric procedures. Full Article
c Testing for principal component directions under weak identifiability By projecteuclid.org Published On :: Mon, 17 Feb 2020 04:02 EST Davy Paindaveine, Julien Remy, Thomas Verdebout. Source: The Annals of Statistics, Volume 48, Number 1, 324--345.Abstract: We consider the problem of testing, on the basis of a $p$-variate Gaussian random sample, the null hypothesis $mathcal{H}_{0}:oldsymbol{ heta}_{1}=oldsymbol{ heta}_{1}^{0}$ against the alternative $mathcal{H}_{1}:oldsymbol{ heta}_{1} eq oldsymbol{ heta}_{1}^{0}$, where $oldsymbol{ heta}_{1}$ is the “first” eigenvector of the underlying covariance matrix and $oldsymbol{ heta}_{1}^{0}$ is a fixed unit $p$-vector. In the classical setup where eigenvalues $lambda_{1}>lambda_{2}geq cdots geq lambda_{p}$ are fixed, the Anderson ( Ann. Math. Stat. 34 (1963) 122–148) likelihood ratio test (LRT) and the Hallin, Paindaveine and Verdebout ( Ann. Statist. 38 (2010) 3245–3299) Le Cam optimal test for this problem are asymptotically equivalent under the null hypothesis, hence also under sequences of contiguous alternatives. We show that this equivalence does not survive asymptotic scenarios where $lambda_{n1}/lambda_{n2}=1+O(r_{n})$ with $r_{n}=O(1/sqrt{n})$. For such scenarios, the Le Cam optimal test still asymptotically meets the nominal level constraint, whereas the LRT severely overrejects the null hypothesis. Consequently, the former test should be favored over the latter one whenever the two largest sample eigenvalues are close to each other. By relying on the Le Cam’s asymptotic theory of statistical experiments, we study the non-null and optimality properties of the Le Cam optimal test in the aforementioned asymptotic scenarios and show that the null robustness of this test is not obtained at the expense of power. Our asymptotic investigation is extensive in the sense that it allows $r_{n}$ to converge to zero at an arbitrary rate. While we restrict to single-spiked spectra of the form $lambda_{n1}>lambda_{n2}=cdots =lambda_{np}$ to make our results as striking as possible, we extend our results to the more general elliptical case. Finally, we present an illustrative real data example. Full Article
c Sparse high-dimensional regression: Exact scalable algorithms and phase transitions By projecteuclid.org Published On :: Mon, 17 Feb 2020 04:02 EST Dimitris Bertsimas, Bart Van Parys. Source: The Annals of Statistics, Volume 48, Number 1, 300--323.Abstract: We present a novel binary convex reformulation of the sparse regression problem that constitutes a new duality perspective. We devise a new cutting plane method and provide evidence that it can solve to provable optimality the sparse regression problem for sample sizes $n$ and number of regressors $p$ in the 100,000s, that is, two orders of magnitude better than the current state of the art, in seconds. The ability to solve the problem for very high dimensions allows us to observe new phase transition phenomena. Contrary to traditional complexity theory which suggests that the difficulty of a problem increases with problem size, the sparse regression problem has the property that as the number of samples $n$ increases the problem becomes easier in that the solution recovers 100% of the true signal, and our approach solves the problem extremely fast (in fact faster than Lasso), while for small number of samples $n$, our approach takes a larger amount of time to solve the problem, but importantly the optimal solution provides a statistically more relevant regressor. We argue that our exact sparse regression approach presents a superior alternative over heuristic methods available at present. Full Article
c Bootstrap confidence regions based on M-estimators under nonstandard conditions By projecteuclid.org Published On :: Mon, 17 Feb 2020 04:02 EST Stephen M. S. Lee, Puyudi Yang. Source: The Annals of Statistics, Volume 48, Number 1, 274--299.Abstract: Suppose that a confidence region is desired for a subvector $ heta $ of a multidimensional parameter $xi =( heta ,psi )$, based on an M-estimator $hat{xi }_{n}=(hat{ heta }_{n},hat{psi }_{n})$ calculated from a random sample of size $n$. Under nonstandard conditions $hat{xi }_{n}$ often converges at a nonregular rate $r_{n}$, in which case consistent estimation of the distribution of $r_{n}(hat{ heta }_{n}- heta )$, a pivot commonly chosen for confidence region construction, is most conveniently effected by the $m$ out of $n$ bootstrap. The above choice of pivot has three drawbacks: (i) the shape of the region is either subjectively prescribed or controlled by a computationally intensive depth function; (ii) the region is not transformation equivariant; (iii) $hat{xi }_{n}$ may not be uniquely defined. To resolve the above difficulties, we propose a one-dimensional pivot derived from the criterion function, and prove that its distribution can be consistently estimated by the $m$ out of $n$ bootstrap, or by a modified version of the perturbation bootstrap. This leads to a new method for constructing confidence regions which are transformation equivariant and have shapes driven solely by the criterion function. A subsampling procedure is proposed for selecting $m$ in practice. Empirical performance of the new method is illustrated with examples drawn from different nonstandard M-estimation settings. Extension of our theory to row-wise independent triangular arrays is also explored. Full Article
c Statistical inference for model parameters in stochastic gradient descent By projecteuclid.org Published On :: Mon, 17 Feb 2020 04:02 EST Xi Chen, Jason D. Lee, Xin T. Tong, Yichen Zhang. Source: The Annals of Statistics, Volume 48, Number 1, 251--273.Abstract: The stochastic gradient descent (SGD) algorithm has been widely used in statistical estimation for large-scale data due to its computational and memory efficiency. While most existing works focus on the convergence of the objective function or the error of the obtained solution, we investigate the problem of statistical inference of true model parameters based on SGD when the population loss function is strongly convex and satisfies certain smoothness conditions. Our main contributions are twofold. First, in the fixed dimension setup, we propose two consistent estimators of the asymptotic covariance of the average iterate from SGD: (1) a plug-in estimator, and (2) a batch-means estimator, which is computationally more efficient and only uses the iterates from SGD. Both proposed estimators allow us to construct asymptotically exact confidence intervals and hypothesis tests. Second, for high-dimensional linear regression, using a variant of the SGD algorithm, we construct a debiased estimator of each regression coefficient that is asymptotically normal. This gives a one-pass algorithm for computing both the sparse regression coefficients and confidence intervals, which is computationally attractive and applicable to online data. Full Article
c Spectral and matrix factorization methods for consistent community detection in multi-layer networks By projecteuclid.org Published On :: Mon, 17 Feb 2020 04:02 EST Subhadeep Paul, Yuguo Chen. Source: The Annals of Statistics, Volume 48, Number 1, 230--250.Abstract: We consider the problem of estimating a consensus community structure by combining information from multiple layers of a multi-layer network using methods based on the spectral clustering or a low-rank matrix factorization. As a general theme, these “intermediate fusion” methods involve obtaining a low column rank matrix by optimizing an objective function and then using the columns of the matrix for clustering. However, the theoretical properties of these methods remain largely unexplored. In the absence of statistical guarantees on the objective functions, it is difficult to determine if the algorithms optimizing the objectives will return good community structures. We investigate the consistency properties of the global optimizer of some of these objective functions under the multi-layer stochastic blockmodel. For this purpose, we derive several new asymptotic results showing consistency of the intermediate fusion techniques along with the spectral clustering of mean adjacency matrix under a high dimensional setup, where the number of nodes, the number of layers and the number of communities of the multi-layer graph grow. Our numerical study shows that the intermediate fusion techniques outperform late fusion methods, namely spectral clustering on aggregate spectral kernel and module allegiance matrix in sparse networks, while they outperform the spectral clustering of mean adjacency matrix in multi-layer networks that contain layers with both homophilic and heterophilic communities. Full Article
c Optimal rates for community estimation in the weighted stochastic block model By projecteuclid.org Published On :: Mon, 17 Feb 2020 04:02 EST Min Xu, Varun Jog, Po-Ling Loh. Source: The Annals of Statistics, Volume 48, Number 1, 183--204.Abstract: Community identification in a network is an important problem in fields such as social science, neuroscience and genetics. Over the past decade, stochastic block models (SBMs) have emerged as a popular statistical framework for this problem. However, SBMs have an important limitation in that they are suited only for networks with unweighted edges; in various scientific applications, disregarding the edge weights may result in a loss of valuable information. We study a weighted generalization of the SBM, in which observations are collected in the form of a weighted adjacency matrix and the weight of each edge is generated independently from an unknown probability density determined by the community membership of its endpoints. We characterize the optimal rate of misclustering error of the weighted SBM in terms of the Renyi divergence of order 1/2 between the weight distributions of within-community and between-community edges, substantially generalizing existing results for unweighted SBMs. Furthermore, we present a computationally tractable algorithm based on discretization that achieves the optimal error rate. Our method is adaptive in the sense that the algorithm, without assuming knowledge of the weight densities, performs as well as the best algorithm that knows the weight densities. Full Article
c New $G$-formula for the sequential causal effect and blip effect of treatment in sequential causal inference By projecteuclid.org Published On :: Mon, 17 Feb 2020 04:02 EST Xiaoqin Wang, Li Yin. Source: The Annals of Statistics, Volume 48, Number 1, 138--160.Abstract: In sequential causal inference, two types of causal effects are of practical interest, namely, the causal effect of the treatment regime (called the sequential causal effect) and the blip effect of treatment on the potential outcome after the last treatment. The well-known $G$-formula expresses these causal effects in terms of the standard parameters. In this article, we obtain a new $G$-formula that expresses these causal effects in terms of the point observable effects of treatments similar to treatment in the framework of single-point causal inference. Based on the new $G$-formula, we estimate these causal effects by maximum likelihood via point observable effects with methods extended from single-point causal inference. We are able to increase precision of the estimation without introducing biases by an unsaturated model imposing constraints on the point observable effects. We are also able to reduce the number of point observable effects in the estimation by treatment assignment conditions. Full Article
c Model assisted variable clustering: Minimax-optimal recovery and algorithms By projecteuclid.org Published On :: Mon, 17 Feb 2020 04:02 EST Florentina Bunea, Christophe Giraud, Xi Luo, Martin Royer, Nicolas Verzelen. Source: The Annals of Statistics, Volume 48, Number 1, 111--137.Abstract: The problem of variable clustering is that of estimating groups of similar components of a $p$-dimensional vector $X=(X_{1},ldots ,X_{p})$ from $n$ independent copies of $X$. There exists a large number of algorithms that return data-dependent groups of variables, but their interpretation is limited to the algorithm that produced them. An alternative is model-based clustering, in which one begins by defining population level clusters relative to a model that embeds notions of similarity. Algorithms tailored to such models yield estimated clusters with a clear statistical interpretation. We take this view here and introduce the class of $G$-block covariance models as a background model for variable clustering. In such models, two variables in a cluster are deemed similar if they have similar associations will all other variables. This can arise, for instance, when groups of variables are noise corrupted versions of the same latent factor. We quantify the difficulty of clustering data generated from a $G$-block covariance model in terms of cluster proximity, measured with respect to two related, but different, cluster separation metrics. We derive minimax cluster separation thresholds, which are the metric values below which no algorithm can recover the model-defined clusters exactly, and show that they are different for the two metrics. We therefore develop two algorithms, COD and PECOK, tailored to $G$-block covariance models, and study their minimax-optimality with respect to each metric. Of independent interest is the fact that the analysis of the PECOK algorithm, which is based on a corrected convex relaxation of the popular $K$-means algorithm, provides the first statistical analysis of such algorithms for variable clustering. Additionally, we compare our methods with another popular clustering method, spectral clustering. Extensive simulation studies, as well as our data analyses, confirm the applicability of our approach. Full Article
c Robust sparse covariance estimation by thresholding Tyler’s M-estimator By projecteuclid.org Published On :: Mon, 17 Feb 2020 04:02 EST John Goes, Gilad Lerman, Boaz Nadler. Source: The Annals of Statistics, Volume 48, Number 1, 86--110.Abstract: Estimating a high-dimensional sparse covariance matrix from a limited number of samples is a fundamental task in contemporary data analysis. Most proposals to date, however, are not robust to outliers or heavy tails. Toward bridging this gap, in this work we consider estimating a sparse shape matrix from $n$ samples following a possibly heavy-tailed elliptical distribution. We propose estimators based on thresholding either Tyler’s M-estimator or its regularized variant. We prove that in the joint limit as the dimension $p$ and the sample size $n$ tend to infinity with $p/n ogamma>0$, our estimators are minimax rate optimal. Results on simulated data support our theoretical analysis. Full Article
c Rerandomization in $2^{K}$ factorial experiments By projecteuclid.org Published On :: Mon, 17 Feb 2020 04:02 EST Xinran Li, Peng Ding, Donald B. Rubin. Source: The Annals of Statistics, Volume 48, Number 1, 43--63.Abstract: With many pretreatment covariates and treatment factors, the classical factorial experiment often fails to balance covariates across multiple factorial effects simultaneously. Therefore, it is intuitive to restrict the randomization of the treatment factors to satisfy certain covariate balance criteria, possibly conforming to the tiers of factorial effects and covariates based on their relative importances. This is rerandomization in factorial experiments. We study the asymptotic properties of this experimental design under the randomization inference framework without imposing any distributional or modeling assumptions of the covariates and outcomes. We derive the joint asymptotic sampling distribution of the usual estimators of the factorial effects, and show that it is symmetric, unimodal and more “concentrated” at the true factorial effects under rerandomization than under the classical factorial experiment. We quantify this advantage of rerandomization using the notions of “central convex unimodality” and “peakedness” of the joint asymptotic sampling distribution. We also construct conservative large-sample confidence sets for the factorial effects. Full Article
c The phase transition for the existence of the maximum likelihood estimate in high-dimensional logistic regression By projecteuclid.org Published On :: Mon, 17 Feb 2020 04:02 EST Emmanuel J. Candès, Pragya Sur. Source: The Annals of Statistics, Volume 48, Number 1, 27--42.Abstract: This paper rigorously establishes that the existence of the maximum likelihood estimate (MLE) in high-dimensional logistic regression models with Gaussian covariates undergoes a sharp “phase transition.” We introduce an explicit boundary curve $h_{mathrm{MLE}}$, parameterized by two scalars measuring the overall magnitude of the unknown sequence of regression coefficients, with the following property: in the limit of large sample sizes $n$ and number of features $p$ proportioned in such a way that $p/n ightarrow kappa $, we show that if the problem is sufficiently high dimensional in the sense that $kappa >h_{mathrm{MLE}}$, then the MLE does not exist with probability one. Conversely, if $kappa <h_{mathrm{MLE}}$, the MLE asymptotically exists with probability one. Full Article
c Two-step semiparametric empirical likelihood inference By projecteuclid.org Published On :: Mon, 17 Feb 2020 04:02 EST Francesco Bravo, Juan Carlos Escanciano, Ingrid Van Keilegom. Source: The Annals of Statistics, Volume 48, Number 1, 1--26.Abstract: In both parametric and certain nonparametric statistical models, the empirical likelihood ratio satisfies a nonparametric version of Wilks’ theorem. For many semiparametric models, however, the commonly used two-step (plug-in) empirical likelihood ratio is not asymptotically distribution-free, that is, its asymptotic distribution contains unknown quantities, and hence Wilks’ theorem breaks down. This article suggests a general approach to restore Wilks’ phenomenon in two-step semiparametric empirical likelihood inferences. The main insight consists in using as the moment function in the estimating equation the influence function of the plug-in sample moment. The proposed method is general; it leads to a chi-squared limiting distribution with known degrees of freedom; it is efficient; it does not require undersmoothing; and it is less sensitive to the first-step than alternative methods, which is particularly appealing for high-dimensional settings. Several examples and simulation studies illustrate the general applicability of the procedure and its excellent finite sample performance relative to competing methods. Full Article
c Detecting relevant changes in the mean of nonstationary processes—A mass excess approach By projecteuclid.org Published On :: Wed, 30 Oct 2019 22:03 EDT Holger Dette, Weichi Wu. Source: The Annals of Statistics, Volume 47, Number 6, 3578--3608.Abstract: This paper considers the problem of testing if a sequence of means $(mu_{t})_{t=1,ldots ,n}$ of a nonstationary time series $(X_{t})_{t=1,ldots ,n}$ is stable in the sense that the difference of the means $mu_{1}$ and $mu_{t}$ between the initial time $t=1$ and any other time is smaller than a given threshold, that is $|mu_{1}-mu_{t}|leq c$ for all $t=1,ldots ,n$. A test for hypotheses of this type is developed using a bias corrected monotone rearranged local linear estimator and asymptotic normality of the corresponding test statistic is established. As the asymptotic variance depends on the location of the roots of the equation $|mu_{1}-mu_{t}|=c$ a new bootstrap procedure is proposed to obtain critical values and its consistency is established. As a consequence we are able to quantitatively describe relevant deviations of a nonstationary sequence from its initial value. The results are illustrated by means of a simulation study and by analyzing data examples. Full Article
c Intrinsic Riemannian functional data analysis By projecteuclid.org Published On :: Wed, 30 Oct 2019 22:03 EDT Zhenhua Lin, Fang Yao. Source: The Annals of Statistics, Volume 47, Number 6, 3533--3577.Abstract: In this work we develop a novel and foundational framework for analyzing general Riemannian functional data, in particular a new development of tensor Hilbert spaces along curves on a manifold. Such spaces enable us to derive Karhunen–Loève expansion for Riemannian random processes. This framework also features an approach to compare objects from different tensor Hilbert spaces, which paves the way for asymptotic analysis in Riemannian functional data analysis. Built upon intrinsic geometric concepts such as vector field, Levi-Civita connection and parallel transport on Riemannian manifolds, the developed framework applies to not only Euclidean submanifolds but also manifolds without a natural ambient space. As applications of this framework, we develop intrinsic Riemannian functional principal component analysis (iRFPCA) and intrinsic Riemannian functional linear regression (iRFLR) that are distinct from their traditional and ambient counterparts. We also provide estimation procedures for iRFPCA and iRFLR, and investigate their asymptotic properties within the intrinsic geometry. Numerical performance is illustrated by simulated and real examples. Full Article
c Tracy–Widom limit for Kendall’s tau By projecteuclid.org Published On :: Wed, 30 Oct 2019 22:03 EDT Zhigang Bao. Source: The Annals of Statistics, Volume 47, Number 6, 3504--3532.Abstract: In this paper, we study a high-dimensional random matrix model from nonparametric statistics called the Kendall rank correlation matrix, which is a natural multivariate extension of the Kendall rank correlation coefficient. We establish the Tracy–Widom law for its largest eigenvalue. It is the first Tracy–Widom law for a nonparametric random matrix model, and also the first Tracy–Widom law for a high-dimensional U-statistic. Full Article
c Joint convergence of sample autocovariance matrices when $p/n o 0$ with application By projecteuclid.org Published On :: Wed, 30 Oct 2019 22:03 EDT Monika Bhattacharjee, Arup Bose. Source: The Annals of Statistics, Volume 47, Number 6, 3470--3503.Abstract: Consider a high-dimensional linear time series model where the dimension $p$ and the sample size $n$ grow in such a way that $p/n o 0$. Let $hat{Gamma }_{u}$ be the $u$th order sample autocovariance matrix. We first show that the LSD of any symmetric polynomial in ${hat{Gamma }_{u},hat{Gamma }_{u}^{*},ugeq 0}$ exists under independence and moment assumptions on the driving sequence together with weak assumptions on the coefficient matrices. This LSD result, with some additional effort, implies the asymptotic normality of the trace of any polynomial in ${hat{Gamma }_{u},hat{Gamma }_{u}^{*},ugeq 0}$. We also study similar results for several independent MA processes. We show applications of the above results to statistical inference problems such as in estimation of the unknown order of a high-dimensional MA process and in graphical and significance tests for hypotheses on coefficient matrices of one or several such independent processes. Full Article
c Bootstrapping and sample splitting for high-dimensional, assumption-lean inference By projecteuclid.org Published On :: Wed, 30 Oct 2019 22:03 EDT Alessandro Rinaldo, Larry Wasserman, Max G’Sell. Source: The Annals of Statistics, Volume 47, Number 6, 3438--3469.Abstract: Several new methods have been recently proposed for performing valid inference after model selection. An older method is sample splitting: use part of the data for model selection and the rest for inference. In this paper, we revisit sample splitting combined with the bootstrap (or the Normal approximation). We show that this leads to a simple, assumption-lean approach to inference and we establish results on the accuracy of the method. In fact, we find new bounds on the accuracy of the bootstrap and the Normal approximation for general nonlinear parameters with increasing dimension which we then use to assess the accuracy of regression inference. We define new parameters that measure variable importance and that can be inferred with greater accuracy than the usual regression coefficients. Finally, we elucidate an inference-prediction trade-off: splitting increases the accuracy and robustness of inference but can decrease the accuracy of the predictions. Full Article
c Minimax posterior convergence rates and model selection consistency in high-dimensional DAG models based on sparse Cholesky factors By projecteuclid.org Published On :: Wed, 30 Oct 2019 22:03 EDT Kyoungjae Lee, Jaeyong Lee, Lizhen Lin. Source: The Annals of Statistics, Volume 47, Number 6, 3413--3437.Abstract: In this paper we study the high-dimensional sparse directed acyclic graph (DAG) models under the empirical sparse Cholesky prior. Among our results, strong model selection consistency or graph selection consistency is obtained under more general conditions than those in the existing literature. Compared to Cao, Khare and Ghosh [ Ann. Statist. (2019) 47 319–348], the required conditions are weakened in terms of the dimensionality, sparsity and lower bound of the nonzero elements in the Cholesky factor. Furthermore, our result does not require the irrepresentable condition, which is necessary for Lasso-type methods. We also derive the posterior convergence rates for precision matrices and Cholesky factors with respect to various matrix norms. The obtained posterior convergence rates are the fastest among those of the existing Bayesian approaches. In particular, we prove that our posterior convergence rates for Cholesky factors are the minimax or at least nearly minimax depending on the relative size of true sparseness for the entire dimension. The simulation study confirms that the proposed method outperforms the competing methods. Full Article
c A smeary central limit theorem for manifolds with application to high-dimensional spheres By projecteuclid.org Published On :: Wed, 30 Oct 2019 22:03 EDT Benjamin Eltzner, Stephan F. Huckemann. Source: The Annals of Statistics, Volume 47, Number 6, 3360--3381.Abstract: The (CLT) central limit theorems for generalized Fréchet means (data descriptors assuming values in manifolds, such as intrinsic means, geodesics, etc.) on manifolds from the literature are only valid if a certain empirical process of Hessians of the Fréchet function converges suitably, as in the proof of the prototypical BP-CLT [ Ann. Statist. 33 (2005) 1225–1259]. This is not valid in many realistic scenarios and we provide for a new very general CLT. In particular, this includes scenarios where, in a suitable chart, the sample mean fluctuates asymptotically at a scale $n^{alpha }$ with exponents $alpha <1/2$ with a nonnormal distribution. As the BP-CLT yields only fluctuations that are, rescaled with $n^{1/2}$, asymptotically normal, just as the classical CLT for random vectors, these lower rates, somewhat loosely called smeariness, had to date been observed only on the circle. We make the concept of smeariness on manifolds precise, give an example for two-smeariness on spheres of arbitrary dimension, and show that smeariness, although “almost never” occurring, may have serious statistical implications on a continuum of sample scenarios nearby. In fact, this effect increases with dimension, striking in particular in high dimension low sample size scenarios. Full Article
c Hypothesis testing on linear structures of high-dimensional covariance matrix By projecteuclid.org Published On :: Wed, 30 Oct 2019 22:03 EDT Shurong Zheng, Zhao Chen, Hengjian Cui, Runze Li. Source: The Annals of Statistics, Volume 47, Number 6, 3300--3334.Abstract: This paper is concerned with test of significance on high-dimensional covariance structures, and aims to develop a unified framework for testing commonly used linear covariance structures. We first construct a consistent estimator for parameters involved in the linear covariance structure, and then develop two tests for the linear covariance structures based on entropy loss and quadratic loss used for covariance matrix estimation. To study the asymptotic properties of the proposed tests, we study related high-dimensional random matrix theory, and establish several highly useful asymptotic results. With the aid of these asymptotic results, we derive the limiting distributions of these two tests under the null and alternative hypotheses. We further show that the quadratic loss based test is asymptotically unbiased. We conduct Monte Carlo simulation study to examine the finite sample performance of the two tests. Our simulation results show that the limiting null distributions approximate their null distributions quite well, and the corresponding asymptotic critical values keep Type I error rate very well. Our numerical comparison implies that the proposed tests outperform existing ones in terms of controlling Type I error rate and power. Our simulation indicates that the test based on quadratic loss seems to have better power than the test based on entropy loss. Full Article
c Sampling and estimation for (sparse) exchangeable graphs By projecteuclid.org Published On :: Wed, 30 Oct 2019 22:03 EDT Victor Veitch, Daniel M. Roy. Source: The Annals of Statistics, Volume 47, Number 6, 3274--3299.Abstract: Sparse exchangeable graphs on $mathbb{R}_{+}$, and the associated graphex framework for sparse graphs, generalize exchangeable graphs on $mathbb{N}$, and the associated graphon framework for dense graphs. We develop the graphex framework as a tool for statistical network analysis by identifying the sampling scheme that is naturally associated with the models of the framework, formalizing two natural notions of consistent estimation of the parameter (the graphex) underlying these models, and identifying general consistent estimators in each case. The sampling scheme is a modification of independent vertex sampling that throws away vertices that are isolated in the sampled subgraph. The estimators are variants of the empirical graphon estimator, which is known to be a consistent estimator for the distribution of dense exchangeable graphs; both can be understood as graph analogues to the empirical distribution in the i.i.d. sequence setting. Our results may be viewed as a generalization of consistent estimation via the empirical graphon from the dense graph regime to also include sparse graphs. Full Article
c Quantile regression under memory constraint By projecteuclid.org Published On :: Wed, 30 Oct 2019 22:03 EDT Xi Chen, Weidong Liu, Yichen Zhang. Source: The Annals of Statistics, Volume 47, Number 6, 3244--3273.Abstract: This paper studies the inference problem in quantile regression (QR) for a large sample size $n$ but under a limited memory constraint, where the memory can only store a small batch of data of size $m$. A natural method is the naive divide-and-conquer approach, which splits data into batches of size $m$, computes the local QR estimator for each batch and then aggregates the estimators via averaging. However, this method only works when $n=o(m^{2})$ and is computationally expensive. This paper proposes a computationally efficient method, which only requires an initial QR estimator on a small batch of data and then successively refines the estimator via multiple rounds of aggregations. Theoretically, as long as $n$ grows polynomially in $m$, we establish the asymptotic normality for the obtained estimator and show that our estimator with only a few rounds of aggregations achieves the same efficiency as the QR estimator computed on all the data. Moreover, our result allows the case that the dimensionality $p$ goes to infinity. The proposed method can also be applied to address the QR problem under distributed computing environment (e.g., in a large-scale sensor network) or for real-time streaming data. Full Article
c On partial-sum processes of ARMAX residuals By projecteuclid.org Published On :: Wed, 30 Oct 2019 22:03 EDT Steffen Grønneberg, Benjamin Holcblat. Source: The Annals of Statistics, Volume 47, Number 6, 3216--3243.Abstract: We establish general and versatile results regarding the limit behavior of the partial-sum process of ARMAX residuals. Illustrations include ARMA with seasonal dummies, misspecified ARMAX models with autocorrelated errors, nonlinear ARMAX models, ARMA with a structural break, a wide range of ARMAX models with infinite-variance errors, weak GARCH models and the consistency of kernel estimation of the density of ARMAX errors. Our results identify the limit distributions, and provide a general algorithm to obtain pivot statistics for CUSUM tests. Full Article
c Statistical inference for autoregressive models under heteroscedasticity of unknown form By projecteuclid.org Published On :: Wed, 30 Oct 2019 22:03 EDT Ke Zhu. Source: The Annals of Statistics, Volume 47, Number 6, 3185--3215.Abstract: This paper provides an entire inference procedure for the autoregressive model under (conditional) heteroscedasticity of unknown form with a finite variance. We first establish the asymptotic normality of the weighted least absolute deviations estimator (LADE) for the model. Second, we develop the random weighting (RW) method to estimate its asymptotic covariance matrix, leading to the implementation of the Wald test. Third, we construct a portmanteau test for model checking, and use the RW method to obtain its critical values. As a special weighted LADE, the feasible adaptive LADE (ALADE) is proposed and proved to have the same efficiency as its infeasible counterpart. The importance of our entire methodology based on the feasible ALADE is illustrated by simulation results and the real data analysis on three U.S. economic data sets. Full Article
c Adaptive estimation of the rank of the coefficient matrix in high-dimensional multivariate response regression models By projecteuclid.org Published On :: Wed, 30 Oct 2019 22:03 EDT Xin Bing, Marten H. Wegkamp. Source: The Annals of Statistics, Volume 47, Number 6, 3157--3184.Abstract: We consider the multivariate response regression problem with a regression coefficient matrix of low, unknown rank. In this setting, we analyze a new criterion for selecting the optimal reduced rank. This criterion differs notably from the one proposed in Bunea, She and Wegkamp ( Ann. Statist. 39 (2011) 1282–1309) in that it does not require estimation of the unknown variance of the noise, nor does it depend on a delicate choice of a tuning parameter. We develop an iterative, fully data-driven procedure, that adapts to the optimal signal-to-noise ratio. This procedure finds the true rank in a few steps with overwhelming probability. At each step, our estimate increases, while at the same time it does not exceed the true rank. Our finite sample results hold for any sample size and any dimension, even when the number of responses and of covariates grow much faster than the number of observations. We perform an extensive simulation study that confirms our theoretical findings. The new method performs better and is more stable than the procedure of Bunea, She and Wegkamp ( Ann. Statist. 39 (2011) 1282–1309) in both low- and high-dimensional settings. Full Article
c Randomized incomplete $U$-statistics in high dimensions By projecteuclid.org Published On :: Wed, 30 Oct 2019 22:03 EDT Xiaohui Chen, Kengo Kato. Source: The Annals of Statistics, Volume 47, Number 6, 3127--3156.Abstract: This paper studies inference for the mean vector of a high-dimensional $U$-statistic. In the era of big data, the dimension $d$ of the $U$-statistic and the sample size $n$ of the observations tend to be both large, and the computation of the $U$-statistic is prohibitively demanding. Data-dependent inferential procedures such as the empirical bootstrap for $U$-statistics is even more computationally expensive. To overcome such a computational bottleneck, incomplete $U$-statistics obtained by sampling fewer terms of the $U$-statistic are attractive alternatives. In this paper, we introduce randomized incomplete $U$-statistics with sparse weights whose computational cost can be made independent of the order of the $U$-statistic. We derive nonasymptotic Gaussian approximation error bounds for the randomized incomplete $U$-statistics in high dimensions, namely in cases where the dimension $d$ is possibly much larger than the sample size $n$, for both nondegenerate and degenerate kernels. In addition, we propose generic bootstrap methods for the incomplete $U$-statistics that are computationally much less demanding than existing bootstrap methods, and establish finite sample validity of the proposed bootstrap methods. Our methods are illustrated on the application to nonparametric testing for the pairwise independence of a high-dimensional random vector under weaker assumptions than those appearing in the literature. Full Article
c Active ranking from pairwise comparisons and when parametric assumptions do not help By projecteuclid.org Published On :: Wed, 30 Oct 2019 22:03 EDT Reinhard Heckel, Nihar B. Shah, Kannan Ramchandran, Martin J. Wainwright. Source: The Annals of Statistics, Volume 47, Number 6, 3099--3126.Abstract: We consider sequential or active ranking of a set of $n$ items based on noisy pairwise comparisons. Items are ranked according to the probability that a given item beats a randomly chosen item, and ranking refers to partitioning the items into sets of prespecified sizes according to their scores. This notion of ranking includes as special cases the identification of the top-$k$ items and the total ordering of the items. We first analyze a sequential ranking algorithm that counts the number of comparisons won, and uses these counts to decide whether to stop, or to compare another pair of items, chosen based on confidence intervals specified by the data collected up to that point. We prove that this algorithm succeeds in recovering the ranking using a number of comparisons that is optimal up to logarithmic factors. This guarantee does depend on whether or not the underlying pairwise probability matrix, satisfies a particular structural property, unlike a significant body of past work on pairwise ranking based on parametric models such as the Thurstone or Bradley–Terry–Luce models. It has been a long-standing open question as to whether or not imposing these parametric assumptions allows for improved ranking algorithms. For stochastic comparison models, in which the pairwise probabilities are bounded away from zero, our second contribution is to resolve this issue by proving a lower bound for parametric models. This shows, perhaps surprisingly, that these popular parametric modeling choices offer at most logarithmic gains for stochastic comparisons. Full Article
c Sorted concave penalized regression By projecteuclid.org Published On :: Wed, 30 Oct 2019 22:03 EDT Long Feng, Cun-Hui Zhang. Source: The Annals of Statistics, Volume 47, Number 6, 3069--3098.Abstract: The Lasso is biased. Concave penalized least squares estimation (PLSE) takes advantage of signal strength to reduce this bias, leading to sharper error bounds in prediction, coefficient estimation and variable selection. For prediction and estimation, the bias of the Lasso can be also reduced by taking a smaller penalty level than what selection consistency requires, but such smaller penalty level depends on the sparsity of the true coefficient vector. The sorted $ell_{1}$ penalized estimation (Slope) was proposed for adaptation to such smaller penalty levels. However, the advantages of concave PLSE and Slope do not subsume each other. We propose sorted concave penalized estimation to combine the advantages of concave and sorted penalizations. We prove that sorted concave penalties adaptively choose the smaller penalty level and at the same time benefits from signal strength, especially when a significant proportion of signals are stronger than the corresponding adaptively selected penalty levels. A local convex approximation for sorted concave penalties, which extends the local linear and quadratic approximations for separable concave penalties, is developed to facilitate the computation of sorted concave PLSE and proven to possess desired prediction and estimation error bounds. Our analysis of prediction and estimation errors requires the restricted eigenvalue condition on the design, not beyond, and provides selection consistency under a required minimum signal strength condition in addition. Thus, our results also sharpens existing results on concave PLSE by removing the upper sparse eigenvalue component of the sparse Riesz condition. Full Article
c Distributed estimation of principal eigenspaces By projecteuclid.org Published On :: Wed, 30 Oct 2019 22:03 EDT Jianqing Fan, Dong Wang, Kaizheng Wang, Ziwei Zhu. Source: The Annals of Statistics, Volume 47, Number 6, 3009--3031.Abstract: Principal component analysis (PCA) is fundamental to statistical machine learning. It extracts latent principal factors that contribute to the most variation of the data. When data are stored across multiple machines, however, communication cost can prohibit the computation of PCA in a central location and distributed algorithms for PCA are thus needed. This paper proposes and studies a distributed PCA algorithm: each node machine computes the top $K$ eigenvectors and transmits them to the central server; the central server then aggregates the information from all the node machines and conducts a PCA based on the aggregated information. We investigate the bias and variance for the resulting distributed estimator of the top $K$ eigenvectors. In particular, we show that for distributions with symmetric innovation, the empirical top eigenspaces are unbiased, and hence the distributed PCA is “unbiased.” We derive the rate of convergence for distributed PCA estimators, which depends explicitly on the effective rank of covariance, eigengap, and the number of machines. We show that when the number of machines is not unreasonably large, the distributed PCA performs as well as the whole sample PCA, even without full access of whole data. The theoretical results are verified by an extensive simulation study. We also extend our analysis to the heterogeneous case where the population covariance matrices are different across local machines but share similar top eigenstructures. Full Article
c Testing for independence of large dimensional vectors By projecteuclid.org Published On :: Fri, 02 Aug 2019 22:04 EDT Taras Bodnar, Holger Dette, Nestor Parolya. Source: The Annals of Statistics, Volume 47, Number 5, 2977--3008.Abstract: In this paper, new tests for the independence of two high-dimensional vectors are investigated. We consider the case where the dimension of the vectors increases with the sample size and propose multivariate analysis of variance-type statistics for the hypothesis of a block diagonal covariance matrix. The asymptotic properties of the new test statistics are investigated under the null hypothesis and the alternative hypothesis using random matrix theory. For this purpose, we study the weak convergence of linear spectral statistics of central and (conditionally) noncentral Fisher matrices. In particular, a central limit theorem for linear spectral statistics of large dimensional (conditionally) noncentral Fisher matrices is derived which is then used to analyse the power of the tests under the alternative. The theoretical results are illustrated by means of a simulation study where we also compare the new tests with several alternative, in particular with the commonly used corrected likelihood ratio test. It is demonstrated that the latter test does not keep its nominal level, if the dimension of one sub-vector is relatively small compared to the dimension of the other sub-vector. On the other hand, the tests proposed in this paper provide a reasonable approximation of the nominal level in such situations. Moreover, we observe that one of the proposed tests is most powerful under a variety of correlation scenarios. Full Article