w

Encyclopedia of renewable and sustainable materials

9780128131961 (print)




w

Emerging eco-friendly green technologies for wastewater treatment

9789811513909 (electronic bk.)




w

Ecophysiology of pesticides : interface between pesticide chemistry and plant physiology

Parween, Talat, author.
9780128176146




w

Diabetes & obesity in women : adolescence, pregnancy, and menopause

Diabetes in women.
9781496390547 (paperback)




w

Current developments in biotechnology and bioengineering : resource recovery from wastes

0444643222




w

Consequences of microbial interactions with hydrocarbons, oils, and lipids : biodegradation and bioremediation

9783319445359 (electronic bk.)




w

Computer security : ESORICS 2019 International Workshops, IOSec, MSTEC, and FINSEC, Luxembourg City, Luxembourg, September 26-27, 2019, Revised Selected Papers

European Symposium on Research in Computer Security (24th : 2019 : Luxembourg, Luxembourg)
9783030420512 (electronic bk.)




w

Computed body tomography with MRI correlation

9781496370495 (hbk.)




w

Compression and chronic wound management

9783030011956 (electronic book)




w

Communications and networking : 14th EAI International Conference, ChinaCom 2019, Shanghai, China, November 29 - December 1, 2019, proceedings.

ChinaCom (Conference) (14th : 2019 : Shanghai, China)
9783030411176




w

Common problems in the newborn nursery : an evidence and case-based guide

9783319956725 (electronic bk.)




w

Climate change and food security with emphasis on wheat

9780128195277




w

Chickpea : crop wild relatives for enhancing genetic gains

9780128183007 (electronic bk.)




w

Breakfast cereals and how they are made : raw materials, processing, and production

9780128120446 (electronic bk.)




w

Biology and physiology of freshwater neotropical fishes

9780128158739 (electronic bk.)




w

Arctic plants of Svalbard : what we learn from the green in the treeless white world

Lee, Yoo Kyung, author
9783030345600 (electronic bk.)




w

Agri-food industry strategies for healthy diets and sustainability : new challenges in nutrition and public health

9780128172261




w

Notice of Construction - Woodbine Ave.






w

New Partnerships Emerge for COVID-19 Relief: Dade County Farm Bureau...

Harvested produce crops feed Florida Department of Corrections’ (FDC) more than 87,000 inmates; action saves food costs while reducing COVID-19 related supply chain impacts.

(PRWeb April 20, 2020)

Read the full story at https://www.prweb.com/releases/new_partnerships_emerge_for_covid_19_relief_dade_county_farm_bureau_teams_with_state_leaders_to_launch_farm_to_inmate_program/prweb17052045.htm






w

Suntuity AirWorks Offering FREE Assistance in Drone Acquisition...

The drones and programs will be fully paid for by the DOJ as part of the $850 million funding that has been allocated to help public safety departments fight the spread of COVID-19. This includes...

(PRWeb April 30, 2020)

Read the full story at https://www.prweb.com/releases/suntuity_airworks_offering_free_assistance_in_drone_acquisition_through_850mm_federal_grant_assistance_program_for_public_safety_agencies/prweb17090555.htm








w

Penalized generalized empirical likelihood with a diverging number of general estimating equations for censored data

Niansheng Tang, Xiaodong Yan, Xingqiu Zhao.

Source: The Annals of Statistics, Volume 48, Number 1, 607--627.

Abstract:
This article considers simultaneous variable selection and parameter estimation as well as hypothesis testing in censored survival models where a parametric likelihood is not available. For the problem, we utilize certain growing dimensional general estimating equations and propose a penalized generalized empirical likelihood, where the general estimating equations are constructed based on the semiparametric efficiency bound of estimation with given moment conditions. The proposed penalized generalized empirical likelihood estimators enjoy the oracle properties, and the estimator of any fixed dimensional vector of nonzero parameters achieves the semiparametric efficiency bound asymptotically. Furthermore, we show that the penalized generalized empirical likelihood ratio test statistic has an asymptotic central chi-square distribution. The conditions of local and restricted global optimality of weighted penalized generalized empirical likelihood estimators are also discussed. We present a two-layer iterative algorithm for efficient implementation, and investigate its convergence property. The performance of the proposed methods is demonstrated by extensive simulation studies, and a real data example is provided for illustration.




w

Almost sure uniqueness of a global minimum without convexity

Gregory Cox.

Source: The Annals of Statistics, Volume 48, Number 1, 584--606.

Abstract:
This paper establishes the argmin of a random objective function to be unique almost surely. This paper first formulates a general result that proves almost sure uniqueness without convexity of the objective function. The general result is then applied to a variety of applications in statistics. Four applications are discussed, including uniqueness of M-estimators, both classical likelihood and penalized likelihood estimators, and two applications of the argmin theorem, threshold regression and weak identification.




w

Asymptotic genealogies of interacting particle systems with an application to sequential Monte Carlo

Jere Koskela, Paul A. Jenkins, Adam M. Johansen, Dario Spanò.

Source: The Annals of Statistics, Volume 48, Number 1, 560--583.

Abstract:
We study weighted particle systems in which new generations are resampled from current particles with probabilities proportional to their weights. This covers a broad class of sequential Monte Carlo (SMC) methods, widely-used in applied statistics and cognate disciplines. We consider the genealogical tree embedded into such particle systems, and identify conditions, as well as an appropriate time-scaling, under which they converge to the Kingman $n$-coalescent in the infinite system size limit in the sense of finite-dimensional distributions. Thus, the tractable $n$-coalescent can be used to predict the shape and size of SMC genealogies, as we illustrate by characterising the limiting mean and variance of the tree height. SMC genealogies are known to be connected to algorithm performance, so that our results are likely to have applications in the design of new methods as well. Our conditions for convergence are strong, but we show by simulation that they do not appear to be necessary.




w

Averages of unlabeled networks: Geometric characterization and asymptotic behavior

Eric D. Kolaczyk, Lizhen Lin, Steven Rosenberg, Jackson Walters, Jie Xu.

Source: The Annals of Statistics, Volume 48, Number 1, 514--538.

Abstract:
It is becoming increasingly common to see large collections of network data objects, that is, data sets in which a network is viewed as a fundamental unit of observation. As a result, there is a pressing need to develop network-based analogues of even many of the most basic tools already standard for scalar and vector data. In this paper, our focus is on averages of unlabeled, undirected networks with edge weights. Specifically, we (i) characterize a certain notion of the space of all such networks, (ii) describe key topological and geometric properties of this space relevant to doing probability and statistics thereupon, and (iii) use these properties to establish the asymptotic behavior of a generalized notion of an empirical mean under sampling from a distribution supported on this space. Our results rely on a combination of tools from geometry, probability theory and statistical shape analysis. In particular, the lack of vertex labeling necessitates working with a quotient space modding out permutations of labels. This results in a nontrivial geometry for the space of unlabeled networks, which in turn is found to have important implications on the types of probabilistic and statistical results that may be obtained and the techniques needed to obtain them.




w

Testing for principal component directions under weak identifiability

Davy Paindaveine, Julien Remy, Thomas Verdebout.

Source: The Annals of Statistics, Volume 48, Number 1, 324--345.

Abstract:
We consider the problem of testing, on the basis of a $p$-variate Gaussian random sample, the null hypothesis $mathcal{H}_{0}:oldsymbol{ heta}_{1}=oldsymbol{ heta}_{1}^{0}$ against the alternative $mathcal{H}_{1}:oldsymbol{ heta}_{1} eq oldsymbol{ heta}_{1}^{0}$, where $oldsymbol{ heta}_{1}$ is the “first” eigenvector of the underlying covariance matrix and $oldsymbol{ heta}_{1}^{0}$ is a fixed unit $p$-vector. In the classical setup where eigenvalues $lambda_{1}>lambda_{2}geq cdots geq lambda_{p}$ are fixed, the Anderson ( Ann. Math. Stat. 34 (1963) 122–148) likelihood ratio test (LRT) and the Hallin, Paindaveine and Verdebout ( Ann. Statist. 38 (2010) 3245–3299) Le Cam optimal test for this problem are asymptotically equivalent under the null hypothesis, hence also under sequences of contiguous alternatives. We show that this equivalence does not survive asymptotic scenarios where $lambda_{n1}/lambda_{n2}=1+O(r_{n})$ with $r_{n}=O(1/sqrt{n})$. For such scenarios, the Le Cam optimal test still asymptotically meets the nominal level constraint, whereas the LRT severely overrejects the null hypothesis. Consequently, the former test should be favored over the latter one whenever the two largest sample eigenvalues are close to each other. By relying on the Le Cam’s asymptotic theory of statistical experiments, we study the non-null and optimality properties of the Le Cam optimal test in the aforementioned asymptotic scenarios and show that the null robustness of this test is not obtained at the expense of power. Our asymptotic investigation is extensive in the sense that it allows $r_{n}$ to converge to zero at an arbitrary rate. While we restrict to single-spiked spectra of the form $lambda_{n1}>lambda_{n2}=cdots =lambda_{np}$ to make our results as striking as possible, we extend our results to the more general elliptical case. Finally, we present an illustrative real data example.




w

Spectral and matrix factorization methods for consistent community detection in multi-layer networks

Subhadeep Paul, Yuguo Chen.

Source: The Annals of Statistics, Volume 48, Number 1, 230--250.

Abstract:
We consider the problem of estimating a consensus community structure by combining information from multiple layers of a multi-layer network using methods based on the spectral clustering or a low-rank matrix factorization. As a general theme, these “intermediate fusion” methods involve obtaining a low column rank matrix by optimizing an objective function and then using the columns of the matrix for clustering. However, the theoretical properties of these methods remain largely unexplored. In the absence of statistical guarantees on the objective functions, it is difficult to determine if the algorithms optimizing the objectives will return good community structures. We investigate the consistency properties of the global optimizer of some of these objective functions under the multi-layer stochastic blockmodel. For this purpose, we derive several new asymptotic results showing consistency of the intermediate fusion techniques along with the spectral clustering of mean adjacency matrix under a high dimensional setup, where the number of nodes, the number of layers and the number of communities of the multi-layer graph grow. Our numerical study shows that the intermediate fusion techniques outperform late fusion methods, namely spectral clustering on aggregate spectral kernel and module allegiance matrix in sparse networks, while they outperform the spectral clustering of mean adjacency matrix in multi-layer networks that contain layers with both homophilic and heterophilic communities.




w

Optimal rates for community estimation in the weighted stochastic block model

Min Xu, Varun Jog, Po-Ling Loh.

Source: The Annals of Statistics, Volume 48, Number 1, 183--204.

Abstract:
Community identification in a network is an important problem in fields such as social science, neuroscience and genetics. Over the past decade, stochastic block models (SBMs) have emerged as a popular statistical framework for this problem. However, SBMs have an important limitation in that they are suited only for networks with unweighted edges; in various scientific applications, disregarding the edge weights may result in a loss of valuable information. We study a weighted generalization of the SBM, in which observations are collected in the form of a weighted adjacency matrix and the weight of each edge is generated independently from an unknown probability density determined by the community membership of its endpoints. We characterize the optimal rate of misclustering error of the weighted SBM in terms of the Renyi divergence of order 1/2 between the weight distributions of within-community and between-community edges, substantially generalizing existing results for unweighted SBMs. Furthermore, we present a computationally tractable algorithm based on discretization that achieves the optimal error rate. Our method is adaptive in the sense that the algorithm, without assuming knowledge of the weight densities, performs as well as the best algorithm that knows the weight densities.




w

New $G$-formula for the sequential causal effect and blip effect of treatment in sequential causal inference

Xiaoqin Wang, Li Yin.

Source: The Annals of Statistics, Volume 48, Number 1, 138--160.

Abstract:
In sequential causal inference, two types of causal effects are of practical interest, namely, the causal effect of the treatment regime (called the sequential causal effect) and the blip effect of treatment on the potential outcome after the last treatment. The well-known $G$-formula expresses these causal effects in terms of the standard parameters. In this article, we obtain a new $G$-formula that expresses these causal effects in terms of the point observable effects of treatments similar to treatment in the framework of single-point causal inference. Based on the new $G$-formula, we estimate these causal effects by maximum likelihood via point observable effects with methods extended from single-point causal inference. We are able to increase precision of the estimation without introducing biases by an unsaturated model imposing constraints on the point observable effects. We are also able to reduce the number of point observable effects in the estimation by treatment assignment conditions.




w

Two-step semiparametric empirical likelihood inference

Francesco Bravo, Juan Carlos Escanciano, Ingrid Van Keilegom.

Source: The Annals of Statistics, Volume 48, Number 1, 1--26.

Abstract:
In both parametric and certain nonparametric statistical models, the empirical likelihood ratio satisfies a nonparametric version of Wilks’ theorem. For many semiparametric models, however, the commonly used two-step (plug-in) empirical likelihood ratio is not asymptotically distribution-free, that is, its asymptotic distribution contains unknown quantities, and hence Wilks’ theorem breaks down. This article suggests a general approach to restore Wilks’ phenomenon in two-step semiparametric empirical likelihood inferences. The main insight consists in using as the moment function in the estimating equation the influence function of the plug-in sample moment. The proposed method is general; it leads to a chi-squared limiting distribution with known degrees of freedom; it is efficient; it does not require undersmoothing; and it is less sensitive to the first-step than alternative methods, which is particularly appealing for high-dimensional settings. Several examples and simulation studies illustrate the general applicability of the procedure and its excellent finite sample performance relative to competing methods.




w

Tracy–Widom limit for Kendall’s tau

Zhigang Bao.

Source: The Annals of Statistics, Volume 47, Number 6, 3504--3532.

Abstract:
In this paper, we study a high-dimensional random matrix model from nonparametric statistics called the Kendall rank correlation matrix, which is a natural multivariate extension of the Kendall rank correlation coefficient. We establish the Tracy–Widom law for its largest eigenvalue. It is the first Tracy–Widom law for a nonparametric random matrix model, and also the first Tracy–Widom law for a high-dimensional U-statistic.




w

Joint convergence of sample autocovariance matrices when $p/n o 0$ with application

Monika Bhattacharjee, Arup Bose.

Source: The Annals of Statistics, Volume 47, Number 6, 3470--3503.

Abstract:
Consider a high-dimensional linear time series model where the dimension $p$ and the sample size $n$ grow in such a way that $p/n o 0$. Let $hat{Gamma }_{u}$ be the $u$th order sample autocovariance matrix. We first show that the LSD of any symmetric polynomial in ${hat{Gamma }_{u},hat{Gamma }_{u}^{*},ugeq 0}$ exists under independence and moment assumptions on the driving sequence together with weak assumptions on the coefficient matrices. This LSD result, with some additional effort, implies the asymptotic normality of the trace of any polynomial in ${hat{Gamma }_{u},hat{Gamma }_{u}^{*},ugeq 0}$. We also study similar results for several independent MA processes. We show applications of the above results to statistical inference problems such as in estimation of the unknown order of a high-dimensional MA process and in graphical and significance tests for hypotheses on coefficient matrices of one or several such independent processes.




w

On testing for high-dimensional white noise

Zeng Li, Clifford Lam, Jianfeng Yao, Qiwei Yao.

Source: The Annals of Statistics, Volume 47, Number 6, 3382--3412.

Abstract:
Testing for white noise is a classical yet important problem in statistics, especially for diagnostic checks in time series modeling and linear regression. For high-dimensional time series in the sense that the dimension $p$ is large in relation to the sample size $T$, the popular omnibus tests including the multivariate Hosking and Li–McLeod tests are extremely conservative, leading to substantial power loss. To develop more relevant tests for high-dimensional cases, we propose a portmanteau-type test statistic which is the sum of squared singular values of the first $q$ lagged sample autocovariance matrices. It, therefore, encapsulates all the serial correlations (up to the time lag $q$) within and across all component series. Using the tools from random matrix theory and assuming both $p$ and $T$ diverge to infinity, we derive the asymptotic normality of the test statistic under both the null and a specific VMA(1) alternative hypothesis. As the actual implementation of the test requires the knowledge of three characteristic constants of the population cross-sectional covariance matrix and the value of the fourth moment of the standardized innovations, nontrivial estimations are proposed for these parameters and their integration leads to a practically usable test. Extensive simulation confirms the excellent finite-sample performance of the new test with accurate size and satisfactory power for a large range of finite $(p,T)$ combinations, therefore, ensuring wide applicability in practice. In particular, the new tests are consistently superior to the traditional Hosking and Li–McLeod tests.




w

A smeary central limit theorem for manifolds with application to high-dimensional spheres

Benjamin Eltzner, Stephan F. Huckemann.

Source: The Annals of Statistics, Volume 47, Number 6, 3360--3381.

Abstract:
The (CLT) central limit theorems for generalized Fréchet means (data descriptors assuming values in manifolds, such as intrinsic means, geodesics, etc.) on manifolds from the literature are only valid if a certain empirical process of Hessians of the Fréchet function converges suitably, as in the proof of the prototypical BP-CLT [ Ann. Statist. 33 (2005) 1225–1259]. This is not valid in many realistic scenarios and we provide for a new very general CLT. In particular, this includes scenarios where, in a suitable chart, the sample mean fluctuates asymptotically at a scale $n^{alpha }$ with exponents $alpha <1/2$ with a nonnormal distribution. As the BP-CLT yields only fluctuations that are, rescaled with $n^{1/2}$, asymptotically normal, just as the classical CLT for random vectors, these lower rates, somewhat loosely called smeariness, had to date been observed only on the circle. We make the concept of smeariness on manifolds precise, give an example for two-smeariness on spheres of arbitrary dimension, and show that smeariness, although “almost never” occurring, may have serious statistical implications on a continuum of sample scenarios nearby. In fact, this effect increases with dimension, striking in particular in high dimension low sample size scenarios.




w

Statistical inference for autoregressive models under heteroscedasticity of unknown form

Ke Zhu.

Source: The Annals of Statistics, Volume 47, Number 6, 3185--3215.

Abstract:
This paper provides an entire inference procedure for the autoregressive model under (conditional) heteroscedasticity of unknown form with a finite variance. We first establish the asymptotic normality of the weighted least absolute deviations estimator (LADE) for the model. Second, we develop the random weighting (RW) method to estimate its asymptotic covariance matrix, leading to the implementation of the Wald test. Third, we construct a portmanteau test for model checking, and use the RW method to obtain its critical values. As a special weighted LADE, the feasible adaptive LADE (ALADE) is proposed and proved to have the same efficiency as its infeasible counterpart. The importance of our entire methodology based on the feasible ALADE is illustrated by simulation results and the real data analysis on three U.S. economic data sets.




w

Active ranking from pairwise comparisons and when parametric assumptions do not help

Reinhard Heckel, Nihar B. Shah, Kannan Ramchandran, Martin J. Wainwright.

Source: The Annals of Statistics, Volume 47, Number 6, 3099--3126.

Abstract:
We consider sequential or active ranking of a set of $n$ items based on noisy pairwise comparisons. Items are ranked according to the probability that a given item beats a randomly chosen item, and ranking refers to partitioning the items into sets of prespecified sizes according to their scores. This notion of ranking includes as special cases the identification of the top-$k$ items and the total ordering of the items. We first analyze a sequential ranking algorithm that counts the number of comparisons won, and uses these counts to decide whether to stop, or to compare another pair of items, chosen based on confidence intervals specified by the data collected up to that point. We prove that this algorithm succeeds in recovering the ranking using a number of comparisons that is optimal up to logarithmic factors. This guarantee does depend on whether or not the underlying pairwise probability matrix, satisfies a particular structural property, unlike a significant body of past work on pairwise ranking based on parametric models such as the Thurstone or Bradley–Terry–Luce models. It has been a long-standing open question as to whether or not imposing these parametric assumptions allows for improved ranking algorithms. For stochastic comparison models, in which the pairwise probabilities are bounded away from zero, our second contribution is to resolve this issue by proving a lower bound for parametric models. This shows, perhaps surprisingly, that these popular parametric modeling choices offer at most logarithmic gains for stochastic comparisons.




w

Additive models with trend filtering

Veeranjaneyulu Sadhanala, Ryan J. Tibshirani.

Source: The Annals of Statistics, Volume 47, Number 6, 3032--3068.

Abstract:
We study additive models built with trend filtering, that is, additive models whose components are each regularized by the (discrete) total variation of their $k$th (discrete) derivative, for a chosen integer $kgeq0$. This results in $k$th degree piecewise polynomial components, (e.g., $k=0$ gives piecewise constant components, $k=1$ gives piecewise linear, $k=2$ gives piecewise quadratic, etc.). Analogous to its advantages in the univariate case, additive trend filtering has favorable theoretical and computational properties, thanks in large part to the localized nature of the (discrete) total variation regularizer that it uses. On the theory side, we derive fast error rates for additive trend filtering estimates, and show these rates are minimax optimal when the underlying function is additive and has component functions whose derivatives are of bounded variation. We also show that these rates are unattainable by additive smoothing splines (and by additive models built from linear smoothers, in general). On the computational side, we use backfitting, to leverage fast univariate trend filtering solvers; we also describe a new backfitting algorithm whose iterations can be run in parallel, which (as far as we can tell) is the first of its kind. Lastly, we present a number of experiments to examine the empirical performance of trend filtering.




w

Exact lower bounds for the agnostic probably-approximately-correct (PAC) machine learning model

Aryeh Kontorovich, Iosif Pinelis.

Source: The Annals of Statistics, Volume 47, Number 5, 2822--2854.

Abstract:
We provide an exact nonasymptotic lower bound on the minimax expected excess risk (EER) in the agnostic probably-approximately-correct (PAC) machine learning classification model and identify minimax learning algorithms as certain maximally symmetric and minimally randomized “voting” procedures. Based on this result, an exact asymptotic lower bound on the minimax EER is provided. This bound is of the simple form $c_{infty}/sqrt{ u}$ as $ u oinfty$, where $c_{infty}=0.16997dots$ is a universal constant, $ u=m/d$, $m$ is the size of the training sample and $d$ is the Vapnik–Chervonenkis dimension of the hypothesis class. It is shown that the differences between these asymptotic and nonasymptotic bounds, as well as the differences between these two bounds and the maximum EER of any learning algorithms that minimize the empirical risk, are asymptotically negligible, and all these differences are due to ties in the mentioned “voting” procedures. A few easy to compute nonasymptotic lower bounds on the minimax EER are also obtained, which are shown to be close to the exact asymptotic lower bound $c_{infty}/sqrt{ u}$ even for rather small values of the ratio $ u=m/d$. As an application of these results, we substantially improve existing lower bounds on the tail probability of the excess risk. Among the tools used are Bayes estimation and apparently new identities and inequalities for binomial distributions.




w

A unified treatment of multiple testing with prior knowledge using the p-filter

Aaditya K. Ramdas, Rina F. Barber, Martin J. Wainwright, Michael I. Jordan.

Source: The Annals of Statistics, Volume 47, Number 5, 2790--2821.

Abstract:
There is a significant literature on methods for incorporating knowledge into multiple testing procedures so as to improve their power and precision. Some common forms of prior knowledge include (a) beliefs about which hypotheses are null, modeled by nonuniform prior weights; (b) differing importances of hypotheses, modeled by differing penalties for false discoveries; (c) multiple arbitrary partitions of the hypotheses into (possibly overlapping) groups and (d) knowledge of independence, positive or arbitrary dependence between hypotheses or groups, suggesting the use of more aggressive or conservative procedures. We present a unified algorithmic framework called p-filter for global null testing and false discovery rate (FDR) control that allows the scientist to incorporate all four types of prior knowledge (a)–(d) simultaneously, recovering a variety of known algorithms as special cases.




w

Distance multivariance: New dependence measures for random vectors

Björn Böttcher, Martin Keller-Ressel, René L. Schilling.

Source: The Annals of Statistics, Volume 47, Number 5, 2757--2789.

Abstract:
We introduce two new measures for the dependence of $nge2$ random variables: distance multivariance and total distance multivariance . Both measures are based on the weighted $L^{2}$-distance of quantities related to the characteristic functions of the underlying random variables. These extend distance covariance (introduced by Székely, Rizzo and Bakirov) from pairs of random variables to $n$-tuplets of random variables. We show that total distance multivariance can be used to detect the independence of $n$ random variables and has a simple finite-sample representation in terms of distance matrices of the sample points, where distance is measured by a continuous negative definite function. Under some mild moment conditions, this leads to a test for independence of multiple random vectors which is consistent against all alternatives.




w

Phase transition in the spiked random tensor with Rademacher prior

Wei-Kuo Chen.

Source: The Annals of Statistics, Volume 47, Number 5, 2734--2756.

Abstract:
We consider the problem of detecting a deformation from a symmetric Gaussian random $p$-tensor $(pgeq3)$ with a rank-one spike sampled from the Rademacher prior. Recently, in Lesieur et al. (Barbier, Krzakala, Macris, Miolane and Zdeborová (2017)), it was proved that there exists a critical threshold $eta_{p}$ so that when the signal-to-noise ratio exceeds $eta_{p}$, one can distinguish the spiked and unspiked tensors and weakly recover the prior via the minimal mean-square-error method. On the other side, Perry, Wein and Bandeira (Perry, Wein and Bandeira (2017)) proved that there exists a $eta_{p}'<eta_{p}$ such that any statistical hypothesis test cannot distinguish these two tensors, in the sense that their total variation distance asymptotically vanishes, when the signa-to-noise ratio is less than $eta_{p}'$. In this work, we show that $eta_{p}$ is indeed the critical threshold that strictly separates the distinguishability and indistinguishability between the two tensors under the total variation distance. Our approach is based on a subtle analysis of the high temperature behavior of the pure $p$-spin model with Ising spin, arising initially from the field of spin glasses. In particular, we identify the signal-to-noise criticality $eta_{p}$ as the critical temperature, distinguishing the high and low temperature behavior, of the Ising pure $p$-spin mean-field spin glass model.




w

The middle-scale asymptotics of Wishart matrices

Didier Chételat, Martin T. Wells.

Source: The Annals of Statistics, Volume 47, Number 5, 2639--2670.

Abstract:
We study the behavior of a real $p$-dimensional Wishart random matrix with $n$ degrees of freedom when $n,p ightarrowinfty$ but $p/n ightarrow0$. We establish the existence of phase transitions when $p$ grows at the order $n^{(K+1)/(K+3)}$ for every $Kinmathbb{N}$, and derive expressions for approximating densities between every two phase transitions. To do this, we make use of a novel tool we call the $mathcal{F}$-conjugate of an absolutely continuous distribution, which is obtained from the Fourier transform of the square root of its density. In the case of the normalized Wishart distribution, this represents an extension of the $t$-distribution to the space of real symmetric matrices.




w

Doubly penalized estimation in additive regression with high-dimensional data

Zhiqiang Tan, Cun-Hui Zhang.

Source: The Annals of Statistics, Volume 47, Number 5, 2567--2600.

Abstract:
Additive regression provides an extension of linear regression by modeling the signal of a response as a sum of functions of covariates of relatively low complexity. We study penalized estimation in high-dimensional nonparametric additive regression where functional semi-norms are used to induce smoothness of component functions and the empirical $L_{2}$ norm is used to induce sparsity. The functional semi-norms can be of Sobolev or bounded variation types and are allowed to be different amongst individual component functions. We establish oracle inequalities for the predictive performance of such methods under three simple technical conditions: a sub-Gaussian condition on the noise, a compatibility condition on the design and the functional classes under consideration and an entropy condition on the functional classes. For random designs, the sample compatibility condition can be replaced by its population version under an additional condition to ensure suitable convergence of empirical norms. In homogeneous settings where the complexities of the component functions are of the same order, our results provide a spectrum of minimax convergence rates, from the so-called slow rate without requiring the compatibility condition to the fast rate under the hard sparsity or certain $L_{q}$ sparsity to allow many small components in the true regression function. These results significantly broaden and sharpen existing ones in the literature.