Latest science and technology news

science and technology

Bootstrapping and sample splitting for high-dimensional, assumption-lean inference

By projecteuclid.org
Published On :: Wed, 30 Oct 2019 22:03 EDT

Alessandro Rinaldo, Larry Wasserman, Max G’Sell.

Source: The Annals of Statistics, Volume 47, Number 6, 3438--3469.

Abstract:
Several new methods have been recently proposed for performing valid inference after model selection. An older method is sample splitting: use part of the data for model selection and the rest for inference. In this paper, we revisit sample splitting combined with the bootstrap (or the Normal approximation). We show that this leads to a simple, assumption-lean approach to inference and we establish results on the accuracy of the method. In fact, we find new bounds on the accuracy of the bootstrap and the Normal approximation for general nonlinear parameters with increasing dimension which we then use to assess the accuracy of regression inference. We define new parameters that measure variable importance and that can be inferred with greater accuracy than the usual regression coefficients. Finally, we elucidate an inference-prediction trade-off: splitting increases the accuracy and robustness of inference but can decrease the accuracy of the predictions.

Bootstrapping and sample splitting for high-dimensional, assumption-lean inference

Minimax posterior convergence rates and model selection consistency in high-dimensional DAG models based on sparse Cholesky factors

On testing for high-dimensional white noise

A smeary central limit theorem for manifolds with application to high-dimensional spheres

On optimal designs for nonregular models

Hypothesis testing on linear structures of high-dimensional covariance matrix

Sampling and estimation for (sparse) exchangeable graphs

Quantile regression under memory constraint

On partial-sum processes of ARMAX residuals

Statistical inference for autoregressive models under heteroscedasticity of unknown form

Adaptive estimation of the rank of the coefficient matrix in high-dimensional multivariate response regression models

Randomized incomplete &#36;U&#36;-statistics in high dimensions

Active ranking from pairwise comparisons and when parametric assumptions do not help

Sorted concave penalized regression

Additive models with trend filtering

Distributed estimation of principal eigenspaces

Testing for independence of large dimensional vectors

Inference for the mode of a log-concave density

Projected spline estimation of the nonparametric function in high-dimensional partially linear models for massive data

Test for high-dimensional correlation matrices

Eigenvalue distributions of variance components estimators in high-dimensional random effects models

Exact lower bounds for the agnostic probably-approximately-correct (PAC) machine learning model

A unified treatment of multiple testing with prior knowledge using the p-filter

Distance multivariance: New dependence measures for random vectors

Phase transition in the spiked random tensor with Rademacher prior

An operator theoretic approach to nonparametric mixture models

Linear hypothesis testing for high dimensional generalized linear models

The middle-scale asymptotics of Wishart matrices

Semiparametrically point-optimal hybrid rank tests for unit roots

Doubly penalized estimation in additive regression with high-dimensional data

Semi-supervised inference: General theory and estimation of means

A knockoff filter for high-dimensional selective inference

Property testing in high-dimensional Ising models

Isotonic regression in general dimensions

The two-to-infinity norm and singular subspace geometry with applications to high-dimensional statistics

Cross validation for locally stationary processes

Dynamic network models and graphon estimation

On testing conditional qualitative treatment effects

Convergence complexity analysis of Albert and Chib’s algorithm for Bayesian probit regression

Convergence rates of least squares regression estimators with heavy-tailed errors

On deep learning as a remedy for the curse of dimensionality in nonparametric regression

Negative association, ordering and convergence of resampling methods

Spectral method and regularized MLE are both optimal for top-&#36;K&#36; ranking

Generalized cluster trees and singular measures

Bayes and empirical-Bayes multiplicity adjustment in the variable-selection problem

ESB

XSLT

ASP

J2EE

interoperability

Understanding Software Migration. part 1

Understanding Software Migration. part 2

Naming Conventions and Coding Standards

Retailer improves business operations by integrating Shopify, POS and SYSPRO

Visix adds Microsoft Power BI Widget to AxisTV Signage Suite

Argonne Scientist Elected as Fellow of the American Physical Society

Stent em forma de ampulheta poderia aliviar a intensa dor toracica causada pela doenca microvascular

Estent en forma de reloj de arena podria aliviar el intenso dolor en el pecho causado por la enfermedad microvascular

Academy of Science, Engineering and Medicine of Florida names two FSU professors Rising Stars

Nurses' Extraordinary Experiences During the COVID-19 Pandemic

WashU Expert: 'X-odus' Creates Growing Challenges for Brand Marketing

Argonne Researchers Highlight Breakthroughs in Supercomputing and AI at SC24

University of Central Florida's A Team with A Dream secures gold at the DOE's 10th CyberForce Competition(r)

Cedars-Sinai Experts Available for Interviews During American College of Rheumatology Convergence 2024

NJ Becomes First State to Have Statewide Law Enforcement & Mental Health Alternative Response Program in Nation

Subscribe To Our Newsletter

Randomized incomplete $U$-statistics in high dimensions

Spectral method and regularized MLE are both optimal for top-$K$ ranking