Latest at news

Curse of dimensionality and related issues in nonparametric functional regression

By projecteuclid.org
Published On :: Thu, 14 Apr 2011 08:17 EDT

Gery Geenens

Source: Statist. Surv., Volume 5, 30--43.

Abstract:
Recently, some nonparametric regression ideas have been extended to the case of functional regression. Within that framework, the main concern arises from the infinite dimensional nature of the explanatory objects. Specifically, in the classical multivariate regression context, it is well-known that any nonparametric method is affected by the so-called “curse of dimensionality”, caused by the sparsity of data in high-dimensional spaces, resulting in a decrease in fastest achievable rates of convergence of regression function estimators toward their target curve as the dimension of the regressor vector increases. Therefore, it is not surprising to find dramatically bad theoretical properties for the nonparametric functional regression estimators, leading many authors to condemn the methodology. Nevertheless, a closer look at the meaning of the functional data under study and on the conclusions that the statistician would like to draw from it allows to consider the problem from another point-of-view, and to justify the use of slightly modified estimators. In most cases, it can be entirely legitimate to measure the proximity between two elements of the infinite dimensional functional space via a semi-metric, which could prevent those estimators suffering from what we will call the “curse of infinite dimensionality”.

References:
[1] Ait-Saïdi, A., Ferraty, F., Kassa, K. and Vieu, P. (2008). Cross-validated estimations in the single-functional index model, Statistics, 42, 475–494.

[2] Aneiros-Perez, G. and Vieu, P. (2008). Nonparametric time series prediction: A semi-functional partial linear modeling, J. Multivariate Anal., 99, 834–857.

[3] Baillo, A. and Grané, A. (2009). Local linear regression for functional predictor and scalar response, J. Multivariate Anal., 100, 102–111.

[4] Burba, F., Ferraty, F. and Vieu, P. (2009). k-Nearest Neighbour method in functional nonparametric regression, J. Nonparam. Stat., 21, 453–469.

[5] Cardot, H., Ferraty, F. and Sarda, P. (1999). Functional linear model, Stat. Probabil. Lett., 45, 11–22.

[6] Crambes, C., Kneip, A. and Sarda, P. (2009). Smoothing splines estimators for functional linear regression, Ann. Statist., 37, 35–72.

[7] Delsol, L. (2009). Advances on asymptotic normality in nonparametric functional time series analysis, Statistics, 43, 13–33.

[8] Fan, J. and Gijbels, I. (1996). Local Polynomial Modelling and Its Applications, Chapman and Hall, London.

[9] Fan, J. and Zhang, J.-T. (2000). Two-step estimation of functional linear models with application to longitudinal data, J. Roy. Stat. Soc. B, 62, 303–322.

[10] Ferraty, F. and Vieu, P. (2006). Nonparametric Functional Data Analysis, Springer-Verlag, New York.

[11] Ferraty, F., Laksaci, A. and Vieu, P. (2006). Estimating Some Characteristics of the Conditional Distribution in Nonparametric Functional Models, Statist. Inf. Stoch. Proc., 9, 47–76.

[12] Ferraty, F., Mas, A. and Vieu, P. (2007). Nonparametric regression on functional data: inference and practical aspects, Aust. NZ. J. Stat., 49, 267–286.

[13] Ferraty, F., Van Keilegom, I. and Vieu, P. (2010). On the validity of the bootstrap in nonparametric functional regression, Scand. J. Stat., 37, 286–306.

[14] Ferraty, F., Laksaci, A., Tadj, A. and Vieu, P. (2010). Rate of uniform consistency for nonparametric estimates with functional variables, J. Stat. Plan. Inf., 140, 335–352.

[15] Ferraty, F. and Romain, Y. (2011). Oxford handbook on functional data analysis (Eds), Oxford University Press.

[16] Gasser, T., Hall, P. and Presnell, B. (1998). Nonparametric estimation of the mode of a distribution of random curves, J. Roy. Stat. Soc. B, 60, 681–691.

[17] Geenens, G. (2011). A nonparametric functional method for signature recognition, Manuscript.

[18] Härdle, W., Müller, M., Sperlich, S. and Werwatz, A. (2004). Nonparametric and semiparametric models, Springer-Verlag, Berlin.

[19] James, G.M. (2002). Generalized linear models with functional predictors, J. Roy. Stat. Soc. B, 64, 411–432.

[20] Masry, E. (2005). Nonparametric regression estimation for dependent functional data: asymptotic normality, Stochastic Process. Appl., 115, 155–177.

[21] Nadaraya, E.A. (1964). On estimating regression, Theory Probab. Applic., 9, 141–142.

[22] Quintela-Del-Rio, A. (2008). Hazard function given a functional variable: nonparametric estimation under strong mixing conditions, J. Nonparam. Stat., 20, 413–430.

[23] Rachdi, M. and Vieu, P. (2007). Nonparametric regression for functional data: automatic smoothing parameter selection, J. Stat. Plan. Inf., 137, 2784–2801.

[24] Ramsay, J. and Silverman, B.W. (1997). Functional Data Analysis, Springer-Verlag, New York.

[25] Ramsay, J. and Silverman, B.W. (2002). Applied functional data analysis; methods and case study, Springer-Verlag, New York.

[26] Ramsay, J. and Silverman, B.W. (2005). Functional Data Analysis, 2nd Edition, Springer-Verlag, New York.

[27] Stone, C.J. (1982). Optimal global rates of convergence for nonparametric regression, Ann. Stat., 10, 1040–1053.

[28] Watson, G.S. (1964). Smooth regression analysis, Sankhya A, 26, 359–372.

[29] Yeung, D.T., Chang, H., Xiong, Y., George, S., Kashi, R., Matsumoto, T. and Rigoll, G. (2004). SVC2004: First International Signature Verification Competition, Proceedings of the International Conference on Biometric Authentication (ICBA), Hong Kong, July 2004.

Curse of dimensionality and related issues in nonparametric functional regression

Data confidentiality: A review of methods for statistical disclosure limitation and methods for assessing privacy

Identifying the consequences of dynamic treatment strategies: A decision-theoretic overview

Primal and dual model representations in kernel-based learning

Discrete variations of the fractional Brownian motion in the presence of outliers and an additive noise

A survey of cross-validation procedures for model selection

Wilcoxon-Mann-Whitney or t-test? On assumptions for hypothesis tests and multiple interpretations of decision rules

Start your Chinese Family Search at the State Library of...

Arctic Amplification of Anthropogenic Forcing: A Vector Autoregressive Analysis. (arXiv:2005.02535v1 [econ.EM] CROSS LISTED)

Unsupervised Pre-trained Models from Healthy ADLs Improve Parkinson's Disease Classification of Gait Patterns. (arXiv:2005.02589v2 [cs.LG] UPDATED)

Statistical errors in Monte Carlo-based inference for random elements. (arXiv:2005.02532v2 [math.ST] UPDATED)

Generating Thermal Image Data Samples using 3D Facial Modelling Techniques and Deep Learning Methodologies. (arXiv:2005.01923v2 [cs.CV] UPDATED)

Interpreting Rate-Distortion of Variational Autoencoder and Using Model Uncertainty for Anomaly Detection. (arXiv:2005.01889v2 [cs.LG] UPDATED)

How many modes can a constrained Gaussian mixture have?. (arXiv:2005.01580v2 [math.ST] UPDATED)

Is the NUTS algorithm correct?. (arXiv:2005.01336v2 [stat.CO] UPDATED)

Can a powerful neural network be a teacher for a weaker neural network?. (arXiv:2005.00393v2 [cs.LG] UPDATED)

Data-Space Inversion Using a Recurrent Autoencoder for Time-Series Parameterization. (arXiv:2005.00061v2 [stat.ML] UPDATED)

Short-term forecasts of COVID-19 spread across Indian states until 1 May 2020. (arXiv:2004.13538v2 [q-bio.PE] UPDATED)

A bimodal gamma distribution: Properties, regression model and applications. (arXiv:2004.12491v2 [stat.ME] UPDATED)

A Global Benchmark of Algorithms for Segmenting Late Gadolinium-Enhanced Cardiac Magnetic Resonance Imaging. (arXiv:2004.12314v3 [cs.CV] UPDATED)

Excess registered deaths in England and Wales during the COVID-19 pandemic, March 2020 and April 2020. (arXiv:2004.11355v4 [stat.AP] UPDATED)

On a phase transition in general order spline regression. (arXiv:2004.10922v2 [math.ST] UPDATED)

A Critical Overview of Privacy-Preserving Approaches for Collaborative Forecasting. (arXiv:2004.09612v3 [cs.LG] UPDATED)

Deep transfer learning for improving single-EEG arousal detection. (arXiv:2004.05111v2 [cs.CV] UPDATED)

Strong Converse for Testing Against Independence over a Noisy channel. (arXiv:2004.00775v2 [cs.IT] UPDATED)

Capturing and Explaining Trajectory Singularities using Composite Signal Neural Networks. (arXiv:2003.10810v2 [cs.LG] UPDATED)

Risk-Aware Energy Scheduling for Edge Computing with Microgrid: A Multi-Agent Deep Reinforcement Learning Approach. (arXiv:2003.02157v2 [physics.soc-ph] UPDATED)

Mnemonics Training: Multi-Class Incremental Learning without Forgetting. (arXiv:2002.10211v3 [cs.CV] UPDATED)

A Distributionally Robust Area Under Curve Maximization Model. (arXiv:2002.07345v2 [math.OC] UPDATED)

Statistical aspects of nuclear mass models. (arXiv:2002.04151v3 [nucl-th] UPDATED)

Cyclic Boosting -- an explainable supervised machine learning algorithm. (arXiv:2002.03425v2 [cs.LG] UPDATED)

On the impact of selected modern deep-learning techniques to the performance and celerity of classification models in an experimental high-energy physics use case. (arXiv:2002.01427v3 [physics.data-an] UPDATED)

Restricting the Flow: Information Bottlenecks for Attribution. (arXiv:2001.00396v3 [stat.ML] UPDATED)

A priori generalization error for two-layer ReLU neural network through minimum norm solution. (arXiv:1912.03011v3 [cs.LG] UPDATED)

Covariance Matrix Adaptation for the Rapid Illumination of Behavior Space. (arXiv:1912.02400v2 [cs.LG] UPDATED)

$V$-statistics and Variance Estimation. (arXiv:1912.01089v2 [stat.ML] UPDATED)

Sampling random graph homomorphisms and applications to network data analysis. (arXiv:1910.09483v2 [math.PR] UPDATED)

Bayesian factor models for multivariate categorical data obtained from questionnaires. (arXiv:1910.04283v2 [stat.AP] UPDATED)

Differentiable Sparsification for Deep Neural Networks. (arXiv:1910.03201v2 [cs.LG] UPDATED)

DualSMC: Tunneling Differentiable Filtering and Planning under Continuous POMDPs. (arXiv:1909.13003v4 [cs.LG] UPDATED)

Margin-Based Generalization Lower Bounds for Boosted Classifiers. (arXiv:1909.12518v4 [cs.LG] UPDATED)

Estimating drift parameters in a non-ergodic Gaussian Vasicek-type model. (arXiv:1909.06155v2 [math.PR] UPDATED)

Additive Bayesian variable selection under censoring and misspecification. (arXiv:1907.13563v3 [stat.ME] UPDATED)

Convergence rates for optimised adaptive importance samplers. (arXiv:1903.12044v4 [stat.CO] UPDATED)

An n-dimensional Rosenbrock Distribution for MCMC Testing. (arXiv:1903.09556v4 [stat.CO] UPDATED)

Learned Step Size Quantization. (arXiv:1902.08153v3 [cs.LG] UPDATED)

FNNC: Achieving Fairness through Neural Networks. (arXiv:1811.00247v3 [cs.LG] UPDATED)

Multi-scale analysis of lead-lag relationships in high-frequency financial markets. (arXiv:1708.03992v3 [stat.ME] UPDATED)

Semiparametric Optimal Estimation With Nonignorable Nonresponse Data. (arXiv:1612.09207v3 [stat.ME] UPDATED)

Alternating Maximization: Unifying Framework for 8 Sparse PCA Formulations and Efficient Parallel Codes. (arXiv:1212.4137v2 [stat.ML] UPDATED)

The finish line: Attachment of Signs

The Finish Line: Katrina One Year After

The Finish Line: A Case Study: What is Causing This?

The Finish Line: EPS Vs. Polyisocyanurate Insulation

The Finish Line: Design Features

Building Product Transparency— Be Careful What You Ask For

Anti-LEED Legislation

Hydronic Floor Heating

Alternatives to LEED

Embodied Energy of Building Materials

Buying a New Water Heater

Climate Change

VIDEO: The Great Heights of the Building Arts

American Industrial Partners to Acquire PPG’s Architectural Coatings Business

NCS Trust ‘sad and disappointed’ at government plans to shut it down

Subscribe To Our Newsletter