orm Consistency and asymptotic normality of Latent Block Model estimators By projecteuclid.org Published On :: Mon, 23 Mar 2020 22:02 EDT Vincent Brault, Christine Keribin, Mahendra Mariadassou. Source: Electronic Journal of Statistics, Volume 14, Number 1, 1234--1268.Abstract: The Latent Block Model (LBM) is a model-based method to cluster simultaneously the $d$ columns and $n$ rows of a data matrix. Parameter estimation in LBM is a difficult and multifaceted problem. Although various estimation strategies have been proposed and are now well understood empirically, theoretical guarantees about their asymptotic behavior is rather sparse and most results are limited to the binary setting. We prove here theoretical guarantees in the valued settings. We show that under some mild conditions on the parameter space, and in an asymptotic regime where $log (d)/n$ and $log (n)/d$ tend to $0$ when $n$ and $d$ tend to infinity, (1) the maximum-likelihood estimate of the complete model (with known labels) is consistent and (2) the log-likelihood ratios are equivalent under the complete and observed (with unknown labels) models. This equivalence allows us to transfer the asymptotic consistency, and under mild conditions, asymptotic normality, to the maximum likelihood estimate under the observed model. Moreover, the variational estimator is also consistent and, under the same conditions, asymptotically normal. Full Article
orm Reduction problems and deformation approaches to nonstationary covariance functions over spheres By projecteuclid.org Published On :: Tue, 11 Feb 2020 22:03 EST Emilio Porcu, Rachid Senoussi, Enner Mendoza, Moreno Bevilacqua. Source: Electronic Journal of Statistics, Volume 14, Number 1, 890--916.Abstract: The paper considers reduction problems and deformation approaches for nonstationary covariance functions on the $(d-1)$-dimensional spheres, $mathbb{S}^{d-1}$, embedded in the $d$-dimensional Euclidean space. Given a covariance function $C$ on $mathbb{S}^{d-1}$, we chase a pair $(R,Psi)$, for a function $R:[-1,+1] o mathbb{R}$ and a smooth bijection $Psi$, such that $C$ can be reduced to a geodesically isotropic one: $C(mathbf{x},mathbf{y})=R(langle Psi (mathbf{x}),Psi (mathbf{y}) angle )$, with $langle cdot ,cdot angle $ denoting the dot product. The problem finds motivation in recent statistical literature devoted to the analysis of global phenomena, defined typically over the sphere of $mathbb{R}^{3}$. The application domains considered in the manuscript makes the problem mathematically challenging. We show the uniqueness of the representation in the reduction problem. Then, under some regularity assumptions, we provide an inversion formula to recover the bijection $Psi$, when it exists, for a given $C$. We also give sufficient conditions for reducibility. Full Article
orm Estimation of a semiparametric transformation model: A novel approach based on least squares minimization By projecteuclid.org Published On :: Tue, 04 Feb 2020 22:03 EST Benjamin Colling, Ingrid Van Keilegom. Source: Electronic Journal of Statistics, Volume 14, Number 1, 769--800.Abstract: Consider the following semiparametric transformation model $Lambda_{ heta }(Y)=m(X)+varepsilon $, where $X$ is a $d$-dimensional covariate, $Y$ is a univariate response variable and $varepsilon $ is an error term with zero mean and independent of $X$. We assume that $m$ is an unknown regression function and that ${Lambda _{ heta }: heta inTheta }$ is a parametric family of strictly increasing functions. Our goal is to develop two new estimators of the transformation parameter $ heta $. The main idea of these two estimators is to minimize, with respect to $ heta $, the $L_{2}$-distance between the transformation $Lambda _{ heta }$ and one of its fully nonparametric estimators. We consider in particular the nonparametric estimator based on the least-absolute deviation loss constructed in Colling and Van Keilegom (2019). We establish the consistency and the asymptotic normality of the two proposed estimators of $ heta $. We also carry out a simulation study to illustrate and compare the performance of our new parametric estimators to that of the profile likelihood estimator constructed in Linton et al. (2008). Full Article
orm Weighted Message Passing and Minimum Energy Flow for Heterogeneous Stochastic Block Models with Side Information By Published On :: 2020 We study the misclassification error for community detection in general heterogeneous stochastic block models (SBM) with noisy or partial label information. We establish a connection between the misclassification rate and the notion of minimum energy on the local neighborhood of the SBM. We develop an optimally weighted message passing algorithm to reconstruct labels for SBM based on the minimum energy flow and the eigenvectors of a certain Markov transition matrix. The general SBM considered in this paper allows for unequal-size communities, degree heterogeneity, and different connection probabilities among blocks. We focus on how to optimally weigh the message passing to improve misclassification. Full Article
orm Robust Asynchronous Stochastic Gradient-Push: Asymptotically Optimal and Network-Independent Performance for Strongly Convex Functions By Published On :: 2020 We consider the standard model of distributed optimization of a sum of functions $F(mathbf z) = sum_{i=1}^n f_i(mathbf z)$, where node $i$ in a network holds the function $f_i(mathbf z)$. We allow for a harsh network model characterized by asynchronous updates, message delays, unpredictable message losses, and directed communication among nodes. In this setting, we analyze a modification of the Gradient-Push method for distributed optimization, assuming that (i) node $i$ is capable of generating gradients of its function $f_i(mathbf z)$ corrupted by zero-mean bounded-support additive noise at each step, (ii) $F(mathbf z)$ is strongly convex, and (iii) each $f_i(mathbf z)$ has Lipschitz gradients. We show that our proposed method asymptotically performs as well as the best bounds on centralized gradient descent that takes steps in the direction of the sum of the noisy gradients of all the functions $f_1(mathbf z), ldots, f_n(mathbf z)$ at each step. Full Article
orm Kymatio: Scattering Transforms in Python By Published On :: 2020 The wavelet scattering transform is an invariant and stable signal representation suitable for many signal processing and machine learning applications. We present the Kymatio software package, an easy-to-use, high-performance Python implementation of the scattering transform in 1D, 2D, and 3D that is compatible with modern deep learning frameworks, including PyTorch and TensorFlow/Keras. The transforms are implemented on both CPUs and GPUs, the latter offering a significant speedup over the former. The package also has a small memory footprint. Source code, documentation, and examples are available under a BSD license at https://www.kymat.io. Full Article
orm Estimation of a Low-rank Topic-Based Model for Information Cascades By Published On :: 2020 We consider the problem of estimating the latent structure of a social network based on the observed information diffusion events, or cascades, where the observations for a given cascade consist of only the timestamps of infection for infected nodes but not the source of the infection. Most of the existing work on this problem has focused on estimating a diffusion matrix without any structural assumptions on it. In this paper, we propose a novel model based on the intuition that an information is more likely to propagate among two nodes if they are interested in similar topics which are also prominent in the information content. In particular, our model endows each node with an influence vector (which measures how authoritative the node is on each topic) and a receptivity vector (which measures how susceptible the node is for each topic). We show how this node-topic structure can be estimated from the observed cascades, and prove the consistency of the estimator. Experiments on synthetic and real data demonstrate the improved performance and better interpretability of our model compared to existing state-of-the-art methods. Full Article
orm Adaptive two-treatment three-period crossover design for normal responses By projecteuclid.org Published On :: Mon, 04 May 2020 04:00 EDT Uttam Bandyopadhyay, Shirsendu Mukherjee, Atanu Biswas. Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 2, 291--303.Abstract: In adaptive crossover design, our goal is to allocate more patients to a promising treatment sequence. The present work contains a very simple three period crossover design for two competing treatments where the allocation in period 3 is done on the basis of the data obtained from the first two periods. Assuming normality of response variables we use a reliability functional for the choice between two treatments. We calculate the allocation proportions and their standard errors corresponding to the possible treatment combinations. We also derive some asymptotic results and provide solutions on related inferential problems. Moreover, the proposed procedure is compared with a possible competitor. Finally, we use a data set to illustrate the applicability of the proposed design. Full Article
orm Multivariate normal approximation of the maximum likelihood estimator via the delta method By projecteuclid.org Published On :: Mon, 03 Feb 2020 04:00 EST Andreas Anastasiou, Robert E. Gaunt. Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 1, 136--149.Abstract: We use the delta method and Stein’s method to derive, under regularity conditions, explicit upper bounds for the distributional distance between the distribution of the maximum likelihood estimator (MLE) of a $d$-dimensional parameter and its asymptotic multivariate normal distribution. Our bounds apply in situations in which the MLE can be written as a function of a sum of i.i.d. $t$-dimensional random vectors. We apply our general bound to establish a bound for the multivariate normal approximation of the MLE of the normal distribution with unknown mean and variance. Full Article
orm Fake uniformity in a shape inversion formula By projecteuclid.org Published On :: Mon, 10 Jun 2019 04:04 EDT Christian Rau. Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 3, 549--557.Abstract: We revisit a shape inversion formula derived by Panaretos in the context of a particle density estimation problem with unknown rotation of the particle. A distribution is presented which imitates, or “fakes”, the uniformity or Haar distribution that is part of that formula. Full Article
orm Modified information criterion for testing changes in skew normal model By projecteuclid.org Published On :: Mon, 04 Mar 2019 04:00 EST Khamis K. Said, Wei Ning, Yubin Tian. Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 2, 280--300.Abstract: In this paper, we study the change point problem for the skew normal distribution model from the view of model selection problem. The detection procedure based on the modified information criterion (MIC) for change problem is proposed. Such a procedure has advantage in detecting the changes in early and late stage of a data comparing to the one based on the traditional Schwarz information criterion which is well known as Bayesian information criterion (BIC) by considering the complexity of the models. Due to the difficulty in deriving the analytic asymptotic distribution of the test statistic based on the MIC procedure, the bootstrap simulation is provided to obtain the critical values at the different significance levels. Simulations are conducted to illustrate the comparisons of performance between MIC, BIC and likelihood ratio test (LRT). Such an approach is applied on two stock market data sets to indicate the detection procedure. Full Article
orm On the impact of selected modern deep-learning techniques to the performance and celerity of classification models in an experimental high-energy physics use case. (arXiv:2002.01427v3 [physics.data-an] UPDATED) By arxiv.org Published On :: Beginning from a basic neural-network architecture, we test the potential benefits offered by a range of advanced techniques for machine learning, in particular deep learning, in the context of a typical classification problem encountered in the domain of high-energy physics, using a well-studied dataset: the 2014 Higgs ML Kaggle dataset. The advantages are evaluated in terms of both performance metrics and the time required to train and apply the resulting models. Techniques examined include domain-specific data-augmentation, learning rate and momentum scheduling, (advanced) ensembling in both model-space and weight-space, and alternative architectures and connection methods. Following the investigation, we arrive at a model which achieves equal performance to the winning solution of the original Kaggle challenge, whilst being significantly quicker to train and apply, and being suitable for use with both GPU and CPU hardware setups. These reductions in timing and hardware requirements potentially allow the use of more powerful algorithms in HEP analyses, where models must be retrained frequently, sometimes at short notice, by small groups of researchers with limited hardware resources. Additionally, a new wrapper library for PyTorch called LUMINis presented, which incorporates all of the techniques studied. Full Article
orm Restricting the Flow: Information Bottlenecks for Attribution. (arXiv:2001.00396v3 [stat.ML] UPDATED) By arxiv.org Published On :: Attribution methods provide insights into the decision-making of machine learning models like artificial neural networks. For a given input sample, they assign a relevance score to each individual input variable, such as the pixels of an image. In this work we adapt the information bottleneck concept for attribution. By adding noise to intermediate feature maps we restrict the flow of information and can quantify (in bits) how much information image regions provide. We compare our method against ten baselines using three different metrics on VGG-16 and ResNet-50, and find that our methods outperform all baselines in five out of six settings. The method's information-theoretic foundation provides an absolute frame of reference for attribution values (bits) and a guarantee that regions scored close to zero are not necessary for the network's decision. For reviews: https://openreview.net/forum?id=S1xWh1rYwB For code: https://github.com/BioroboticsLab/IBA Full Article
orm A priori generalization error for two-layer ReLU neural network through minimum norm solution. (arXiv:1912.03011v3 [cs.LG] UPDATED) By arxiv.org Published On :: We focus on estimating emph{a priori} generalization error of two-layer ReLU neural networks (NNs) trained by mean squared error, which only depends on initial parameters and the target function, through the following research line. We first estimate emph{a priori} generalization error of finite-width two-layer ReLU NN with constraint of minimal norm solution, which is proved by cite{zhang2019type} to be an equivalent solution of a linearized (w.r.t. parameter) finite-width two-layer NN. As the width goes to infinity, the linearized NN converges to the NN in Neural Tangent Kernel (NTK) regime citep{jacot2018neural}. Thus, we can derive the emph{a priori} generalization error of two-layer ReLU NN in NTK regime. The distance between NN in a NTK regime and a finite-width NN with gradient training is estimated by cite{arora2019exact}. Based on the results in cite{arora2019exact}, our work proves an emph{a priori} generalization error bound of two-layer ReLU NNs. This estimate uses the intrinsic implicit bias of the minimum norm solution without requiring extra regularity in the loss function. This emph{a priori} estimate also implies that NN does not suffer from curse of dimensionality, and a small generalization error can be achieved without requiring exponentially large number of neurons. In addition the research line proposed in this paper can also be used to study other properties of the finite-width network, such as the posterior generalization error. Full Article
orm Alternating Maximization: Unifying Framework for 8 Sparse PCA Formulations and Efficient Parallel Codes. (arXiv:1212.4137v2 [stat.ML] UPDATED) By arxiv.org Published On :: Given a multivariate data set, sparse principal component analysis (SPCA) aims to extract several linear combinations of the variables that together explain the variance in the data as much as possible, while controlling the number of nonzero loadings in these combinations. In this paper we consider 8 different optimization formulations for computing a single sparse loading vector; these are obtained by combining the following factors: we employ two norms for measuring variance (L2, L1) and two sparsity-inducing norms (L0, L1), which are used in two different ways (constraint, penalty). Three of our formulations, notably the one with L0 constraint and L1 variance, have not been considered in the literature. We give a unifying reformulation which we propose to solve via a natural alternating maximization (AM) method. We show the the AM method is nontrivially equivalent to GPower (Journ'{e}e et al; JMLR 11:517--553, 2010) for all our formulations. Besides this, we provide 24 efficient parallel SPCA implementations: 3 codes (multi-core, GPU and cluster) for each of the 8 problems. Parallelism in the methods is aimed at i) speeding up computations (our GPU code can be 100 times faster than an efficient serial code written in C++), ii) obtaining solutions explaining more variance and iii) dealing with big data problems (our cluster code is able to solve a 357 GB problem in about a minute). Full Article
orm Nonparametric Estimation of the Fisher Information and Its Applications. (arXiv:2005.03622v1 [cs.IT]) By arxiv.org Published On :: This paper considers the problem of estimation of the Fisher information for location from a random sample of size $n$. First, an estimator proposed by Bhattacharya is revisited and improved convergence rates are derived. Second, a new estimator, termed a clipped estimator, is proposed. Superior upper bounds on the rates of convergence can be shown for the new estimator compared to the Bhattacharya estimator, albeit with different regularity conditions. Third, both of the estimators are evaluated for the practically relevant case of a random variable contaminated by Gaussian noise. Moreover, using Brown's identity, which relates the Fisher information and the minimum mean squared error (MMSE) in Gaussian noise, two corresponding consistent estimators for the MMSE are proposed. Simulation examples for the Bhattacharya estimator and the clipped estimator as well as the MMSE estimators are presented. The examples demonstrate that the clipped estimator can significantly reduce the required sample size to guarantee a specific confidence interval compared to the Bhattacharya estimator. Full Article
orm Physics-informed neural network for ultrasound nondestructive quantification of surface breaking cracks. (arXiv:2005.03596v1 [cs.LG]) By arxiv.org Published On :: We introduce an optimized physics-informed neural network (PINN) trained to solve the problem of identifying and characterizing a surface breaking crack in a metal plate. PINNs are neural networks that can combine data and physics in the learning process by adding the residuals of a system of Partial Differential Equations to the loss function. Our PINN is supervised with realistic ultrasonic surface acoustic wave data acquired at a frequency of 5 MHz. The ultrasonic surface wave data is represented as a surface deformation on the top surface of a metal plate, measured by using the method of laser vibrometry. The PINN is physically informed by the acoustic wave equation and its convergence is sped up using adaptive activation functions. The adaptive activation function uses a scalable hyperparameter in the activation function, which is optimized to achieve best performance of the network as it changes dynamically the topology of the loss function involved in the optimization process. The usage of adaptive activation function significantly improves the convergence, notably observed in the current study. We use PINNs to estimate the speed of sound of the metal plate, which we do with an error of 1\%, and then, by allowing the speed of sound to be space dependent, we identify and characterize the crack as the positions where the speed of sound has decreased. Our study also shows the effect of sub-sampling of the data on the sensitivity of sound speed estimates. More broadly, the resulting model shows a promising deep neural network model for ill-posed inverse problems. Full Article
orm Relevance Vector Machine with Weakly Informative Hyperprior and Extended Predictive Information Criterion. (arXiv:2005.03419v1 [stat.ML]) By arxiv.org Published On :: In the variational relevance vector machine, the gamma distribution is representative as a hyperprior over the noise precision of automatic relevance determination prior. Instead of the gamma hyperprior, we propose to use the inverse gamma hyperprior with a shape parameter close to zero and a scale parameter not necessary close to zero. This hyperprior is associated with the concept of a weakly informative prior. The effect of this hyperprior is investigated through regression to non-homogeneous data. Because it is difficult to capture the structure of such data with a single kernel function, we apply the multiple kernel method, in which multiple kernel functions with different widths are arranged for input data. We confirm that the degrees of freedom in a model is controlled by adjusting the scale parameter and keeping the shape parameter close to zero. A candidate for selecting the scale parameter is the predictive information criterion. However the estimated model using this criterion seems to cause over-fitting. This is because the multiple kernel method makes the model a situation where the dimension of the model is larger than the data size. To select an appropriate scale parameter even in such a situation, we also propose an extended prediction information criterion. It is confirmed that a multiple kernel relevance vector regression model with good predictive accuracy can be obtained by selecting the scale parameter minimizing extended prediction information criterion. Full Article
orm On a computationally-scalable sparse formulation of the multidimensional and non-stationary maximum entropy principle. (arXiv:2005.03253v1 [stat.CO]) By arxiv.org Published On :: Data-driven modelling and computational predictions based on maximum entropy principle (MaxEnt-principle) aim at finding as-simple-as-possible - but not simpler then necessary - models that allow to avoid the data overfitting problem. We derive a multivariate non-parametric and non-stationary formulation of the MaxEnt-principle and show that its solution can be approximated through a numerical maximisation of the sparse constrained optimization problem with regularization. Application of the resulting algorithm to popular financial benchmarks reveals memoryless models allowing for simple and qualitative descriptions of the major stock market indexes data. We compare the obtained MaxEnt-models to the heteroschedastic models from the computational econometrics (GARCH, GARCH-GJR, MS-GARCH, GARCH-PML4) in terms of the model fit, complexity and prediction quality. We compare the resulting model log-likelihoods, the values of the Bayesian Information Criterion, posterior model probabilities, the quality of the data autocorrelation function fits as well as the Value-at-Risk prediction quality. We show that all of the considered seven major financial benchmark time series (DJI, SPX, FTSE, STOXX, SMI, HSI and N225) are better described by conditionally memoryless MaxEnt-models with nonstationary regime-switching than by the common econometric models with finite memory. This analysis also reveals a sparse network of statistically-significant temporal relations for the positive and negative latent variance changes among different markets. The code is provided for open access. Full Article
orm Detecting Latent Communities in Network Formation Models. (arXiv:2005.03226v1 [econ.EM]) By arxiv.org Published On :: This paper proposes a logistic undirected network formation model which allows for assortative matching on observed individual characteristics and the presence of edge-wise fixed effects. We model the coefficients of observed characteristics to have a latent community structure and the edge-wise fixed effects to be of low rank. We propose a multi-step estimation procedure involving nuclear norm regularization, sample splitting, iterative logistic regression and spectral clustering to detect the latent communities. We show that the latent communities can be exactly recovered when the expected degree of the network is of order log n or higher, where n is the number of nodes in the network. The finite sample performance of the new estimation and inference methods is illustrated through both simulated and real datasets. Full Article
orm Deep Learning Framework for Detecting Ground Deformation in the Built Environment using Satellite InSAR data. (arXiv:2005.03221v1 [cs.CV]) By arxiv.org Published On :: The large volumes of Sentinel-1 data produced over Europe are being used to develop pan-national ground motion services. However, simple analysis techniques like thresholding cannot detect and classify complex deformation signals reliably making providing usable information to a broad range of non-expert stakeholders a challenge. Here we explore the applicability of deep learning approaches by adapting a pre-trained convolutional neural network (CNN) to detect deformation in a national-scale velocity field. For our proof-of-concept, we focus on the UK where previously identified deformation is associated with coal-mining, ground water withdrawal, landslides and tunnelling. The sparsity of measurement points and the presence of spike noise make this a challenging application for deep learning networks, which involve calculations of the spatial convolution between images. Moreover, insufficient ground truth data exists to construct a balanced training data set, and the deformation signals are slower and more localised than in previous applications. We propose three enhancement methods to tackle these problems: i) spatial interpolation with modified matrix completion, ii) a synthetic training dataset based on the characteristics of real UK velocity map, and iii) enhanced over-wrapping techniques. Using velocity maps spanning 2015-2019, our framework detects several areas of coal mining subsidence, uplift due to dewatering, slate quarries, landslides and tunnel engineering works. The results demonstrate the potential applicability of the proposed framework to the development of automated ground motion analysis systems. Full Article
orm Important information: COVID-19 By feedproxy.google.com Published On :: Fri, 13 Mar 2020 04:16:37 +0000 The Library will be closed to the public and to staff from Monday 23 March 2020. Full Article
orm Trusted computing and information security : 13th Chinese conference, CTCIS 2019, Shanghai, China, October 24-27, 2019 By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Author: Chinese Conference on Trusted Computing and Information Security (13th : 2019 : Shanghai, China)Callnumber: OnlineISBN: 9789811534188 (eBook) Full Article
orm Transgender and gender nonconforming health and aging By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9783319950310 (electronic bk.) Full Article
orm Structured object-oriented formal language and method : 9th International Workshop, SOFL+MSVL 2019, Shenzhen, China, November 5, 2019, Revised selected papers By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Author: SOFL+MSVL (Workshop) (9th : 2019 : Shenzhen, China)Callnumber: OnlineISBN: 9783030414184 (electronic bk.) Full Article
orm Space information networks : 4th International Conference, SINC 2019, Wuzhen, China, September 19-20, 2019, Revised Selected Papers By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Author: SINC (Conference) (4th : 2019 : Wuzhen, China)Callnumber: OnlineISBN: 9789811534423 (electronic bk.) Full Article
orm Salt, fat and sugar reduction : sensory approaches for nutritional reformulation of foods and beverages By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Author: O'Sullivan, Maurice G., authorCallnumber: OnlineISBN: 9780128226124 (electronic bk.) Full Article
orm Models of tree and stand dynamics : theory, formulation and application By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Author: Mäkelä, Annikki, authorCallnumber: OnlineISBN: 9783030357610 Full Article
orm Mental Conditioning to Perform Common Operations in General Surgery Training By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9783319911649 978-3-319-91164-9 Full Article
orm Information retrieval technology : 15th Asia Information Retrieval Societies Conference, AIRS 2019, Hong Kong, China, November 7-9, 2019, proceedings By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Author: Asia Information Retrieval Societies Conference (15th : 2019 : Hong Kong, China)Callnumber: OnlineISBN: 9783030428358 Full Article
orm In china's wake : how the commodity boom transformed development strategies in the global south By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Author: Jepson, Nicholas, author.Callnumber: OnlineISBN: 9780231547598 electronic book Full Article
orm Green food processing techniques : preservation, transformation and extraction By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9780128153536 Full Article
orm Formation and control of biofilm in various environments By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Author: Kanematsu, Hideyuki, authorCallnumber: OnlineISBN: 9789811522406 (electronic bk.) Full Article
orm Enterprise information systems : 21st International Conference, ICEIS 2019, Heraklion, Crete, Greece, May 3-5, 2019, Revised Selected Papers By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Author: International Conference on Enterprise Information Systems (21st : 2019 : Ērakleion, Greece)Callnumber: OnlineISBN: 9783030407834 (electronic bk.) Full Article
orm Optimal prediction in the linearly transformed spiked model By projecteuclid.org Published On :: Mon, 17 Feb 2020 04:02 EST Edgar Dobriban, William Leeb, Amit Singer. Source: The Annals of Statistics, Volume 48, Number 1, 491--513.Abstract: We consider the linearly transformed spiked model , where the observations $Y_{i}$ are noisy linear transforms of unobserved signals of interest $X_{i}$: egin{equation*}Y_{i}=A_{i}X_{i}+varepsilon_{i},end{equation*} for $i=1,ldots ,n$. The transform matrices $A_{i}$ are also observed. We model the unobserved signals (or regression coefficients) $X_{i}$ as vectors lying on an unknown low-dimensional space. Given only $Y_{i}$ and $A_{i}$ how should we predict or recover their values? The naive approach of performing regression for each observation separately is inaccurate due to the large noise level. Instead, we develop optimal methods for predicting $X_{i}$ by “borrowing strength” across the different samples. Our linear empirical Bayes methods scale to large datasets and rely on weak moment assumptions. We show that this model has wide-ranging applications in signal processing, deconvolution, cryo-electron microscopy, and missing data with noise. For missing data, we show in simulations that our methods are more robust to noise and to unequal sampling than well-known matrix completion methods. Full Article
orm Uniformly valid confidence intervals post-model-selection By projecteuclid.org Published On :: Mon, 17 Feb 2020 04:02 EST François Bachoc, David Preinerstorfer, Lukas Steinberger. Source: The Annals of Statistics, Volume 48, Number 1, 440--463.Abstract: We suggest general methods to construct asymptotically uniformly valid confidence intervals post-model-selection. The constructions are based on principles recently proposed by Berk et al. ( Ann. Statist. 41 (2013) 802–837). In particular, the candidate models used can be misspecified, the target of inference is model-specific, and coverage is guaranteed for any data-driven model selection procedure. After developing a general theory, we apply our methods to practically important situations where the candidate set of models, from which a working model is selected, consists of fixed design homoskedastic or heteroskedastic linear models, or of binary regression models with general link functions. In an extensive simulation study, we find that the proposed confidence intervals perform remarkably well, even when compared to existing methods that are tailored only for specific model selection procedures. Full Article
orm New $G$-formula for the sequential causal effect and blip effect of treatment in sequential causal inference By projecteuclid.org Published On :: Mon, 17 Feb 2020 04:02 EST Xiaoqin Wang, Li Yin. Source: The Annals of Statistics, Volume 48, Number 1, 138--160.Abstract: In sequential causal inference, two types of causal effects are of practical interest, namely, the causal effect of the treatment regime (called the sequential causal effect) and the blip effect of treatment on the potential outcome after the last treatment. The well-known $G$-formula expresses these causal effects in terms of the standard parameters. In this article, we obtain a new $G$-formula that expresses these causal effects in terms of the point observable effects of treatments similar to treatment in the framework of single-point causal inference. Based on the new $G$-formula, we estimate these causal effects by maximum likelihood via point observable effects with methods extended from single-point causal inference. We are able to increase precision of the estimation without introducing biases by an unsaturated model imposing constraints on the point observable effects. We are also able to reduce the number of point observable effects in the estimation by treatment assignment conditions. Full Article
orm Statistical inference for autoregressive models under heteroscedasticity of unknown form By projecteuclid.org Published On :: Wed, 30 Oct 2019 22:03 EDT Ke Zhu. Source: The Annals of Statistics, Volume 47, Number 6, 3185--3215.Abstract: This paper provides an entire inference procedure for the autoregressive model under (conditional) heteroscedasticity of unknown form with a finite variance. We first establish the asymptotic normality of the weighted least absolute deviations estimator (LADE) for the model. Second, we develop the random weighting (RW) method to estimate its asymptotic covariance matrix, leading to the implementation of the Wald test. Third, we construct a portmanteau test for model checking, and use the RW method to obtain its critical values. As a special weighted LADE, the feasible adaptive LADE (ALADE) is proposed and proved to have the same efficiency as its infeasible counterpart. The importance of our entire methodology based on the feasible ALADE is illustrated by simulation results and the real data analysis on three U.S. economic data sets. Full Article
orm The two-to-infinity norm and singular subspace geometry with applications to high-dimensional statistics By projecteuclid.org Published On :: Fri, 02 Aug 2019 22:04 EDT Joshua Cape, Minh Tang, Carey E. Priebe. Source: The Annals of Statistics, Volume 47, Number 5, 2405--2439.Abstract: The singular value matrix decomposition plays a ubiquitous role throughout statistics and related fields. Myriad applications including clustering, classification, and dimensionality reduction involve studying and exploiting the geometric structure of singular values and singular vectors. This paper provides a novel collection of technical and theoretical tools for studying the geometry of singular subspaces using the two-to-infinity norm. Motivated by preliminary deterministic Procrustes analysis, we consider a general matrix perturbation setting in which we derive a new Procrustean matrix decomposition. Together with flexible machinery developed for the two-to-infinity norm, this allows us to conduct a refined analysis of the induced perturbation geometry with respect to the underlying singular vectors even in the presence of singular value multiplicity. Our analysis yields singular vector entrywise perturbation bounds for a range of popular matrix noise models, each of which has a meaningful associated statistical inference task. In addition, we demonstrate how the two-to-infinity norm is the preferred norm in certain statistical settings. Specific applications discussed in this paper include covariance estimation, singular subspace recovery, and multiple graph inference. Both our Procrustean matrix decomposition and the technical machinery developed for the two-to-infinity norm may be of independent interest. Full Article
orm New formulation of the logistic-Gaussian process to analyze trajectory tracking data By projecteuclid.org Published On :: Wed, 27 Nov 2019 22:01 EST Gianluca Mastrantonio, Clara Grazian, Sara Mancinelli, Enrico Bibbona. Source: The Annals of Applied Statistics, Volume 13, Number 4, 2483--2508.Abstract: Improved communication systems, shrinking battery sizes and the price drop of tracking devices have led to an increasing availability of trajectory tracking data. These data are often analyzed to understand animal behavior. In this work, we propose a new model for interpreting the animal movent as a mixture of characteristic patterns, that we interpret as different behaviors. The probability that the animal is behaving according to a specific pattern, at each time instant, is nonparametrically estimated using the Logistic-Gaussian process. Owing to a new formalization and the way we specify the coregionalization matrix of the associated multivariate Gaussian process, our model is invariant with respect to the choice of the reference element and of the ordering of the probability vector components. We fit the model under a Bayesian framework, and show that the Markov chain Monte Carlo algorithm we propose is straightforward to implement. We perform a simulation study with the aim of showing the ability of the estimation procedure to retrieve the model parameters. We also test the performance of the information criterion we used to select the number of behaviors. The model is then applied to a real dataset where a wolf has been observed before and after procreation. The results are easy to interpret, and clear differences emerge in the two phases. Full Article
orm Empirical Bayes analysis of RNA sequencing experiments with auxiliary information By projecteuclid.org Published On :: Wed, 27 Nov 2019 22:01 EST Kun Liang. Source: The Annals of Applied Statistics, Volume 13, Number 4, 2452--2482.Abstract: Finding differentially expressed genes is a common task in high-throughput transcriptome studies. While traditional statistical methods rank the genes by their test statistics alone, we analyze an RNA sequencing dataset using the auxiliary information of gene length and the test statistics from a related microarray study. Given the auxiliary information, we propose a novel nonparametric empirical Bayes procedure to estimate the posterior probability of differential expression for each gene. We demonstrate the advantage of our procedure in extensive simulation studies and a psoriasis RNA sequencing study. The companion R package calm is available at Bioconductor. Full Article
orm Radio-iBAG: Radiomics-based integrative Bayesian analysis of multiplatform genomic data By projecteuclid.org Published On :: Wed, 16 Oct 2019 22:03 EDT Youyi Zhang, Jeffrey S. Morris, Shivali Narang Aerry, Arvind U. K. Rao, Veerabhadran Baladandayuthapani. Source: The Annals of Applied Statistics, Volume 13, Number 3, 1957--1988.Abstract: Technological innovations have produced large multi-modal datasets that include imaging and multi-platform genomics data. Integrative analyses of such data have the potential to reveal important biological and clinical insights into complex diseases like cancer. In this paper, we present Bayesian approaches for integrative analysis of radiological imaging and multi-platform genomic data, where-in our goals are to simultaneously identify genomic and radiomic, that is, radiology-based imaging markers, along with the latent associations between these two modalities, and to detect the overall prognostic relevance of the combined markers. For this task, we propose Radio-iBAG: Radiomics-based Integrative Bayesian Analysis of Multiplatform Genomic Data , a multi-scale Bayesian hierarchical model that involves several innovative strategies: it incorporates integrative analysis of multi-platform genomic data sets to capture fundamental biological relationships; explores the associations between radiomic markers accompanying genomic information with clinical outcomes; and detects genomic and radiomic markers associated with clinical prognosis. We also introduce the use of sparse Principal Component Analysis (sPCA) to extract a sparse set of approximately orthogonal meta-features each containing information from a set of related individual radiomic features, reducing dimensionality and combining like features. Our methods are motivated by and applied to The Cancer Genome Atlas glioblastoma multiforme data set, where-in we integrate magnetic resonance imaging-based biomarkers along with genomic, epigenomic and transcriptomic data. Our model identifies important magnetic resonance imaging features and the associated genomic platforms that are related with patient survival times. Full Article
orm Sequential decision model for inference and prediction on nonuniform hypergraphs with application to knot matching from computational forestry By projecteuclid.org Published On :: Wed, 16 Oct 2019 22:03 EDT Seong-Hwan Jun, Samuel W. K. Wong, James V. Zidek, Alexandre Bouchard-Côté. Source: The Annals of Applied Statistics, Volume 13, Number 3, 1678--1707.Abstract: In this paper, we consider the knot-matching problem arising in computational forestry. The knot-matching problem is an important problem that needs to be solved to advance the state of the art in automatic strength prediction of lumber. We show that this problem can be formulated as a quadripartite matching problem and develop a sequential decision model that admits efficient parameter estimation along with a sequential Monte Carlo sampler on graph matching that can be utilized for rapid sampling of graph matching. We demonstrate the effectiveness of our methods on 30 manually annotated boards and present findings from various simulation studies to provide further evidence supporting the efficacy of our methods. Full Article
orm RCRnorm: An integrated system of random-coefficient hierarchical regression models for normalizing NanoString nCounter data By projecteuclid.org Published On :: Wed, 16 Oct 2019 22:03 EDT Gaoxiang Jia, Xinlei Wang, Qiwei Li, Wei Lu, Ximing Tang, Ignacio Wistuba, Yang Xie. Source: The Annals of Applied Statistics, Volume 13, Number 3, 1617--1647.Abstract: Formalin-fixed paraffin-embedded (FFPE) samples have great potential for biomarker discovery, retrospective studies and diagnosis or prognosis of diseases. Their application, however, is hindered by the unsatisfactory performance of traditional gene expression profiling techniques on damaged RNAs. NanoString nCounter platform is well suited for profiling of FFPE samples and measures gene expression with high sensitivity which may greatly facilitate realization of scientific and clinical values of FFPE samples. However, methodological development for normalization, a critical step when analyzing this type of data, is far behind. Existing methods designed for the platform use information from different types of internal controls separately and rely on an overly-simplified assumption that expression of housekeeping genes is constant across samples for global scaling. Thus, these methods are not optimized for the nCounter system, not mentioning that they were not developed for FFPE samples. We construct an integrated system of random-coefficient hierarchical regression models to capture main patterns and characteristics observed from NanoString data of FFPE samples and develop a Bayesian approach to estimate parameters and normalize gene expression across samples. Our method, labeled RCRnorm, incorporates information from all aspects of the experimental design and simultaneously removes biases from various sources. It eliminates the unrealistic assumption on housekeeping genes and offers great interpretability. Furthermore, it is applicable to freshly frozen or like samples that can be generally viewed as a reduced case of FFPE samples. Simulation and applications showed the superior performance of RCRnorm. Full Article
orm Concentration of the spectral norm of Erdős–Rényi random graphs By projecteuclid.org Published On :: Mon, 27 Apr 2020 04:02 EDT Gábor Lugosi, Shahar Mendelson, Nikita Zhivotovskiy. Source: Bernoulli, Volume 26, Number 3, 2253--2274.Abstract: We present results on the concentration properties of the spectral norm $|A_{p}|$ of the adjacency matrix $A_{p}$ of an Erdős–Rényi random graph $G(n,p)$. First, we consider the Erdős–Rényi random graph process and prove that $|A_{p}|$ is uniformly concentrated over the range $pin[Clog n/n,1]$. The analysis is based on delocalization arguments, uniform laws of large numbers, together with the entropy method to prove concentration inequalities. As an application of our techniques, we prove sharp sub-Gaussian moment inequalities for $|A_{p}|$ for all $pin[clog^{3}n/n,1]$ that improve the general bounds of Alon, Krivelevich, and Vu ( Israel J. Math. 131 (2002) 259–267) and some of the more recent results of Erdős et al. ( Ann. Probab. 41 (2013) 2279–2375). Both results are consistent with the asymptotic result of Füredi and Komlós ( Combinatorica 1 (1981) 233–241) that holds for fixed $p$ as $n oinfty$. Full Article
orm On Sobolev tests of uniformity on the circle with an extension to the sphere By projecteuclid.org Published On :: Mon, 27 Apr 2020 04:02 EDT Sreenivasa Rao Jammalamadaka, Simos Meintanis, Thomas Verdebout. Source: Bernoulli, Volume 26, Number 3, 2226--2252.Abstract: Circular and spherical data arise in many applications, especially in biology, Earth sciences and astronomy. In dealing with such data, one of the preliminary steps before any further inference, is to test if such data is isotropic, that is, uniformly distributed around the circle or the sphere. In view of its importance, there is a considerable literature on the topic. In the present work, we provide new tests of uniformity on the circle based on original asymptotic results. Our tests are motivated by the shape of locally and asymptotically maximin tests of uniformity against generalized von Mises distributions. We show that they are uniformly consistent. Empirical power comparisons with several competing procedures are presented via simulations. The new tests detect particularly well multimodal alternatives such as mixtures of von Mises distributions. A practically-oriented combination of the new tests with already existing Sobolev tests is proposed. An extension to testing uniformity on the sphere, along with some simulations, is included. The procedures are illustrated on a real dataset. Full Article
orm On estimation of nonsmooth functionals of sparse normal means By projecteuclid.org Published On :: Mon, 27 Apr 2020 04:02 EDT O. Collier, L. Comminges, A.B. Tsybakov. Source: Bernoulli, Volume 26, Number 3, 1989--2020.Abstract: We study the problem of estimation of $N_{gamma }( heta )=sum_{i=1}^{d}| heta _{i}|^{gamma }$ for $gamma >0$ and of the $ell _{gamma }$-norm of $ heta $ for $gamma ge 1$ based on the observations $y_{i}= heta _{i}+varepsilon xi _{i}$, $i=1,ldots,d$, where $ heta =( heta _{1},dots , heta _{d})$ are unknown parameters, $varepsilon >0$ is known, and $xi _{i}$ are i.i.d. standard normal random variables. We find the non-asymptotic minimax rate for estimation of these functionals on the class of $s$-sparse vectors $ heta $ and we propose estimators achieving this rate. Full Article
orm Random orthogonal matrices and the Cayley transform By projecteuclid.org Published On :: Fri, 31 Jan 2020 04:06 EST Michael Jauch, Peter D. Hoff, David B. Dunson. Source: Bernoulli, Volume 26, Number 2, 1560--1586.Abstract: Random orthogonal matrices play an important role in probability and statistics, arising in multivariate analysis, directional statistics, and models of physical systems, among other areas. Calculations involving random orthogonal matrices are complicated by their constrained support. Accordingly, we parametrize the Stiefel and Grassmann manifolds, represented as subsets of orthogonal matrices, in terms of Euclidean parameters using the Cayley transform. We derive the necessary Jacobian terms for change of variables formulas. Given a density defined on the Stiefel or Grassmann manifold, these allow us to specify the corresponding density for the Euclidean parameters, and vice versa. As an application, we present a Markov chain Monte Carlo approach to simulating from distributions on the Stiefel and Grassmann manifolds. Finally, we establish that the Euclidean parameters corresponding to a uniform orthogonal matrix can be approximated asymptotically by independent normals. This result contributes to the growing literature on normal approximations to the entries of random orthogonal matrices or transformations thereof. Full Article
orm Strictly weak consensus in the uniform compass model on $mathbb{Z}$ By projecteuclid.org Published On :: Fri, 31 Jan 2020 04:06 EST Nina Gantert, Markus Heydenreich, Timo Hirscher. Source: Bernoulli, Volume 26, Number 2, 1269--1293.Abstract: We investigate a model for opinion dynamics, where individuals (modeled by vertices of a graph) hold certain abstract opinions. As time progresses, neighboring individuals interact with each other, and this interaction results in a realignment of opinions closer towards each other. This mechanism triggers formation of consensus among the individuals. Our main focus is on strong consensus (i.e., global agreement of all individuals) versus weak consensus (i.e., local agreement among neighbors). By extending a known model to a more general opinion space, which lacks a “central” opinion acting as a contraction point, we provide an example of an opinion formation process on the one-dimensional lattice $mathbb{Z}$ with weak consensus but no strong consensus. Full Article
orm Normal approximation for sums of weighted $U$-statistics – application to Kolmogorov bounds in random subgraph counting By projecteuclid.org Published On :: Tue, 26 Nov 2019 04:00 EST Nicolas Privault, Grzegorz Serafin. Source: Bernoulli, Volume 26, Number 1, 587--615.Abstract: We derive normal approximation bounds in the Kolmogorov distance for sums of discrete multiple integrals and weighted $U$-statistics made of independent Bernoulli random variables. Such bounds are applied to normal approximation for the renormalized subgraph counts in the Erdős–Rényi random graph. This approach completely solves a long-standing conjecture in the general setting of arbitrary graph counting, while recovering recent results obtained for triangles and improving other bounds in the Wasserstein distance. Full Article