Latest the news

the

David Milliss further papers, 1940s-2010

By feedproxy.google.com
Published On :: 6/10/2015 12:00:00 AM

Full Article

the

The Most Excellent Order of the British Empire Association (New South Wales) further records, 1979-2012

By feedproxy.google.com
Published On :: 9/10/2015 12:00:00 AM

Full Article

the

Signs of the times

By www.sl.nsw.gov.au
Published On :: Thu, 10 Sep 2015 02:50:10 +0000

Since we began digitising the Holtermann negatives to our new standard we have been able to view previously unclear deta

Full Article

the

Sizing up the collection

By www.sl.nsw.gov.au
Published On :: Thu, 10 Sep 2015 02:50:11 +0000

The Holtermann Collection Digitisation Project is focused mainly on the original glass plate negatives taken by the Amer

Full Article

the

It's all in the detail

By www.sl.nsw.gov.au
Published On :: Thu, 10 Sep 2015 02:50:12 +0000

It's been a productive last few weeks on the project.

Full Article

the

Resolving the image

By www.sl.nsw.gov.au
Published On :: Thu, 10 Sep 2015 02:50:12 +0000

As discussed in last week's post we have recently made important decisions on the Holtermann Collection digitisatio

Full Article

the

The scanner has arrived

By www.sl.nsw.gov.au
Published On :: Thu, 10 Sep 2015 02:50:14 +0000

The glass plate scanner has now arrived. Though officially known as a Kodak IQ3 Smart XY axis transparency scanner, I t

Full Article

the

Oregon's Sabrina Ionescu takes home Naismith Trophy Player of the Year honor

By sports.yahoo.com
Published On :: Fri, 03 Apr 2020 16:05:43 GMT

Sabrina Ionescu is the Naismith Trophy Player of the Year, concluding her illustrious Oregon career with one of the major postseason women's basketball awards. As the only player in college basketball history with 2,000 career points (2,562), 1,000 assists (1,091) and 1,000 rebounds (1,040) and the NCAA all-time leader with 26 triple-doubles, Ionescu has continued to rack up player of the year honors for her remarkable senior season.

Full Article

video
Sports

the

Oregon's Ionescu wins women's Naismith Player of the Year

By sports.yahoo.com
Published On :: Fri, 03 Apr 2020 17:27:00 GMT

Already named The Associated Press women's player of the year, Ionescu was awarded the Naismith Trophy for the most outstanding women's basketball player on Friday. Ionescu, who won AP All-American honors three times, shattered the NCAA career triple-double mark with 26 and became the first player in college history to have 2,000 points, 1,000 rebounds and 1,000 assists. Ionescu averaged 17.5 points, 9.1 assists and 8.6 rebounds with eight triple-doubles as a senior this season.

Full Article

the

The Class of 2020: A look at basketball's new Hall of Famers

By sports.yahoo.com
Published On :: Sat, 04 Apr 2020 16:20:38 GMT

A look at the newest members of the Naismith Memorial Basketball Hall of Fame, announced on Saturday:

Full Article

the

Clean sweep: Oregon's Sabrina Ionescu is unanimous Player of the Year after winning Wooden Award

By sports.yahoo.com
Published On :: Mon, 06 Apr 2020 21:21:52 GMT

Sabrina Ionescu wins the Wooden Award for the second year in a row, becoming the fifth in the trophy's history to win in back-to-back seasons. With the honor, she completes a complete sweep of the national postseason player of the year awards. As a senior, Ionescu matched her own single-season mark with eight triple-doubles in 2019-20, and she was incredibly efficient from the field with a career-best 51.8 field goal percentage.

Full Article

video
Sports

the

Sabrina Ionescu: The Goat

By sports.yahoo.com
Published On :: Thu, 09 Apr 2020 23:31:26 GMT

Watch "Our Stories Unfinished Business: Sabrina Ionescu and Ruthy Hebard" debuting Wednesday, April 15 at 7 p.m. PT/ 8 p.m. MT on Pac-12 Network.

Full Article

video
News

the

Aari McDonald on returning for her senior year at Arizona: 'We're ready to set the bar higher'

By sports.yahoo.com
Published On :: Fri, 10 Apr 2020 00:30:39 GMT

Arizona's Aari McDonald and Pac-12 Networks' Ashley Adamson discuss the guard's decision to return for her senior season in Tucson and how she now has the opportunity to be the face of the league. McDonald, the Pac-12 Defensive Player of the Year, was one of the nation's top scorers in 2019-20, averaging 20.6 points per game.

Full Article

video
Sports

the

WNBA Draft Profile: UCLA guard Japreece Dean ready to lead at the next level

By sports.yahoo.com
Published On :: Fri, 10 Apr 2020 16:40:28 GMT

UCLA guard Japreece Dean is primed to shine at the next level as she heads to the WNBA Draft in April. The do-it-all point-woman was an All-Pac-12 honoree last season, and one of only seven D-1 hoopers with at least 13 points and 5.5 assists per game.

Full Article

video
Sports

the

Oregon's Ionescu looks forward to pro career in the WNBA

By sports.yahoo.com
Published On :: Tue, 14 Apr 2020 23:17:13 GMT

With the spotlight on her growing ever brighter, Sabrina Ionescu is aware she's becoming her own brand. One of the most decorated players in women's college basketball, Ionescu is about to go pro with the WNBA draft coming up Friday. Ionescu said Oregon has prepared her to understand how much impact she can have in the community and on women's basketball.

Full Article

the

Pac-12 women's basketball student-athletes reflect on the influence of their moms ahead of Mother's Day

By sports.yahoo.com
Published On :: Fri, 08 May 2020 21:24:08 GMT

Pac-12 student-athletes give shout-outs to their moms ahead of Mother's Day on May 10th, 2020 including UCLA's Michaela Onyenwere, Oregon's Sabrina Ionescu and Satou Sabally, Arizona's Aari McDonald, Cate Reese, and Lacie Hull, Stanford's Kiana Williams, USC's Endyia Rogers, and Aliyah Jeune, and Utah's Brynna Maxwell.

Full Article

video
Sports

the

The limiting behavior of isotonic and convex regression estimators when the model is misspecified

By projecteuclid.org
Published On :: Tue, 05 May 2020 22:00 EDT

Eunji Lim.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 2053--2097.

Abstract:
We study the asymptotic behavior of the least squares estimators when the model is possibly misspecified. We consider the setting where we wish to estimate an unknown function $f_{*}:(0,1)^{d} ightarrow mathbb{R}$ from observations $(X,Y),(X_{1},Y_{1}),cdots ,(X_{n},Y_{n})$; our estimator $hat{g}_{n}$ is the minimizer of $sum _{i=1}^{n}(Y_{i}-g(X_{i}))^{2}/n$ over $gin mathcal{G}$ for some set of functions $mathcal{G}$. We provide sufficient conditions on the metric entropy of $mathcal{G}$, under which $hat{g}_{n}$ converges to $g_{*}$ as $n ightarrow infty $, where $g_{*}$ is the minimizer of $|g-f_{*}| riangleq mathbb{E}(g(X)-f_{*}(X))^{2}$ over $gin mathcal{G}$. As corollaries of our theorem, we establish $|hat{g}_{n}-g_{*}| ightarrow 0$ as $n ightarrow infty $ when $mathcal{G}$ is the set of monotone functions or the set of convex functions. We also make a connection between the convergence rate of $|hat{g}_{n}-g_{*}|$ and the metric entropy of $mathcal{G}$. As special cases of our finding, we compute the convergence rate of $|hat{g}_{n}-g_{*}|^{2}$ when $mathcal{G}$ is the set of bounded monotone functions or the set of bounded convex functions.

Full Article

the

Statistical convergence of the EM algorithm on Gaussian mixture models

By projecteuclid.org
Published On :: Tue, 05 May 2020 22:00 EDT

Ruofei Zhao, Yuanzhi Li, Yuekai Sun.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 632--660.

Abstract:
We study the convergence behavior of the Expectation Maximization (EM) algorithm on Gaussian mixture models with an arbitrary number of mixture components and mixing weights. We show that as long as the means of the components are separated by at least $Omega (sqrt{min {M,d}})$, where $M$ is the number of components and $d$ is the dimension, the EM algorithm converges locally to the global optimum of the log-likelihood. Further, we show that the convergence rate is linear and characterize the size of the basin of attraction to the global optimum.

Full Article

the

Generalised cepstral models for the spectrum of vector time series

By projecteuclid.org
Published On :: Tue, 05 May 2020 22:00 EDT

Maddalena Cavicchioli.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 605--631.

Abstract:
The paper treats the modeling of stationary multivariate stochastic processes via a frequency domain model expressed in terms of cepstrum theory. The proposed model nests the vector exponential model of [20] as a special case, and extends the generalised cepstral model of [36] to the multivariate setting, answering a question raised by the last authors in their paper. Contemporarily, we extend the notion of generalised autocovariance function of [35] to vector time series. Then we derive explicit matrix formulas connecting generalised cepstral and autocovariance matrices of the process, and prove the consistency and asymptotic properties of the Whittle likelihood estimators of model parameters. Asymptotic theory for the special case of the vector exponential model is a significant addition to the paper of [20]. We also provide a mathematical machinery, based on matrix differentiation, and computational methods to derive our results, which differ significantly from those employed in the univariate case. The utility of the proposed model is illustrated through Monte Carlo simulation from a bivariate process characterized by a high dynamic range, and an empirical application on time varying minimum variance hedge ratios through the second moments of future and spot prices in the corn commodity market.

Full Article

the

On the Letac-Massam conjecture and existence of high dimensional Bayes estimators for graphical models

By projecteuclid.org
Published On :: Tue, 05 May 2020 22:00 EDT

Emanuel Ben-David, Bala Rajaratnam.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 580--604.

Abstract:
The Wishart distribution defined on the open cone of positive-definite matrices plays a central role in multivariate analysis and multivariate distribution theory. Its domain of parameters is often referred to as the Gindikin set. In recent years, varieties of useful extensions of the Wishart distribution have been proposed in the literature for the purposes of studying Markov random fields and graphical models. In particular, generalizations of the Wishart distribution, referred to as Type I and Type II (graphical) Wishart distributions introduced by Letac and Massam in Annals of Statistics (2007) play important roles in both frequentist and Bayesian inference for Gaussian graphical models. These distributions have been especially useful in high-dimensional settings due to the flexibility offered by their multiple-shape parameters. Concerning Type I and Type II Wishart distributions, a conjecture of Letac and Massam concerns the domain of multiple-shape parameters of these distributions. The conjecture also has implications for the existence of Bayes estimators corresponding to these high dimensional priors. The conjecture, which was first posed in the Annals of Statistics, has now been an open problem for about 10 years. In this paper, we give a necessary condition for the Letac and Massam conjecture to hold. More precisely, we prove that if the Letac and Massam conjecture holds on a decomposable graph, then no two separators of the graph can be nested within each other. For this, we analyze Type I and Type II Wishart distributions on appropriate Markov equivalent perfect DAG models and succeed in deriving the aforementioned necessary condition. This condition in particular identifies a class of counterexamples to the conjecture.

Full Article

the

Gaussian field on the symmetric group: Prediction and learning

By projecteuclid.org
Published On :: Tue, 05 May 2020 22:00 EDT

François Bachoc, Baptiste Broto, Fabrice Gamboa, Jean-Michel Loubes.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 503--546.

Abstract:
In the framework of the supervised learning of a real function defined on an abstract space $mathcal{X}$, Gaussian processes are widely used. The Euclidean case for $mathcal{X}$ is well known and has been widely studied. In this paper, we explore the less classical case where $mathcal{X}$ is the non commutative finite group of permutations (namely the so-called symmetric group $S_{N}$). We provide an application to Gaussian process based optimization of Latin Hypercube Designs. We also extend our results to the case of partial rankings.

Full Article

the

Asymptotic properties of the maximum likelihood and cross validation estimators for transformed Gaussian processes

By projecteuclid.org
Published On :: Mon, 27 Apr 2020 22:02 EDT

François Bachoc, José Betancourt, Reinhard Furrer, Thierry Klein.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 1962--2008.

Abstract:
The asymptotic analysis of covariance parameter estimation of Gaussian processes has been subject to intensive investigation. However, this asymptotic analysis is very scarce for non-Gaussian processes. In this paper, we study a class of non-Gaussian processes obtained by regular non-linear transformations of Gaussian processes. We provide the increasing-domain asymptotic properties of the (Gaussian) maximum likelihood and cross validation estimators of the covariance parameters of a non-Gaussian process of this class. We show that these estimators are consistent and asymptotically normal, although they are defined as if the process was Gaussian. They do not need to model or estimate the non-linear transformation. Our results can thus be interpreted as a robustness of (Gaussian) maximum likelihood and cross validation towards non-Gaussianity. Our proofs rely on two technical results that are of independent interest for the increasing-domain asymptotic literature of spatial processes. First, we show that, under mild assumptions, coefficients of inverses of large covariance matrices decay at an inverse polynomial rate as a function of the corresponding observation location distances. Second, we provide a general central limit theorem for quadratic forms obtained from transformed Gaussian processes. Finally, our asymptotic results are illustrated by numerical simulations.

Full Article

the

Sparse equisigned PCA: Algorithms and performance bounds in the noisy rank-1 setting

By projecteuclid.org
Published On :: Mon, 27 Apr 2020 22:02 EDT

Arvind Prasadan, Raj Rao Nadakuditi, Debashis Paul.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 345--385.

Abstract:
Singular value decomposition (SVD) based principal component analysis (PCA) breaks down in the high-dimensional and limited sample size regime below a certain critical eigen-SNR that depends on the dimensionality of the system and the number of samples. Below this critical eigen-SNR, the estimates returned by the SVD are asymptotically uncorrelated with the latent principal components. We consider a setting where the left singular vector of the underlying rank one signal matrix is assumed to be sparse and the right singular vector is assumed to be equisigned, that is, having either only nonnegative or only nonpositive entries. We consider six different algorithms for estimating the sparse principal component based on different statistical criteria and prove that by exploiting sparsity, we recover consistent estimates in the low eigen-SNR regime where the SVD fails. Our analysis reveals conditions under which a coordinate selection scheme based on a sum-type decision statistic outperforms schemes that utilize the $ell _{1}$ and $ell _{2}$ norm-based statistics. We derive lower bounds on the size of detectable coordinates of the principal left singular vector and utilize these lower bounds to derive lower bounds on the worst-case risk. Finally, we verify our findings with numerical simulations and a illustrate the performance with a video data where the interest is in identifying objects.

Full Article

the

Bayesian variance estimation in the Gaussian sequence model with partial information on the means

By projecteuclid.org
Published On :: Mon, 27 Apr 2020 22:02 EDT

Gianluca Finocchio, Johannes Schmidt-Hieber.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 239--271.

Abstract:
Consider the Gaussian sequence model under the additional assumption that a fixed fraction of the means is known. We study the problem of variance estimation from a frequentist Bayesian perspective. The maximum likelihood estimator (MLE) for $sigma^{2}$ is biased and inconsistent. This raises the question whether the posterior is able to correct the MLE in this case. By developing a new proving strategy that uses refined properties of the posterior distribution, we find that the marginal posterior is inconsistent for any i.i.d. prior on the mean parameters. In particular, no assumption on the decay of the prior needs to be imposed. Surprisingly, we also find that consistency can be retained for a hierarchical prior based on Gaussian mixtures. In this case we also establish a limiting shape result and determine the limit distribution. In contrast to the classical Bernstein-von Mises theorem, the limit is non-Gaussian. We show that the Bayesian analysis leads to new statistical estimators outperforming the correctly calibrated MLE in a numerical simulation study.

Full Article

the

Adaptive estimation in the supremum norm for semiparametric mixtures of regressions

By projecteuclid.org
Published On :: Thu, 23 Apr 2020 22:01 EDT

Heiko Werner, Hajo Holzmann, Pierre Vandekerkhove.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 1816--1871.

Abstract:
We investigate a flexible two-component semiparametric mixture of regressions model, in which one of the conditional component distributions of the response given the covariate is unknown but assumed symmetric about a location parameter, while the other is specified up to a scale parameter. The location and scale parameters together with the proportion are allowed to depend nonparametrically on covariates. After settling identifiability, we provide local M-estimators for these parameters which converge in the sup-norm at the optimal rates over Hölder-smoothness classes. We also introduce an adaptive version of the estimators based on the Lepski-method. Sup-norm bounds show that the local M-estimator properly estimates the functions globally, and are the first step in the construction of useful inferential tools such as confidence bands. In our analysis we develop general results about rates of convergence in the sup-norm as well as adaptive estimation of local M-estimators which might be of some independent interest, and which can also be applied in various other settings. We investigate the finite-sample behaviour of our method in a simulation study, and give an illustration to a real data set from bioinformatics.

Full Article

the

Exact recovery in block spin Ising models at the critical line

By projecteuclid.org
Published On :: Thu, 23 Apr 2020 22:01 EDT

Matthias Löwe, Kristina Schubert.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 1796--1815.

Abstract:
We show how to exactly reconstruct the block structure at the critical line in the so-called Ising block model. This model was recently re-introduced by Berthet, Rigollet and Srivastava in [2]. There the authors show how to exactly reconstruct blocks away from the critical line and they give an upper and a lower bound on the number of observations one needs; thereby they establish a minimax optimal rate (up to constants). Our technique relies on a combination of their methods with fluctuation results obtained in [20]. The latter are extended to the full critical regime. We find that the number of necessary observations depends on whether the interaction parameter between two blocks is positive or negative: In the first case, there are about $Nlog N$ observations required to exactly recover the block structure, while in the latter case $sqrt{N}log N$ observations suffice.

Full Article

the

On the predictive potential of kernel principal components

By projecteuclid.org
Published On :: Wed, 15 Apr 2020 04:02 EDT

Ben Jones, Andreas Artemiou, Bing Li.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 1--23.

Abstract:
We give a probabilistic analysis of a phenomenon in statistics which, until recently, has not received a convincing explanation. This phenomenon is that the leading principal components tend to possess more predictive power for a response variable than lower-ranking ones despite the procedure being unsupervised. Our result, in its most general form, shows that the phenomenon goes far beyond the context of linear regression and classical principal components — if an arbitrary distribution for the predictor $X$ and an arbitrary conditional distribution for $Yvert X$ are chosen then any measureable function $g(Y)$, subject to a mild condition, tends to be more correlated with the higher-ranking kernel principal components than with the lower-ranking ones. The “arbitrariness” is formulated in terms of unitary invariance then the tendency is explicitly quantified by exploring how unitary invariance relates to the Cauchy distribution. The most general results, for technical reasons, are shown for the case where the kernel space is finite dimensional. The occurency of this tendency in real world databases is also investigated to show that our results are consistent with observation.

Full Article

the

A fast MCMC algorithm for the uniform sampling of binary matrices with fixed margins

By projecteuclid.org
Published On :: Thu, 09 Apr 2020 04:00 EDT

Guanyang Wang.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 1690--1706.

Abstract:
Uniform sampling of binary matrix with fixed margins is an important and difficult problem in statistics, computer science, ecology and so on. The well-known swap algorithm would be inefficient when the size of the matrix becomes large or when the matrix is too sparse/dense. Here we propose the Rectangle Loop algorithm, a Markov chain Monte Carlo algorithm to sample binary matrices with fixed margins uniformly. Theoretically the Rectangle Loop algorithm is better than the swap algorithm in Peskun’s order. Empirically studies also demonstrates the Rectangle Loop algorithm is remarkablely more efficient than the swap algorithm.

Full Article

the

Computing the degrees of freedom of rank-regularized estimators and cousins

By projecteuclid.org
Published On :: Thu, 26 Mar 2020 22:03 EDT

Rahul Mazumder, Haolei Weng.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 1348--1385.

Abstract:
Estimating a low rank matrix from its linear measurements is a problem of central importance in contemporary statistical analysis. The choice of tuning parameters for estimators remains an important challenge from a theoretical and practical perspective. To this end, Stein’s Unbiased Risk Estimate (SURE) framework provides a well-grounded statistical framework for degrees of freedom estimation. In this paper, we use the SURE framework to obtain degrees of freedom estimates for a general class of spectral regularized matrix estimators—our results generalize beyond the class of estimators that have been studied thus far. To this end, we use a result due to Shapiro (2002) pertaining to the differentiability of symmetric matrix valued functions, developed in the context of semidefinite optimization algorithms. We rigorously verify the applicability of Stein’s Lemma towards the derivation of degrees of freedom estimates; and also present new techniques based on Gaussian convolution to estimate the degrees of freedom of a class of spectral estimators, for which Stein’s Lemma does not directly apply.

Full Article

the

Rate optimal Chernoff bound and application to community detection in the stochastic block models

By projecteuclid.org
Published On :: Tue, 24 Mar 2020 22:01 EDT

Zhixin Zhou, Ping Li.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 1302--1347.

Abstract:
The Chernoff coefficient is known to be an upper bound of Bayes error probability in classification problem. In this paper, we will develop a rate optimal Chernoff bound on the Bayes error probability. The new bound is not only an upper bound but also a lower bound of Bayes error probability up to a constant factor. Moreover, we will apply this result to community detection in the stochastic block models. As a clustering problem, the optimal misclassification rate of community detection problem can be characterized by our rate optimal Chernoff bound. This can be formalized by deriving a minimax error rate over certain parameter space of stochastic block models, then achieving such an error rate by a feasible algorithm employing multiple steps of EM type updates.

Full Article

the

Differential network inference via the fused D-trace loss with cross variables

By projecteuclid.org
Published On :: Tue, 24 Mar 2020 22:01 EDT

Yichong Wu, Tiejun Li, Xiaoping Liu, Luonan Chen.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 1269--1301.

Abstract:
Detecting the change of biological interaction networks is of great importance in biological and medical research. We proposed a simple loss function, named as CrossFDTL, to identify the network change or differential network by estimating the difference between two precision matrices under Gaussian assumption. The CrossFDTL is a natural fusion of the D-trace loss for the considered two networks by imposing the $ell _{1}$ penalty to the differential matrix to ensure sparsity. The key point of our method is to utilize the cross variables, which correspond to the sum and difference of two precision matrices instead of using their original forms. Moreover, we developed an efficient minimization algorithm for the proposed loss function and further rigorously proved its convergence. Numerical results showed that our method outperforms the existing methods in both accuracy and convergence speed for the simulated and real data.

Full Article

the

On the distribution, model selection properties and uniqueness of the Lasso estimator in low and high dimensions

By projecteuclid.org
Published On :: Mon, 17 Feb 2020 22:06 EST

Karl Ewald, Ulrike Schneider.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 944--969.

Abstract:
We derive expressions for the finite-sample distribution of the Lasso estimator in the context of a linear regression model in low as well as in high dimensions by exploiting the structure of the optimization problem defining the estimator. In low dimensions, we assume full rank of the regressor matrix and present expressions for the cumulative distribution function as well as the densities of the absolutely continuous parts of the estimator. Our results are presented for the case of normally distributed errors, but do not hinge on this assumption and can easily be generalized. Additionally, we establish an explicit formula for the correspondence between the Lasso and the least-squares estimator. We derive analogous results for the distribution in less explicit form in high dimensions where we make no assumptions on the regressor matrix at all. In this setting, we also investigate the model selection properties of the Lasso and show that possibly only a subset of models might be selected by the estimator, completely independently of the observed response vector. Finally, we present a condition for uniqueness of the estimator that is necessary as well as sufficient.

Full Article

the

The bias of isotonic regression

By projecteuclid.org
Published On :: Tue, 04 Feb 2020 22:03 EST

Ran Dai, Hyebin Song, Rina Foygel Barber, Garvesh Raskutti.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 801--834.

Abstract:
We study the bias of the isotonic regression estimator. While there is extensive work characterizing the mean squared error of the isotonic regression estimator, relatively little is known about the bias. In this paper, we provide a sharp characterization, proving that the bias scales as $O(n^{-eta /3})$ up to log factors, where $1leq eta leq 2$ is the exponent corresponding to Hölder smoothness of the underlying mean. Importantly, this result only requires a strictly monotone mean and that the noise distribution has subexponential tails, without relying on symmetric noise or other restrictive assumptions.

Full Article

the

The bias and skewness of M -estimators in regression

By projecteuclid.org
Published On :: Thu, 05 Aug 2010 15:41 EDT

Christopher Withers, Saralees Nadarajah

Source: Electron. J. Statist., Volume 4, 1--14.

Abstract:
We consider M estimation of a regression model with a nuisance parameter and a vector of other parameters. The unknown distribution of the residuals is not assumed to be normal or symmetric. Simple and easily estimated formulas are given for the dominant terms of the bias and skewness of the parameter estimates. For the linear model these are proportional to the skewness of the ‘independent’ variables. For a nonlinear model, its linear component plays the role of these independent variables, and a second term must be added proportional to the covariance of its linear and quadratic components. For the least squares estimate with normal errors this term was derived by Box [1]. We also consider the effect of a large number of parameters, and the case of random independent variables.

Full Article

the

On the consistency of graph-based Bayesian semi-supervised learning and the scalability of sampling algorithms

By
Published On :: 2020

This paper considers a Bayesian approach to graph-based semi-supervised learning. We show that if the graph parameters are suitably scaled, the graph-posteriors converge to a continuum limit as the size of the unlabeled data set grows. This consistency result has profound algorithmic implications: we prove that when consistency holds, carefully designed Markov chain Monte Carlo algorithms have a uniform spectral gap, independent of the number of unlabeled inputs. Numerical experiments illustrate and complement the theory.

Full Article

the

The Maximum Separation Subspace in Sufficient Dimension Reduction with Categorical Response

By
Published On :: 2020

Sufficient dimension reduction (SDR) is a very useful concept for exploratory analysis and data visualization in regression, especially when the number of covariates is large. Many SDR methods have been proposed for regression with a continuous response, where the central subspace (CS) is the target of estimation. Various conditions, such as the linearity condition and the constant covariance condition, are imposed so that these methods can estimate at least a portion of the CS. In this paper we study SDR for regression and discriminant analysis with categorical response. Motivated by the exploratory analysis and data visualization aspects of SDR, we propose a new geometric framework to reformulate the SDR problem in terms of manifold optimization and introduce a new concept called Maximum Separation Subspace (MASES). The MASES naturally preserves the “sufficiency” in SDR without imposing additional conditions on the predictor distribution, and directly inspires a semi-parametric estimator. Numerical studies show MASES exhibits superior performance as compared with competing SDR methods in specific settings.

Full Article

the

Generalized Nonbacktracking Bounds on the Influence

By
Published On :: 2020

This paper develops deterministic upper and lower bounds on the influence measure in a network, more precisely, the expected number of nodes that a seed set can influence in the independent cascade model. In particular, our bounds exploit r-nonbacktracking walks and Fortuin-Kasteleyn-Ginibre (FKG) type inequalities, and are computed by message passing algorithms. Further, we provide parameterized versions of the bounds that control the trade-off between efficiency and accuracy. Finally, the tightness of the bounds is illustrated on various network models.

Full Article

the

On the Complexity Analysis of the Primal Solutions for the Accelerated Randomized Dual Coordinate Ascent

By
Published On :: 2020

Dual first-order methods are essential techniques for large-scale constrained convex optimization. However, when recovering the primal solutions, we need $T(epsilon^{-2})$ iterations to achieve an $epsilon$-optimal primal solution when we apply an algorithm to the non-strongly convex dual problem with $T(epsilon^{-1})$ iterations to achieve an $epsilon$-optimal dual solution, where $T(x)$ can be $x$ or $sqrt{x}$. In this paper, we prove that the iteration complexity of the primal solutions and dual solutions have the same $Oleft(frac{1}{sqrt{epsilon}} ight)$ order of magnitude for the accelerated randomized dual coordinate ascent. When the dual function further satisfies the quadratic functional growth condition, by restarting the algorithm at any period, we establish the linear iteration complexity for both the primal solutions and dual solutions even if the condition number is unknown. When applied to the regularized empirical risk minimization problem, we prove the iteration complexity of $Oleft(nlog n+sqrt{frac{n}{epsilon}} ight)$ in both primal space and dual space, where $n$ is the number of samples. Our result takes out the $left(log frac{1}{epsilon} ight)$ factor compared with the methods based on smoothing/regularization or Catalyst reduction. As far as we know, this is the first time that the optimal $Oleft(sqrt{frac{n}{epsilon}} ight)$ iteration complexity in the primal space is established for the dual coordinate ascent based stochastic algorithms. We also establish the accelerated linear complexity for some problems with nonsmooth loss, e.g., the least absolute deviation and SVM.

Full Article

the

Learning Linear Non-Gaussian Causal Models in the Presence of Latent Variables

By
Published On :: 2020

We consider the problem of learning causal models from observational data generated by linear non-Gaussian acyclic causal models with latent variables. Without considering the effect of latent variables, the inferred causal relationships among the observed variables are often wrong. Under faithfulness assumption, we propose a method to check whether there exists a causal path between any two observed variables. From this information, we can obtain the causal order among the observed variables. The next question is whether the causal effects can be uniquely identified as well. We show that causal effects among observed variables cannot be identified uniquely under mere assumptions of faithfulness and non-Gaussianity of exogenous noises. However, we are able to propose an efficient method that identifies the set of all possible causal effects that are compatible with the observational data. We present additional structural conditions on the causal graph under which causal effects among observed variables can be determined uniquely. Furthermore, we provide necessary and sufficient graphical conditions for unique identification of the number of variables in the system. Experiments on synthetic data and real-world data show the effectiveness of our proposed algorithm for learning causal models.

Full Article

the

Switching Regression Models and Causal Inference in the Presence of Discrete Latent Variables

By
Published On :: 2020

Given a response $Y$ and a vector $X = (X^1, dots, X^d)$ of $d$ predictors, we investigate the problem of inferring direct causes of $Y$ among the vector $X$. Models for $Y$ that use all of its causal covariates as predictors enjoy the property of being invariant across different environments or interventional settings. Given data from such environments, this property has been exploited for causal discovery. Here, we extend this inference principle to situations in which some (discrete-valued) direct causes of $ Y $ are unobserved. Such cases naturally give rise to switching regression models. We provide sufficient conditions for the existence, consistency and asymptotic normality of the MLE in linear switching regression models with Gaussian noise, and construct a test for the equality of such models. These results allow us to prove that the proposed causal discovery method obtains asymptotic false discovery control under mild conditions. We provide an algorithm, make available code, and test our method on simulated data. It is robust against model violations and outperforms state-of-the-art approaches. We further apply our method to a real data set, where we show that it does not only output causal predictors, but also a process-based clustering of data points, which could be of additional interest to practitioners.

Full Article

the

Skill Rating for Multiplayer Games. Introducing Hypernode Graphs and their Spectral Theory

By
Published On :: 2020

We consider the skill rating problem for multiplayer games, that is how to infer player skills from game outcomes in multiplayer games. We formulate the problem as a minimization problem $arg min_{s} s^T Delta s$ where $Delta$ is a positive semidefinite matrix and $s$ a real-valued function, of which some entries are the skill values to be inferred and other entries are constrained by the game outcomes. We leverage graph-based semi-supervised learning (SSL) algorithms for this problem. We apply our algorithms on several data sets of multiplayer games and obtain very promising results compared to Elo Duelling (see Elo, 1978) and TrueSkill (see Herbrich et al., 2006).. As we leverage graph-based SSL algorithms and because games can be seen as relations between sets of players, we then generalize the approach. For this aim, we introduce a new finite model, called hypernode graph, defined to be a set of weighted binary relations between sets of nodes. We define Laplacians of hypernode graphs. Then, we show that the skill rating problem for multiplayer games can be formulated as $arg min_{s} s^T Delta s$ where $Delta$ is the Laplacian of a hypernode graph constructed from a set of games. From a fundamental perspective, we show that hypernode graph Laplacians are symmetric positive semidefinite matrices with constant functions in their null space. We show that problems on hypernode graphs can not be solved with graph constructions and graph kernels. We relate hypernode graphs to signed graphs showing that positive relations between groups can lead to negative relations between individuals.

Full Article

the

Exact Guarantees on the Absence of Spurious Local Minima for Non-negative Rank-1 Robust Principal Component Analysis

By
Published On :: 2020

This work is concerned with the non-negative rank-1 robust principal component analysis (RPCA), where the goal is to recover the dominant non-negative principal components of a data matrix precisely, where a number of measurements could be grossly corrupted with sparse and arbitrary large noise. Most of the known techniques for solving the RPCA rely on convex relaxation methods by lifting the problem to a higher dimension, which significantly increase the number of variables. As an alternative, the well-known Burer-Monteiro approach can be used to cast the RPCA as a non-convex and non-smooth $ell_1$ optimization problem with a significantly smaller number of variables. In this work, we show that the low-dimensional formulation of the symmetric and asymmetric positive rank-1 RPCA based on the Burer-Monteiro approach has benign landscape, i.e., 1) it does not have any spurious local solution, 2) has a unique global solution, and 3) its unique global solution coincides with the true components. An implication of this result is that simple local search algorithms are guaranteed to achieve a zero global optimality gap when directly applied to the low-dimensional formulation. Furthermore, we provide strong deterministic and probabilistic guarantees for the exact recovery of the true principal components. In particular, it is shown that a constant fraction of the measurements could be grossly corrupted and yet they would not create any spurious local solution.

Full Article

the

Smoothed Nonparametric Derivative Estimation using Weighted Difference Quotients

By
Published On :: 2020

Derivatives play an important role in bandwidth selection methods (e.g., plug-ins), data analysis and bias-corrected confidence intervals. Therefore, obtaining accurate derivative information is crucial. Although many derivative estimation methods exist, the majority require a fixed design assumption. In this paper, we propose an effective and fully data-driven framework to estimate the first and second order derivative in random design. We establish the asymptotic properties of the proposed derivative estimator, and also propose a fast selection method for the tuning parameters. The performance and flexibility of the method is illustrated via an extensive simulation study.

Full Article

the

The weight function in the subtree kernel is decisive

By
Published On :: 2020

Tree data are ubiquitous because they model a large variety of situations, e.g., the architecture of plants, the secondary structure of RNA, or the hierarchy of XML files. Nevertheless, the analysis of these non-Euclidean data is difficult per se. In this paper, we focus on the subtree kernel that is a convolution kernel for tree data introduced by Vishwanathan and Smola in the early 2000's. More precisely, we investigate the influence of the weight function from a theoretical perspective and in real data applications. We establish on a 2-classes stochastic model that the performance of the subtree kernel is improved when the weight of leaves vanishes, which motivates the definition of a new weight function, learned from the data and not fixed by the user as usually done. To this end, we define a unified framework for computing the subtree kernel from ordered or unordered trees, that is particularly suitable for tuning parameters. We show through eight real data classification problems the great efficiency of our approach, in particular for small data sets, which also states the high importance of the weight function. Finally, a visualization tool of the significant features is derived.

Full Article

the

Scalable Approximate MCMC Algorithms for the Horseshoe Prior

By
Published On :: 2020

The horseshoe prior is frequently employed in Bayesian analysis of high-dimensional models, and has been shown to achieve minimax optimal risk properties when the truth is sparse. While optimization-based algorithms for the extremely popular Lasso and elastic net procedures can scale to dimension in the hundreds of thousands, algorithms for the horseshoe that use Markov chain Monte Carlo (MCMC) for computation are limited to problems an order of magnitude smaller. This is due to high computational cost per step and growth of the variance of time-averaging estimators as a function of dimension. We propose two new MCMC algorithms for computation in these models that have significantly improved performance compared to existing alternatives. One of the algorithms also approximates an expensive matrix product to give orders of magnitude speedup in high-dimensional applications. We prove guarantees for the accuracy of the approximate algorithm, and show that gradually decreasing the approximation error as the chain extends results in an exact algorithm. The scalability of the algorithm is illustrated in simulations with problem size as large as $N=5,000$ observations and $p=50,000$ predictors, and an application to a genome-wide association study with $N=2,267$ and $p=98,385$. The empirical results also show that the new algorithm yields estimates with lower mean squared error, intervals with better coverage, and elucidates features of the posterior that were often missed by previous algorithms in high dimensions, including bimodality of posterior marginals indicating uncertainty about which covariates belong in the model.

Full Article

the

Multi-Player Bandits: The Adversarial Case

By
Published On :: 2020

We consider a setting where multiple players sequentially choose among a common set of actions (arms). Motivated by an application to cognitive radio networks, we assume that players incur a loss upon colliding, and that communication between players is not possible. Existing approaches assume that the system is stationary. Yet this assumption is often violated in practice, e.g., due to signal strength fluctuations. In this work, we design the first multi-player Bandit algorithm that provably works in arbitrarily changing environments, where the losses of the arms may even be chosen by an adversary. This resolves an open problem posed by Rosenski et al. (2016).

Full Article

the

Portraits of women in the collection

By feedproxy.google.com
Published On :: Thu, 20 Feb 2020 00:02:06 +0000

This NSW Women's Week (2–8 March) we're showcasing portraits and stories of 10 significant women from the Lib

Full Article

the

Researching the Pacific: The Pacific Manuscripts Bureau

By feedproxy.google.com
Published On :: Mon, 27 Apr 2020 05:25:40 +0000

The State Library holds a superb collection of original documents, illustrations, photographs and books about the Pacifi

Full Article

the

Have your say on the Highway 404 Employment Corridor Secondary Plan

By www.eastgwillimbury.ca
Published On :: Mon, 27 Apr 2020 22:16:01 GMT

Full Article

the

Oriented first passage percolation in the mean field limit

By projecteuclid.org
Published On :: Mon, 04 May 2020 04:00 EDT

Nicola Kistler, Adrien Schertzer, Marius A. Schmidt.

Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 2, 414--425.

Abstract:
The Poisson clumping heuristic has lead Aldous to conjecture the value of the oriented first passage percolation on the hypercube in the limit of large dimensions. Aldous’ conjecture has been rigorously confirmed by Fill and Pemantle ( Ann. Appl. Probab. 3 (1993) 593–629) by means of a variance reduction trick. We present here a streamlined and, we believe, more natural proof based on ideas emerged in the study of Derrida’s random energy models.

Full Article

David Milliss further papers, 1940s-2010

The Most Excellent Order of the British Empire Association (New South Wales) further records, 1979-2012

Signs of the times

Sizing up the collection

It's all in the detail

Resolving the image

The scanner has arrived

Oregon's Sabrina Ionescu takes home Naismith Trophy Player of the Year honor

Oregon's Ionescu wins women's Naismith Player of the Year

The Class of 2020: A look at basketball's new Hall of Famers

Clean sweep: Oregon's Sabrina Ionescu is unanimous Player of the Year after winning Wooden Award

Sabrina Ionescu: The Goat

Aari McDonald on returning for her senior year at Arizona: 'We're ready to set the bar higher'

WNBA Draft Profile: UCLA guard Japreece Dean ready to lead at the next level

Oregon's Ionescu looks forward to pro career in the WNBA

Pac-12 women's basketball student-athletes reflect on the influence of their moms ahead of Mother's Day

The limiting behavior of isotonic and convex regression estimators when the model is misspecified

Statistical convergence of the EM algorithm on Gaussian mixture models

Generalised cepstral models for the spectrum of vector time series

On the Letac-Massam conjecture and existence of high dimensional Bayes estimators for graphical models

Gaussian field on the symmetric group: Prediction and learning

Asymptotic properties of the maximum likelihood and cross validation estimators for transformed Gaussian processes

Sparse equisigned PCA: Algorithms and performance bounds in the noisy rank-1 setting

Bayesian variance estimation in the Gaussian sequence model with partial information on the means

Adaptive estimation in the supremum norm for semiparametric mixtures of regressions

Exact recovery in block spin Ising models at the critical line

On the predictive potential of kernel principal components

A fast MCMC algorithm for the uniform sampling of binary matrices with fixed margins

Computing the degrees of freedom of rank-regularized estimators and cousins

Rate optimal Chernoff bound and application to community detection in the stochastic block models

Differential network inference via the fused D-trace loss with cross variables

On the distribution, model selection properties and uniqueness of the Lasso estimator in low and high dimensions

The bias of isotonic regression

The bias and skewness of M -estimators in regression

On the consistency of graph-based Bayesian semi-supervised learning and the scalability of sampling algorithms

The Maximum Separation Subspace in Sufficient Dimension Reduction with Categorical Response

Generalized Nonbacktracking Bounds on the Influence

On the Complexity Analysis of the Primal Solutions for the Accelerated Randomized Dual Coordinate Ascent

Learning Linear Non-Gaussian Causal Models in the Presence of Latent Variables

Switching Regression Models and Causal Inference in the Presence of Discrete Latent Variables

Skill Rating for Multiplayer Games. Introducing Hypernode Graphs and their Spectral Theory

Exact Guarantees on the Absence of Spurious Local Minima for Non-negative Rank-1 Robust Principal Component Analysis

Smoothed Nonparametric Derivative Estimation using Weighted Difference Quotients

The weight function in the subtree kernel is decisive

Scalable Approximate MCMC Algorithms for the Horseshoe Prior

Multi-Player Bandits: The Adversarial Case

Portraits of women in the collection

Researching the Pacific: The Pacific Manuscripts Bureau

Have your say on the Highway 404 Employment Corridor Secondary Plan

Oriented first passage percolation in the mean field limit

The finish line: Attachment of Signs

The Finish Line: Katrina One Year After

The Finish Line: Cast Stone and EIFS

The Finish Line: Changing Stucco to EIFS

The Finish Line: A Case Study: What is Causing This?

The Finish Line: All About Rust

The Finish Line: Backwrapping vs. Edgewrapping

The Finish Line: Cleaning EIFS

The Finish Line: Floor Line Joints

The Finish Line: FAQ's About EIFS Part 1

The Finish Line: Drainage Efficiency

The Finish Line: Earthquakes and EIFS

The Finish Line: Types of EIFS

The Finish Line: Eco-Friendliness of EIFS

The Finish Line: Foam Shapes Revisited

Subscribe To Our Newsletter