i Multiparameter Persistence Landscapes By Published On :: 2020 An important problem in the field of Topological Data Analysis is defining topological summaries which can be combined with traditional data analytic tools. In recent work Bubenik introduced the persistence landscape, a stable representation of persistence diagrams amenable to statistical analysis and machine learning tools. In this paper we generalise the persistence landscape to multiparameter persistence modules providing a stable representation of the rank invariant. We show that multiparameter landscapes are stable with respect to the interleaving distance and persistence weighted Wasserstein distance, and that the collection of multiparameter landscapes faithfully represents the rank invariant. Finally we provide example calculations and statistical tests to demonstrate a range of potential applications and how one can interpret the landscapes associated to a multiparameter module. Full Article
i Generalized Optimal Matching Methods for Causal Inference By Published On :: 2020 We develop an encompassing framework for matching, covariate balancing, and doubly-robust methods for causal inference from observational data called generalized optimal matching (GOM). The framework is given by generalizing a new functional-analytical formulation of optimal matching, giving rise to the class of GOM methods, for which we provide a single unified theory to analyze tractability and consistency. Many commonly used existing methods are included in GOM and, using their GOM interpretation, can be extended to optimally and automatically trade off balance for variance and outperform their standard counterparts. As a subclass, GOM gives rise to kernel optimal matching (KOM), which, as supported by new theoretical and empirical results, is notable for combining many of the positive properties of other methods in one. KOM, which is solved as a linearly-constrained convex-quadratic optimization problem, inherits both the interpretability and model-free consistency of matching but can also achieve the $sqrt{n}$-consistency of well-specified regression and the bias reduction and robustness of doubly robust methods. In settings of limited overlap, KOM enables a very transparent method for interval estimation for partial identification and robust coverage. We demonstrate this in examples with both synthetic and real data. Full Article
i Unique Sharp Local Minimum in L1-minimization Complete Dictionary Learning By Published On :: 2020 We study the problem of globally recovering a dictionary from a set of signals via $ell_1$-minimization. We assume that the signals are generated as i.i.d. random linear combinations of the $K$ atoms from a complete reference dictionary $D^*in mathbb R^{K imes K}$, where the linear combination coefficients are from either a Bernoulli type model or exact sparse model. First, we obtain a necessary and sufficient norm condition for the reference dictionary $D^*$ to be a sharp local minimum of the expected $ell_1$ objective function. Our result substantially extends that of Wu and Yu (2015) and allows the combination coefficient to be non-negative. Secondly, we obtain an explicit bound on the region within which the objective value of the reference dictionary is minimal. Thirdly, we show that the reference dictionary is the unique sharp local minimum, thus establishing the first known global property of $ell_1$-minimization dictionary learning. Motivated by the theoretical results, we introduce a perturbation based test to determine whether a dictionary is a sharp local minimum of the objective function. In addition, we also propose a new dictionary learning algorithm based on Block Coordinate Descent, called DL-BCD, which is guaranteed to decrease the obective function monotonically. Simulation studies show that DL-BCD has competitive performance in terms of recovery rate compared to other state-of-the-art dictionary learning algorithms when the reference dictionary is generated from random Gaussian matrices. Full Article
i Community-Based Group Graphical Lasso By Published On :: 2020 A new strategy for probabilistic graphical modeling is developed that draws parallels to community detection analysis. The method jointly estimates an undirected graph and homogeneous communities of nodes. The structure of the communities is taken into account when estimating the graph and at the same time, the structure of the graph is accounted for when estimating communities of nodes. The procedure uses a joint group graphical lasso approach with community detection-based grouping, such that some groups of edges co-occur in the estimated graph. The grouping structure is unknown and is estimated based on community detection algorithms. Theoretical derivations regarding graph convergence and sparsistency, as well as accuracy of community recovery are included, while the method's empirical performance is illustrated in an fMRI context, as well as with simulated examples. Full Article
i Smoothed Nonparametric Derivative Estimation using Weighted Difference Quotients By Published On :: 2020 Derivatives play an important role in bandwidth selection methods (e.g., plug-ins), data analysis and bias-corrected confidence intervals. Therefore, obtaining accurate derivative information is crucial. Although many derivative estimation methods exist, the majority require a fixed design assumption. In this paper, we propose an effective and fully data-driven framework to estimate the first and second order derivative in random design. We establish the asymptotic properties of the proposed derivative estimator, and also propose a fast selection method for the tuning parameters. The performance and flexibility of the method is illustrated via an extensive simulation study. Full Article
i WONDER: Weighted One-shot Distributed Ridge Regression in High Dimensions By Published On :: 2020 In many areas, practitioners need to analyze large data sets that challenge conventional single-machine computing. To scale up data analysis, distributed and parallel computing approaches are increasingly needed. Here we study a fundamental and highly important problem in this area: How to do ridge regression in a distributed computing environment? Ridge regression is an extremely popular method for supervised learning, and has several optimality properties, thus it is important to study. We study one-shot methods that construct weighted combinations of ridge regression estimators computed on each machine. By analyzing the mean squared error in a high-dimensional random-effects model where each predictor has a small effect, we discover several new phenomena. Infinite-worker limit: The distributed estimator works well for very large numbers of machines, a phenomenon we call 'infinite-worker limit'. Optimal weights: The optimal weights for combining local estimators sum to more than unity, due to the downward bias of ridge. Thus, all averaging methods are suboptimal. We also propose a new Weighted ONe-shot DistributEd Ridge regression algorithm (WONDER). We test WONDER in simulation studies and using the Million Song Dataset as an example. There it can save at least 100x in computation time, while nearly preserving test accuracy. Full Article
i The weight function in the subtree kernel is decisive By Published On :: 2020 Tree data are ubiquitous because they model a large variety of situations, e.g., the architecture of plants, the secondary structure of RNA, or the hierarchy of XML files. Nevertheless, the analysis of these non-Euclidean data is difficult per se. In this paper, we focus on the subtree kernel that is a convolution kernel for tree data introduced by Vishwanathan and Smola in the early 2000's. More precisely, we investigate the influence of the weight function from a theoretical perspective and in real data applications. We establish on a 2-classes stochastic model that the performance of the subtree kernel is improved when the weight of leaves vanishes, which motivates the definition of a new weight function, learned from the data and not fixed by the user as usually done. To this end, we define a unified framework for computing the subtree kernel from ordered or unordered trees, that is particularly suitable for tuning parameters. We show through eight real data classification problems the great efficiency of our approach, in particular for small data sets, which also states the high importance of the weight function. Finally, a visualization tool of the significant features is derived. Full Article
i On Stationary-Point Hitting Time and Ergodicity of Stochastic Gradient Langevin Dynamics By Published On :: 2020 Stochastic gradient Langevin dynamics (SGLD) is a fundamental algorithm in stochastic optimization. Recent work by Zhang et al. (2017) presents an analysis for the hitting time of SGLD for the first and second order stationary points. The proof in Zhang et al. (2017) is a two-stage procedure through bounding the Cheeger's constant, which is rather complicated and leads to loose bounds. In this paper, using intuitions from stochastic differential equations, we provide a direct analysis for the hitting times of SGLD to the first and second order stationary points. Our analysis is straightforward. It only relies on basic linear algebra and probability theory tools. Our direct analysis also leads to tighter bounds comparing to Zhang et al. (2017) and shows the explicit dependence of the hitting time on different factors, including dimensionality, smoothness, noise strength, and step size effects. Under suitable conditions, we show that the hitting time of SGLD to first-order stationary points can be dimension-independent. Moreover, we apply our analysis to study several important online estimation problems in machine learning, including linear regression, matrix factorization, and online PCA. Full Article
i Union of Low-Rank Tensor Spaces: Clustering and Completion By Published On :: 2020 We consider the problem of clustering and completing a set of tensors with missing data that are drawn from a union of low-rank tensor spaces. In the clustering problem, given a partially sampled tensor data that is composed of a number of subtensors, each chosen from one of a certain number of unknown tensor spaces, we need to group the subtensors that belong to the same tensor space. We provide a geometrical analysis on the sampling pattern and subsequently derive the sampling rate that guarantees the correct clustering under some assumptions with high probability. Moreover, we investigate the fundamental conditions for finite/unique completability for the union of tensor spaces completion problem. Both deterministic and probabilistic conditions on the sampling pattern to ensure finite/unique completability are obtained. For both the clustering and completion problems, our tensor analysis provides significantly better bound than the bound given by the matrix analysis applied to any unfolding of the tensor data. Full Article
i Representation Learning for Dynamic Graphs: A Survey By Published On :: 2020 Graphs arise naturally in many real-world applications including social networks, recommender systems, ontologies, biology, and computational finance. Traditionally, machine learning models for graphs have been mostly designed for static graphs. However, many applications involve evolving graphs. This introduces important challenges for learning and inference since nodes, attributes, and edges change over time. In this survey, we review the recent advances in representation learning for dynamic graphs, including dynamic knowledge graphs. We describe existing models from an encoder-decoder perspective, categorize these encoders and decoders based on the techniques they employ, and analyze the approaches in each category. We also review several prominent applications and widely used datasets and highlight directions for future research. Full Article
i Estimation of a Low-rank Topic-Based Model for Information Cascades By Published On :: 2020 We consider the problem of estimating the latent structure of a social network based on the observed information diffusion events, or cascades, where the observations for a given cascade consist of only the timestamps of infection for infected nodes but not the source of the infection. Most of the existing work on this problem has focused on estimating a diffusion matrix without any structural assumptions on it. In this paper, we propose a novel model based on the intuition that an information is more likely to propagate among two nodes if they are interested in similar topics which are also prominent in the information content. In particular, our model endows each node with an influence vector (which measures how authoritative the node is on each topic) and a receptivity vector (which measures how susceptible the node is for each topic). We show how this node-topic structure can be estimated from the observed cascades, and prove the consistency of the estimator. Experiments on synthetic and real data demonstrate the improved performance and better interpretability of our model compared to existing state-of-the-art methods. Full Article
i (1 + epsilon)-class Classification: an Anomaly Detection Method for Highly Imbalanced or Incomplete Data Sets By Published On :: 2020 Anomaly detection is not an easy problem since distribution of anomalous samples is unknown a priori. We explore a novel method that gives a trade-off possibility between one-class and two-class approaches, and leads to a better performance on anomaly detection problems with small or non-representative anomalous samples. The method is evaluated using several data sets and compared to a set of conventional one-class and two-class approaches. Full Article
i Scalable Approximate MCMC Algorithms for the Horseshoe Prior By Published On :: 2020 The horseshoe prior is frequently employed in Bayesian analysis of high-dimensional models, and has been shown to achieve minimax optimal risk properties when the truth is sparse. While optimization-based algorithms for the extremely popular Lasso and elastic net procedures can scale to dimension in the hundreds of thousands, algorithms for the horseshoe that use Markov chain Monte Carlo (MCMC) for computation are limited to problems an order of magnitude smaller. This is due to high computational cost per step and growth of the variance of time-averaging estimators as a function of dimension. We propose two new MCMC algorithms for computation in these models that have significantly improved performance compared to existing alternatives. One of the algorithms also approximates an expensive matrix product to give orders of magnitude speedup in high-dimensional applications. We prove guarantees for the accuracy of the approximate algorithm, and show that gradually decreasing the approximation error as the chain extends results in an exact algorithm. The scalability of the algorithm is illustrated in simulations with problem size as large as $N=5,000$ observations and $p=50,000$ predictors, and an application to a genome-wide association study with $N=2,267$ and $p=98,385$. The empirical results also show that the new algorithm yields estimates with lower mean squared error, intervals with better coverage, and elucidates features of the posterior that were often missed by previous algorithms in high dimensions, including bimodality of posterior marginals indicating uncertainty about which covariates belong in the model. Full Article
i High-dimensional Gaussian graphical models on network-linked data By Published On :: 2020 Graphical models are commonly used to represent conditional dependence relationships between variables. There are multiple methods available for exploring them from high-dimensional data, but almost all of them rely on the assumption that the observations are independent and identically distributed. At the same time, observations connected by a network are becoming increasingly common, and tend to violate these assumptions. Here we develop a Gaussian graphical model for observations connected by a network with potentially different mean vectors, varying smoothly over the network. We propose an efficient estimation algorithm and demonstrate its effectiveness on both simulated and real data, obtaining meaningful and interpretable results on a statistics coauthorship network. We also prove that our method estimates both the inverse covariance matrix and the corresponding graph structure correctly under the assumption of network “cohesion”, which refers to the empirically observed phenomenon of network neighbors sharing similar traits. Full Article
i Identifiability of Additive Noise Models Using Conditional Variances By Published On :: 2020 This paper considers a new identifiability condition for additive noise models (ANMs) in which each variable is determined by an arbitrary Borel measurable function of its parents plus an independent error. It has been shown that ANMs are fully recoverable under some identifiability conditions, such as when all error variances are equal. However, this identifiable condition could be restrictive, and hence, this paper focuses on a relaxed identifiability condition that involves not only error variances, but also the influence of parents. This new class of identifiable ANMs does not put any constraints on the form of dependencies, or distributions of errors, and allows different error variances. It further provides a statistically consistent and computationally feasible structure learning algorithm for the identifiable ANMs based on the new identifiability condition. The proposed algorithm assumes that all relevant variables are observed, while it does not assume faithfulness or a sparse graph. Demonstrated through extensive simulated and real multivariate data is that the proposed algorithm successfully recovers directed acyclic graphs. Full Article
i GADMM: Fast and Communication Efficient Framework for Distributed Machine Learning By Published On :: 2020 When the data is distributed across multiple servers, lowering the communication cost between the servers (or workers) while solving the distributed learning problem is an important problem and is the focus of this paper. In particular, we propose a fast, and communication-efficient decentralized framework to solve the distributed machine learning (DML) problem. The proposed algorithm, Group Alternating Direction Method of Multipliers (GADMM) is based on the Alternating Direction Method of Multipliers (ADMM) framework. The key novelty in GADMM is that it solves the problem in a decentralized topology where at most half of the workers are competing for the limited communication resources at any given time. Moreover, each worker exchanges the locally trained model only with two neighboring workers, thereby training a global model with a lower amount of communication overhead in each exchange. We prove that GADMM converges to the optimal solution for convex loss functions, and numerically show that it converges faster and more communication-efficient than the state-of-the-art communication-efficient algorithms such as the Lazily Aggregated Gradient (LAG) and dual averaging, in linear and logistic regression tasks on synthetic and real datasets. Furthermore, we propose Dynamic GADMM (D-GADMM), a variant of GADMM, and prove its convergence under the time-varying network topology of the workers. Full Article
i Multi-Player Bandits: The Adversarial Case By Published On :: 2020 We consider a setting where multiple players sequentially choose among a common set of actions (arms). Motivated by an application to cognitive radio networks, we assume that players incur a loss upon colliding, and that communication between players is not possible. Existing approaches assume that the system is stationary. Yet this assumption is often violated in practice, e.g., due to signal strength fluctuations. In this work, we design the first multi-player Bandit algorithm that provably works in arbitrarily changing environments, where the losses of the arms may even be chosen by an adversary. This resolves an open problem posed by Rosenski et al. (2016). Full Article
i Portraits of women in the collection By feedproxy.google.com Published On :: Thu, 20 Feb 2020 00:02:06 +0000 This NSW Women's Week (2–8 March) we're showcasing portraits and stories of 10 significant women from the Lib Full Article
i TIGER: using artificial intelligence to discover our collections By feedproxy.google.com Published On :: Tue, 10 Mar 2020 22:01:20 +0000 The State Library of NSW has almost 4 million digital files in its collection. Full Article
i It's only rock’n’roll… but I like it By feedproxy.google.com Published On :: Mon, 16 Mar 2020 00:04:17 +0000 Collecting contemporary music from New South Wales is a developing priority for the Library. Full Article
i COVID-19 collecting drive By feedproxy.google.com Published On :: Mon, 30 Mar 2020 23:02:48 +0000 We need your help! We are collecting posters, flyers and mail-outs appearing in our local neighbourhoods in respo Full Article
i Q&A with Tara June Winch By feedproxy.google.com Published On :: Mon, 27 Apr 2020 01:40:36 +0000 Tara June Winch's profound novel The Yield has won three NSW Premier's Literary Awards prizes this year, inclu Full Article
i Researching the Pacific: The Pacific Manuscripts Bureau By feedproxy.google.com Published On :: Mon, 27 Apr 2020 05:25:40 +0000 The State Library holds a superb collection of original documents, illustrations, photographs and books about the Pacifi Full Article
i Cook commemoration sparks 1970 protest By feedproxy.google.com Published On :: Tue, 28 Apr 2020 04:52:55 +0000 In 1970, celebrations and commemorations were held across the nation for the 200th anniversary of the Endeavour’s visit Full Article
i Access thousands of newspapers and magazines with PressReader By feedproxy.google.com Published On :: Mon, 04 May 2020 03:40:42 +0000 Want to access thousands of newspapers and magazines wherever you are? Full Article
i Q&A with Adam Ferguson By feedproxy.google.com Published On :: Tue, 05 May 2020 05:43:41 +0000 Each year the Library hosts the popular World Press Photo exhibition, bringing together award-winning photographs from t Full Article
i Youth & Community Initiatives Funding available By www.eastgwillimbury.ca Published On :: Thu, 20 Feb 2020 18:27:25 GMT Full Article
i Crime Prevention at Home By www.eastgwillimbury.ca Published On :: Wed, 11 Dec 2019 17:46:59 GMT Full Article
i Health & Active Living Challenge By www.eastgwillimbury.ca Published On :: Wed, 15 Apr 2020 17:19:39 GMT Full Article
i EG Waste Collection By www.eastgwillimbury.ca Published On :: Wed, 15 Apr 2020 17:11:30 GMT Full Article
i Mosquito Control Program By www.eastgwillimbury.ca Published On :: Thu, 16 Apr 2020 21:14:52 GMT Full Article
i Have your say on the Highway 404 Employment Corridor Secondary Plan By www.eastgwillimbury.ca Published On :: Mon, 27 Apr 2020 22:16:01 GMT Full Article
i Town launches new Community Support Hotline By www.eastgwillimbury.ca Published On :: Tue, 28 Apr 2020 23:15:02 GMT Full Article
i Share your fall and winter photos with us! By www.eastgwillimbury.ca Published On :: Sun, 03 May 2020 16:08:06 GMT Full Article
i Branching random walks with uncountably many extinction probability vectors By projecteuclid.org Published On :: Mon, 04 May 2020 04:00 EDT Daniela Bertacchi, Fabio Zucca. Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 2, 426--438.Abstract: Given a branching random walk on a set $X$, we study its extinction probability vectors $mathbf{q}(cdot,A)$. Their components are the probability that the process goes extinct in a fixed $Asubseteq X$, when starting from a vertex $xin X$. The set of extinction probability vectors (obtained letting $A$ vary among all subsets of $X$) is a subset of the set of the fixed points of the generating function of the branching random walk. In particular here we are interested in the cardinality of the set of extinction probability vectors. We prove results which allow to understand whether the probability of extinction in a set $A$ is different from the one of extinction in another set $B$. In many cases there are only two possible extinction probability vectors and so far, in more complicated examples, only a finite number of distinct extinction probability vectors had been explicitly found. Whether a branching random walk could have an infinite number of distinct extinction probability vectors was not known. We apply our results to construct examples of branching random walks with uncountably many distinct extinction probability vectors. Full Article
i Oriented first passage percolation in the mean field limit By projecteuclid.org Published On :: Mon, 04 May 2020 04:00 EDT Nicola Kistler, Adrien Schertzer, Marius A. Schmidt. Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 2, 414--425.Abstract: The Poisson clumping heuristic has lead Aldous to conjecture the value of the oriented first passage percolation on the hypercube in the limit of large dimensions. Aldous’ conjecture has been rigorously confirmed by Fill and Pemantle ( Ann. Appl. Probab. 3 (1993) 593–629) by means of a variance reduction trick. We present here a streamlined and, we believe, more natural proof based on ideas emerged in the study of Derrida’s random energy models. Full Article
i Stein characterizations for linear combinations of gamma random variables By projecteuclid.org Published On :: Mon, 04 May 2020 04:00 EDT Benjamin Arras, Ehsan Azmoodeh, Guillaume Poly, Yvik Swan. Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 2, 394--413.Abstract: In this paper we propose a new, simple and explicit mechanism allowing to derive Stein operators for random variables whose characteristic function satisfies a simple ODE. We apply this to study random variables which can be represented as linear combinations of (not necessarily independent) gamma distributed random variables. The connection with Malliavin calculus for random variables in the second Wiener chaos is detailed. An application to McKay Type I random variables is also outlined. Full Article
i Measuring symmetry and asymmetry of multiplicative distortion measurement errors data By projecteuclid.org Published On :: Mon, 04 May 2020 04:00 EDT Jun Zhang, Yujie Gai, Xia Cui, Gaorong Li. Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 2, 370--393.Abstract: This paper studies the measure of symmetry or asymmetry of a continuous variable under the multiplicative distortion measurement errors setting. The unobservable variable is distorted in a multiplicative fashion by an observed confounding variable. First, two direct plug-in estimation procedures are proposed, and the empirical likelihood based confidence intervals are constructed to measure the symmetry or asymmetry of the unobserved variable. Next, we propose four test statistics for testing whether the unobserved variable is symmetric or not. The asymptotic properties of the proposed estimators and test statistics are examined. We conduct Monte Carlo simulation experiments to examine the performance of the proposed estimators and test statistics. These methods are applied to analyze a real dataset for an illustration. Full Article
i Reliability estimation in a multicomponent stress-strength model for Burr XII distribution under progressive censoring By projecteuclid.org Published On :: Mon, 04 May 2020 04:00 EDT Raj Kamal Maurya, Yogesh Mani Tripathi. Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 2, 345--369.Abstract: We consider estimation of the multicomponent stress-strength reliability under progressive Type II censoring under the assumption that stress and strength variables follow Burr XII distributions with a common shape parameter. Maximum likelihood estimates of the reliability are obtained along with asymptotic intervals when common shape parameter may be known or unknown. Bayes estimates are also derived under the squared error loss function using different approximation methods. Further, we obtain exact Bayes and uniformly minimum variance unbiased estimates of the reliability for the case common shape parameter is known. The highest posterior density intervals are also obtained. We perform Monte Carlo simulations to compare the performance of proposed estimates and present a discussion based on this study. Finally, two real data sets are analyzed for illustration purposes. Full Article
i A Bayesian sparse finite mixture model for clustering data from a heterogeneous population By projecteuclid.org Published On :: Mon, 04 May 2020 04:00 EDT Erlandson F. Saraiva, Adriano K. Suzuki, Luís A. Milan. Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 2, 323--344.Abstract: In this paper, we introduce a Bayesian approach for clustering data using a sparse finite mixture model (SFMM). The SFMM is a finite mixture model with a large number of components $k$ previously fixed where many components can be empty. In this model, the number of components $k$ can be interpreted as the maximum number of distinct mixture components. Then, we explore the use of a prior distribution for the weights of the mixture model that take into account the possibility that the number of clusters $k_{mathbf{c}}$ (e.g., nonempty components) can be random and smaller than the number of components $k$ of the finite mixture model. In order to determine clusters we develop a MCMC algorithm denominated Split-Merge allocation sampler. In this algorithm, the split-merge strategy is data-driven and was inserted within the algorithm in order to increase the mixing of the Markov chain in relation to the number of clusters. The performance of the method is verified using simulated datasets and three real datasets. The first real data set is the benchmark galaxy data, while second and third are the publicly available data set on Enzyme and Acidity, respectively. Full Article
i Bayesian modeling and prior sensitivity analysis for zero–one augmented beta regression models with an application to psychometric data By projecteuclid.org Published On :: Mon, 04 May 2020 04:00 EDT Danilo Covaes Nogarotto, Caio Lucidius Naberezny Azevedo, Jorge Luis Bazán. Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 2, 304--322.Abstract: The interest on the analysis of the zero–one augmented beta regression (ZOABR) model has been increasing over the last few years. In this work, we developed a Bayesian inference for the ZOABR model, providing some contributions, namely: we explored the use of Jeffreys-rule and independence Jeffreys prior for some of the parameters, performing a sensitivity study of prior choice, comparing the Bayesian estimates with the maximum likelihood ones and measuring the accuracy of the estimates under several scenarios of interest. The results indicate, in a general way, that: the Bayesian approach, under the Jeffreys-rule prior, was as accurate as the ML one. Also, different from other approaches, we use the predictive distribution of the response to implement Bayesian residuals. To further illustrate the advantages of our approach, we conduct an analysis of a real psychometric data set including a Bayesian residual analysis, where it is shown that misleading inference can be obtained when the data is transformed. That is, when the zeros and ones are transformed to suitable values and the usual beta regression model is considered, instead of the ZOABR model. Finally, future developments are discussed. Full Article
i Adaptive two-treatment three-period crossover design for normal responses By projecteuclid.org Published On :: Mon, 04 May 2020 04:00 EDT Uttam Bandyopadhyay, Shirsendu Mukherjee, Atanu Biswas. Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 2, 291--303.Abstract: In adaptive crossover design, our goal is to allocate more patients to a promising treatment sequence. The present work contains a very simple three period crossover design for two competing treatments where the allocation in period 3 is done on the basis of the data obtained from the first two periods. Assuming normality of response variables we use a reliability functional for the choice between two treatments. We calculate the allocation proportions and their standard errors corresponding to the possible treatment combinations. We also derive some asymptotic results and provide solutions on related inferential problems. Moreover, the proposed procedure is compared with a possible competitor. Finally, we use a data set to illustrate the applicability of the proposed design. Full Article
i Symmetrical and asymmetrical mixture autoregressive processes By projecteuclid.org Published On :: Mon, 04 May 2020 04:00 EDT Mohsen Maleki, Arezo Hajrajabi, Reinaldo B. Arellano-Valle. Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 2, 273--290.Abstract: In this paper, we study the finite mixtures of autoregressive processes assuming that the distribution of innovations (errors) belongs to the class of scale mixture of skew-normal (SMSN) distributions. The SMSN distributions allow a simultaneous modeling of the existence of outliers, heavy tails and asymmetries in the distribution of innovations. Therefore, a statistical methodology based on the SMSN family allows us to use a robust modeling on some non-linear time series with great flexibility, to accommodate skewness, heavy tails and heterogeneity simultaneously. The existence of convenient hierarchical representations of the SMSN distributions facilitates also the implementation of an ECME-type of algorithm to perform the likelihood inference in the considered model. Simulation studies and the application to a real data set are finally presented to illustrate the usefulness of the proposed model. Full Article
i Random environment binomial thinning integer-valued autoregressive process with Poisson or geometric marginal By projecteuclid.org Published On :: Mon, 04 May 2020 04:00 EDT Zhengwei Liu, Qi Li, Fukang Zhu. Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 2, 251--272.Abstract: To predict time series of counts with small values and remarkable fluctuations, an available model is the $r$ states random environment process based on the negative binomial thinning operator and the geometric marginal. However, we argue that the aforementioned model may suffer from the following two drawbacks. First, under the condition of no prior information, the overdispersed property of the geometric distribution may cause the predictions fluctuate greatly. Second, because of the constraints on the model parameters, some estimated parameters are close to zero in real-data examples, which may not objectively reveal the correlation relationship. For the first drawback, an $r$ states random environment process based on the binomial thinning operator and the Poisson marginal is introduced. For the second drawback, we propose a generalized $r$ states random environment integer-valued autoregressive model based on the binomial thinning operator to model fluctuations of data. Yule–Walker and conditional maximum likelihood estimates are considered and their performances are assessed via simulation studies. Two real-data sets are conducted to illustrate the better performances of the proposed models compared with some existing models. Full Article
i Agnostic tests can control the type I and type II errors simultaneously By projecteuclid.org Published On :: Mon, 04 May 2020 04:00 EDT Victor Coscrato, Rafael Izbicki, Rafael B. Stern. Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 2, 230--250.Abstract: Despite its common practice, statistical hypothesis testing presents challenges in interpretation. For instance, in the standard frequentist framework there is no control of the type II error. As a result, the non-rejection of the null hypothesis $(H_{0})$ cannot reasonably be interpreted as its acceptance. We propose that this dilemma can be overcome by using agnostic hypothesis tests, since they can control the type I and II errors simultaneously. In order to make this idea operational, we show how to obtain agnostic hypothesis in typical models. For instance, we show how to build (unbiased) uniformly most powerful agnostic tests and how to obtain agnostic tests from standard p-values. Also, we present conditions such that the above tests can be made logically coherent. Finally, we present examples of consistent agnostic hypothesis tests. Full Article
i Recent developments in complex and spatially correlated functional data By projecteuclid.org Published On :: Mon, 04 May 2020 04:00 EDT Israel Martínez-Hernández, Marc G. Genton. Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 2, 204--229.Abstract: As high-dimensional and high-frequency data are being collected on a large scale, the development of new statistical models is being pushed forward. Functional data analysis provides the required statistical methods to deal with large-scale and complex data by assuming that data are continuous functions, for example, realizations of a continuous process (curves) or continuous random field (surfaces), and that each curve or surface is considered as a single observation. Here, we provide an overview of functional data analysis when data are complex and spatially correlated. We provide definitions and estimators of the first and second moments of the corresponding functional random variable. We present two main approaches: The first assumes that data are realizations of a functional random field, that is, each observation is a curve with a spatial component. We call them spatial functional data . The second approach assumes that data are continuous deterministic fields observed over time. In this case, one observation is a surface or manifold, and we call them surface time series . For these two approaches, we describe software available for the statistical analysis. We also present a data illustration, using a high-resolution wind speed simulated dataset, as an example of the two approaches. The functional data approach offers a new paradigm of data analysis, where the continuous processes or random fields are considered as a single entity. We consider this approach to be very valuable in the context of big data. Full Article
i A message from the editorial board By projecteuclid.org Published On :: Mon, 04 May 2020 04:00 EDT Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 2, 203--203. Full Article
i $W^{1,p}$-Solutions of the transport equation by stochastic perturbation By projecteuclid.org Published On :: Mon, 03 Feb 2020 04:00 EST David A. C. Mollinedo. Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 1, 188--201.Abstract: We consider the stochastic transport equation with a possibly unbounded Hölder continuous vector field. Well-posedness is proved, namely, we show existence, uniqueness and strong stability of $W^{1,p}$-weak solutions. Full Article
i A note on the “L-logistic regression models: Prior sensitivity analysis, robustness to outliers and applications” By projecteuclid.org Published On :: Mon, 03 Feb 2020 04:00 EST Saralees Nadarajah, Yuancheng Si. Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 1, 183--187.Abstract: Da Paz, Balakrishnan and Bazan [Braz. J. Probab. Stat. 33 (2019), 455–479] introduced the L-logistic distribution, studied its properties including estimation issues and illustrated a data application. This note derives a closed form expression for moment properties of the distribution. Some computational issues are discussed. Full Article