ter Share your fall and winter photos with us! By www.eastgwillimbury.ca Published On :: Sun, 03 May 2020 16:08:06 GMT Full Article
ter Stein characterizations for linear combinations of gamma random variables By projecteuclid.org Published On :: Mon, 04 May 2020 04:00 EDT Benjamin Arras, Ehsan Azmoodeh, Guillaume Poly, Yvik Swan. Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 2, 394--413.Abstract: In this paper we propose a new, simple and explicit mechanism allowing to derive Stein operators for random variables whose characteristic function satisfies a simple ODE. We apply this to study random variables which can be represented as linear combinations of (not necessarily independent) gamma distributed random variables. The connection with Malliavin calculus for random variables in the second Wiener chaos is detailed. An application to McKay Type I random variables is also outlined. Full Article
ter A Bayesian sparse finite mixture model for clustering data from a heterogeneous population By projecteuclid.org Published On :: Mon, 04 May 2020 04:00 EDT Erlandson F. Saraiva, Adriano K. Suzuki, Luís A. Milan. Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 2, 323--344.Abstract: In this paper, we introduce a Bayesian approach for clustering data using a sparse finite mixture model (SFMM). The SFMM is a finite mixture model with a large number of components $k$ previously fixed where many components can be empty. In this model, the number of components $k$ can be interpreted as the maximum number of distinct mixture components. Then, we explore the use of a prior distribution for the weights of the mixture model that take into account the possibility that the number of clusters $k_{mathbf{c}}$ (e.g., nonempty components) can be random and smaller than the number of components $k$ of the finite mixture model. In order to determine clusters we develop a MCMC algorithm denominated Split-Merge allocation sampler. In this algorithm, the split-merge strategy is data-driven and was inserted within the algorithm in order to increase the mixing of the Markov chain in relation to the number of clusters. The performance of the method is verified using simulated datasets and three real datasets. The first real data set is the benchmark galaxy data, while second and third are the publicly available data set on Enzyme and Acidity, respectively. Full Article
ter On estimating the location parameter of the selected exponential population under the LINEX loss function By projecteuclid.org Published On :: Mon, 03 Feb 2020 04:00 EST Mohd Arshad, Omer Abdalghani. Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 1, 167--182.Abstract: Suppose that $pi_{1},pi_{2},ldots ,pi_{k}$ be $k(geq2)$ independent exponential populations having unknown location parameters $mu_{1},mu_{2},ldots,mu_{k}$ and known scale parameters $sigma_{1},ldots,sigma_{k}$. Let $mu_{[k]}=max {mu_{1},ldots,mu_{k}}$. For selecting the population associated with $mu_{[k]}$, a class of selection rules (proposed by Arshad and Misra [ Statistical Papers 57 (2016) 605–621]) is considered. We consider the problem of estimating the location parameter $mu_{S}$ of the selected population under the criterion of the LINEX loss function. We consider three natural estimators $delta_{N,1},delta_{N,2}$ and $delta_{N,3}$ of $mu_{S}$, based on the maximum likelihood estimators, uniformly minimum variance unbiased estimator (UMVUE) and minimum risk equivariant estimator (MREE) of $mu_{i}$’s, respectively. The uniformly minimum risk unbiased estimator (UMRUE) and the generalized Bayes estimator of $mu_{S}$ are derived. Under the LINEX loss function, a general result for improving a location-equivariant estimator of $mu_{S}$ is derived. Using this result, estimator better than the natural estimator $delta_{N,1}$ is obtained. We also shown that the estimator $delta_{N,1}$ is dominated by the natural estimator $delta_{N,3}$. Finally, we perform a simulation study to evaluate and compare risk functions among various competing estimators of $mu_{S}$. Full Article
ter A primer on the characterization of the exchangeable Marshall–Olkin copula via monotone sequences By projecteuclid.org Published On :: Mon, 03 Feb 2020 04:00 EST Natalia Shenkman. Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 1, 127--135.Abstract: While derivations of the characterization of the $d$-variate exchangeable Marshall–Olkin copula via $d$-monotone sequences relying on basic knowledge in probability theory exist in the literature, they contain a myriad of unnecessary relatively complicated computations. We revisit this issue and provide proofs where all undesired artefacts are removed, thereby exposing the simplicity of the characterization. In particular, we give an insightful analytical derivation of the monotonicity conditions based on the monotonicity properties of the survival probabilities. Full Article
ter Effects of gene–environment and gene–gene interactions in case-control studies: A novel Bayesian semiparametric approach By projecteuclid.org Published On :: Mon, 03 Feb 2020 04:00 EST Durba Bhattacharya, Sourabh Bhattacharya. Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 1, 71--89.Abstract: Present day bio-medical research is pointing towards the fact that cognizance of gene–environment interactions along with genetic interactions may help prevent or detain the onset of many complex diseases like cardiovascular disease, cancer, type2 diabetes, autism or asthma by adjustments to lifestyle. In this regard, we propose a Bayesian semiparametric model to detect not only the roles of genes and their interactions, but also the possible influence of environmental variables on the genes in case-control studies. Our model also accounts for the unknown number of genetic sub-populations via finite mixtures composed of Dirichlet processes. An effective parallel computing methodology, developed by us harnesses the power of parallel processing technology to increase the efficiencies of our conditionally independent Gibbs sampling and Transformation based MCMC (TMCMC) methods. Applications of our model and methods to simulation studies with biologically realistic genotype datasets and a real, case-control based genotype dataset on early onset of myocardial infarction (MI) have yielded quite interesting results beside providing some insights into the differential effect of gender on MI. Full Article
ter Option pricing with bivariate risk-neutral density via copula and heteroscedastic model: A Bayesian approach By projecteuclid.org Published On :: Mon, 26 Aug 2019 04:00 EDT Lucas Pereira Lopes, Vicente Garibay Cancho, Francisco Louzada. Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 4, 801--825.Abstract: Multivariate options are adequate tools for multi-asset risk management. The pricing models derived from the pioneer Black and Scholes method under the multivariate case consider that the asset-object prices follow a Brownian geometric motion. However, the construction of such methods imposes some unrealistic constraints on the process of fair option calculation, such as constant volatility over the maturity time and linear correlation between the assets. Therefore, this paper aims to price and analyze the fair price behavior of the call-on-max (bivariate) option considering marginal heteroscedastic models with dependence structure modeled via copulas. Concerning inference, we adopt a Bayesian perspective and computationally intensive methods based on Monte Carlo simulations via Markov Chain (MCMC). A simulation study examines the bias, and the root mean squared errors of the posterior means for the parameters. Real stocks prices of Brazilian banks illustrate the approach. For the proposed method is verified the effects of strike and dependence structure on the fair price of the option. The results show that the prices obtained by our heteroscedastic model approach and copulas differ substantially from the prices obtained by the model derived from Black and Scholes. Empirical results are presented to argue the advantages of our strategy. Full Article
ter Estimation of parameters in the $operatorname{DDRCINAR}(p)$ model By projecteuclid.org Published On :: Mon, 10 Jun 2019 04:04 EDT Xiufang Liu, Dehui Wang. Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 3, 638--673.Abstract: This paper discusses a $p$th-order dependence-driven random coefficient integer-valued autoregressive time series model ($operatorname{DDRCINAR}(p)$). Stationarity and ergodicity properties are proved. Conditional least squares, weighted least squares and maximum quasi-likelihood are used to estimate the model parameters. Asymptotic properties of the estimators are presented. The performances of these estimators are investigated and compared via simulations. In certain regions of the parameter space, simulative analysis shows that maximum quasi-likelihood estimators perform better than the estimators of conditional least squares and weighted least squares in terms of the proportion of within-$Omega$ estimates. At last, the model is applied to two real data sets. Full Article
ter Modified information criterion for testing changes in skew normal model By projecteuclid.org Published On :: Mon, 04 Mar 2019 04:00 EST Khamis K. Said, Wei Ning, Yubin Tian. Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 2, 280--300.Abstract: In this paper, we study the change point problem for the skew normal distribution model from the view of model selection problem. The detection procedure based on the modified information criterion (MIC) for change problem is proposed. Such a procedure has advantage in detecting the changes in early and late stage of a data comparing to the one based on the traditional Schwarz information criterion which is well known as Bayesian information criterion (BIC) by considering the complexity of the models. Due to the difficulty in deriving the analytic asymptotic distribution of the test statistic based on the MIC procedure, the bootstrap simulation is provided to obtain the critical values at the different significance levels. Simulations are conducted to illustrate the comparisons of performance between MIC, BIC and likelihood ratio test (LRT). Such an approach is applied on two stock market data sets to indicate the detection procedure. Full Article
ter Simple tail index estimation for dependent and heterogeneous data with missing values By projecteuclid.org Published On :: Mon, 14 Jan 2019 04:01 EST Ivana Ilić, Vladica M. Veličković. Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 1, 192--203.Abstract: Financial returns are known to be nonnormal and tend to have fat-tailed distribution. Also, the dependence of large values in a stochastic process is an important topic in risk, insurance and finance. In the presence of missing values, we deal with the asymptotic properties of a simple “median” estimator of the tail index based on random variables with the heavy-tailed distribution function and certain dependence among the extremes. Weak consistency and asymptotic normality of the proposed estimator are established. The estimator is a special case of a well-known estimator defined in Bacro and Brito [ Statistics & Decisions 3 (1993) 133–143]. The advantage of the estimator is its robustness against deviations and compared to Hill’s, it is less affected by the fluctuations related to the maximum of the sample or by the presence of outliers. Several examples are analyzed in order to support the proofs. Full Article
ter An estimation method for latent traits and population parameters in Nominal Response Model By projecteuclid.org Published On :: Thu, 05 Aug 2010 15:41 EDT Caio L. N. Azevedo, Dalton F. AndradeSource: Braz. J. Probab. Stat., Volume 24, Number 3, 415--433.Abstract: The nominal response model (NRM) was proposed by Bock [ Psychometrika 37 (1972) 29–51] in order to improve the latent trait (ability) estimation in multiple choice tests with nominal items. When the item parameters are known, expectation a posteriori or maximum a posteriori methods are commonly employed to estimate the latent traits, considering a standard symmetric normal distribution as the latent traits prior density. However, when this item set is presented to a new group of examinees, it is not only necessary to estimate their latent traits but also the population parameters of this group. This article has two main purposes: first, to develop a Monte Carlo Markov Chain algorithm to estimate both latent traits and population parameters concurrently. This algorithm comprises the Metropolis–Hastings within Gibbs sampling algorithm (MHWGS) proposed by Patz and Junker [ Journal of Educational and Behavioral Statistics 24 (1999b) 346–366]. Second, to compare, in the latent trait recovering, the performance of this method with three other methods: maximum likelihood, expectation a posteriori and maximum a posteriori. The comparisons were performed by varying the total number of items (NI), the number of categories and the values of the mean and the variance of the latent trait distribution. The results showed that MHWGS outperforms the other methods concerning the latent traits estimation as well as it recoveries properly the population parameters. Furthermore, we found that NI accounts for the highest percentage of the variability in the accuracy of latent trait estimation. Full Article
ter Unlikeness is us : fourteen from the Exeter book By dal.novanet.ca Published On :: Fri, 1 May 2020 19:34:09 -0300 Author: Exeter book. Selections. EnglishCallnumber: PS 8631 A8489 E94 2018ISBN: 9781554471751 (softcover) Full Article
ter The Grand River watershed : a folk ecology : poems By dal.novanet.ca Published On :: Fri, 1 May 2020 19:34:09 -0300 Author: Houle, Karen, author.Callnumber: PS 8565 O78 G73 2019ISBN: 9781554471843 paperback Full Article
ter Nights below Foord Street : literature and popular culture in postindustrial Nova Scotia By dal.novanet.ca Published On :: Fri, 1 May 2020 19:34:09 -0300 Author: Thompson, Peter, 1981- author.Callnumber: PS 8131 N6 T56 2019ISBN: 0773559345 Full Article
ter Novel bodies : disability and sexuality in eighteenth-century British literature By dal.novanet.ca Published On :: Fri, 1 May 2020 19:34:09 -0300 Author: Farr, Jason S., 1978- author.Callnumber: PR 858 P425 F37 2019ISBN: 9781684481088 hardcover alkaline paper Full Article
ter Globalizing capital : a history of the international monetary system By dal.novanet.ca Published On :: Fri, 1 May 2020 19:34:09 -0300 Author: Eichengreen, Barry J., author.Callnumber: HG 3881 E347 2019ISBN: 9780691193908 (paperback) Full Article
ter Can $p$-values be meaningfully interpreted without random sampling? By projecteuclid.org Published On :: Thu, 26 Mar 2020 22:02 EDT Norbert Hirschauer, Sven Grüner, Oliver Mußhoff, Claudia Becker, Antje Jantsch. Source: Statistics Surveys, Volume 14, 71--91.Abstract: Besides the inferential errors that abound in the interpretation of $p$-values, the probabilistic pre-conditions (i.e. random sampling or equivalent) for using them at all are not often met by observational studies in the social sciences. This paper systematizes different sampling designs and discusses the restrictive requirements of data collection that are the indispensable prerequisite for using $p$-values. Full Article
ter Scalar-on-function regression for predicting distal outcomes from intensively gathered longitudinal data: Interpretability for applied scientists By projecteuclid.org Published On :: Tue, 05 Nov 2019 22:03 EST John J. Dziak, Donna L. Coffman, Matthew Reimherr, Justin Petrovich, Runze Li, Saul Shiffman, Mariya P. Shiyko. Source: Statistics Surveys, Volume 13, 150--180.Abstract: Researchers are sometimes interested in predicting a distal or external outcome (such as smoking cessation at follow-up) from the trajectory of an intensively recorded longitudinal variable (such as urge to smoke). This can be done in a semiparametric way via scalar-on-function regression. However, the resulting fitted coefficient regression function requires special care for correct interpretation, as it represents the joint relationship of time points to the outcome, rather than a marginal or cross-sectional relationship. We provide practical guidelines, based on experience with scientific applications, for helping practitioners interpret their results and illustrate these ideas using data from a smoking cessation study. Full Article
ter Variable selection methods for model-based clustering By projecteuclid.org Published On :: Thu, 26 Apr 2018 04:00 EDT Michael Fop, Thomas Brendan Murphy. Source: Statistics Surveys, Volume 12, 18--65.Abstract: Model-based clustering is a popular approach for clustering multivariate data which has seen applications in numerous fields. Nowadays, high-dimensional data are more and more common and the model-based clustering approach has adapted to deal with the increasing dimensionality. In particular, the development of variable selection techniques has received a lot of attention and research effort in recent years. Even for small size problems, variable selection has been advocated to facilitate the interpretation of the clustering results. This review provides a summary of the methods developed for variable selection in model-based clustering. Existing R packages implementing the different methods are indicated and illustrated in application to two data analysis examples. Full Article
ter $M$-functionals of multivariate scatter By projecteuclid.org Published On :: Fri, 20 Mar 2015 09:11 EDT Lutz Dümbgen, Markus Pauly, Thomas Schweizer. Source: Statistics Surveys, Volume 9, 32--105.Abstract: This survey provides a self-contained account of $M$-estimation of multivariate scatter. In particular, we present new proofs for existence of the underlying $M$-functionals and discuss their weak continuity and differentiability. This is done in a rather general framework with matrix-valued random variables. By doing so we reveal a connection between Tyler’s (1987a) $M$-functional of scatter and the estimation of proportional covariance matrices. Moreover, this general framework allows us to treat a new class of scatter estimators, based on symmetrizations of arbitrary order. Finally these results are applied to $M$-estimation of multivariate location and scatter via multivariate $t$-distributions. Full Article
ter Finite mixture models and model-based clustering By projecteuclid.org Published On :: Thu, 05 Aug 2010 15:41 EDT Volodymyr Melnykov, Ranjan MaitraSource: Statist. Surv., Volume 4, 80--116.Abstract: Finite mixture models have a long history in statistics, having been used to model population heterogeneity, generalize distributional assumptions, and lately, for providing a convenient yet formal framework for clustering and classification. This paper provides a detailed review into mixture models and model-based clustering. Recent trends as well as open problems in the area are also discussed. Full Article
ter Wilcoxon-Mann-Whitney or t-test? On assumptions for hypothesis tests and multiple interpretations of decision rules By projecteuclid.org Published On :: Thu, 05 Aug 2010 15:41 EDT Michael P. Fay, Michael A. ProschanSource: Statist. Surv., Volume 4, 1--39.Abstract: In a mathematical approach to hypothesis tests, we start with a clearly defined set of hypotheses and choose the test with the best properties for those hypotheses. In practice, we often start with less precise hypotheses. For example, often a researcher wants to know which of two groups generally has the larger responses, and either a t-test or a Wilcoxon-Mann-Whitney (WMW) test could be acceptable. Although both t-tests and WMW tests are usually associated with quite different hypotheses, the decision rule and p-value from either test could be associated with many different sets of assumptions, which we call perspectives. It is useful to have many of the different perspectives to which a decision rule may be applied collected in one place, since each perspective allows a different interpretation of the associated p-value. Here we collect many such perspectives for the two-sample t-test, the WMW test and other related tests. We discuss validity and consistency under each perspective and discuss recommendations between the tests in light of these many different perspectives. Finally, we briefly discuss a decision rule for testing genetic neutrality where knowledge of the many perspectives is vital to the proper interpretation of the decision rule. Full Article
ter Holtermann and the A&A Photographic Company By feedproxy.google.com Published On :: Thu, 10 Sep 2015 02:50:04 +0000 We recently received a comment about authorship of the Holtermann Collection. Although it may seem a purely historica Full Article
ter Unsupervised Pre-trained Models from Healthy ADLs Improve Parkinson's Disease Classification of Gait Patterns. (arXiv:2005.02589v2 [cs.LG] UPDATED) By arxiv.org Published On :: Application and use of deep learning algorithms for different healthcare applications is gaining interest at a steady pace. However, use of such algorithms can prove to be challenging as they require large amounts of training data that capture different possible variations. This makes it difficult to use them in a clinical setting since in most health applications researchers often have to work with limited data. Less data can cause the deep learning model to over-fit. In this paper, we ask how can we use data from a different environment, different use-case, with widely differing data distributions. We exemplify this use case by using single-sensor accelerometer data from healthy subjects performing activities of daily living - ADLs (source dataset), to extract features relevant to multi-sensor accelerometer gait data (target dataset) for Parkinson's disease classification. We train the pre-trained model using the source dataset and use it as a feature extractor. We show that the features extracted for the target dataset can be used to train an effective classification model. Our pre-trained source model consists of a convolutional autoencoder, and the target classification model is a simple multi-layer perceptron model. We explore two different pre-trained source models, trained using different activity groups, and analyze the influence the choice of pre-trained model has over the task of Parkinson's disease classification. Full Article
ter Interpreting Rate-Distortion of Variational Autoencoder and Using Model Uncertainty for Anomaly Detection. (arXiv:2005.01889v2 [cs.LG] UPDATED) By arxiv.org Published On :: Building a scalable machine learning system for unsupervised anomaly detection via representation learning is highly desirable. One of the prevalent methods is using a reconstruction error from variational autoencoder (VAE) via maximizing the evidence lower bound. We revisit VAE from the perspective of information theory to provide some theoretical foundations on using the reconstruction error, and finally arrive at a simpler and more effective model for anomaly detection. In addition, to enhance the effectiveness of detecting anomalies, we incorporate a practical model uncertainty measure into the metric. We show empirically the competitive performance of our approach on benchmark datasets. Full Article
ter Data-Space Inversion Using a Recurrent Autoencoder for Time-Series Parameterization. (arXiv:2005.00061v2 [stat.ML] UPDATED) By arxiv.org Published On :: Data-space inversion (DSI) and related procedures represent a family of methods applicable for data assimilation in subsurface flow settings. These methods differ from model-based techniques in that they provide only posterior predictions for quantities (time series) of interest, not posterior models with calibrated parameters. DSI methods require a large number of flow simulations to first be performed on prior geological realizations. Given observed data, posterior predictions can then be generated directly. DSI operates in a Bayesian setting and provides posterior samples of the data vector. In this work we develop and evaluate a new approach for data parameterization in DSI. Parameterization reduces the number of variables to determine in the inversion, and it maintains the physical character of the data variables. The new parameterization uses a recurrent autoencoder (RAE) for dimension reduction, and a long-short-term memory (LSTM) network to represent flow-rate time series. The RAE-based parameterization is combined with an ensemble smoother with multiple data assimilation (ESMDA) for posterior generation. Results are presented for two- and three-phase flow in a 2D channelized system and a 3D multi-Gaussian model. The RAE procedure, along with existing DSI treatments, are assessed through comparison to reference rejection sampling (RS) results. The new DSI methodology is shown to consistently outperform existing approaches, in terms of statistical agreement with RS results. The method is also shown to accurately capture derived quantities, which are computed from variables considered directly in DSI. This requires correlation and covariance between variables to be properly captured, and accuracy in these relationships is demonstrated. The RAE-based parameterization developed here is clearly useful in DSI, and it may also find application in other subsurface flow problems. Full Article
ter Short-term forecasts of COVID-19 spread across Indian states until 1 May 2020. (arXiv:2004.13538v2 [q-bio.PE] UPDATED) By arxiv.org Published On :: The very first case of corona-virus illness was recorded on 30 January 2020, in India and the number of infected cases, including the death toll, continues to rise. In this paper, we present short-term forecasts of COVID-19 for 28 Indian states and five union territories using real-time data from 30 January to 21 April 2020. Applying Holt's second-order exponential smoothing method and autoregressive integrated moving average (ARIMA) model, we generate 10-day ahead forecasts of the likely number of infected cases and deaths in India for 22 April to 1 May 2020. Our results show that the number of cumulative cases in India will rise to 36335.63 [PI 95% (30884.56, 42918.87)], concurrently the number of deaths may increase to 1099.38 [PI 95% (959.77, 1553.76)] by 1 May 2020. Further, we have divided the country into severity zones based on the cumulative cases. According to this analysis, Maharashtra is likely to be the most affected states with around 9787.24 [PI 95% (6949.81, 13757.06)] cumulative cases by 1 May 2020. However, Kerala and Karnataka are likely to shift from the red zone (i.e. highly affected) to the lesser affected region. On the other hand, Gujarat and Madhya Pradesh will move to the red zone. These results mark the states where lockdown by 3 May 2020, can be loosened. Full Article
ter Excess registered deaths in England and Wales during the COVID-19 pandemic, March 2020 and April 2020. (arXiv:2004.11355v4 [stat.AP] UPDATED) By arxiv.org Published On :: Official counts of COVID-19 deaths have been criticized for potentially including people who did not die of COVID-19 but merely died with COVID-19. I address that critique by fitting a generalized additive model to weekly counts of all registered deaths in England and Wales during the 2010s. The model produces baseline rates of death registrations expected in the absence of the COVID-19 pandemic, and comparing those baselines to recent counts of registered deaths exposes the emergence of excess deaths late in March 2020. Among adults aged 45+, about 38,700 excess deaths were registered in the 5 weeks comprising 21 March through 24 April (612 $pm$ 416 from 21$-$27 March, 5675 $pm$ 439 from 28 March through 3 April, then 9183 $pm$ 468, 12,712 $pm$ 589, and 10,511 $pm$ 567 in April's next 3 weeks). Both the Office for National Statistics's respective count of 26,891 death certificates which mention COVID-19, and the Department of Health and Social Care's hospital-focused count of 21,222 deaths, are appreciably less, implying that their counting methods have underestimated rather than overestimated the pandemic's true death toll. If underreporting rates have held steady, about 45,900 direct and indirect COVID-19 deaths might have been registered by April's end but not yet publicly reported in full. Full Article
ter DualSMC: Tunneling Differentiable Filtering and Planning under Continuous POMDPs. (arXiv:1909.13003v4 [cs.LG] UPDATED) By arxiv.org Published On :: A major difficulty of solving continuous POMDPs is to infer the multi-modal distribution of the unobserved true states and to make the planning algorithm dependent on the perceived uncertainty. We cast POMDP filtering and planning problems as two closely related Sequential Monte Carlo (SMC) processes, one over the real states and the other over the future optimal trajectories, and combine the merits of these two parts in a new model named the DualSMC network. In particular, we first introduce an adversarial particle filter that leverages the adversarial relationship between its internal components. Based on the filtering results, we then propose a planning algorithm that extends the previous SMC planning approach [Piche et al., 2018] to continuous POMDPs with an uncertainty-dependent policy. Crucially, not only can DualSMC handle complex observations such as image input but also it remains highly interpretable. It is shown to be effective in three continuous POMDP domains: the floor positioning domain, the 3D light-dark navigation domain, and a modified Reacher domain. Full Article
ter Estimating drift parameters in a non-ergodic Gaussian Vasicek-type model. (arXiv:1909.06155v2 [math.PR] UPDATED) By arxiv.org Published On :: We study the problem of parameter estimation for a non-ergodic Gaussian Vasicek-type model defined as $dX_t=(mu+ heta X_t)dt+dG_t, tgeq0$ with unknown parameters $ heta>0$ and $muinR$, where $G$ is a Gaussian process. We provide least square-type estimators $widetilde{ heta}_T$ and $widetilde{mu}_T$ respectively for the drift parameters $ heta$ and $mu$ based on continuous-time observations ${X_t, tin[0,T]}$ as $T ightarrowinfty$. Our aim is to derive some sufficient conditions on the driving Gaussian process $G$ in order to ensure that $widetilde{ heta}_T$ and $widetilde{mu}_T$ are strongly consistent, the limit distribution of $widetilde{ heta}_T$ is a Cauchy-type distribution and $widetilde{mu}_T$ is asymptotically normal. We apply our result to fractional Vasicek, subfractional Vasicek and bifractional Vasicek processes. In addition, this work extends the result of cite{EEO} studied in the case where $mu=0$. Full Article
ter Alternating Maximization: Unifying Framework for 8 Sparse PCA Formulations and Efficient Parallel Codes. (arXiv:1212.4137v2 [stat.ML] UPDATED) By arxiv.org Published On :: Given a multivariate data set, sparse principal component analysis (SPCA) aims to extract several linear combinations of the variables that together explain the variance in the data as much as possible, while controlling the number of nonzero loadings in these combinations. In this paper we consider 8 different optimization formulations for computing a single sparse loading vector; these are obtained by combining the following factors: we employ two norms for measuring variance (L2, L1) and two sparsity-inducing norms (L0, L1), which are used in two different ways (constraint, penalty). Three of our formulations, notably the one with L0 constraint and L1 variance, have not been considered in the literature. We give a unifying reformulation which we propose to solve via a natural alternating maximization (AM) method. We show the the AM method is nontrivially equivalent to GPower (Journ'{e}e et al; JMLR 11:517--553, 2010) for all our formulations. Besides this, we provide 24 efficient parallel SPCA implementations: 3 codes (multi-core, GPU and cluster) for each of the 8 problems. Parallelism in the methods is aimed at i) speeding up computations (our GPU code can be 100 times faster than an efficient serial code written in C++), ii) obtaining solutions explaining more variance and iii) dealing with big data problems (our cluster code is able to solve a 357 GB problem in about a minute). Full Article
ter Visualisation and knowledge discovery from interpretable models. (arXiv:2005.03632v1 [cs.LG]) By arxiv.org Published On :: Increasing number of sectors which affect human lives, are using Machine Learning (ML) tools. Hence the need for understanding their working mechanism and evaluating their fairness in decision-making, are becoming paramount, ushering in the era of Explainable AI (XAI). In this contribution we introduced a few intrinsically interpretable models which are also capable of dealing with missing values, in addition to extracting knowledge from the dataset and about the problem. These models are also capable of visualisation of the classifier and decision boundaries: they are the angle based variants of Learning Vector Quantization. We have demonstrated the algorithms on a synthetic dataset and a real-world one (heart disease dataset from the UCI repository). The newly developed classifiers helped in investigating the complexities of the UCI dataset as a multiclass problem. The performance of the developed classifiers were comparable to those reported in literature for this dataset, with additional value of interpretability, when the dataset was treated as a binary class problem. Full Article
ter Know Your Clients' behaviours: a cluster analysis of financial transactions. (arXiv:2005.03625v1 [econ.EM]) By arxiv.org Published On :: In Canada, financial advisors and dealers by provincial securities commissions, and those self-regulatory organizations charged with direct regulation over investment dealers and mutual fund dealers, respectively to collect and maintain Know Your Client (KYC) information, such as their age or risk tolerance, for investor accounts. With this information, investors, under their advisor's guidance, make decisions on their investments which are presumed to be beneficial to their investment goals. Our unique dataset is provided by a financial investment dealer with over 50,000 accounts for over 23,000 clients. We use a modified behavioural finance recency, frequency, monetary model for engineering features that quantify investor behaviours, and machine learning clustering algorithms to find groups of investors that behave similarly. We show that the KYC information collected does not explain client behaviours, whereas trade and transaction frequency and volume are most informative. We believe the results shown herein encourage financial regulators and advisors to use more advanced metrics to better understand and predict investor behaviours. Full Article
ter Predictive Modeling of ICU Healthcare-Associated Infections from Imbalanced Data. Using Ensembles and a Clustering-Based Undersampling Approach. (arXiv:2005.03582v1 [cs.LG]) By arxiv.org Published On :: Early detection of patients vulnerable to infections acquired in the hospital environment is a challenge in current health systems given the impact that such infections have on patient mortality and healthcare costs. This work is focused on both the identification of risk factors and the prediction of healthcare-associated infections in intensive-care units by means of machine-learning methods. The aim is to support decision making addressed at reducing the incidence rate of infections. In this field, it is necessary to deal with the problem of building reliable classifiers from imbalanced datasets. We propose a clustering-based undersampling strategy to be used in combination with ensemble classifiers. A comparative study with data from 4616 patients was conducted in order to validate our proposal. We applied several single and ensemble classifiers both to the original dataset and to data preprocessed by means of different resampling methods. The results were analyzed by means of classic and recent metrics specifically designed for imbalanced data classification. They revealed that the proposal is more efficient in comparison with other approaches. Full Article
ter Transfer Learning for sEMG-based Hand Gesture Classification using Deep Learning in a Master-Slave Architecture. (arXiv:2005.03460v1 [eess.SP]) By arxiv.org Published On :: Recent advancements in diagnostic learning and development of gesture-based human machine interfaces have driven surface electromyography (sEMG) towards significant importance. Analysis of hand gestures requires an accurate assessment of sEMG signals. The proposed work presents a novel sequential master-slave architecture consisting of deep neural networks (DNNs) for classification of signs from the Indian sign language using signals recorded from multiple sEMG channels. The performance of the master-slave network is augmented by leveraging additional synthetic feature data generated by long short term memory networks. Performance of the proposed network is compared to that of a conventional DNN prior to and after the addition of synthetic data. Up to 14% improvement is observed in the conventional DNN and up to 9% improvement in master-slave network on addition of synthetic data with an average accuracy value of 93.5% asserting the suitability of the proposed approach. Full Article
ter Interpreting Deep Models through the Lens of Data. (arXiv:2005.03442v1 [cs.LG]) By arxiv.org Published On :: Identification of input data points relevant for the classifier (i.e. serve as the support vector) has recently spurred the interest of researchers for both interpretability as well as dataset debugging. This paper presents an in-depth analysis of the methods which attempt to identify the influence of these data points on the resulting classifier. To quantify the quality of the influence, we curated a set of experiments where we debugged and pruned the dataset based on the influence information obtained from different methods. To do so, we provided the classifier with mislabeled examples that hampered the overall performance. Since the classifier is a combination of both the data and the model, therefore, it is essential to also analyze these influences for the interpretability of deep learning models. Analysis of the results shows that some interpretability methods can detect mislabels better than using a random approach, however, contrary to the claim of these methods, the sample selection based on the training loss showed a superior performance. Full Article
ter Relevance Vector Machine with Weakly Informative Hyperprior and Extended Predictive Information Criterion. (arXiv:2005.03419v1 [stat.ML]) By arxiv.org Published On :: In the variational relevance vector machine, the gamma distribution is representative as a hyperprior over the noise precision of automatic relevance determination prior. Instead of the gamma hyperprior, we propose to use the inverse gamma hyperprior with a shape parameter close to zero and a scale parameter not necessary close to zero. This hyperprior is associated with the concept of a weakly informative prior. The effect of this hyperprior is investigated through regression to non-homogeneous data. Because it is difficult to capture the structure of such data with a single kernel function, we apply the multiple kernel method, in which multiple kernel functions with different widths are arranged for input data. We confirm that the degrees of freedom in a model is controlled by adjusting the scale parameter and keeping the shape parameter close to zero. A candidate for selecting the scale parameter is the predictive information criterion. However the estimated model using this criterion seems to cause over-fitting. This is because the multiple kernel method makes the model a situation where the dimension of the model is larger than the data size. To select an appropriate scale parameter even in such a situation, we also propose an extended prediction information criterion. It is confirmed that a multiple kernel relevance vector regression model with good predictive accuracy can be obtained by selecting the scale parameter minimizing extended prediction information criterion. Full Article
ter A Locally Adaptive Interpretable Regression. (arXiv:2005.03350v1 [stat.ML]) By arxiv.org Published On :: Machine learning models with both good predictability and high interpretability are crucial for decision support systems. Linear regression is one of the most interpretable prediction models. However, the linearity in a simple linear regression worsens its predictability. In this work, we introduce a locally adaptive interpretable regression (LoAIR). In LoAIR, a metamodel parameterized by neural networks predicts percentile of a Gaussian distribution for the regression coefficients for a rapid adaptation. Our experimental results on public benchmark datasets show that our model not only achieves comparable or better predictive performance than the other state-of-the-art baselines but also discovers some interesting relationships between input and target variables such as a parabolic relationship between CO2 emissions and Gross National Product (GNP). Therefore, LoAIR is a step towards bridging the gap between econometrics, statistics, and machine learning by improving the predictive ability of linear regression without depreciating its interpretability. Full Article
ter Fractional ridge regression: a fast, interpretable reparameterization of ridge regression. (arXiv:2005.03220v1 [stat.ME]) By arxiv.org Published On :: Ridge regression (RR) is a regularization technique that penalizes the L2-norm of the coefficients in linear regression. One of the challenges of using RR is the need to set a hyperparameter ($alpha$) that controls the amount of regularization. Cross-validation is typically used to select the best $alpha$ from a set of candidates. However, efficient and appropriate selection of $alpha$ can be challenging, particularly where large amounts of data are analyzed. Because the selected $alpha$ depends on the scale of the data and predictors, it is not straightforwardly interpretable. Here, we propose to reparameterize RR in terms of the ratio $gamma$ between the L2-norms of the regularized and unregularized coefficients. This approach, called fractional RR (FRR), has several benefits: the solutions obtained for different $gamma$ are guaranteed to vary, guarding against wasted calculations, and automatically span the relevant range of regularization, avoiding the need for arduous manual exploration. We provide an algorithm to solve FRR, as well as open-source software implementations in Python and MATLAB (https://github.com/nrdg/fracridge). We show that the proposed method is fast and scalable for large-scale data problems, and delivers results that are straightforward to interpret and compare across models and datasets. Full Article
ter Efficient Characterization of Dynamic Response Variation Using Multi-Fidelity Data Fusion through Composite Neural Network. (arXiv:2005.03213v1 [stat.ML]) By arxiv.org Published On :: Uncertainties in a structure is inevitable, which generally lead to variation in dynamic response predictions. For a complex structure, brute force Monte Carlo simulation for response variation analysis is infeasible since one single run may already be computationally costly. Data driven meta-modeling approaches have thus been explored to facilitate efficient emulation and statistical inference. The performance of a meta-model hinges upon both the quality and quantity of training dataset. In actual practice, however, high-fidelity data acquired from high-dimensional finite element simulation or experiment are generally scarce, which poses significant challenge to meta-model establishment. In this research, we take advantage of the multi-level response prediction opportunity in structural dynamic analysis, i.e., acquiring rapidly a large amount of low-fidelity data from reduced-order modeling, and acquiring accurately a small amount of high-fidelity data from full-scale finite element analysis. Specifically, we formulate a composite neural network fusion approach that can fully utilize the multi-level, heterogeneous datasets obtained. It implicitly identifies the correlation of the low- and high-fidelity datasets, which yields improved accuracy when compared with the state-of-the-art. Comprehensive investigations using frequency response variation characterization as case example are carried out to demonstrate the performance. Full Article
ter Fair Algorithms for Hierarchical Agglomerative Clustering. (arXiv:2005.03197v1 [cs.LG]) By arxiv.org Published On :: Hierarchical Agglomerative Clustering (HAC) algorithms are extensively utilized in modern data science and machine learning, and seek to partition the dataset into clusters while generating a hierarchical relationship between the data samples themselves. HAC algorithms are employed in a number of applications, such as biology, natural language processing, and recommender systems. Thus, it is imperative to ensure that these algorithms are fair-- even if the dataset contains biases against certain protected groups, the cluster outputs generated should not be discriminatory against samples from any of these groups. However, recent work in clustering fairness has mostly focused on center-based clustering algorithms, such as k-median and k-means clustering. Therefore, in this paper, we propose fair algorithms for performing HAC that enforce fairness constraints 1) irrespective of the distance linkage criteria used, 2) generalize to any natural measures of clustering fairness for HAC, 3) work for multiple protected groups, and 4) have competitive running times to vanilla HAC. To the best of our knowledge, this is the first work that studies fairness for HAC algorithms. We also propose an algorithm with lower asymptotic time complexity than HAC algorithms that can rectify existing HAC outputs and make them subsequently fair as a result. Moreover, we carry out extensive experiments on multiple real-world UCI datasets to demonstrate the working of our algorithms. Full Article
ter Entries open for $40,000 award for female scriptwriters By feedproxy.google.com Published On :: Thu, 05 Mar 2020 23:11:18 +0000 Friday 6 March 2020 Nominations opened for the 2020 Mona Brand Award for Women Stage and Screen Writers. Full Article
ter Shortlists announced for 2020 NSW Premier’s Literary Awards By feedproxy.google.com Published On :: Thu, 19 Mar 2020 21:24:32 +0000 Friday 20 March 2020 Contemporary works by leading and emerging Australian writers have been shortlisted for the 2020 NSW Premier's Literary Awards, the State Library of NSW announced today. Full Article
ter 2020 NSW Premier’s Literary Awards announced By feedproxy.google.com Published On :: Sat, 25 Apr 2020 01:29:17 +0000 Sunday 26 April 2020 A total of $295,000 awarded across 12 prize categories. Full Article
ter ManifoldOptim: An R Interface to the ROPTLIB Library for Riemannian Manifold Optimization By www.jstatsoft.org Published On :: Sat, 18 Apr 2020 03:35:08 +0000 Manifold optimization appears in a wide variety of computational problems in the applied sciences. In recent statistical methodologies such as sufficient dimension reduction and regression envelopes, estimation relies on the optimization of likelihood functions over spaces of matrices such as the Stiefel or Grassmann manifolds. Recently, Huang, Absil, Gallivan, and Hand (2016) have introduced the library ROPTLIB, which provides a framework and state of the art algorithms to optimize real-valued objective functions over commonly used matrix-valued Riemannian manifolds. This article presents ManifoldOptim, an R package that wraps the C++ library ROPTLIB. ManifoldOptim enables users to access functionality in ROPTLIB through R so that optimization problems can easily be constructed, solved, and integrated into larger R codes. Computationally intensive problems can be programmed with Rcpp and RcppArmadillo, and otherwise accessed through R. We illustrate the practical use of ManifoldOptim through several motivating examples involving dimension reduction and envelope methods in regression. Full Article
ter Anxiety and compassion: emotions and the surgical encounter in early 19th-century Britain By blog.wellcomelibrary.org Published On :: Thu, 02 Nov 2017 12:49:06 +0000 The next seminar in the 2017–18 History of Pre-Modern Medicine seminar series takes place on Tuesday 7 November. Speaker: Dr Michael Brown (University of Roehampton), ‘Anxiety and compassion: emotions and the surgical encounter in early 19th-century Britain’ The historical study of the… Continue reading Full Article Early Medicine Events and Visits 19th century emotions seminars surgery
ter Close encounters: a manuscripts workshop By blog.wellcomelibrary.org Published On :: Mon, 23 Apr 2018 15:18:54 +0000 A free manuscripts workshop for PhD students at Wellcome Collection, 01 June 2018 Engaging with an artefact from the past is often a powerful experience, eliciting emotional and sensory, as well as analytical, responses. Researchers in the library at Wellcome… Continue reading Full Article Early Medicine Events and Visits emotions manuscripts materiality senses study visits
ter 2020 NSW Premier’s Literary Awards announced By feedproxy.google.com Published On :: Sat, 25 Apr 2020 01:12:46 +0000 A total of $295,000 awarded across 12 prize categories. Full Article
ter Water hyacinth : a potential lignocellulosic biomass for bioethanol By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Author: Sharma, Anuja, authorCallnumber: OnlineISBN: 9783030356323 (electronic bk.) Full Article
ter The interaction of food industry and environment By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9780128175156 (electronic bk.) Full Article