anc Branch and Bound for Piecewise Linear Neural Network Verification By Published On :: 2020 The success of Deep Learning and its potential use in many safety-critical applicationshas motivated research on formal verification of Neural Network (NN) models. In thiscontext, verification involves proving or disproving that an NN model satisfies certaininput-output properties. Despite the reputation of learned NN models as black boxes,and the theoretical hardness of proving useful properties about them, researchers havebeen successful in verifying some classes of models by exploiting their piecewise linearstructure and taking insights from formal methods such as Satisifiability Modulo Theory.However, these methods are still far from scaling to realistic neural networks. To facilitateprogress on this crucial area, we exploit the Mixed Integer Linear Programming (MIP) formulation of verification to propose a family of algorithms based on Branch-and-Bound (BaB). We show that our family contains previous verification methods as special cases.With the help of the BaB framework, we make three key contributions. Firstly, we identifynew methods that combine the strengths of multiple existing approaches, accomplishingsignificant performance improvements over previous state of the art. Secondly, we introducean effective branching strategy on ReLU non-linearities. This branching strategy allows usto efficiently and successfully deal with high input dimensional problems with convolutionalnetwork architecture, on which previous methods fail frequently. Finally, we proposecomprehensive test data sets and benchmarks which includes a collection of previouslyreleased testcases. We use the data sets to conduct a thorough experimental comparison ofexisting and new algorithms and to provide an inclusive analysis of the factors impactingthe hardness of verification problems. Full Article
anc Ancestral Gumbel-Top-k Sampling for Sampling Without Replacement By Published On :: 2020 We develop ancestral Gumbel-Top-$k$ sampling: a generic and efficient method for sampling without replacement from discrete-valued Bayesian networks, which includes multivariate discrete distributions, Markov chains and sequence models. The method uses an extension of the Gumbel-Max trick to sample without replacement by finding the top $k$ of perturbed log-probabilities among all possible configurations of a Bayesian network. Despite the exponentially large domain, the algorithm has a complexity linear in the number of variables and sample size $k$. Our algorithm allows to set the number of parallel processors $m$, to trade off the number of iterations versus the total cost (iterations times $m$) of running the algorithm. For $m = 1$ the algorithm has minimum total cost, whereas for $m = k$ the number of iterations is minimized, and the resulting algorithm is known as Stochastic Beam Search. We provide extensions of the algorithm and discuss a number of related algorithms. We analyze the properties of ancestral Gumbel-Top-$k$ sampling and compare against alternatives on randomly generated Bayesian networks with different levels of connectivity. In the context of (deep) sequence models, we show its use as a method to generate diverse but high-quality translations and statistical estimates of translation quality and entropy. Full Article
anc Robust Asynchronous Stochastic Gradient-Push: Asymptotically Optimal and Network-Independent Performance for Strongly Convex Functions By Published On :: 2020 We consider the standard model of distributed optimization of a sum of functions $F(mathbf z) = sum_{i=1}^n f_i(mathbf z)$, where node $i$ in a network holds the function $f_i(mathbf z)$. We allow for a harsh network model characterized by asynchronous updates, message delays, unpredictable message losses, and directed communication among nodes. In this setting, we analyze a modification of the Gradient-Push method for distributed optimization, assuming that (i) node $i$ is capable of generating gradients of its function $f_i(mathbf z)$ corrupted by zero-mean bounded-support additive noise at each step, (ii) $F(mathbf z)$ is strongly convex, and (iii) each $f_i(mathbf z)$ has Lipschitz gradients. We show that our proposed method asymptotically performs as well as the best bounds on centralized gradient descent that takes steps in the direction of the sum of the noisy gradients of all the functions $f_1(mathbf z), ldots, f_n(mathbf z)$ at each step. Full Article
anc (1 + epsilon)-class Classification: an Anomaly Detection Method for Highly Imbalanced or Incomplete Data Sets By Published On :: 2020 Anomaly detection is not an easy problem since distribution of anomalous samples is unknown a priori. We explore a novel method that gives a trade-off possibility between one-class and two-class approaches, and leads to a better performance on anomaly detection problems with small or non-representative anomalous samples. The method is evaluated using several data sets and compared to a set of conventional one-class and two-class approaches. Full Article
anc Identifiability of Additive Noise Models Using Conditional Variances By Published On :: 2020 This paper considers a new identifiability condition for additive noise models (ANMs) in which each variable is determined by an arbitrary Borel measurable function of its parents plus an independent error. It has been shown that ANMs are fully recoverable under some identifiability conditions, such as when all error variances are equal. However, this identifiable condition could be restrictive, and hence, this paper focuses on a relaxed identifiability condition that involves not only error variances, but also the influence of parents. This new class of identifiable ANMs does not put any constraints on the form of dependencies, or distributions of errors, and allows different error variances. It further provides a statistically consistent and computationally feasible structure learning algorithm for the identifiable ANMs based on the new identifiability condition. The proposed algorithm assumes that all relevant variables are observed, while it does not assume faithfulness or a sparse graph. Demonstrated through extensive simulated and real multivariate data is that the proposed algorithm successfully recovers directed acyclic graphs. Full Article
anc Branching random walks with uncountably many extinction probability vectors By projecteuclid.org Published On :: Mon, 04 May 2020 04:00 EDT Daniela Bertacchi, Fabio Zucca. Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 2, 426--438.Abstract: Given a branching random walk on a set $X$, we study its extinction probability vectors $mathbf{q}(cdot,A)$. Their components are the probability that the process goes extinct in a fixed $Asubseteq X$, when starting from a vertex $xin X$. The set of extinction probability vectors (obtained letting $A$ vary among all subsets of $X$) is a subset of the set of the fixed points of the generating function of the branching random walk. In particular here we are interested in the cardinality of the set of extinction probability vectors. We prove results which allow to understand whether the probability of extinction in a set $A$ is different from the one of extinction in another set $B$. In many cases there are only two possible extinction probability vectors and so far, in more complicated examples, only a finite number of distinct extinction probability vectors had been explicitly found. Whether a branching random walk could have an infinite number of distinct extinction probability vectors was not known. We apply our results to construct examples of branching random walks with uncountably many distinct extinction probability vectors. Full Article
anc Keeping the balance—Bridge sampling for marginal likelihood estimation in finite mixture, mixture of experts and Markov mixture models By projecteuclid.org Published On :: Mon, 26 Aug 2019 04:00 EDT Sylvia Frühwirth-Schnatter. Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 4, 706--733.Abstract: Finite mixture models and their extensions to Markov mixture and mixture of experts models are very popular in analysing data of various kind. A challenge for these models is choosing the number of components based on marginal likelihoods. The present paper suggests two innovative, generic bridge sampling estimators of the marginal likelihood that are based on constructing balanced importance densities from the conditional densities arising during Gibbs sampling. The full permutation bridge sampling estimator is derived from considering all possible permutations of the mixture labels for a subset of these densities. For the double random permutation bridge sampling estimator, two levels of random permutations are applied, first to permute the labels of the MCMC draws and second to randomly permute the labels of the conditional densities arising during Gibbs sampling. Various applications show very good performance of these estimators in comparison to importance and to reciprocal importance sampling estimators derived from the same importance densities. Full Article
anc Necessary and sufficient conditions for the convergence of the consistent maximal displacement of the branching random walk By projecteuclid.org Published On :: Mon, 04 Mar 2019 04:00 EST Bastien Mallein. Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 2, 356--373.Abstract: Consider a supercritical branching random walk on the real line. The consistent maximal displacement is the smallest of the distances between the trajectories followed by individuals at the $n$th generation and the boundary of the process. Fang and Zeitouni, and Faraud, Hu and Shi proved that under some integrability conditions, the consistent maximal displacement grows almost surely at rate $lambda^{*}n^{1/3}$ for some explicit constant $lambda^{*}$. We obtain here a necessary and sufficient condition for this asymptotic behaviour to hold. Full Article
anc Reclaiming indigenous governance : reflections and insights from Australia, Canada, New Zealand, and the United States By dal.novanet.ca Published On :: Fri, 1 May 2020 19:34:09 -0300 Callnumber: K 3247 R43 2019ISBN: 9780816539970 (paperback) Full Article
anc Pitfalls of significance testing and $p$-value variability: An econometrics perspective By projecteuclid.org Published On :: Wed, 03 Oct 2018 22:00 EDT Norbert Hirschauer, Sven Grüner, Oliver Mußhoff, Claudia Becker. Source: Statistics Surveys, Volume 12, 136--172.Abstract: Data on how many scientific findings are reproducible are generally bleak and a wealth of papers have warned against misuses of the $p$-value and resulting false findings in recent years. This paper discusses the question of what we can(not) learn from the $p$-value, which is still widely considered as the gold standard of statistical validity. We aim to provide a non-technical and easily accessible resource for statistical practitioners who wish to spot and avoid misinterpretations and misuses of statistical significance tests. For this purpose, we first classify and describe the most widely discussed (“classical”) pitfalls of significance testing, and review published work on these misuses with a focus on regression-based “confirmatory” study. This includes a description of the single-study bias and a simulation-based illustration of how proper meta-analysis compares to misleading significance counts (“vote counting”). Going beyond the classical pitfalls, we also use simulation to provide intuition that relying on the statistical estimate “$p$-value” as a measure of evidence without considering its sample-to-sample variability falls short of the mark even within an otherwise appropriate interpretation. We conclude with a discussion of the exigencies of informed approaches to statistical inference and corresponding institutional reforms. Full Article
anc Adaptive clinical trial designs for phase I cancer studies By projecteuclid.org Published On :: Thu, 29 May 2014 09:11 EDT Oleksandr Sverdlov, Weng Kee Wong, Yevgen Ryeznik. Source: Statistics Surveys, Volume 8, 2--44.Abstract: Adaptive clinical trials are becoming increasingly popular research designs for clinical investigation. Adaptive designs are particularly useful in phase I cancer studies where clinical data are scant and the goals are to assess the drug dose-toxicity profile and to determine the maximum tolerated dose while minimizing the number of study patients treated at suboptimal dose levels. In the current work we give an overview of adaptive design methods for phase I cancer trials. We find that modern statistical literature is replete with novel adaptive designs that have clearly defined objectives and established statistical properties, and are shown to outperform conventional dose finding methods such as the 3+3 design, both in terms of statistical efficiency and in terms of minimizing the number of patients treated at highly toxic or nonefficacious doses. We discuss statistical, logistical, and regulatory aspects of these designs and present some links to non-commercial statistical software for implementing these methods in practice. Full Article
anc Was one of your ancestors a whaler? By feedproxy.google.com Published On :: Mon, 31 Jul 2017 06:25:29 +0000 Whaling – along with wool production – was one of the first primary industries after the establishment of New South Wa Full Article
anc Was your ancestor a doctor? By feedproxy.google.com Published On :: Mon, 31 Jul 2017 22:58:54 +0000 A register of medical practitioners was first required to be kept in 1838 in New South Wales and was published in the G Full Article
anc A Global Benchmark of Algorithms for Segmenting Late Gadolinium-Enhanced Cardiac Magnetic Resonance Imaging. (arXiv:2004.12314v3 [cs.CV] UPDATED) By arxiv.org Published On :: Segmentation of cardiac images, particularly late gadolinium-enhanced magnetic resonance imaging (LGE-MRI) widely used for visualizing diseased cardiac structures, is a crucial first step for clinical diagnosis and treatment. However, direct segmentation of LGE-MRIs is challenging due to its attenuated contrast. Since most clinical studies have relied on manual and labor-intensive approaches, automatic methods are of high interest, particularly optimized machine learning approaches. To address this, we organized the "2018 Left Atrium Segmentation Challenge" using 154 3D LGE-MRIs, currently the world's largest cardiac LGE-MRI dataset, and associated labels of the left atrium segmented by three medical experts, ultimately attracting the participation of 27 international teams. In this paper, extensive analysis of the submitted algorithms using technical and biological metrics was performed by undergoing subgroup analysis and conducting hyper-parameter analysis, offering an overall picture of the major design choices of convolutional neural networks (CNNs) and practical considerations for achieving state-of-the-art left atrium segmentation. Results show the top method achieved a dice score of 93.2% and a mean surface to a surface distance of 0.7 mm, significantly outperforming prior state-of-the-art. Particularly, our analysis demonstrated that double, sequentially used CNNs, in which a first CNN is used for automatic region-of-interest localization and a subsequent CNN is used for refined regional segmentation, achieved far superior results than traditional methods and pipelines containing single CNNs. This large-scale benchmarking study makes a significant step towards much-improved segmentation methods for cardiac LGE-MRIs, and will serve as an important benchmark for evaluating and comparing the future works in the field. Full Article
anc On the impact of selected modern deep-learning techniques to the performance and celerity of classification models in an experimental high-energy physics use case. (arXiv:2002.01427v3 [physics.data-an] UPDATED) By arxiv.org Published On :: Beginning from a basic neural-network architecture, we test the potential benefits offered by a range of advanced techniques for machine learning, in particular deep learning, in the context of a typical classification problem encountered in the domain of high-energy physics, using a well-studied dataset: the 2014 Higgs ML Kaggle dataset. The advantages are evaluated in terms of both performance metrics and the time required to train and apply the resulting models. Techniques examined include domain-specific data-augmentation, learning rate and momentum scheduling, (advanced) ensembling in both model-space and weight-space, and alternative architectures and connection methods. Following the investigation, we arrive at a model which achieves equal performance to the winning solution of the original Kaggle challenge, whilst being significantly quicker to train and apply, and being suitable for use with both GPU and CPU hardware setups. These reductions in timing and hardware requirements potentially allow the use of more powerful algorithms in HEP analyses, where models must be retrained frequently, sometimes at short notice, by small groups of researchers with limited hardware resources. Additionally, a new wrapper library for PyTorch called LUMINis presented, which incorporates all of the techniques studied. Full Article
anc Covariance Matrix Adaptation for the Rapid Illumination of Behavior Space. (arXiv:1912.02400v2 [cs.LG] UPDATED) By arxiv.org Published On :: We focus on the challenge of finding a diverse collection of quality solutions on complex continuous domains. While quality diver-sity (QD) algorithms like Novelty Search with Local Competition (NSLC) and MAP-Elites are designed to generate a diverse range of solutions, these algorithms require a large number of evaluations for exploration of continuous spaces. Meanwhile, variants of the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) are among the best-performing derivative-free optimizers in single-objective continuous domains. This paper proposes a new QD algorithm called Covariance Matrix Adaptation MAP-Elites (CMA-ME). Our new algorithm combines the self-adaptation techniques of CMA-ES with archiving and mapping techniques for maintaining diversity in QD. Results from experiments based on standard continuous optimization benchmarks show that CMA-ME finds better-quality solutions than MAP-Elites; similarly, results on the strategic game Hearthstone show that CMA-ME finds both a higher overall quality and broader diversity of strategies than both CMA-ES and MAP-Elites. Overall, CMA-ME more than doubles the performance of MAP-Elites using standard QD performance metrics. These results suggest that QD algorithms augmented by operators from state-of-the-art optimization algorithms can yield high-performing methods for simultaneously exploring and optimizing continuous search spaces, with significant applications to design, testing, and reinforcement learning among other domains. Full Article
anc $V$-statistics and Variance Estimation. (arXiv:1912.01089v2 [stat.ML] UPDATED) By arxiv.org Published On :: This paper develops a general framework for analyzing asymptotics of $V$-statistics. Previous literature on limiting distribution mainly focuses on the cases when $n o infty$ with fixed kernel size $k$. Under some regularity conditions, we demonstrate asymptotic normality when $k$ grows with $n$ by utilizing existing results for $U$-statistics. The key in our approach lies in a mathematical reduction to $U$-statistics by designing an equivalent kernel for $V$-statistics. We also provide a unified treatment on variance estimation for both $U$- and $V$-statistics by observing connections to existing methods and proposing an empirically more accurate estimator. Ensemble methods such as random forests, where multiple base learners are trained and aggregated for prediction purposes, serve as a running example throughout the paper because they are a natural and flexible application of $V$-statistics. Full Article
anc Convergence rates for optimised adaptive importance samplers. (arXiv:1903.12044v4 [stat.CO] UPDATED) By arxiv.org Published On :: Adaptive importance samplers are adaptive Monte Carlo algorithms to estimate expectations with respect to some target distribution which extit{adapt} themselves to obtain better estimators over a sequence of iterations. Although it is straightforward to show that they have the same $mathcal{O}(1/sqrt{N})$ convergence rate as standard importance samplers, where $N$ is the number of Monte Carlo samples, the behaviour of adaptive importance samplers over the number of iterations has been left relatively unexplored. In this work, we investigate an adaptation strategy based on convex optimisation which leads to a class of adaptive importance samplers termed extit{optimised adaptive importance samplers} (OAIS). These samplers rely on the iterative minimisation of the $chi^2$-divergence between an exponential-family proposal and the target. The analysed algorithms are closely related to the class of adaptive importance samplers which minimise the variance of the weight function. We first prove non-asymptotic error bounds for the mean squared errors (MSEs) of these algorithms, which explicitly depend on the number of iterations and the number of samples together. The non-asymptotic bounds derived in this paper imply that when the target belongs to the exponential family, the $L_2$ errors of the optimised samplers converge to the optimal rate of $mathcal{O}(1/sqrt{N})$ and the rate of convergence in the number of iterations are explicitly provided. When the target does not belong to the exponential family, the rate of convergence is the same but the asymptotic $L_2$ error increases by a factor $sqrt{ ho^star} > 1$, where $ ho^star - 1$ is the minimum $chi^2$-divergence between the target and an exponential-family proposal. Full Article
anc Multi-scale analysis of lead-lag relationships in high-frequency financial markets. (arXiv:1708.03992v3 [stat.ME] UPDATED) By arxiv.org Published On :: We propose a novel estimation procedure for scale-by-scale lead-lag relationships of financial assets observed at high-frequency in a non-synchronous manner. The proposed estimation procedure does not require any interpolation processing of original datasets and is applicable to those with highest time resolution available. Consistency of the proposed estimators is shown under the continuous-time framework that has been developed in our previous work Hayashi and Koike (2018). An empirical application to a quote dataset of the NASDAQ-100 assets identifies two types of lead-lag relationships at different time scales. Full Article
anc Know Your Clients' behaviours: a cluster analysis of financial transactions. (arXiv:2005.03625v1 [econ.EM]) By arxiv.org Published On :: In Canada, financial advisors and dealers by provincial securities commissions, and those self-regulatory organizations charged with direct regulation over investment dealers and mutual fund dealers, respectively to collect and maintain Know Your Client (KYC) information, such as their age or risk tolerance, for investor accounts. With this information, investors, under their advisor's guidance, make decisions on their investments which are presumed to be beneficial to their investment goals. Our unique dataset is provided by a financial investment dealer with over 50,000 accounts for over 23,000 clients. We use a modified behavioural finance recency, frequency, monetary model for engineering features that quantify investor behaviours, and machine learning clustering algorithms to find groups of investors that behave similarly. We show that the KYC information collected does not explain client behaviours, whereas trade and transaction frequency and volume are most informative. We believe the results shown herein encourage financial regulators and advisors to use more advanced metrics to better understand and predict investor behaviours. Full Article
anc Domain Adaptation in Highly Imbalanced and Overlapping Datasets. (arXiv:2005.03585v1 [cs.LG]) By arxiv.org Published On :: In many Machine Learning domains, datasets are characterized by highly imbalanced and overlapping classes. Particularly in the medical domain, a specific list of symptoms can be labeled as one of various different conditions. Some of these conditions may be more prevalent than others by several orders of magnitude. Here we present a novel unsupervised Domain Adaptation scheme for such datasets. The scheme, based on a specific type of Quantification, is designed to work under both label and conditional shifts. It is demonstrated on datasets generated from Electronic Health Records and provides high quality results for both Quantification and Domain Adaptation in very challenging scenarios. Potential benefits of using this scheme in the current COVID-19 outbreak, for estimation of prevalence and probability of infection, are discussed. Full Article
anc Predictive Modeling of ICU Healthcare-Associated Infections from Imbalanced Data. Using Ensembles and a Clustering-Based Undersampling Approach. (arXiv:2005.03582v1 [cs.LG]) By arxiv.org Published On :: Early detection of patients vulnerable to infections acquired in the hospital environment is a challenge in current health systems given the impact that such infections have on patient mortality and healthcare costs. This work is focused on both the identification of risk factors and the prediction of healthcare-associated infections in intensive-care units by means of machine-learning methods. The aim is to support decision making addressed at reducing the incidence rate of infections. In this field, it is necessary to deal with the problem of building reliable classifiers from imbalanced datasets. We propose a clustering-based undersampling strategy to be used in combination with ensemble classifiers. A comparative study with data from 4616 patients was conducted in order to validate our proposal. We applied several single and ensemble classifiers both to the original dataset and to data preprocessed by means of different resampling methods. The results were analyzed by means of classic and recent metrics specifically designed for imbalanced data classification. They revealed that the proposal is more efficient in comparison with other approaches. Full Article
anc On unbalanced data and common shock models in stochastic loss reserving. (arXiv:2005.03500v1 [q-fin.RM]) By arxiv.org Published On :: Introducing common shocks is a popular dependence modelling approach, with some recent applications in loss reserving. The main advantage of this approach is the ability to capture structural dependence coming from known relationships. In addition, it helps with the parsimonious construction of correlation matrices of large dimensions. However, complications arise in the presence of "unbalanced data", that is, when (expected) magnitude of observations over a single triangle, or between triangles, can vary substantially. Specifically, if a single common shock is applied to all of these cells, it can contribute insignificantly to the larger values and/or swamp the smaller ones, unless careful adjustments are made. This problem is further complicated in applications involving negative claim amounts. In this paper, we address this problem in the loss reserving context using a common shock Tweedie approach for unbalanced data. We show that the solution not only provides a much better balance of the common shock proportions relative to the unbalanced data, but it is also parsimonious. Finally, the common shock Tweedie model also provides distributional tractability. Full Article
anc Relevance Vector Machine with Weakly Informative Hyperprior and Extended Predictive Information Criterion. (arXiv:2005.03419v1 [stat.ML]) By arxiv.org Published On :: In the variational relevance vector machine, the gamma distribution is representative as a hyperprior over the noise precision of automatic relevance determination prior. Instead of the gamma hyperprior, we propose to use the inverse gamma hyperprior with a shape parameter close to zero and a scale parameter not necessary close to zero. This hyperprior is associated with the concept of a weakly informative prior. The effect of this hyperprior is investigated through regression to non-homogeneous data. Because it is difficult to capture the structure of such data with a single kernel function, we apply the multiple kernel method, in which multiple kernel functions with different widths are arranged for input data. We confirm that the degrees of freedom in a model is controlled by adjusting the scale parameter and keeping the shape parameter close to zero. A candidate for selecting the scale parameter is the predictive information criterion. However the estimated model using this criterion seems to cause over-fitting. This is because the multiple kernel method makes the model a situation where the dimension of the model is larger than the data size. To select an appropriate scale parameter even in such a situation, we also propose an extended prediction information criterion. It is confirmed that a multiple kernel relevance vector regression model with good predictive accuracy can be obtained by selecting the scale parameter minimizing extended prediction information criterion. Full Article
anc Multi-Label Sampling based on Local Label Imbalance. (arXiv:2005.03240v1 [cs.LG]) By arxiv.org Published On :: Class imbalance is an inherent characteristic of multi-label data that hinders most multi-label learning methods. One efficient and flexible strategy to deal with this problem is to employ sampling techniques before training a multi-label learning model. Although existing multi-label sampling approaches alleviate the global imbalance of multi-label datasets, it is actually the imbalance level within the local neighbourhood of minority class examples that plays a key role in performance degradation. To address this issue, we propose a novel measure to assess the local label imbalance of multi-label datasets, as well as two multi-label sampling approaches based on the local label imbalance, namely MLSOL and MLUL. By considering all informative labels, MLSOL creates more diverse and better labeled synthetic instances for difficult examples, while MLUL eliminates instances that are harmful to their local region. Experimental results on 13 multi-label datasets demonstrate the effectiveness of the proposed measure and sampling approaches for a variety of evaluation metrics, particularly in the case of an ensemble of classifiers trained on repeated samples of the original data. Full Article
anc Subdomain Adaptation with Manifolds Discrepancy Alignment. (arXiv:2005.03229v1 [cs.LG]) By arxiv.org Published On :: Reducing domain divergence is a key step in transfer learning problems. Existing works focus on the minimization of global domain divergence. However, two domains may consist of several shared subdomains, and differ from each other in each subdomain. In this paper, we take the local divergence of subdomains into account in transfer. Specifically, we propose to use low-dimensional manifold to represent subdomain, and align the local data distribution discrepancy in each manifold across domains. A Manifold Maximum Mean Discrepancy (M3D) is developed to measure the local distribution discrepancy in each manifold. We then propose a general framework, called Transfer with Manifolds Discrepancy Alignment (TMDA), to couple the discovery of data manifolds with the minimization of M3D. We instantiate TMDA in the subspace learning case considering both the linear and nonlinear mappings. We also instantiate TMDA in the deep learning framework. Extensive experimental studies demonstrate that TMDA is a promising method for various transfer learning tasks. Full Article
anc On the Optimality of Randomization in Experimental Design: How to Randomize for Minimax Variance and Design-Based Inference. (arXiv:2005.03151v1 [stat.ME]) By arxiv.org Published On :: I study the minimax-optimal design for a two-arm controlled experiment where conditional mean outcomes may vary in a given set. When this set is permutation symmetric, the optimal design is complete randomization, and using a single partition (i.e., the design that only randomizes the treatment labels for each side of the partition) has minimax risk larger by a factor of $n-1$. More generally, the optimal design is shown to be the mixed-strategy optimal design (MSOD) of Kallus (2018). Notably, even when the set of conditional mean outcomes has structure (i.e., is not permutation symmetric), being minimax-optimal for variance still requires randomization beyond a single partition. Nonetheless, since this targets precision, it may still not ensure sufficient uniformity in randomization to enable randomization (i.e., design-based) inference by Fisher's exact test to appropriately detect violations of null. I therefore propose the inference-constrained MSOD, which is minimax-optimal among all designs subject to such uniformity constraints. On the way, I discuss Johansson et al. (2020) who recently compared rerandomization of Morgan and Rubin (2012) and the pure-strategy optimal design (PSOD) of Kallus (2018). I point out some errors therein and set straight that randomization is minimax-optimal and that the "no free lunch" theorem and example in Kallus (2018) are correct. Full Article
anc Adaptive Invariance for Molecule Property Prediction. (arXiv:2005.03004v1 [q-bio.QM]) By arxiv.org Published On :: Effective property prediction methods can help accelerate the search for COVID-19 antivirals either through accurate in-silico screens or by effectively guiding on-going at-scale experimental efforts. However, existing prediction tools have limited ability to accommodate scarce or fragmented training data currently available. In this paper, we introduce a novel approach to learn predictors that can generalize or extrapolate beyond the heterogeneous data. Our method builds on and extends recently proposed invariant risk minimization, adaptively forcing the predictor to avoid nuisance variation. We achieve this by continually exercising and manipulating latent representations of molecules to highlight undesirable variation to the predictor. To test the method we use a combination of three data sources: SARS-CoV-2 antiviral screening data, molecular fragments that bind to SARS-CoV-2 main protease and large screening data for SARS-CoV-1. Our predictor outperforms state-of-the-art transfer learning methods by significant margin. We also report the top 20 predictions of our model on Broad drug repurposing hub. Full Article
anc Theranostics approaches to gastric and colon cancer By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9789811520174 (electronic bk.) Full Article
anc Sustainable agriculture : advances in plant metabolome and microbiome By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Author: Parray, Javid Ahmad, authorCallnumber: OnlineISBN: 9780128173749 (electronic bk.) Full Article
anc Sowing legume seeds, reaping cash : a renaissance within communities in Sub-Saharan Africa By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Author: Akpo, Essegbemon, author.Callnumber: OnlineISBN: 9789811508455 (electronic bk.) Full Article
anc Regulation of cancer immune checkpoints : molecular and cellular mechanisms and therapy By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9789811532665 Full Article
anc Racing for the surface : pathogenesis of implant infection and advanced antimicrobial strategies By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9783030344757 (electronic bk.) Full Article
anc Priming-mediated stress and cross-stress tolerance in crop plants By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9780128178935 (electronic bk.) Full Article
anc Microbiological advancements for higher altitude agro-ecosystems and sustainability By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9789811519024 (electronic bk.) Full Article
anc Medical pharmacology at a glance By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Author: Neal, M. J., author.Callnumber: OnlineISBN: 9781119548096 (epub) Full Article
anc Management of Hereditary Colorectal Cancer By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9783030262341 978-3-030-26234-1 Full Article
anc Intelligent wavelet based techniques for advanced multimedia applications By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Author: Singh, Rajiv, authorCallnumber: OnlineISBN: 9783030318734 (electronic bk.) Full Article
anc Governance of offshore freshwater resources By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Author: Martin-Nagle, Renee, author.Callnumber: OnlineISBN: 9004421041 (electronic book) Full Article
anc Gapenski's understanding healthcare financial management By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Author: Pink, George H., author.Callnumber: OnlineISBN: 9781640551145 (electronic bk.) Full Article
anc Functional foods in cancer prevention and therapy By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9780128165386 (electronic bk.) Full Article
anc Encyclopedia of cancer By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9783642278419 (electronic bk.) Full Article
anc Diabetes & obesity in women : adolescence, pregnancy, and menopause By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Author: Diabetes in women.Callnumber: OnlineISBN: 9781496390547 (paperback) Full Article
anc Chickpea : crop wild relatives for enhancing genetic gains By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9780128183007 (electronic bk.) Full Article
anc Botulinum toxins, fillers and related substances By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9783319168029 (electronic bk.) Full Article
anc Beyond our genes : pathophysiology of gene and environment interaction and epigenetic inheritance By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9783030352134 (electronic bk.) Full Article
anc Atlas of Lymphatic System in Cancer By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Author: Gantsev, Shamil. author. aut http://id.loc.gov/vocabulary/relators/autCallnumber: OnlineISBN: 9783030409678 978-3-030-40967-8 Full Article
anc Aquatic biopolymers : understanding their industrial significance and environmental implications By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Author: Olatunji, Ololade.Callnumber: OnlineISBN: 9783030347093 (electronic bk.) Full Article
anc Advances in virus research. By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9780123850348 (electronic bk.) Full Article
anc Advances in protein chemistry and structural biology. By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9780123819635 (electronic bk.) Full Article