ev On Stationary-Point Hitting Time and Ergodicity of Stochastic Gradient Langevin Dynamics By Published On :: 2020 Stochastic gradient Langevin dynamics (SGLD) is a fundamental algorithm in stochastic optimization. Recent work by Zhang et al. (2017) presents an analysis for the hitting time of SGLD for the first and second order stationary points. The proof in Zhang et al. (2017) is a two-stage procedure through bounding the Cheeger's constant, which is rather complicated and leads to loose bounds. In this paper, using intuitions from stochastic differential equations, we provide a direct analysis for the hitting times of SGLD to the first and second order stationary points. Our analysis is straightforward. It only relies on basic linear algebra and probability theory tools. Our direct analysis also leads to tighter bounds comparing to Zhang et al. (2017) and shows the explicit dependence of the hitting time on different factors, including dimensionality, smoothness, noise strength, and step size effects. Under suitable conditions, we show that the hitting time of SGLD to first-order stationary points can be dimension-independent. Moreover, we apply our analysis to study several important online estimation problems in machine learning, including linear regression, matrix factorization, and online PCA. Full Article
ev Crime Prevention at Home By www.eastgwillimbury.ca Published On :: Wed, 11 Dec 2019 17:46:59 GMT Full Article
ev Recent developments in complex and spatially correlated functional data By projecteuclid.org Published On :: Mon, 04 May 2020 04:00 EDT Israel Martínez-Hernández, Marc G. Genton. Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 2, 204--229.Abstract: As high-dimensional and high-frequency data are being collected on a large scale, the development of new statistical models is being pushed forward. Functional data analysis provides the required statistical methods to deal with large-scale and complex data by assuming that data are continuous functions, for example, realizations of a continuous process (curves) or continuous random field (surfaces), and that each curve or surface is considered as a single observation. Here, we provide an overview of functional data analysis when data are complex and spatially correlated. We provide definitions and estimators of the first and second moments of the corresponding functional random variable. We present two main approaches: The first assumes that data are realizations of a functional random field, that is, each observation is a curve with a spatial component. We call them spatial functional data . The second approach assumes that data are continuous deterministic fields observed over time. In this case, one observation is a surface or manifold, and we call them surface time series . For these two approaches, we describe software available for the statistical analysis. We also present a data illustration, using a high-resolution wind speed simulated dataset, as an example of the two approaches. The functional data approach offers a new paradigm of data analysis, where the continuous processes or random fields are considered as a single entity. We consider this approach to be very valuable in the context of big data. Full Article
ev Time series of count data: A review, empirical comparisons and data analysis By projecteuclid.org Published On :: Mon, 26 Aug 2019 04:00 EDT Glaura C. Franco, Helio S. Migon, Marcos O. Prates. Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 4, 756--781.Abstract: Observation and parameter driven models are commonly used in the literature to analyse time series of counts. In this paper, we study the characteristics of a variety of models and point out the main differences and similarities among these procedures, concerning parameter estimation, model fitting and forecasting. Alternatively to the literature, all inference was performed under the Bayesian paradigm. The models are fitted with a latent AR($p$) process in the mean, which accounts for autocorrelation in the data. An extensive simulation study shows that the estimates for the covariate parameters are remarkably similar across the different models. However, estimates for autoregressive coefficients and forecasts of future values depend heavily on the underlying process which generates the data. A real data set of bankruptcy in the United States is also analysed. Full Article
ev A brief review of optimal scaling of the main MCMC approaches and optimal scaling of additive TMCMC under non-regular cases By projecteuclid.org Published On :: Mon, 04 Mar 2019 04:00 EST Kushal K. Dey, Sourabh Bhattacharya. Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 2, 222--266.Abstract: Transformation based Markov Chain Monte Carlo (TMCMC) was proposed by Dutta and Bhattacharya ( Statistical Methodology 16 (2014) 100–116) as an efficient alternative to the Metropolis–Hastings algorithm, especially in high dimensions. The main advantage of this algorithm is that it simultaneously updates all components of a high dimensional parameter using appropriate move types defined by deterministic transformation of a single random variable. This results in reduction in time complexity at each step of the chain and enhances the acceptance rate. In this paper, we first provide a brief review of the optimal scaling theory for various existing MCMC approaches, comparing and contrasting them with the corresponding TMCMC approaches.The optimal scaling of the simplest form of TMCMC, namely additive TMCMC , has been studied extensively for the Gaussian proposal density in Dey and Bhattacharya (2017a). Here, we discuss diffusion-based optimal scaling behavior of additive TMCMC for non-Gaussian proposal densities—in particular, uniform, Student’s $t$ and Cauchy proposals. Although we could not formally prove our diffusion result for the Cauchy proposal, simulation based results lead us to conjecture that at least the recipe for obtaining general optimal scaling and optimal acceptance rate holds for the Cauchy case as well. We also consider diffusion based optimal scaling of TMCMC when the target density is discontinuous. Such non-regular situations have been studied in the case of Random Walk Metropolis Hastings (RWMH) algorithm by Neal and Roberts ( Methodology and Computing in Applied Probability 13 (2011) 583–601) using expected squared jumping distance (ESJD), but the diffusion theory based scaling has not been considered. We compare our diffusion based optimally scaled TMCMC approach with the ESJD based optimally scaled RWM with simulation studies involving several target distributions and proposal distributions including the challenging Cauchy proposal case, showing that additive TMCMC outperforms RWMH in almost all cases considered. Full Article
ev Figuring racism in medieval Christianity By dal.novanet.ca Published On :: Fri, 1 May 2020 19:34:09 -0300 Author: Kaplan, M. Lindsay, author.Callnumber: BT 734.2 K354 2019ISBN: 9780190678241 hardcover alkaline paper Full Article
ev A review of dynamic network models with latent variables By projecteuclid.org Published On :: Mon, 03 Sep 2018 04:01 EDT Bomin Kim, Kevin H. Lee, Lingzhou Xue, Xiaoyue Niu. Source: Statistics Surveys, Volume 12, 105--135.Abstract: We present a selective review of statistical modeling of dynamic networks. We focus on models with latent variables, specifically, the latent space models and the latent class models (or stochastic blockmodels), which investigate both the observed features and the unobserved structure of networks. We begin with an overview of the static models, and then we introduce the dynamic extensions. For each dynamic model, we also discuss its applications that have been studied in the literature, with the data source listed in Appendix. Based on the review, we summarize a list of open problems and challenges in dynamic network modeling with latent variables. Full Article
ev Statistical inference for dynamical systems: A review By projecteuclid.org Published On :: Tue, 10 Nov 2015 09:20 EST Kevin McGoff, Sayan Mukherjee, Natesh Pillai. Source: Statistics Surveys, Volume 9, 209--252.Abstract: The topic of statistical inference for dynamical systems has been studied widely across several fields. In this survey we focus on methods related to parameter estimation for nonlinear dynamical systems. Our objective is to place results across distinct disciplines in a common setting and highlight opportunities for further research. Full Article
ev Log-concavity and strong log-concavity: A review By projecteuclid.org Published On :: Tue, 09 Dec 2014 09:09 EST Adrien Saumard, Jon A. Wellner. Source: Statistics Surveys, Volume 8, 45--114.Abstract: We review and formulate results concerning log-concavity and strong-log-concavity in both discrete and continuous settings. We show how preservation of log-concavity and strong log-concavity on $mathbb{R}$ under convolution follows from a fundamental monotonicity result of Efron (1965). We provide a new proof of Efron’s theorem using the recent asymmetric Brascamp-Lieb inequality due to Otto and Menz (2013). Along the way we review connections between log-concavity and other areas of mathematics and statistics, including concentration of measure, log-Sobolev inequalities, convex geometry, MCMC algorithms, Laplace approximations, and machine learning. Full Article
ev Prediction in several conventional contexts By projecteuclid.org Published On :: Tue, 08 May 2012 08:50 EDT Bertrand Clarke, Jennifer ClarkeSource: Statist. Surv., Volume 6, 1--73.Abstract: We review predictive techniques from several traditional branches of statistics. Starting with prediction based on the normal model and on the empirical distribution function, we proceed to techniques for various forms of regression and classification. Then, we turn to time series, longitudinal data, and survival analysis. Our focus throughout is on the mechanics of prediction more than on the properties of predictors. Full Article
ev A review of survival trees By projecteuclid.org Published On :: Mon, 12 Sep 2011 09:13 EDT Imad Bou-Hamad, Denis Larocque, Hatem Ben-AmeurSource: Statist. Surv., Volume 5, 44--71.Abstract: This paper presents a non–technical account of the developments in tree–based methods for the analysis of survival data with censoring. This review describes the initial developments, which mainly extended the existing basic tree methodologies to censored data as well as to more recent work. We also cover more complex models, more specialized methods, and more specific problems such as multivariate data, the use of time–varying covariates, discrete–scale survival data, and ensemble methods applied to survival trees. A data example is used to illustrate some methods that are implemented in R. Full Article
ev Data confidentiality: A review of methods for statistical disclosure limitation and methods for assessing privacy By projecteuclid.org Published On :: Fri, 04 Feb 2011 09:16 EST Gregory J. Matthews, Ofer HarelSource: Statist. Surv., Volume 5, 1--29.Abstract: There is an ever increasing demand from researchers for access to useful microdata files. However, there are also growing concerns regarding the privacy of the individuals contained in the microdata. Ideally, microdata could be released in such a way that a balance between usefulness of the data and privacy is struck. This paper presents a review of proposed methods of statistical disclosure control and techniques for assessing the privacy of such methods under different definitions of disclosure. References:Abowd, J., Woodcock, S., 2001. Disclosure limitation in longitudinal linked data. Confidentiality, Disclosure, and Data Access: Theory and Practical Applications for Statistical Agencies, 215–277.Adam, N.R., Worthmann, J.C., 1989. Security-control methods for statistical databases: a comparative study. ACM Comput. Surv. 21 (4), 515–556.Armstrong, M., Rushton, G., Zimmerman, D.L., 1999. Geographically masking health data to preserve confidentiality. Statistics in Medicine 18 (5), 497–525.Bethlehem, J.G., Keller, W., Pannekoek, J., 1990. Disclosure control of microdata. Jorunal of the American Statistical Association 85, 38–45.Blum, A., Dwork, C., McSherry, F., Nissam, K., 2005. Practical privacy: The sulq framework. In: Proceedings of the 24th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems. pp. 128–138.Bowden, R.J., Sim, A.B., 1992. The privacy bootstrap. Journal of Business and Economic Statistics 10 (3), 337–345.Carlson, M., Salabasis, M., 2002. A data-swapping technique for generating synthetic samples; a method for disclosure control. Res. Official Statist. (5), 35–64.Cox, L.H., 1980. Suppression methodology and statistical disclosure control. Journal of the American Statistical Association 75, 377–385.Cox, L.H., 1984. Disclosure control methods for frequency count data. Tech. rep., U.S. Bureau of the Census.Cox, L.H., 1987. A constructive procedure for unbiased controlled rounding. Journal of the American Statistical Association 82, 520–524.Cox, L.H., 1994. Matrix masking methods for disclosure limitation in microdata. Survey Methodology 6, 165–169.Cox, L.H., Fagan, J.T., Greenberg, B., Hemmig, R., 1987. Disclosure avoidance techniques for tabular data. Tech. rep., U.S. Bureau of the Census.Dalenius, T., 1977. Towards a methodology for statistical disclosure control. Statistik Tidskrift 15, 429–444.Dalenius, T., 1986. Finding a needle in a haystack - or identifying anonymous census record. Journal of Official Statistics 2 (3), 329–336.Dalenius, T., Denning, D., 1982. A hybrid scheme for release of statistics. Statistisk Tidskrift.Dalenius, T., Reiss, S.P., 1982. Data-swapping: A technique for disclosure control. Journal of Statistical Planning and Inference 6, 73–85.De Waal, A., Hundepool, A., Willenborg, L., 1995. Argus: Software for statistical disclosure control of microdata. U.S. Census Bureau.DeGroot, M.H., 1962. Uncertainty, information, and sequential experiments. Annals of Mathematical Statistics 33, 404–419.DeGroot, M.H., 1970. Optimal Statistical Decisions. Mansell, London.Dinur, I., Nissam, K., 2003. Revealing information while preserving privacy. In: Proceedings of the 22nd ACM SIGMOD-SIGACT-SIGART Symposium on Principlesof Database Systems. pp. 202–210.Domingo-Ferrer, J., Torra, V., 2001a. A Quantitative Comparison of Disclosure Control Methods for Microdata. In: Doyle, P., Lane, J., Theeuwes, J., Zayatz, L. (Eds.), Confidentiality, Disclosure and Data Access - Theory and Practical Applications for Statistical Agencies. North-Holland, Amsterdam, Ch. 6, pp. 113–135.Domingo-Ferrer, J., Torra, V., 2001b. Disclosure control methods and information loss for microdata. In: Doyle, P., Lane, J., Theeuwes, J., Zayatz, L. (Eds.), Confidentiality, Disclosure and Data Access - Theory and Practical Applications for Statistical Agencies. North-Holland, Amsterdam, Ch. 5, pp. 93–112.Duncan, G., Lambert, D., 1986. Disclosure-limited data dissemination. Journal of the American Statistical Association 81, 10–28.Duncan, G., Lambert, D., 1989. The risk of disclosure for microdata. Journal of Business & Economic Statistics 7, 207–217. Duncan, G., Pearson, R., 1991. Enhancing access to microdata while protecting confidentiality: prospects for the future (with discussion). Statistical Science 6, 219–232.Dwork, C., 2006. Differential privacy. In: ICALP. Springer, pp. 1–12.Dwork, C., 2008. An ad omnia approach to defining and achieving private data analysis. In: Lecture Notes in Computer Science. Springer, p. 10.Dwork, C., Lei, J., 2009. Differential privacy and robust statistics. In: Proceedings of the 41th Annual ACM Symposium on Theory of Computing (STOC). pp. 371–380.Dwork, C., Mcsherry, F., Nissim, K., Smith, A., 2006. Calibrating noise to sensitivity in private data analysis. In: Proceedings of the 3rd Theory of Cryptography Conference. Springer, pp. 265–284.Dwork, C., Nissam, K., 2004. Privacy-preserving datamining on vertically partitioned databases. In: Advances in Cryptology: Proceedings of Crypto. pp. 528–544.Elliot, M., 2000. DIS: a new approach to the measurement of statistical disclosure risk. International Journal of Risk Assessment and Management 2, 39–48.Federal Committee on Statistical Methodology (FCSM), 2005. Statistical policy working group 22 - report on statistical disclosure limitation methodology. U.S. Census Bureau.Fellegi, I.P., 1972. On the question of statistical confidentiality. Journal of the American Statistical Association 67 (337), 7–18.Fienberg, S.E., McIntyre, J., 2004. Data swapping: Variations on a theme by Dalenius and Reiss. In: Domingo-Ferrer, J., Torra, V. (Eds.), Privacy in Statistical Databases. Vol. 3050 of Lecture Notes in Computer Science. Springer Berlin/Heidelberg, pp. 519, http://dx.doi.org/10.1007/ 978-3-540-25955-8_2Fuller, W., 1993. Masking procedurse for microdata disclosure limitation. Journal of Official Statistics 9, 383–406.General Assembly of the United Nations, 1948. Universal declaration of human rights.Gouweleeuw, J., P. Kooiman, L.W., de Wolf, P.-P., 1998. Post randomisation for statistical disclosure control: Theory and implementation. Journal of Official Statistics 14 (4), 463–478.Greenberg, B., 1987. Rank swapping for masking ordinal microdata. Tech. rep., U.S. Bureau of the Census (unpublished manuscript), Suitland, Maryland, USA.Greenberg, B.G., Abul-Ela, A.-L.A., Simmons, W.R., Horvitz, D.G., 1969. The unrelated question randomized response model: Theoretical framework. Journal of the American Statistical Association 64 (326), 520–539.Harel, O., Zhou, X.-H., 2007. Multiple imputation: Review and theory, implementation and software. Statistics in Medicine 26, 3057–3077. Hundepool, A., Domingo-ferrer, J., Franconi, L., Giessing, S., Lenz, R., Longhurst, J., Nordholt, E.S., Seri, G., paul De Wolf, P., 2006. A CENtre of EXcellence for Statistical Disclosure Control Handbook on Statistical Disclosure Control Version 1.01.Hundepool, A., Wetering, A. v.d., Ramaswamy, R., Wolf, P.d., Giessing, S., Fischetti, M., Salazar, J., Castro, J., Lowthian, P., Feb. 2005. τ-argus 3.1 user manual. Statistics Netherlands, Voorburg NL.Hundepool, A., Willenborg, L., 1996. μ- and τ-argus: Software for statistical disclosure control. Third International Seminar on Statistical Confidentiality, Bled.Karr, A., Kohnen, C.N., Oganian, A., Reiter, J.P., Sanil, A.P., 2006. A framework for evaluating the utility of data altered to protect confidentiality. American Statistician 60 (3), 224–232.Kaufman, S., Seastrom, M., Roey, S., 2005. Do disclosure controls to protect confidentiality degrade the quality of the data? In: American Statistical Association, Proceedings of the Section on Survey Research.Kennickell, A.B., 1997. Multiple imputation and disclosure protection: the case of the 1995 survey of consumer finances. Record Linkage Techniques, 248–267.Kim, J., 1986. Limiting disclosure in microdata based on random noise and transformation. Bureau of the Census.Krumm, J., 2007. Inference attacks on location tracks. Proceedings of Fifth International Conference on Pervasive Computingy, 127–143.Li, N., Li, T., Venkatasubramanian, S., 2007. t-closeness: Privacy beyond k-anonymity and l-diversity. In: Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on. pp. 106–115.Liew, C.K., Choi, U.J., Liew, C.J., 1985. A data distortion by probability distribution. ACM Trans. Database Syst. 10 (3), 395–411.Little, R.J.A., 1993. Statistical analysis of masked data. Journal of Official Statistics 9, 407–426.Little, R.J.A., Rubin, D.B., 1987. Statistical Analysis with Missing Data. John Wiley & Sons.Liu, F., Little, R.J.A., 2002. Selective multiple mputation of keys for statistical disclosure control in microdata. In: Proceedings Joint Statistical Meet. pp. 2133–2138.Machanavajjhala, A., Kifer, D., Abowd, J., Gehrke, J., Vilhuber, L., April 2008. Privacy: Theory meets practice on the map. In: International Conference on Data Engineering. Cornell University Comuputer Science Department, Cornell, USA, p. 10.Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M., 2007. L-diversity: Privacy beyond k-anonymity. ACM Trans. Knowl. Discov. Data 1 (1), 3.Manning, A.M., Haglin, D.J., Keane, J.A., 2008. A recursive search algorithm for statistical disclosure assessment. Data Min. Knowl. Discov. 16 (2), 165–196. Marsh, C., Skinner, C., Arber, S., Penhale, B., Openshaw, S., Hobcraft, J., Lievesley, D., Walford, N., 1991. The case for samples of anonymized records from the 1991 census. Journal of the Royal Statistical Society 154 (2), 305–340.Matthews, G.J., Harel, O., Aseltine, R.H., 2010a. Assessing database privacy using the area under the receiver-operator characteristic curve. Health Services and Outcomes Research Methodology 10 (1), 1–15.Matthews, G.J., Harel, O., Aseltine, R.H., 2010b. Examining the robustness of fully synthetic data techniques for data with binary variables. Journal of Statistical Computation and Simulation 80 (6), 609–624.Moore, Jr., R., 1996. Controlled data-swapping techniques for masking public use microdata. Census Tech Report.Mugge, R., 1983. Issues in protecting confidentiality in national health statistics. Proceedings of the Section on Survey Research Methods.Nissim, K., Raskhodnikova, S., Smith, A., 2007. Smooth sensitivity and sampling in private data analysis. In: STOC ’07: Proceedings of the thirty-ninth annual ACM symposium on Theory of computing. pp. 75–84.Paass, G., 1988. Disclosure risk and disclosure avoidance for microdata. Journal of Business and Economic Statistics 6 (4), 487–500.Palley, M., Simonoff, J., 1987. The use of regression methodology for the compromise of confidential information in statistical databases. ACM Trans. Database Systems 12 (4), 593–608.Raghunathan, T.E., Reiter, J.P., Rubin, D.B., 2003. Multiple imputation for statistical disclosure limitation. Journal of Official Statistics 19 (1), 1–16.Rajasekaran, S., Harel, O., Zuba, M., Matthews, G.J., Aseltine, Jr., R., 2009. Responsible data releases. In: Proceedings 9th Industrial Conference on Data Mining (ICDM). Springer LNCS, pp. 388–400.Reiss, S.P., 1984. Practical data-swapping: The first steps. CM Transactions on Database Systems 9, 20–37.Reiter, J.P., 2002. Satisfying disclosure restriction with synthetic data sets. Journal of Official Statistics 18 (4), 531–543.Reiter, J.P., 2003. Inference for partially synthetic, public use microdata sets. Survey Methodology 29 (2), 181–188.Reiter, J.P., 2004a. New approaches to data dissemination: A glimpse into the future (?). Chance 17 (3), 11–15.Reiter, J.P., 2004b. Simultaneous use of multiple imputation for missing data and disclosure limitation. Survey Methodology 30 (2), 235–242.Reiter, J.P., 2005a. Estimating risks of identification disclosure in microdata. Journal of the American Statistical Association 100, 1103–1112.Reiter, J.P., 2005b. Releasing multiply imputed, synthetic public use microdata: An illustration and empirical study. Journal of the Royal Statistical Society, Series A: Statistics in Society 168 (1), 185–205.Reiter, J.P., 2005c. Using CART to generate partially synthetic public use microdata. Journal of Official Statistics 21 (3), 441–462. Rubin, D.B., 1987. Multiple Imputation for Nonresponse in Surveys. John Wiley & Sons.Rubin, D.B., 1993. Comment on “Statistical disclosure limitation”. Journal of Official Statistics 9, 461–468.Rubner, Y., Tomasi, C., Guibas, L.J., 1998. A metric for distributions with applications to image databases. Computer Vision, IEEE International Conference on 0, 59.Sarathy, R., Muralidhar, K., 2002a. The security of confidential numerical data in databases. Information Systems Research 13 (4), 389–403.Sarathy, R., Muralidhar, K., 2002b. The security of confidential numerical data in databases. Info. Sys. Research 13 (4), 389–403.Schafer, J.L., Graham, J.W., 2002. Missing data: Our view of state of the art. Psychological Methods 7 (2), 147–177.Singh, A., Yu, F., Dunteman, G., 2003. MASSC: A new data mask for limiting statistical information loss and disclosure. In: Proceedings of the Joint UNECE/EUROSTAT Work Session on Statistical Data Confidentiality. pp. 373–394.Skinner, C., 2009. Statistical disclosure control for survey data. In: Pfeffermann, D and Rao, C.R. eds. Handbook of Statistics Vol. 29A: Sample Surveys: Design, Methods and Applications. pp. 381–396.Skinner, C., Marsh, C., Openshaw, S., Wymer, C., 1994. Disclosure control for census microdata. Journal of Official Statistics 10, 31–51.Skinner, C., Shlomo, N., 2008. Assessing identification risk in survey microdata using log-linear models. Journal of the American Statistical Association 103, 989–1001.Skinner, C.J., Elliot, M.J., 2002. A measure of disclosure risk for microdata. Journal of the Royal Statistical Society. Series B (Statistical Methodology) 64 (4), 855–867.Smith, A., 2008. Efficient, dfferentially private point estimators. arXiv:0809.4794v1 [cs.CR].Spruill, N.L., 1982. Measures of confidentiality. Statistics of Income and Related Administrative Record Research, 131–136.Spruill, N.L., 1983. The confidentiality and analytic usefulness of masked business microdata. In: Proceedings of the Section on Survey Reserach Microdata. American Statistical Association, pp. 602–607.Sweeney, L., 1996. Replacing personally-identifying information in medical records, the scrub system. In: American Medical Informatics Association. Hanley and Belfus, Inc., pp. 333–337.Sweeney, L., 1997. Guaranteeing anonymity when sharing medical data, the datafly system. Journal of the American Medical Informatics Association 4, 51–55.Sweeney, L., 2002a. Achieving k-anonymity privacy protection using generalization and suppression. International Journal of Uncertainty, Fuzziness and Knowledge Based Systems 10 (5), 571–588. Sweeney, L., 2002b. k-anonymity: A model for protecting privacy. International Journal of Uncertainty, Fuzziness and Knowledge Based Systems 10 (5), 557–570.Tendick, P., 1991. Optimal noise addition for preserving confidentiality in multivariate data. Journal of Statistical Planning and Inference 27 (2), 341–353.United Nations Economic Comission for Europe (UNECE), 2007. Manging statistical cinfidentiality and microdata access: Principles and guidlinesof good practice.Warner, S.L., 1965. Randomized response: A survey technique for eliminating evasive answer bias. Journal of the American Statistical Association 60 (309), 63–69.Wasserman, L., Zhou, S., 2010. A statistical framework for differential privacy. Journal of the American Statistical Association 105 (489), 375–389.Willenborg, L., de Waal, T., 2001. Elements of Statistical Disclosure Control. Springer-Verlag.Woodward, B., 1995. The computer-based patient record and confidentiality. The New England Journal of Medicine, 1419–1422. Full Article
ev FNNC: Achieving Fairness through Neural Networks. (arXiv:1811.00247v3 [cs.LG] UPDATED) By arxiv.org Published On :: In classification models fairness can be ensured by solving a constrained optimization problem. We focus on fairness constraints like Disparate Impact, Demographic Parity, and Equalized Odds, which are non-decomposable and non-convex. Researchers define convex surrogates of the constraints and then apply convex optimization frameworks to obtain fair classifiers. Surrogates serve only as an upper bound to the actual constraints, and convexifying fairness constraints might be challenging. We propose a neural network-based framework, emph{FNNC}, to achieve fairness while maintaining high accuracy in classification. The above fairness constraints are included in the loss using Lagrangian multipliers. We prove bounds on generalization errors for the constrained losses which asymptotically go to zero. The network is optimized using two-step mini-batch stochastic gradient descent. Our experiments show that FNNC performs as good as the state of the art, if not better. The experimental evidence supplements our theoretical guarantees. In summary, we have an automated solution to achieve fairness in classification, which is easily extendable to many fairness constraints. Full Article
ev Reference and Document Aware Semantic Evaluation Methods for Korean Language Summarization. (arXiv:2005.03510v1 [cs.CL]) By arxiv.org Published On :: Text summarization refers to the process that generates a shorter form of text from the source document preserving salient information. Recently, many models for text summarization have been proposed. Most of those models were evaluated using recall-oriented understudy for gisting evaluation (ROUGE) scores. However, as ROUGE scores are computed based on n-gram overlap, they do not reflect semantic meaning correspondences between generated and reference summaries. Because Korean is an agglutinative language that combines various morphemes into a word that express several meanings, ROUGE is not suitable for Korean summarization. In this paper, we propose evaluation metrics that reflect semantic meanings of a reference summary and the original document, Reference and Document Aware Semantic Score (RDASS). We then propose a method for improving the correlation of the metrics with human judgment. Evaluation results show that the correlation with human judgment is significantly higher for our evaluation metrics than for ROUGE scores. Full Article
ev A stochastic user-operator assignment game for microtransit service evaluation: A case study of Kussbus in Luxembourg. (arXiv:2005.03465v1 [physics.soc-ph]) By arxiv.org Published On :: This paper proposes a stochastic variant of the stable matching model from Rasulkhani and Chow [1] which allows microtransit operators to evaluate their operation policy and resource allocations. The proposed model takes into account the stochastic nature of users' travel utility perception, resulting in a probabilistic stable operation cost allocation outcome to design ticket price and ridership forecasting. We applied the model for the operation policy evaluation of a microtransit service in Luxembourg and its border area. The methodology for the model parameters estimation and calibration is developed. The results provide useful insights for the operator and the government to improve the ridership of the service. Full Article
ev Relevance Vector Machine with Weakly Informative Hyperprior and Extended Predictive Information Criterion. (arXiv:2005.03419v1 [stat.ML]) By arxiv.org Published On :: In the variational relevance vector machine, the gamma distribution is representative as a hyperprior over the noise precision of automatic relevance determination prior. Instead of the gamma hyperprior, we propose to use the inverse gamma hyperprior with a shape parameter close to zero and a scale parameter not necessary close to zero. This hyperprior is associated with the concept of a weakly informative prior. The effect of this hyperprior is investigated through regression to non-homogeneous data. Because it is difficult to capture the structure of such data with a single kernel function, we apply the multiple kernel method, in which multiple kernel functions with different widths are arranged for input data. We confirm that the degrees of freedom in a model is controlled by adjusting the scale parameter and keeping the shape parameter close to zero. A candidate for selecting the scale parameter is the predictive information criterion. However the estimated model using this criterion seems to cause over-fitting. This is because the multiple kernel method makes the model a situation where the dimension of the model is larger than the data size. To select an appropriate scale parameter even in such a situation, we also propose an extended prediction information criterion. It is confirmed that a multiple kernel relevance vector regression model with good predictive accuracy can be obtained by selecting the scale parameter minimizing extended prediction information criterion. Full Article
ev Medieval Ideas about Infertility and Old Age By blog.wellcomelibrary.org Published On :: Fri, 12 Jan 2018 11:01:15 +0000 The next seminar in the 2017–18 History of Pre-Modern Medicine seminar series takes place on Tuesday 16 January. Speaker: Dr Catherine Rider (University of Exeter) Medieval Ideas about Infertility and Old Age Abstract: When they discussed fertility and reproductive disorders it was common… Continue reading Full Article Early Medicine Events and Visits Early Health and Well-being
ev Wood microbiology : decay and its prevention By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Author: Zabel, R. A. (Robert A.), authorCallnumber: OnlineISBN: 9780128205730 (electronic bk.) Full Article
ev The science of grapevines By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Author: Keller, Markus, (horticulturist) authorCallnumber: OnlineISBN: 9780128167021 (electronic bk.) Full Article
ev The evolution of feathers : from their origin to the present By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9783030272234 electronic book Full Article
ev Structured object-oriented formal language and method : 9th International Workshop, SOFL+MSVL 2019, Shenzhen, China, November 5, 2019, Revised selected papers By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Author: SOFL+MSVL (Workshop) (9th : 2019 : Shenzhen, China)Callnumber: OnlineISBN: 9783030414184 (electronic bk.) Full Article
ev Space information networks : 4th International Conference, SINC 2019, Wuzhen, China, September 19-20, 2019, Revised Selected Papers By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Author: SINC (Conference) (4th : 2019 : Wuzhen, China)Callnumber: OnlineISBN: 9789811534423 (electronic bk.) Full Article
ev Semantic technology : 9th Joint International Conference, JIST 2019, Hangzhou, China, November 25-27, 2019, Revised selected papers By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Author: Joint International Semantic Technology Conference (9th : 2019 : Hangzhou, China)Callnumber: OnlineISBN: 9789811534126 (electronic bk.) Full Article
ev Salt, fat and sugar reduction : sensory approaches for nutritional reformulation of foods and beverages By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Author: O'Sullivan, Maurice G., authorCallnumber: OnlineISBN: 9780128226124 (electronic bk.) Full Article
ev Recent developments on genus Chaetomium By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9783030316129 (electronic bk.) Full Article
ev Prevention of chronic diseases and age-related disability By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9783319965291 (electronic bk.) Full Article
ev Plastic waste and recycling : environmental impact, societal issues, prevention, and solutions By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9780128178812 (electronic bk.) Full Article
ev Plant microRNAs : shaping development and environmental responses By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9783030357726 (electronic bk.) Full Article
ev Mobilities facing hydrometeorological extreme events. By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9780081028827 (electronic bk.) Full Article
ev Insect metamorphosis : from natural history to regulation of development and evolution By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Author: Bellés, X., authorCallnumber: OnlineISBN: 9780128130216 Full Article
ev Information retrieval technology : 15th Asia Information Retrieval Societies Conference, AIRS 2019, Hong Kong, China, November 7-9, 2019, proceedings By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Author: Asia Information Retrieval Societies Conference (15th : 2019 : Hong Kong, China)Callnumber: OnlineISBN: 9783030428358 Full Article
ev In china's wake : how the commodity boom transformed development strategies in the global south By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Author: Jepson, Nicholas, author.Callnumber: OnlineISBN: 9780231547598 electronic book Full Article
ev Healthcare-associated infections in children : a guide to prevention and management By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9783319981222 (electronic bk.) Full Article
ev Functional foods in cancer prevention and therapy By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9780128165386 (electronic bk.) Full Article
ev Evolutionary developmental biology : a reference guide By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9783319330389 (electronic bk.) Full Article
ev Enterprise information systems : 21st International Conference, ICEIS 2019, Heraklion, Crete, Greece, May 3-5, 2019, Revised Selected Papers By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Author: International Conference on Enterprise Information Systems (21st : 2019 : Ērakleion, Greece)Callnumber: OnlineISBN: 9783030407834 (electronic bk.) Full Article
ev Development of biopharmaceutical drug-device products By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9783030314156 (electronic bk.) Full Article
ev Current developments in biotechnology and bioengineering : resource recovery from wastes By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 0444643222 Full Article
ev Computer security : ESORICS 2019 International Workshops, IOSec, MSTEC, and FINSEC, Luxembourg City, Luxembourg, September 26-27, 2019, Revised Selected Papers By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Author: European Symposium on Research in Computer Security (24th : 2019 : Luxembourg, Luxembourg)Callnumber: OnlineISBN: 9783030420512 (electronic bk.) Full Article
ev Computational processing of the Portuguese language : 14th International Conference, PROPOR 2020, Evora, Portugal, March 2-4, 2020, Proceedings By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Author: PROPOR (Conference) (14th : 2020 : Evora, Portugal)Callnumber: OnlineISBN: 9783030415051 (electronic bk.) Full Article
ev Common problems in the newborn nursery : an evidence and case-based guide By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9783319956725 (electronic bk.) Full Article
ev Clinical manual of fever in children By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Author: El-Radhi, A. Sahib, author.Callnumber: OnlineISBN: 9783319923369 (electronic book) Full Article
ev Biomedical product development : bench to bedside By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9783030356262 (electronic bk.) Full Article
ev Anomalies of the Developing Dentition : a Clinical Guide to Diagnosis and Management By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Author: Soxman, Jane A., author.Callnumber: OnlineISBN: 9783030031640 (electronic bk.) Full Article
ev Jamboree Begins Construction on Capstone Development to Change... By www.prweb.com Published On :: In a public-private partnership to develop housing, resident services and hope for 102 working families in Haster Orangewood community, Jamboree Housing Corporation and the City of Anaheim announce...(PRWeb April 27, 2020)Read the full story at https://www.prweb.com/releases/jamboree_begins_construction_on_capstone_development_to_change_trajectory_of_neighborhood_in_anaheim_ca/prweb17073166.htm Full Article
ev PMA Reveals New Logo and Brand Identity By www.prweb.com Published On :: PMA, a premier full-service provider of comprehensive financial and investment advisory services to municipalities, school districts, local government pools, insurance companies and other...(PRWeb May 04, 2020)Read the full story at https://www.prweb.com/releases/pma_reveals_new_logo_and_brand_identity/prweb17090459.htm Full Article
ev Colorado Court Rules STRmix Is “Relevant and Reliable” Practice for... By www.prweb.com Published On :: Defendant’s Motion to Exclude Expert Testimony regarding evidence generated by STRmix denied.(PRWeb May 08, 2020)Read the full story at https://www.prweb.com/releases/colorado_court_rules_strmix_is_relevant_and_reliable_practice_for_interpreting_likelihood_ratios/prweb17101548.htm Full Article
ev Detecting relevant changes in the mean of nonstationary processes—A mass excess approach By projecteuclid.org Published On :: Wed, 30 Oct 2019 22:03 EDT Holger Dette, Weichi Wu. Source: The Annals of Statistics, Volume 47, Number 6, 3578--3608.Abstract: This paper considers the problem of testing if a sequence of means $(mu_{t})_{t=1,ldots ,n}$ of a nonstationary time series $(X_{t})_{t=1,ldots ,n}$ is stable in the sense that the difference of the means $mu_{1}$ and $mu_{t}$ between the initial time $t=1$ and any other time is smaller than a given threshold, that is $|mu_{1}-mu_{t}|leq c$ for all $t=1,ldots ,n$. A test for hypotheses of this type is developed using a bias corrected monotone rearranged local linear estimator and asymptotic normality of the corresponding test statistic is established. As the asymptotic variance depends on the location of the roots of the equation $|mu_{1}-mu_{t}|=c$ a new bootstrap procedure is proposed to obtain critical values and its consistency is established. As a consequence we are able to quantitatively describe relevant deviations of a nonstationary sequence from its initial value. The results are illustrated by means of a simulation study and by analyzing data examples. Full Article
ev A hierarchical Bayesian model for predicting ecological interactions using scaled evolutionary relationships By projecteuclid.org Published On :: Wed, 15 Apr 2020 22:05 EDT Mohamad Elmasri, Maxwell J. Farrell, T. Jonathan Davies, David A. Stephens. Source: The Annals of Applied Statistics, Volume 14, Number 1, 221--240.Abstract: Identifying undocumented or potential future interactions among species is a challenge facing modern ecologists. Recent link prediction methods rely on trait data; however, large species interaction databases are typically sparse and covariates are limited to only a fraction of species. On the other hand, evolutionary relationships, encoded as phylogenetic trees, can act as proxies for underlying traits and historical patterns of parasite sharing among hosts. We show that, using a network-based conditional model, phylogenetic information provides strong predictive power in a recently published global database of host-parasite interactions. By scaling the phylogeny using an evolutionary model, our method allows for biological interpretation often missing from latent variable models. To further improve on the phylogeny-only model, we combine a hierarchical Bayesian latent score framework for bipartite graphs that accounts for the number of interactions per species with host dependence informed by phylogeny. Combining the two information sources yields significant improvement in predictive accuracy over each of the submodels alone. As many interaction networks are constructed from presence-only data, we extend the model by integrating a correction mechanism for missing interactions which proves valuable in reducing uncertainty in unobserved interactions. Full Article
ev Integrative survival analysis with uncertain event times in application to a suicide risk study By projecteuclid.org Published On :: Wed, 15 Apr 2020 22:05 EDT Wenjie Wang, Robert Aseltine, Kun Chen, Jun Yan. Source: The Annals of Applied Statistics, Volume 14, Number 1, 51--73.Abstract: The concept of integrating data from disparate sources to accelerate scientific discovery has generated tremendous excitement in many fields. The potential benefits from data integration, however, may be compromised by the uncertainty due to incomplete/imperfect record linkage. Motivated by a suicide risk study, we propose an approach for analyzing survival data with uncertain event times arising from data integration. Specifically, in our problem deaths identified from the hospital discharge records together with reported suicidal deaths determined by the Office of Medical Examiner may still not include all the death events of patients, and the missing deaths can be recovered from a complete database of death records. Since the hospital discharge data can only be linked to the death record data by matching basic patient characteristics, a patient with a censored death time from the first dataset could be linked to multiple potential event records in the second dataset. We develop an integrative Cox proportional hazards regression in which the uncertainty in the matched event times is modeled probabilistically. The estimation procedure combines the ideas of profile likelihood and the expectation conditional maximization algorithm (ECM). Simulation studies demonstrate that under realistic settings of imperfect data linkage the proposed method outperforms several competing approaches including multiple imputation. A marginal screening analysis using the proposed integrative Cox model is performed to identify risk factors associated with death following suicide-related hospitalization in Connecticut. The identified diagnostics codes are consistent with existing literature and provide several new insights on suicide risk, prediction and prevention. Full Article