Latest ng news

Modifying the Chi-square and the CMH test for population genetic inference: Adapting to overdispersion

By projecteuclid.org
Published On :: Wed, 15 Apr 2020 22:05 EDT

Kerstin Spitzer, Marta Pelizzola, Andreas Futschik.

Source: The Annals of Applied Statistics, Volume 14, Number 1, 202--220.

Abstract:
Evolve and resequence studies provide a popular approach to simulate evolution in the lab and explore its genetic basis. In this context, Pearson’s chi-square test, Fisher’s exact test as well as the Cochran–Mantel–Haenszel test are commonly used to infer genomic positions affected by selection from temporal changes in allele frequency. However, the null model associated with these tests does not match the null hypothesis of actual interest. Indeed, due to genetic drift and possibly other additional noise components such as pool sequencing, the null variance in the data can be substantially larger than accounted for by these common test statistics. This leads to $p$-values that are systematically too small and, therefore, a huge number of false positive results. Even, if the ranking rather than the actual $p$-values is of interest, a naive application of the mentioned tests will give misleading results, as the amount of overdispersion varies from locus to locus. We therefore propose adjusted statistics that take the overdispersion into account while keeping the formulas simple. This is particularly useful in genome-wide applications, where millions of SNPs can be handled with little computational effort. We then apply the adapted test statistics to real data from Drosophila and investigate how information from intermediate generations can be included when available. We also discuss further applications such as genome-wide association studies based on pool sequencing data and tests for local adaptation.

Modifying the Chi-square and the CMH test for population genetic inference: Adapting to overdispersion

TFisher: A powerful truncation and weighting procedure for combining &#36;p&#36;-values

Assessing wage status transition and stagnation using quantile transition regression

Surface temperature monitoring in liver procurement via functional variance change-point analysis

Modeling microbial abundances and dysbiosis with beta-binomial regression

Efficient real-time monitoring of an emerging influenza pandemic: How feasible?

BART with targeted smoothing: An analysis of patient-specific stillbirth risk

A general theory for preferential sampling in environmental networks

Hierarchical infinite factor models for improving the prediction of surgical complications for geriatric patients

Bayesian indicator variable selection to incorporate hierarchical overlapping group structure in multi-omics applications

Scalable high-resolution forecasting of sparse spatiotemporal events with kernel methods: A winning solution to the NIJ “Real-Time Crime Forecasting Challenge”

New formulation of the logistic-Gaussian process to analyze trajectory tracking data

Empirical Bayes analysis of RNA sequencing experiments with auxiliary information

Propensity score weighting for causal inference with multiple treatments

Predicting paleoclimate from compositional data using multivariate Gaussian process inverse prediction

A latent discrete Markov random field approach to identifying and classifying historical forest communities based on spatial multivariate tree species counts

Objective Bayes model selection of Gaussian interventional essential graphs for the identification of signaling pathways

Fitting a deeply nested hierarchical model to a large book review dataset using a moment-based estimator

Spatial modeling of trends in crime over time in Philadelphia

Microsimulation model calibration using incremental mixture approximate Bayesian computation

Prediction of small area quantiles for the conservation effects assessment project using a mixed effects quantile regression model

Statistical inference for partially observed branching processes with application to cell lineage tracking of in vivo hematopoiesis

Estimating abundance from multiple sampling capture-recapture data via a multi-state multi-period stopover model

Estimating the rate constant from biosensor data via an adaptive variational Bayesian approach

A semiparametric modeling approach using Bayesian Additive Regression Trees with an application to evaluate heterogeneous treatment effects

Bayesian methods for multiple mediators: Relating principal stratification and causal mediation in the analysis of power plant emission controls

Approximate inference for constructing astronomical catalogs from images

Wavelet spectral testing: Application to nonstationary circadian rhythms

Bayesian modeling of the structural connectome for studying Alzheimer’s disease

Incorporating conditional dependence in latent class models for probabilistic record linkage: Does it matter?

A hierarchical Bayesian model for single-cell clustering using RNA-sequencing data

Sequential decision model for inference and prediction on nonuniform hypergraphs with application to knot matching from computational forestry

RCRnorm: An integrated system of random-coefficient hierarchical regression models for normalizing NanoString nCounter data

Modeling seasonality and serial dependence of electricity price curves with warping functional autoregressive dynamics

Distributional regression forests for probabilistic precipitation forecasting in complex terrain

Fast dynamic nonparametric distribution tracking in electron microscopic data

Network modelling of topological domains using Hi-C data

Spatio-temporal short-term wind forecast: A calibrated regime-switching method

The classification permutation test: A flexible approach to testing for covariate imbalance in observational studies

Identifying multiple changes for a functional data sequence with application to freeway traffic segmentation

A hidden Markov model approach to characterizing the photo-switching behavior of fluorophores

Imputation and post-selection inference in models with missing data: An application to colorectal cancer surveillance guidelines

Introduction to papers on the modeling and analysis of network data—II

Weighted Lépingle inequality

Scaling limits for super-replication with transient price impact

Perfect sampling for Gibbs point processes using partial rejection sampling

Matching strings in encoded sequences

On sampling from a log-concave density using kinetic Langevin diffusions

On the best constant in the martingale version of Fefferman’s inequality

Estimating the number of connected components in a graph via subgraph sampling

The Finish Line: Changing Stucco to EIFS

The Finish Line: A Case Study: What is Causing This?

The Finish Line: Backwrapping vs. Edgewrapping

The Finish Line: Cleaning EIFS

The Finish Line: Firestopping

The Finish Line: Inspecting Eifs

The Finish Line: Beefing Up EIFS

The Finish Line: Building Walls in the Land Down Under

Building Product Transparency— Be Careful What You Ask For

LED Lighting is the Future

An Energy Label for Buildings

Hydronic Floor Heating

Meeting Codes with Wall Assemblies

New Gadget Analyzes Everything Including Building Industry

Green Building Mistakes

Subscribe To Our Newsletter

TFisher: A powerful truncation and weighting procedure for combining $p$-values