and Over-the-Air Computation Systems: Optimization, Analysis and Scaling Laws. (arXiv:1909.00329v2 [cs.IT] UPDATED) By arxiv.org Published On :: For future Internet of Things (IoT)-based Big Data applications (e.g., smart cities/transportation), wireless data collection from ubiquitous massive smart sensors with limited spectrum bandwidth is very challenging. On the other hand, to interpret the meaning behind the collected data, it is also challenging for edge fusion centers running computing tasks over large data sets with limited computation capacity. To tackle these challenges, by exploiting the superposition property of a multiple-access channel and the functional decomposition properties, the recently proposed technique, over-the-air computation (AirComp), enables an effective joint data collection and computation from concurrent sensor transmissions. In this paper, we focus on a single-antenna AirComp system consisting of $K$ sensors and one receiver (i.e., the fusion center). We consider an optimization problem to minimize the computation mean-squared error (MSE) of the $K$ sensors' signals at the receiver by optimizing the transmitting-receiving (Tx-Rx) policy, under the peak power constraint of each sensor. Although the problem is not convex, we derive the computation-optimal policy in closed form. Also, we comprehensively investigate the ergodic performance of AirComp systems in terms of the average computation MSE and the average power consumption under Rayleigh fading channels with different Tx-Rx policies. For the computation-optimal policy, we prove that its average computation MSE has a decay rate of $O(1/sqrt{K})$, and our numerical results illustrate that the policy also has a vanishing average power consumption with the increasing $K$, which jointly show the computation effectiveness and the energy efficiency of the policy with a large number of sensors. Full Article
and A Fast and Accurate Algorithm for Spherical Harmonic Analysis on HEALPix Grids with Applications to the Cosmic Microwave Background Radiation. (arXiv:1904.10514v4 [math.NA] UPDATED) By arxiv.org Published On :: The Hierarchical Equal Area isoLatitude Pixelation (HEALPix) scheme is used extensively in astrophysics for data collection and analysis on the sphere. The scheme was originally designed for studying the Cosmic Microwave Background (CMB) radiation, which represents the first light to travel during the early stages of the universe's development and gives the strongest evidence for the Big Bang theory to date. Refined analysis of the CMB angular power spectrum can lead to revolutionary developments in understanding the nature of dark matter and dark energy. In this paper, we present a new method for performing spherical harmonic analysis for HEALPix data, which is a central component to computing and analyzing the angular power spectrum of the massive CMB data sets. The method uses a novel combination of a non-uniform fast Fourier transform, the double Fourier sphere method, and Slevinsky's fast spherical harmonic transform (Slevinsky, 2019). For a HEALPix grid with $N$ pixels (points), the computational complexity of the method is $mathcal{O}(Nlog^2 N)$, with an initial set-up cost of $mathcal{O}(N^{3/2}log N)$. This compares favorably with $mathcal{O}(N^{3/2})$ runtime complexity of the current methods available in the HEALPix software when multiple maps need to be analyzed at the same time. Using numerical experiments, we demonstrate that the new method also appears to provide better accuracy over the entire angular power spectrum of synthetic data when compared to the current methods, with a convergence rate at least two times higher. Full Article
and Constrained Restless Bandits for Dynamic Scheduling in Cyber-Physical Systems. (arXiv:1904.08962v3 [cs.SY] UPDATED) By arxiv.org Published On :: Restless multi-armed bandits are a class of discrete-time stochastic control problems which involve sequential decision making with a finite set of actions (set of arms). This paper studies a class of constrained restless multi-armed bandits (CRMAB). The constraints are in the form of time varying set of actions (set of available arms). This variation can be either stochastic or semi-deterministic. Given a set of arms, a fixed number of them can be chosen to be played in each decision interval. The play of each arm yields a state dependent reward. The current states of arms are partially observable through binary feedback signals from arms that are played. The current availability of arms is fully observable. The objective is to maximize long term cumulative reward. The uncertainty about future availability of arms along with partial state information makes this objective challenging. Applications for CRMAB abound in the domain of cyber-physical systems. This optimization problem is analyzed using Whittle's index policy. To this end, a constrained restless single-armed bandit is studied. It is shown to admit a threshold-type optimal policy, and is also indexable. An algorithm to compute Whittle's index is presented. Further, upper bounds on the value function are derived in order to estimate the degree of sub-optimality of various solutions. The simulation study compares the performance of Whittle's index, modified Whittle's index and myopic policies. Full Article
and Keeping out the Masses: Understanding the Popularity and Implications of Internet Paywalls. (arXiv:1903.01406v4 [cs.CY] UPDATED) By arxiv.org Published On :: Funding the production of quality online content is a pressing problem for content producers. The most common funding method, online advertising, is rife with well-known performance and privacy harms, and an intractable subject-agent conflict: many users do not want to see advertisements, depriving the site of needed funding. Because of these negative aspects of advertisement-based funding, paywalls are an increasingly popular alternative for websites. This shift to a "pay-for-access" web is one that has potentially huge implications for the web and society. Instead of a system where information (nominally) flows freely, paywalls create a web where high quality information is available to fewer and fewer people, leaving the rest of the web users with less information, that might be also less accurate and of lower quality. Despite the potential significance of a move from an "advertising-but-open" web to a "paywalled" web, we find this issue understudied. This work addresses this gap in our understanding by measuring how widely paywalls have been adopted, what kinds of sites use paywalls, and the distribution of policies enforced by paywalls. A partial list of our findings include that (i) paywall use is accelerating (2x more paywalls every 6 months), (ii) paywall adoption differs by country (e.g. 18.75% in US, 12.69% in Australia), (iii) paywalls change how users interact with sites (e.g. higher bounce rates, less incoming links), (iv) the median cost of an annual paywall access is $108 per site, and (v) paywalls are in general trivial to circumvent. Finally, we present the design of a novel, automated system for detecting whether a site uses a paywall, through the combination of runtime browser instrumentation and repeated programmatic interactions with the site. We intend this classifier to augment future, longitudinal measurements of paywall use and behavior. Full Article
and Asymptotic expansions of eigenvalues by both the Crouzeix-Raviart and enriched Crouzeix-Raviart elements. (arXiv:1902.09524v2 [math.NA] UPDATED) By arxiv.org Published On :: Asymptotic expansions are derived for eigenvalues produced by both the Crouzeix-Raviart element and the enriched Crouzeix--Raviart element. The expansions are optimal in the sense that extrapolation eigenvalues based on them admit a fourth order convergence provided that exact eigenfunctions are smooth enough. The major challenge in establishing the expansions comes from the fact that the canonical interpolation of both nonconforming elements lacks a crucial superclose property, and the nonconformity of both elements. The main idea is to employ the relation between the lowest-order mixed Raviart--Thomas element and the two nonconforming elements, and consequently make use of the superclose property of the canonical interpolation of the lowest-order mixed Raviart--Thomas element. To overcome the difficulty caused by the nonconformity, the commuting property of the canonical interpolation operators of both nonconforming elements is further used, which turns the consistency error problem into an interpolation error problem. Then, a series of new results are obtained to show the final expansions. Full Article
and Learning Direct Optimization for Scene Understanding. (arXiv:1812.07524v2 [cs.CV] UPDATED) By arxiv.org Published On :: We develop a Learning Direct Optimization (LiDO) method for the refinement of a latent variable model that describes input image x. Our goal is to explain a single image x with an interpretable 3D computer graphics model having scene graph latent variables z (such as object appearance, camera position). Given a current estimate of z we can render a prediction of the image g(z), which can be compared to the image x. The standard way to proceed is then to measure the error E(x, g(z)) between the two, and use an optimizer to minimize the error. However, it is unknown which error measure E would be most effective for simultaneously addressing issues such as misaligned objects, occlusions, textures, etc. In contrast, the LiDO approach trains a Prediction Network to predict an update directly to correct z, rather than minimizing the error with respect to z. Experiments show that our LiDO method converges rapidly as it does not need to perform a search on the error landscape, produces better solutions than error-based competitors, and is able to handle the mismatch between the data and the fitted scene model. We apply LiDO to a realistic synthetic dataset, and show that the method also transfers to work well with real images. Full Article
and An improved exact algorithm and an NP-completeness proof for sparse matrix bipartitioning. (arXiv:1811.02043v2 [cs.DS] UPDATED) By arxiv.org Published On :: We investigate sparse matrix bipartitioning -- a problem where we minimize the communication volume in parallel sparse matrix-vector multiplication. We prove, by reduction from graph bisection, that this problem is $mathcal{NP}$-complete in the case where each side of the bipartitioning must contain a linear fraction of the nonzeros. We present an improved exact branch-and-bound algorithm which finds the minimum communication volume for a given matrix and maximum allowed imbalance. The algorithm is based on a maximum-flow bound and a packing bound, which extend previous matching and packing bounds. We implemented the algorithm in a new program called MP (Matrix Partitioner), which solved 839 matrices from the SuiteSparse collection to optimality, each within 24 hours of CPU-time. Furthermore, MP solved the difficult problem of the matrix cage6 in about 3 days. The new program is on average more than ten times faster than the previous program MondriaanOpt. Benchmark results using the set of 839 optimally solved matrices show that combining the medium-grain/iterative refinement methods of the Mondriaan package with the hypergraph bipartitioner of the PaToH package produces sparse matrix bipartitionings on average within 10% of the optimal solution. Full Article
and ErdH{o}s-P'osa property of chordless cycles and its applications. (arXiv:1711.00667v3 [math.CO] UPDATED) By arxiv.org Published On :: A chordless cycle, or equivalently a hole, in a graph $G$ is an induced subgraph of $G$ which is a cycle of length at least $4$. We prove that the ErdH{o}s-P'osa property holds for chordless cycles, which resolves the major open question concerning the ErdH{o}s-P'osa property. Our proof for chordless cycles is constructive: in polynomial time, one can find either $k+1$ vertex-disjoint chordless cycles, or $c_1k^2 log k+c_2$ vertices hitting every chordless cycle for some constants $c_1$ and $c_2$. It immediately implies an approximation algorithm of factor $mathcal{O}(sf{opt}log {sf opt})$ for Chordal Vertex Deletion. We complement our main result by showing that chordless cycles of length at least $ell$ for any fixed $ellge 5$ do not have the ErdH{o}s-P'osa property. Full Article
and Compression, inversion, and approximate PCA of dense kernel matrices at near-linear computational complexity. (arXiv:1706.02205v4 [math.NA] UPDATED) By arxiv.org Published On :: Dense kernel matrices $Theta in mathbb{R}^{N imes N}$ obtained from point evaluations of a covariance function $G$ at locations ${ x_{i} }_{1 leq i leq N} subset mathbb{R}^{d}$ arise in statistics, machine learning, and numerical analysis. For covariance functions that are Green's functions of elliptic boundary value problems and homogeneously-distributed sampling points, we show how to identify a subset $S subset { 1 , dots , N }^2$, with $# S = O ( N log (N) log^{d} ( N /epsilon ) )$, such that the zero fill-in incomplete Cholesky factorisation of the sparse matrix $Theta_{ij} 1_{( i, j ) in S}$ is an $epsilon$-approximation of $Theta$. This factorisation can provably be obtained in complexity $O ( N log( N ) log^{d}( N /epsilon) )$ in space and $O ( N log^{2}( N ) log^{2d}( N /epsilon) )$ in time, improving upon the state of the art for general elliptic operators; we further present numerical evidence that $d$ can be taken to be the intrinsic dimension of the data set rather than that of the ambient space. The algorithm only needs to know the spatial configuration of the $x_{i}$ and does not require an analytic representation of $G$. Furthermore, this factorization straightforwardly provides an approximate sparse PCA with optimal rate of convergence in the operator norm. Hence, by using only subsampling and the incomplete Cholesky factorization, we obtain, at nearly linear complexity, the compression, inversion and approximate PCA of a large class of covariance matrices. By inverting the order of the Cholesky factorization we also obtain a solver for elliptic PDE with complexity $O ( N log^{d}( N /epsilon) )$ in space and $O ( N log^{2d}( N /epsilon) )$ in time, improving upon the state of the art for general elliptic operators. Full Article
and On Exposure Bias, Hallucination and Domain Shift in Neural Machine Translation. (arXiv:2005.03642v1 [cs.CL]) By arxiv.org Published On :: The standard training algorithm in neural machine translation (NMT) suffers from exposure bias, and alternative algorithms have been proposed to mitigate this. However, the practical impact of exposure bias is under debate. In this paper, we link exposure bias to another well-known problem in NMT, namely the tendency to generate hallucinations under domain shift. In experiments on three datasets with multiple test domains, we show that exposure bias is partially to blame for hallucinations, and that training with Minimum Risk Training, which avoids exposure bias, can mitigate this. Our analysis explains why exposure bias is more problematic under domain shift, and also links exposure bias to the beam search problem, i.e. performance deterioration with increasing beam size. Our results provide a new justification for methods that reduce exposure bias: even if they do not increase performance on in-domain test sets, they can increase model robustness to domain shift. Full Article
and Universal Coding and Prediction on Martin-L"of Random Points. (arXiv:2005.03627v1 [math.PR]) By arxiv.org Published On :: We perform an effectivization of classical results concerning universal coding and prediction for stationary ergodic processes over an arbitrary finite alphabet. That is, we lift the well-known almost sure statements to statements about Martin-L"of random sequences. Most of this work is quite mechanical but, by the way, we complete a result of Ryabko from 2008 by showing that each universal probability measure in the sense of universal coding induces a universal predictor in the prequential sense. Surprisingly, the effectivization of this implication holds true provided the universal measure does not ascribe too low conditional probabilities to individual symbols. As an example, we show that the Prediction by Partial Matching (PPM) measure satisfies this requirement. In the almost sure setting, the requirement is superfluous. Full Article
and COVID-19 Contact-tracing Apps: A Survey on the Global Deployment and Challenges. (arXiv:2005.03599v1 [cs.CR]) By arxiv.org Published On :: In response to the coronavirus disease (COVID-19) outbreak, there is an ever-increasing number of national governments that are rolling out contact-tracing Apps to aid the containment of the virus. The first hugely contentious issue facing the Apps is the deployment framework, i.e. centralised or decentralised. Based on this, the debate branches out to the corresponding technologies that underpin these architectures, i.e. GPS, QR codes, and Bluetooth. This work conducts a pioneering review of the above scenarios and contributes a geolocation mapping of the current deployment. The vulnerabilities and the directions of research are identified, with a special focus on the Bluetooth-based decentralised scheme. Full Article
and A Local Spectral Exterior Calculus for the Sphere and Application to the Shallow Water Equations. (arXiv:2005.03598v1 [math.NA]) By arxiv.org Published On :: We introduce $Psimathrm{ec}$, a local spectral exterior calculus for the two-sphere $S^2$. $Psimathrm{ec}$ provides a discretization of Cartan's exterior calculus on $S^2$ formed by spherical differential $r$-form wavelets. These are well localized in space and frequency and provide (Stevenson) frames for the homogeneous Sobolev spaces $dot{H}^{-r+1}( Omega_{ u}^{r} , S^2 )$ of differential $r$-forms. At the same time, they satisfy important properties of the exterior calculus, such as the de Rahm complex and the Hodge-Helmholtz decomposition. Through this, $Psimathrm{ec}$ is tailored towards structure preserving discretizations that can adapt to solutions with varying regularity. The construction of $Psimathrm{ec}$ is based on a novel spherical wavelet frame for $L_2(S^2)$ that we obtain by introducing scalable reproducing kernel frames. These extend scalable frames to weighted sampling expansions and provide an alternative to quadrature rules for the discretization of needlet-like scale-discrete wavelets. We verify the practicality of $Psimathrm{ec}$ for numerical computations using the rotating shallow water equations. Our numerical results demonstrate that a $Psimathrm{ec}$-based discretization of the equations attains accuracy comparable to those of spectral methods while using a representation that is well localized in space and frequency. Full Article
and Enhancing Geometric Factors in Model Learning and Inference for Object Detection and Instance Segmentation. (arXiv:2005.03572v1 [cs.CV]) By arxiv.org Published On :: Deep learning-based object detection and instance segmentation have achieved unprecedented progress. In this paper, we propose Complete-IoU (CIoU) loss and Cluster-NMS for enhancing geometric factors in both bounding box regression and Non-Maximum Suppression (NMS), leading to notable gains of average precision (AP) and average recall (AR), without the sacrifice of inference efficiency. In particular, we consider three geometric factors, i.e., overlap area, normalized central point distance and aspect ratio, which are crucial for measuring bounding box regression in object detection and instance segmentation. The three geometric factors are then incorporated into CIoU loss for better distinguishing difficult regression cases. The training of deep models using CIoU loss results in consistent AP and AR improvements in comparison to widely adopted $ell_n$-norm loss and IoU-based loss. Furthermore, we propose Cluster-NMS, where NMS during inference is done by implicitly clustering detected boxes and usually requires less iterations. Cluster-NMS is very efficient due to its pure GPU implementation, , and geometric factors can be incorporated to improve both AP and AR. In the experiments, CIoU loss and Cluster-NMS have been applied to state-of-the-art instance segmentation (e.g., YOLACT), and object detection (e.g., YOLO v3, SSD and Faster R-CNN) models. Taking YOLACT on MS COCO as an example, our method achieves performance gains as +1.7 AP and +6.2 AR$_{100}$ for object detection, and +0.9 AP and +3.5 AR$_{100}$ for instance segmentation, with 27.1 FPS on one NVIDIA GTX 1080Ti GPU. All the source code and trained models are available at https://github.com/Zzh-tju/CIoU Full Article
and NH-HAZE: An Image Dehazing Benchmark with Non-Homogeneous Hazy and Haze-Free Images. (arXiv:2005.03560v1 [cs.CV]) By arxiv.org Published On :: Image dehazing is an ill-posed problem that has been extensively studied in the recent years. The objective performance evaluation of the dehazing methods is one of the major obstacles due to the lacking of a reference dataset. While the synthetic datasets have shown important limitations, the few realistic datasets introduced recently assume homogeneous haze over the entire scene. Since in many real cases haze is not uniformly distributed we introduce NH-HAZE, a non-homogeneous realistic dataset with pairs of real hazy and corresponding haze-free images. This is the first non-homogeneous image dehazing dataset and contains 55 outdoor scenes. The non-homogeneous haze has been introduced in the scene using a professional haze generator that imitates the real conditions of hazy scenes. Additionally, this work presents an objective assessment of several state-of-the-art single image dehazing methods that were evaluated using NH-HAZE dataset. Full Article
and Credulous Users and Fake News: a Real Case Study on the Propagation in Twitter. (arXiv:2005.03550v1 [cs.SI]) By arxiv.org Published On :: Recent studies have confirmed a growing trend, especially among youngsters, of using Online Social Media as favourite information platform at the expense of traditional mass media. Indeed, they can easily reach a wide audience at a high speed; but exactly because of this they are the preferred medium for influencing public opinion via so-called fake news. Moreover, there is a general agreement that the main vehicle of fakes news are malicious software robots (bots) that automatically interact with human users. In previous work we have considered the problem of tagging human users in Online Social Networks as credulous users. Specifically, we have considered credulous those users with relatively high number of bot friends when compared to total number of their social friends. We consider this group of users worth of attention because they might have a higher exposure to malicious activities and they may contribute to the spreading of fake information by sharing dubious content. In this work, starting from a dataset of fake news, we investigate the behaviour and the degree of involvement of credulous users in fake news diffusion. The study aims to: (i) fight fake news by considering the content diffused by credulous users; (ii) highlight the relationship between credulous users and fake news spreading; (iii) target fake news detection by focusing on the analysis of specific accounts more exposed to malicious activities of bots. Our first results demonstrate a strong involvement of credulous users in fake news diffusion. This findings are calling for tools that, by performing data streaming on credulous' users actions, enables us to perform targeted fact-checking. Full Article
and MISA: Modality-Invariant and -Specific Representations for Multimodal Sentiment Analysis. (arXiv:2005.03545v1 [cs.CL]) By arxiv.org Published On :: Multimodal Sentiment Analysis is an active area of research that leverages multimodal signals for affective understanding of user-generated videos. The predominant approach, addressing this task, has been to develop sophisticated fusion techniques. However, the heterogeneous nature of the signals creates distributional modality gaps that pose significant challenges. In this paper, we aim to learn effective modality representations to aid the process of fusion. We propose a novel framework, MISA, which projects each modality to two distinct subspaces. The first subspace is modality invariant, where the representations across modalities learn their commonalities and reduce the modality gap. The second subspace is modality-specific, which is private to each modality and captures their characteristic features. These representations provide a holistic view of the multimodal data, which is used for fusion that leads to task predictions. Our experiments on popular sentiment analysis benchmarks, MOSI and MOSEI, demonstrate significant gains over state-of-the-art models. We also consider the task of Multimodal Humor Detection and experiment on the recently proposed UR_FUNNY dataset. Here too, our model fares better than strong baselines, establishing MISA as a useful multimodal framework. Full Article
and CounQER: A System for Discovering and Linking Count Information in Knowledge Bases. (arXiv:2005.03529v1 [cs.IR]) By arxiv.org Published On :: Predicate constraints of general-purpose knowledge bases (KBs) like Wikidata, DBpedia and Freebase are often limited to subproperty, domain and range constraints. In this demo we showcase CounQER, a system that illustrates the alignment of counting predicates, like staffSize, and enumerating predicates, like workInstitution^{-1} . In the demonstration session, attendees can inspect these alignments, and will learn about the importance of these alignments for KB question answering and curation. CounQER is available at https://counqer.mpi-inf.mpg.de/spo. Full Article
and An asynchronous distributed and scalable generalized Nash equilibrium seeking algorithm for strongly monotone games. (arXiv:2005.03507v1 [cs.GT]) By arxiv.org Published On :: In this paper, we present three distributed algorithms to solve a class of generalized Nash equilibrium (GNE) seeking problems in strongly monotone games. The first one (SD-GENO) is based on synchronous updates of the agents, while the second and the third (AD-GEED and AD-GENO) represent asynchronous solutions that are robust to communication delays. AD-GENO can be seen as a refinement of AD-GEED, since it only requires node auxiliary variables, enhancing the scalability of the algorithm. Our main contribution is to prove converge to a variational GNE of the game via an operator-theoretic approach. Finally, we apply the algorithms to network Cournot games and show how different activation sequences and delays affect convergence. We also compare the proposed algorithms to the only other in the literature (ADAGNES), and observe that AD-GENO outperforms the alternative. Full Article
and Computing with bricks and mortar: Classification of waveforms with a doped concrete blocks. (arXiv:2005.03498v1 [cs.ET]) By arxiv.org Published On :: We present results showing the capability of concrete-based information processing substrate in the signal classification task in accordance with in materio computing paradigm. As the Reservoir Computing is a suitable model for describing embedded in materio computation, we propose that this type of presented basic construction unit can be used as a source for "reservoir of states" necessary for simple tuning of the readout layer. In that perspective, buildings constructed from computing concrete could function as a highly parallel information processor for smart architecture. We present an electrical characterization of the set of samples with different additive concentrations followed by a dynamical analysis of selected specimens showing fingerprints of memfractive properties. Moreover, on the basis of obtained parameters, classification of the signal waveform shapes can be performed in scenarios explicitly tuned for a given device terminal. Full Article
and Predictions and algorithmic statistics for infinite sequence. (arXiv:2005.03467v1 [cs.IT]) By arxiv.org Published On :: Consider the following prediction problem. Assume that there is a block box that produces bits according to some unknown computable distribution on the binary tree. We know first $n$ bits $x_1 x_2 ldots x_n$. We want to know the probability of the event that that the next bit is equal to $1$. Solomonoff suggested to use universal semimeasure $m$ for solving this task. He proved that for every computable distribution $P$ and for every $b in {0,1}$ the following holds: $$sum_{n=1}^{infty}sum_{x: l(x)=n} P(x) (P(b | x) - m(b | x))^2 < infty .$$ However, Solomonoff's method has a negative aspect: Hutter and Muchnik proved that there are an universal semimeasure $m$, computable distribution $P$ and a random (in Martin-L{"o}f sense) sequence $x_1 x_2ldots$ such that $lim_{n o infty} P(x_{n+1} | x_1ldots x_n) - m(x_{n+1} | x_1ldots x_n) rightarrow 0$. We suggest a new way for prediction. For every finite string $x$ we predict the new bit according to the best (in some sence) distribution for $x$. We prove the similar result as Solomonoff theorem for our way of prediction. Also we show that our method of prediction has no that negative aspect as Solomonoff's method. Full Article
and Detection and Feeder Identification of the High Impedance Fault at Distribution Networks Based on Synchronous Waveform Distortions. (arXiv:2005.03411v1 [eess.SY]) By arxiv.org Published On :: Diagnosis of high impedance fault (HIF) is a challenge for nowadays distribution network protections. The fault current of a HIF is much lower than that of a normal load, and fault feature is significantly affected by fault scenarios. A detection and feeder identification algorithm for HIFs is proposed in this paper, based on the high-resolution and synchronous waveform data. In the algorithm, an interval slope is defined to describe the waveform distortions, which guarantees a uniform feature description under various HIF nonlinearities and noise interferences. For three typical types of network neutrals, i.e.,isolated neutral, resonant neutral, and low-resistor-earthed neutral, differences of the distorted components between the zero-sequence currents of healthy and faulty feeders are mathematically deduced, respectively. As a result, the proposed criterion, which is based on the distortion relationships between zero-sequence currents of feeders and the zero-sequence voltage at the substation, is theoretically supported. 28 HIFs grounded to various materials are tested in a 10kV distribution networkwith three neutral types, and are utilized to verify the effectiveness of the proposed algorithm. Full Article
and AutoSOS: Towards Multi-UAV Systems Supporting Maritime Search and Rescue with Lightweight AI and Edge Computing. (arXiv:2005.03409v1 [cs.RO]) By arxiv.org Published On :: Rescue vessels are the main actors in maritime safety and rescue operations. At the same time, aerial drones bring a significant advantage into this scenario. This paper presents the research directions of the AutoSOS project, where we work in the development of an autonomous multi-robot search and rescue assistance platform capable of sensor fusion and object detection in embedded devices using novel lightweight AI models. The platform is meant to perform reconnaissance missions for initial assessment of the environment using novel adaptive deep learning algorithms that efficiently use the available sensors and computational resources on drones and rescue vessel. When drones find potential objects, they will send their sensor data to the vessel to verity the findings with increased accuracy. The actual rescue and treatment operation are left as the responsibility of the rescue personnel. The drones will autonomously reconfigure their spatial distribution to enable multi-hop communication, when a direct connection between a drone transmitting information and the vessel is unavailable. Full Article
and Joint Prediction and Time Estimation of COVID-19 Developing Severe Symptoms using Chest CT Scan. (arXiv:2005.03405v1 [eess.IV]) By arxiv.org Published On :: With the rapidly worldwide spread of Coronavirus disease (COVID-19), it is of great importance to conduct early diagnosis of COVID-19 and predict the time that patients might convert to the severe stage, for designing effective treatment plan and reducing the clinicians' workloads. In this study, we propose a joint classification and regression method to determine whether the patient would develop severe symptoms in the later time, and if yes, predict the possible conversion time that the patient would spend to convert to the severe stage. To do this, the proposed method takes into account 1) the weight for each sample to reduce the outliers' influence and explore the problem of imbalance classification, and 2) the weight for each feature via a sparsity regularization term to remove the redundant features of high-dimensional data and learn the shared information across the classification task and the regression task. To our knowledge, this study is the first work to predict the disease progression and the conversion time, which could help clinicians to deal with the potential severe cases in time or even save the patients' lives. Experimental analysis was conducted on a real data set from two hospitals with 422 chest computed tomography (CT) scans, where 52 cases were converted to severe on average 5.64 days and 34 cases were severe at admission. Results show that our method achieves the best classification (e.g., 85.91% of accuracy) and regression (e.g., 0.462 of the correlation coefficient) performance, compared to all comparison methods. Moreover, our proposed method yields 76.97% of accuracy for predicting the severe cases, 0.524 of the correlation coefficient, and 0.55 days difference for the converted time. Full Article
and Simultaneous topology and fastener layout optimization of assemblies considering joint failure. (arXiv:2005.03398v1 [cs.CE]) By arxiv.org Published On :: This paper provides a method for the simultaneous topology optimization of parts and their corresponding joint locations in an assembly. Therein, the joint locations are not discrete and predefined, but continuously movable. The underlying coupling equations allow for connecting dissimilar meshes and avoid the need for remeshing when joint locations change. The presented method models the force transfer at a joint location not only by using single spring elements but accounts for the size and type of the joints. When considering riveted or bolted joints, the local part geometry at the joint location consists of holes that are surrounded by material. For spot welds, the joint locations are filled with material and may be smaller than for bolts. The presented method incorporates these material and clearance zones into the simultaneously running topology optimization of the parts. Furthermore, failure of joints may be taken into account at the optimization stage, yielding assemblies connected in a fail-safe manner. Full Article
and WSMN: An optimized multipurpose blind watermarking in Shearlet domain using MLP and NSGA-II. (arXiv:2005.03382v1 [cs.CR]) By arxiv.org Published On :: Digital watermarking is a remarkable issue in the field of information security to avoid the misuse of images in multimedia networks. Although access to unauthorized persons can be prevented through cryptography, it cannot be simultaneously used for copyright protection or content authentication with the preservation of image integrity. Hence, this paper presents an optimized multipurpose blind watermarking in Shearlet domain with the help of smart algorithms including MLP and NSGA-II. In this method, four copies of the robust copyright logo are embedded in the approximate coefficients of Shearlet by using an effective quantization technique. Furthermore, an embedded random sequence as a semi-fragile authentication mark is effectively extracted from details by the neural network. Due to performing an effective optimization algorithm for selecting optimum embedding thresholds, and also distinguishing the texture of blocks, the imperceptibility and robustness have been preserved. The experimental results reveal the superiority of the scheme with regard to the quality of watermarked images and robustness against hybrid attacks over other state-of-the-art schemes. The average PSNR and SSIM of the dual watermarked images are 38 dB and 0.95, respectively; Besides, it can effectively extract the copyright logo and locates forgery regions under severe attacks with satisfactory accuracy. Full Article
and Vid2Curve: Simultaneously Camera Motion Estimation and Thin Structure Reconstruction from an RGB Video. (arXiv:2005.03372v1 [cs.GR]) By arxiv.org Published On :: Thin structures, such as wire-frame sculptures, fences, cables, power lines, and tree branches, are common in the real world. It is extremely challenging to acquire their 3D digital models using traditional image-based or depth-based reconstruction methods because thin structures often lack distinct point features and have severe self-occlusion. We propose the first approach that simultaneously estimates camera motion and reconstructs the geometry of complex 3D thin structures in high quality from a color video captured by a handheld camera. Specifically, we present a new curve-based approach to estimate accurate camera poses by establishing correspondences between featureless thin objects in the foreground in consecutive video frames, without requiring visual texture in the background scene to lock on. Enabled by this effective curve-based camera pose estimation strategy, we develop an iterative optimization method with tailored measures on geometry, topology as well as self-occlusion handling for reconstructing 3D thin structures. Extensive validations on a variety of thin structures show that our method achieves accurate camera pose estimation and faithful reconstruction of 3D thin structures with complex shape and topology at a level that has not been attained by other existing reconstruction methods. Full Article
and Soft Interference Cancellation for Random Coding in Massive Gaussian Multiple-Access. (arXiv:2005.03364v1 [cs.IT]) By arxiv.org Published On :: We utilize recent results on the exact block error probability of Gaussian random codes in additive white Gaussian noise to analyze Gaussian random coding for massive multiple-access at finite message length. Soft iterative interference cancellation is found to closely approach the performance bounds recently found in [1]. The existence of two fundamentally different regimes in the trade-off between power and bandwidth efficiency reported in [2] is related to much older results in [3] on power optimization by linear programming. Furthermore, we tighten the achievability bounds of [1] in the low power regime and show that orthogonal constellations are very close to the theoretical limits for message lengths around 100 and above. Full Article
and Estimating Blood Pressure from Photoplethysmogram Signal and Demographic Features using Machine Learning Techniques. (arXiv:2005.03357v1 [eess.SP]) By arxiv.org Published On :: Hypertension is a potentially unsafe health ailment, which can be indicated directly from the Blood pressure (BP). Hypertension always leads to other health complications. Continuous monitoring of BP is very important; however, cuff-based BP measurements are discrete and uncomfortable to the user. To address this need, a cuff-less, continuous and a non-invasive BP measurement system is proposed using Photoplethysmogram (PPG) signal and demographic features using machine learning (ML) algorithms. PPG signals were acquired from 219 subjects, which undergo pre-processing and feature extraction steps. Time, frequency and time-frequency domain features were extracted from the PPG and their derivative signals. Feature selection techniques were used to reduce the computational complexity and to decrease the chance of over-fitting the ML algorithms. The features were then used to train and evaluate ML algorithms. The best regression models were selected for Systolic BP (SBP) and Diastolic BP (DBP) estimation individually. Gaussian Process Regression (GPR) along with ReliefF feature selection algorithm outperforms other algorithms in estimating SBP and DBP with a root-mean-square error (RMSE) of 6.74 and 3.59 respectively. This ML model can be implemented in hardware systems to continuously monitor BP and avoid any critical health conditions due to sudden changes. Full Article
and DramaQA: Character-Centered Video Story Understanding with Hierarchical QA. (arXiv:2005.03356v1 [cs.CL]) By arxiv.org Published On :: Despite recent progress on computer vision and natural language processing, developing video understanding intelligence is still hard to achieve due to the intrinsic difficulty of story in video. Moreover, there is not a theoretical metric for evaluating the degree of video understanding. In this paper, we propose a novel video question answering (Video QA) task, DramaQA, for a comprehensive understanding of the video story. The DramaQA focused on two perspectives: 1) hierarchical QAs as an evaluation metric based on the cognitive developmental stages of human intelligence. 2) character-centered video annotations to model local coherence of the story. Our dataset is built upon the TV drama "Another Miss Oh" and it contains 16,191 QA pairs from 23,928 various length video clips, with each QA pair belonging to one of four difficulty levels. We provide 217,308 annotated images with rich character-centered annotations, including visual bounding boxes, behaviors, and emotions of main characters, and coreference resolved scripts. Additionally, we provide analyses of the dataset as well as Dual Matching Multistream model which effectively learns character-centered representations of video to answer questions about the video. We are planning to release our dataset and model publicly for research purposes and expect that our work will provide a new perspective on video story understanding research. Full Article
and Regression Forest-Based Atlas Localization and Direction Specific Atlas Generation for Pancreas Segmentation. (arXiv:2005.03345v1 [cs.CV]) By arxiv.org Published On :: This paper proposes a fully automated atlas-based pancreas segmentation method from CT volumes utilizing atlas localization by regression forest and atlas generation using blood vessel information. Previous probabilistic atlas-based pancreas segmentation methods cannot deal with spatial variations that are commonly found in the pancreas well. Also, shape variations are not represented by an averaged atlas. We propose a fully automated pancreas segmentation method that deals with two types of variations mentioned above. The position and size of the pancreas is estimated using a regression forest technique. After localization, a patient-specific probabilistic atlas is generated based on a new image similarity that reflects the blood vessel position and direction information around the pancreas. We segment it using the EM algorithm with the atlas as prior followed by the graph-cut. In evaluation results using 147 CT volumes, the Jaccard index and the Dice overlap of the proposed method were 62.1% and 75.1%, respectively. Although we automated all of the segmentation processes, segmentation results were superior to the other state-of-the-art methods in the Dice overlap. Full Article
and Arranging Test Tubes in Racks Using Combined Task and Motion Planning. (arXiv:2005.03342v1 [cs.RO]) By arxiv.org Published On :: The paper develops a robotic manipulation system to treat the pressing needs for handling a large number of test tubes in clinical examination and replace or reduce human labor. It presents the technical details of the system, which separates and arranges test tubes in racks with the help of 3D vision and artificial intelligence (AI) reasoning/planning. The developed system only requires a person to put a rack with mixed and non-arranged tubes in front of a robot. The robot autonomously performs recognition, reasoning, planning, manipulation, etc., and returns a rack with separated and arranged tubes. The system is simple-to-use, and there are no requests for expert knowledge in robotics. We expect such a system to play an important role in helping managing public health and hope similar systems could be extended to other clinical manipulation like handling mixers and pipettes in the future. Full Article
and Database Traffic Interception for Graybox Detection of Stored and Context-Sensitive XSS. (arXiv:2005.03322v1 [cs.CR]) By arxiv.org Published On :: XSS is a security vulnerability that permits injecting malicious code into the client side of a web application. In the simplest situations, XSS vulnerabilities arise when a web application includes the user input in the web output without due sanitization. Such simple XSS vulnerabilities can be detected fairly reliably with blackbox scanners, which inject malicious payload into sensitive parts of HTTP requests and look for the reflected values in the web output. Contemporary blackbox scanners are not effective against stored XSS vulnerabilities, where the malicious payload in an HTTP response originates from the database storage of the web application, rather than from the associated HTTP request. Similarly, many blackbox scanners do not systematically handle context-sensitive XSS vulnerabilities, where the user input is included in the web output after a transformation that prevents the scanner from recognizing the original value, but does not sanitize the value sufficiently. Among the combination of two basic data sources (stored vs reflected) and two basic vulnerability patterns (context sensitive vs not so), only one is therefore tested systematically by state-of-the-art blackbox scanners. Our work focuses on systematic coverage of the three remaining combinations. We present a graybox mechanism that extends a general purpose database to cooperate with our XSS scanner, reporting and injecting the test inputs at the boundary between the database and the web application. Furthermore, we design a mechanism for identifying the injected inputs in the web output even after encoding by the web application, and check whether the encoding sanitizes the injected inputs correctly in the respective browser context. We evaluate our approach on eight mature and technologically diverse web applications, discovering previously unknown and exploitable XSS flaws in each of those applications. Full Article
and Specification and Automated Analysis of Inter-Parameter Dependencies in Web APIs. (arXiv:2005.03320v1 [cs.SE]) By arxiv.org Published On :: Web services often impose inter-parameter dependencies that restrict the way in which two or more input parameters can be combined to form valid calls to the service. Unfortunately, current specification languages for web services like the OpenAPI Specification (OAS) provide no support for the formal description of such dependencies, which makes it hardly possible to automatically discover and interact with services without human intervention. In this article, we present an approach for the specification and automated analysis of inter-parameter dependencies in web APIs. We first present a domain-specific language, called Inter-parameter Dependency Language (IDL), for the specification of dependencies among input parameters in web services. Then, we propose a mapping to translate an IDL document into a constraint satisfaction problem (CSP), enabling the automated analysis of IDL specifications using standard CSP-based reasoning operations. Specifically, we present a catalogue of nine analysis operations on IDL documents allowing to compute, for example, whether a given request satisfies all the dependencies of the service. Finally, we present a tool suite including an editor, a parser, an OAS extension, a constraint programming-aided library, and a test suite supporting IDL specifications and their analyses. Together, these contributions pave the way for a new range of specification-driven applications in areas such as code generation and testing. Full Article
and Encoding in the Dark Grand Challenge: An Overview. (arXiv:2005.03315v1 [eess.IV]) By arxiv.org Published On :: A big part of the video content we consume from video providers consists of genres featuring low-light aesthetics. Low light sequences have special characteristics, such as spatio-temporal varying acquisition noise and light flickering, that make the encoding process challenging. To deal with the spatio-temporal incoherent noise, higher bitrates are used to achieve high objective quality. Additionally, the quality assessment metrics and methods have not been designed, trained or tested for this type of content. This has inspired us to trigger research in that area and propose a Grand Challenge on encoding low-light video sequences. In this paper, we present an overview of the proposed challenge, and test state-of-the-art methods that will be part of the benchmark methods at the stage of the participants' deliverable assessment. From this exploration, our results show that VVC already achieves a high performance compared to simply denoising the video source prior to encoding. Moreover, the quality of the video streams can be further improved by employing a post-processing image enhancement method. Full Article
and Adaptive Dialog Policy Learning with Hindsight and User Modeling. (arXiv:2005.03299v1 [cs.AI]) By arxiv.org Published On :: Reinforcement learning methods have been used to compute dialog policies from language-based interaction experiences. Efficiency is of particular importance in dialog policy learning, because of the considerable cost of interacting with people, and the very poor user experience from low-quality conversations. Aiming at improving the efficiency of dialog policy learning, we develop algorithm LHUA (Learning with Hindsight, User modeling, and Adaptation) that, for the first time, enables dialog agents to adaptively learn with hindsight from both simulated and real users. Simulation and hindsight provide the dialog agent with more experience and more (positive) reinforcements respectively. Experimental results suggest that, in success rate and policy quality, LHUA outperforms competitive baselines from the literature, including its no-simulation, no-adaptation, and no-hindsight counterparts. Full Article
and YANG2UML: Bijective Transformation and Simplification of YANG to UML. (arXiv:2005.03292v1 [cs.SE]) By arxiv.org Published On :: Software Defined Networking is currently revolutionizing computer networking by decoupling the network control (control plane) from the forwarding functions (data plane) enabling the network control to become directly programmable and the underlying infrastructure to be abstracted for applications and network services. Next to the well-known OpenFlow protocol, the XML-based NETCONF protocol is also an important means for exchanging configuration information from a management platform and is nowadays even part of OpenFlow. In combination with NETCONF, YANG is the corresponding protocol that defines the associated data structures supporting virtually all network configuration protocols. YANG itself is a semantically rich language, which -- in order to facilitate familiarization with the relevant subject -- is often visualized to involve other experts or developers and to support them by their daily work (writing applications which make use of YANG). In order to support this process, this paper presents an novel approach to optimize and simplify YANG data models to assist further discussions with the management and implementations (especially of interfaces) to reduce complexity. Therefore, we have defined a bidirectional mapping of YANG to UML and developed a tool that renders the created UML diagrams. This combines the benefits to use the formal language YANG with automatically maintained UML diagrams to involve other experts or developers, closing the gap between technically improved data models and their human readability. Full Article
and RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions. (arXiv:2005.03271v1 [eess.AS]) By arxiv.org Published On :: In recent years, all-neural end-to-end approaches have obtained state-of-the-art results on several challenging automatic speech recognition (ASR) tasks. However, most existing works focus on building ASR models where train and test data are drawn from the same domain. This results in poor generalization characteristics on mismatched-domains: e.g., end-to-end models trained on short segments perform poorly when evaluated on longer utterances. In this work, we analyze the generalization properties of streaming and non-streaming recurrent neural network transducer (RNN-T) based end-to-end models in order to identify model components that negatively affect generalization performance. We propose two solutions: combining multiple regularization techniques during training, and using dynamic overlapping inference. On a long-form YouTube test set, when the non-streaming RNN-T model is trained with shorter segments of data, the proposed combination improves word error rate (WER) from 22.3% to 14.8%; when the streaming RNN-T model trained on short Search queries, the proposed techniques improve WER on the YouTube set from 67.0% to 25.3%. Finally, when trained on Librispeech, we find that dynamic overlapping inference improves WER on YouTube from 99.8% to 33.0%. Full Article
and Structured inversion of the Bernstein-Vandermonde Matrix. (arXiv:2005.03251v1 [math.NA]) By arxiv.org Published On :: Bernstein polynomials, long a staple of approximation theory and computational geometry, have also increasingly become of interest in finite element methods. Many fundamental problems in interpolation and approximation give rise to interesting linear algebra questions. When attempting to find a polynomial approximation of boundary or initial data, one encounters the Bernstein-Vandermonde matrix, which is found to be highly ill-conditioned. Previously, we used the relationship between monomial Bezout matrices and the inverse of Hankel matrices to obtain a decomposition of the inverse of the Bernstein mass matrix in terms of Hankel, Toeplitz, and diagonal matrices. In this paper, we use properties of the Bernstein-Bezout matrix to factor the inverse of the Bernstein-Vandermonde matrix into a difference of products of Hankel, Toeplitz, and diagonal matrices. We also use a nonstandard matrix norm to study the conditioning of the Bernstein-Vandermonde matrix, showing that the conditioning in this case is better than in the standard 2-norm. Additionally, we use properties of multivariate Bernstein polynomials to derive a block $LU$ decomposition of the Bernstein-Vandermonde matrix corresponding to equispaced nodes on the $d$-simplex. Full Article
and DFSeer: A Visual Analytics Approach to Facilitate Model Selection for Demand Forecasting. (arXiv:2005.03244v1 [cs.HC]) By arxiv.org Published On :: Selecting an appropriate model to forecast product demand is critical to the manufacturing industry. However, due to the data complexity, market uncertainty and users' demanding requirements for the model, it is challenging for demand analysts to select a proper model. Although existing model selection methods can reduce the manual burden to some extent, they often fail to present model performance details on individual products and reveal the potential risk of the selected model. This paper presents DFSeer, an interactive visualization system to conduct reliable model selection for demand forecasting based on the products with similar historical demand. It supports model comparison and selection with different levels of details. Besides, it shows the difference in model performance on similar products to reveal the risk of model selection and increase users' confidence in choosing a forecasting model. Two case studies and interviews with domain experts demonstrate the effectiveness and usability of DFSeer. Full Article
and Phase retrieval of complex-valued objects via a randomized Kaczmarz method. (arXiv:2005.03238v1 [cs.IT]) By arxiv.org Published On :: This paper investigates the convergence of the randomized Kaczmarz algorithm for the problem of phase retrieval of complex-valued objects. While this algorithm has been studied for the real-valued case}, its generalization to the complex-valued case is nontrivial and has been left as a conjecture. This paper establishes the connection between the convergence of the algorithm and the convexity of an objective function. Based on the connection, it demonstrates that when the sensing vectors are sampled uniformly from a unit sphere and the number of sensing vectors $m$ satisfies $m>O(nlog n)$ as $n, m ightarrowinfty$, then this algorithm with a good initialization achieves linear convergence to the solution with high probability. Full Article
and Mortar-based entropy-stable discontinuous Galerkin methods on non-conforming quadrilateral and hexahedral meshes. (arXiv:2005.03237v1 [math.NA]) By arxiv.org Published On :: High-order entropy-stable discontinuous Galerkin (DG) methods for nonlinear conservation laws reproduce a discrete entropy inequality by combining entropy conservative finite volume fluxes with summation-by-parts (SBP) discretization matrices. In the DG context, on tensor product (quadrilateral and hexahedral) elements, SBP matrices are typically constructed by collocating at Lobatto quadrature points. Recent work has extended the construction of entropy-stable DG schemes to collocation at more accurate Gauss quadrature points. In this work, we extend entropy-stable Gauss collocation schemes to non-conforming meshes. Entropy-stable DG schemes require computing entropy conservative numerical fluxes between volume and surface quadrature nodes. On conforming tensor product meshes where volume and surface nodes are aligned, flux evaluations are required only between "lines" of nodes. However, on non-conforming meshes, volume and surface nodes are no longer aligned, resulting in a larger number of flux evaluations. We reduce this expense by introducing an entropy-stable mortar-based treatment of non-conforming interfaces via a face-local correction term, and provide necessary conditions for high-order accuracy. Numerical experiments in both two and three dimensions confirm the stability and accuracy of this approach. Full Article
and Multi-Target Deep Learning for Algal Detection and Classification. (arXiv:2005.03232v1 [cs.CV]) By arxiv.org Published On :: Water quality has a direct impact on industry, agriculture, and public health. Algae species are common indicators of water quality. It is because algal communities are sensitive to changes in their habitats, giving valuable knowledge on variations in water quality. However, water quality analysis requires professional inspection of algal detection and classification under microscopes, which is very time-consuming and tedious. In this paper, we propose a novel multi-target deep learning framework for algal detection and classification. Extensive experiments were carried out on a large-scale colored microscopic algal dataset. Experimental results demonstrate that the proposed method leads to the promising performance on algal detection, class identification and genus identification. Full Article
and Constructing Accurate and Efficient Deep Spiking Neural Networks with Double-threshold and Augmented Schemes. (arXiv:2005.03231v1 [cs.NE]) By arxiv.org Published On :: Spiking neural networks (SNNs) are considered as a potential candidate to overcome current challenges such as the high-power consumption encountered by artificial neural networks (ANNs), however there is still a gap between them with respect to the recognition accuracy on practical tasks. A conversion strategy was thus introduced recently to bridge this gap by mapping a trained ANN to an SNN. However, it is still unclear that to what extent this obtained SNN can benefit both the accuracy advantage from ANN and high efficiency from the spike-based paradigm of computation. In this paper, we propose two new conversion methods, namely TerMapping and AugMapping. The TerMapping is a straightforward extension of a typical threshold-balancing method with a double-threshold scheme, while the AugMapping additionally incorporates a new scheme of augmented spike that employs a spike coefficient to carry the number of typical all-or-nothing spikes occurring at a time step. We examine the performance of our methods based on MNIST, Fashion-MNIST and CIFAR10 datasets. The results show that the proposed double-threshold scheme can effectively improve accuracies of the converted SNNs. More importantly, the proposed AugMapping is more advantageous for constructing accurate, fast and efficient deep SNNs as compared to other state-of-the-art approaches. Our study therefore provides new approaches for further integration of advanced techniques in ANNs to improve the performance of SNNs, which could be of great merit to applied developments with spike-based neuromorphic computing. Full Article
and What comprises a good talking-head video generation?: A Survey and Benchmark. (arXiv:2005.03201v1 [cs.CV]) By arxiv.org Published On :: Over the years, performance evaluation has become essential in computer vision, enabling tangible progress in many sub-fields. While talking-head video generation has become an emerging research topic, existing evaluations on this topic present many limitations. For example, most approaches use human subjects (e.g., via Amazon MTurk) to evaluate their research claims directly. This subjective evaluation is cumbersome, unreproducible, and may impend the evolution of new research. In this work, we present a carefully-designed benchmark for evaluating talking-head video generation with standardized dataset pre-processing strategies. As for evaluation, we either propose new metrics or select the most appropriate ones to evaluate results in what we consider as desired properties for a good talking-head video, namely, identity preserving, lip synchronization, high video quality, and natural-spontaneous motion. By conducting a thoughtful analysis across several state-of-the-art talking-head generation approaches, we aim to uncover the merits and drawbacks of current methods and point out promising directions for future work. All the evaluation code is available at: https://github.com/lelechen63/talking-head-generation-survey. Full Article
and Recognizing Exercises and Counting Repetitions in Real Time. (arXiv:2005.03194v1 [cs.CV]) By arxiv.org Published On :: Artificial intelligence technology has made its way absolutely necessary in a variety of industries including the fitness industry. Human pose estimation is one of the important researches in the field of Computer Vision for the last few years. In this project, pose estimation and deep machine learning techniques are combined to analyze the performance and report feedback on the repetitions of performed exercises in real-time. Involving machine learning technology in the fitness industry could help the judges to count repetitions of any exercise during Weightlifting or CrossFit competitions. Full Article
and Trains, Games, and Complexity: 0/1/2-Player Motion Planning through Input/Output Gadgets. (arXiv:2005.03192v1 [cs.CC]) By arxiv.org Published On :: We analyze the computational complexity of motion planning through local "input/output" gadgets with separate entrances and exits, and a subset of allowed traversals from entrances to exits, each of which changes the state of the gadget and thereby the allowed traversals. We study such gadgets in the 0-, 1-, and 2-player settings, in particular extending past motion-planning-through-gadgets work to 0-player games for the first time, by considering "branchless" connections between gadgets that route every gadget's exit to a unique gadget's entrance. Our complexity results include containment in L, NL, P, NP, and PSPACE; as well as hardness for NL, P, NP, and PSPACE. We apply these results to show PSPACE-completeness for certain mechanics in Factorio, [the Sequence], and a restricted version of Trainyard, improving prior results. This work strengthens prior results on switching graphs and reachability switching games. Full Article
and An Optimal Control Theory for the Traveling Salesman Problem and Its Variants. (arXiv:2005.03186v1 [math.OC]) By arxiv.org Published On :: We show that the traveling salesman problem (TSP) and its many variants may be modeled as functional optimization problems over a graph. In this formulation, all vertices and arcs of the graph are functionals; i.e., a mapping from a space of measurable functions to the field of real numbers. Many variants of the TSP, such as those with neighborhoods, with forbidden neighborhoods, with time-windows and with profits, can all be framed under this construct. In sharp contrast to their discrete-optimization counterparts, the modeling constructs presented in this paper represent a fundamentally new domain of analysis and computation for TSPs and their variants. Beyond its apparent mathematical unification of a class of problems in graph theory, the main advantage of the new approach is that it facilitates the modeling of certain application-specific problems in their home space of measurable functions. Consequently, certain elements of economic system theory such as dynamical models and continuous-time cost/profit functionals can be directly incorporated in the new optimization problem formulation. Furthermore, subtour elimination constraints, prevalent in discrete optimization formulations, are naturally enforced through continuity requirements. The price for the new modeling framework is nonsmooth functionals. Although a number of theoretical issues remain open in the proposed mathematical framework, we demonstrate the computational viability of the new modeling constructs over a sample set of problems to illustrate the rapid production of end-to-end TSP solutions to extensively-constrained practical problems. Full Article
and Determinantal Point Processes in Randomized Numerical Linear Algebra. (arXiv:2005.03185v1 [cs.DS]) By arxiv.org Published On :: Randomized Numerical Linear Algebra (RandNLA) uses randomness to develop improved algorithms for matrix problems that arise in scientific computing, data science, machine learning, etc. Determinantal Point Processes (DPPs), a seemingly unrelated topic in pure and applied mathematics, is a class of stochastic point processes with probability distribution characterized by sub-determinants of a kernel matrix. Recent work has uncovered deep and fruitful connections between DPPs and RandNLA which lead to new guarantees and improved algorithms that are of interest to both areas. We provide an overview of this exciting new line of research, including brief introductions to RandNLA and DPPs, as well as applications of DPPs to classical linear algebra tasks such as least squares regression, low-rank approximation and the Nystr"om method. For example, random sampling with a DPP leads to new kinds of unbiased estimators for least squares, enabling more refined statistical and inferential understanding of these algorithms; a DPP is, in some sense, an optimal randomized algorithm for the Nystr"om method; and a RandNLA technique called leverage score sampling can be derived as the marginal distribution of a DPP. We also discuss recent algorithmic developments, illustrating that, while not quite as efficient as standard RandNLA techniques, DPP-based algorithms are only moderately more expensive. Full Article
and Lattice-based public key encryption with equality test in standard model, revisited. (arXiv:2005.03178v1 [cs.CR]) By arxiv.org Published On :: Public key encryption with equality test (PKEET) allows testing whether two ciphertexts are generated by the same message or not. PKEET is a potential candidate for many practical applications like efficient data management on encrypted databases. Potential applicability of PKEET leads to intensive research from its first instantiation by Yang et al. (CT-RSA 2010). Most of the followup constructions are secure in the random oracle model. Moreover, the security of all the concrete constructions is based on number-theoretic hardness assumptions which are vulnerable in the post-quantum era. Recently, Lee et al. (ePrint 2016) proposed a generic construction of PKEET schemes in the standard model and hence it is possible to yield the first instantiation of PKEET schemes based on lattices. Their method is to use a $2$-level hierarchical identity-based encryption (HIBE) scheme together with a one-time signature scheme. In this paper, we propose, for the first time, a direct construction of a PKEET scheme based on the hardness assumption of lattices in the standard model. More specifically, the security of the proposed scheme is reduces to the hardness of the Learning With Errors problem. Full Article