as

Subquadratic-Time Algorithms for Normal Bases. (arXiv:2005.03497v1 [cs.SC])

For any finite Galois field extension $mathsf{K}/mathsf{F}$, with Galois group $G = mathrm{Gal}(mathsf{K}/mathsf{F})$, there exists an element $alpha in mathsf{K}$ whose orbit $Gcdotalpha$ forms an $mathsf{F}$-basis of $mathsf{K}$. Such an $alpha$ is called a normal element and $Gcdotalpha$ is a normal basis. We introduce a probabilistic algorithm for testing whether a given $alpha in mathsf{K}$ is normal, when $G$ is either a finite abelian or a metacyclic group. The algorithm is based on the fact that deciding whether $alpha$ is normal can be reduced to deciding whether $sum_{g in G} g(alpha)g in mathsf{K}[G]$ is invertible; it requires a slightly subquadratic number of operations. Once we know that $alpha$ is normal, we show how to perform conversions between the working basis of $mathsf{K}/mathsf{F}$ and the normal basis with the same asymptotic cost.




as

High Performance Interference Suppression in Multi-User Massive MIMO Detector. (arXiv:2005.03466v1 [cs.OH])

In this paper, we propose a new nonlinear detector with improved interference suppression in Multi-User Multiple Input, Multiple Output (MU-MIMO) system. The proposed detector is a combination of the following parts: QR decomposition (QRD), low complexity users sorting before QRD, sorting-reduced (SR) K-best method and minimum mean square error (MMSE) pre-processing. Our method outperforms a linear interference rejection combining (IRC, i.e. MMSE naturally) method significantly in both strong interference and additive white noise scenarios with both ideal and real channel estimations. This result has wide application importance for scenarios with strong interference, i.e. when co-located users utilize the internet in stadium, highway, shopping center, etc. Simulation results are presented for the non-line of sight 3D-UMa model of 5G QuaDRiGa 2.0 channel for 16 highly correlated single-antenna users with QAM16 modulation in 64 antennas of Massive MIMO system. The performance was compared with MMSE and other detection approaches.




as

An Experimental Study of Reduced-Voltage Operation in Modern FPGAs for Neural Network Acceleration. (arXiv:2005.03451v1 [cs.LG])

We empirically evaluate an undervolting technique, i.e., underscaling the circuit supply voltage below the nominal level, to improve the power-efficiency of Convolutional Neural Network (CNN) accelerators mapped to Field Programmable Gate Arrays (FPGAs). Undervolting below a safe voltage level can lead to timing faults due to excessive circuit latency increase. We evaluate the reliability-power trade-off for such accelerators. Specifically, we experimentally study the reduced-voltage operation of multiple components of real FPGAs, characterize the corresponding reliability behavior of CNN accelerators, propose techniques to minimize the drawbacks of reduced-voltage operation, and combine undervolting with architectural CNN optimization techniques, i.e., quantization and pruning. We investigate the effect of environmental temperature on the reliability-power trade-off of such accelerators. We perform experiments on three identical samples of modern Xilinx ZCU102 FPGA platforms with five state-of-the-art image classification CNN benchmarks. This approach allows us to study the effects of our undervolting technique for both software and hardware variability. We achieve more than 3X power-efficiency (GOPs/W) gain via undervolting. 2.6X of this gain is the result of eliminating the voltage guardband region, i.e., the safe voltage region below the nominal level that is set by FPGA vendor to ensure correct functionality in worst-case environmental and circuit conditions. 43% of the power-efficiency gain is due to further undervolting below the guardband, which comes at the cost of accuracy loss in the CNN accelerator. We evaluate an effective frequency underscaling technique that prevents this accuracy loss, and find that it reduces the power-efficiency gain from 43% to 25%.




as

Detection and Feeder Identification of the High Impedance Fault at Distribution Networks Based on Synchronous Waveform Distortions. (arXiv:2005.03411v1 [eess.SY])

Diagnosis of high impedance fault (HIF) is a challenge for nowadays distribution network protections. The fault current of a HIF is much lower than that of a normal load, and fault feature is significantly affected by fault scenarios. A detection and feeder identification algorithm for HIFs is proposed in this paper, based on the high-resolution and synchronous waveform data. In the algorithm, an interval slope is defined to describe the waveform distortions, which guarantees a uniform feature description under various HIF nonlinearities and noise interferences. For three typical types of network neutrals, i.e.,isolated neutral, resonant neutral, and low-resistor-earthed neutral, differences of the distorted components between the zero-sequence currents of healthy and faulty feeders are mathematically deduced, respectively. As a result, the proposed criterion, which is based on the distortion relationships between zero-sequence currents of feeders and the zero-sequence voltage at the substation, is theoretically supported. 28 HIFs grounded to various materials are tested in a 10kV distribution networkwith three neutral types, and are utilized to verify the effectiveness of the proposed algorithm.




as

A LiDAR-based real-time capable 3D Perception System for Automated Driving in Urban Domains. (arXiv:2005.03404v1 [cs.RO])

We present a LiDAR-based and real-time capable 3D perception system for automated driving in urban domains. The hierarchical system design is able to model stationary and movable parts of the environment simultaneously and under real-time conditions. Our approach extends the state of the art by innovative in-detail enhancements for perceiving road users and drivable corridors even in case of non-flat ground surfaces and overhanging or protruding elements. We describe a runtime-efficient pointcloud processing pipeline, consisting of adaptive ground surface estimation, 3D clustering and motion classification stages. Based on the pipeline's output, the stationary environment is represented in a multi-feature mapping and fusion approach. Movable elements are represented in an object tracking system capable of using multiple reference points to account for viewpoint changes. We further enhance the tracking system by explicit consideration of occlusion and ambiguity cases. Our system is evaluated using a subset of the TUBS Road User Dataset. We enhance common performance metrics by considering application-driven aspects of real-world traffic scenarios. The perception system shows impressive results and is able to cope with the addressed scenarios while still preserving real-time capability.




as

Simultaneous topology and fastener layout optimization of assemblies considering joint failure. (arXiv:2005.03398v1 [cs.CE])

This paper provides a method for the simultaneous topology optimization of parts and their corresponding joint locations in an assembly. Therein, the joint locations are not discrete and predefined, but continuously movable. The underlying coupling equations allow for connecting dissimilar meshes and avoid the need for remeshing when joint locations change. The presented method models the force transfer at a joint location not only by using single spring elements but accounts for the size and type of the joints. When considering riveted or bolted joints, the local part geometry at the joint location consists of holes that are surrounded by material. For spot welds, the joint locations are filled with material and may be smaller than for bolts. The presented method incorporates these material and clearance zones into the simultaneously running topology optimization of the parts. Furthermore, failure of joints may be taken into account at the optimization stage, yielding assemblies connected in a fail-safe manner.




as

Does Multi-Encoder Help? A Case Study on Context-Aware Neural Machine Translation. (arXiv:2005.03393v1 [cs.CL])

In encoder-decoder neural models, multiple encoders are in general used to represent the contextual information in addition to the individual sentence. In this paper, we investigate multi-encoder approaches in documentlevel neural machine translation (NMT). Surprisingly, we find that the context encoder does not only encode the surrounding sentences but also behaves as a noise generator. This makes us rethink the real benefits of multi-encoder in context-aware translation - some of the improvements come from robust training. We compare several methods that introduce noise and/or well-tuned dropout setup into the training of these encoders. Experimental results show that noisy training plays an important role in multi-encoder-based NMT, especially when the training data is small. Also, we establish a new state-of-the-art on IWSLT Fr-En task by careful use of noise generation and dropout methods.




as

Scoring Root Necrosis in Cassava Using Semantic Segmentation. (arXiv:2005.03367v1 [eess.IV])

Cassava a major food crop in many parts of Africa, has majorly been affected by Cassava Brown Streak Disease (CBSD). The disease affects tuberous roots and presents symptoms that include a yellow/brown, dry, corky necrosis within the starch-bearing tissues. Cassava breeders currently depend on visual inspection to score necrosis in roots based on a qualitative score which is quite subjective. In this paper we present an approach to automate root necrosis scoring using deep convolutional neural networks with semantic segmentation. Our experiments show that the UNet model performs this task with high accuracy achieving a mean Intersection over Union (IoU) of 0.90 on the test set. This method provides a means to use a quantitative measure for necrosis scoring on root cross-sections. This is done by segmentation and classifying the necrotized and non-necrotized pixels of cassava root cross-sections without any additional feature engineering.




as

Soft Interference Cancellation for Random Coding in Massive Gaussian Multiple-Access. (arXiv:2005.03364v1 [cs.IT])

We utilize recent results on the exact block error probability of Gaussian random codes in additive white Gaussian noise to analyze Gaussian random coding for massive multiple-access at finite message length. Soft iterative interference cancellation is found to closely approach the performance bounds recently found in [1]. The existence of two fundamentally different regimes in the trade-off between power and bandwidth efficiency reported in [2] is related to much older results in [3] on power optimization by linear programming. Furthermore, we tighten the achievability bounds of [1] in the low power regime and show that orthogonal constellations are very close to the theoretical limits for message lengths around 100 and above.




as

JASS: Japanese-specific Sequence to Sequence Pre-training for Neural Machine Translation. (arXiv:2005.03361v1 [cs.CL])

Neural machine translation (NMT) needs large parallel corpora for state-of-the-art translation quality. Low-resource NMT is typically addressed by transfer learning which leverages large monolingual or parallel corpora for pre-training. Monolingual pre-training approaches such as MASS (MAsked Sequence to Sequence) are extremely effective in boosting NMT quality for languages with small parallel corpora. However, they do not account for linguistic information obtained using syntactic analyzers which is known to be invaluable for several Natural Language Processing (NLP) tasks. To this end, we propose JASS, Japanese-specific Sequence to Sequence, as a novel pre-training alternative to MASS for NMT involving Japanese as the source or target language. JASS is joint BMASS (Bunsetsu MASS) and BRSS (Bunsetsu Reordering Sequence to Sequence) pre-training which focuses on Japanese linguistic units called bunsetsus. In our experiments on ASPEC Japanese--English and News Commentary Japanese--Russian translation we show that JASS can give results that are competitive with if not better than those given by MASS. Furthermore, we show for the first time that joint MASS and JASS pre-training gives results that significantly surpass the individual methods indicating their complementary nature. We will release our code, pre-trained models and bunsetsu annotated data as resources for researchers to use in their own NLP tasks.




as

Regression Forest-Based Atlas Localization and Direction Specific Atlas Generation for Pancreas Segmentation. (arXiv:2005.03345v1 [cs.CV])

This paper proposes a fully automated atlas-based pancreas segmentation method from CT volumes utilizing atlas localization by regression forest and atlas generation using blood vessel information. Previous probabilistic atlas-based pancreas segmentation methods cannot deal with spatial variations that are commonly found in the pancreas well. Also, shape variations are not represented by an averaged atlas. We propose a fully automated pancreas segmentation method that deals with two types of variations mentioned above. The position and size of the pancreas is estimated using a regression forest technique. After localization, a patient-specific probabilistic atlas is generated based on a new image similarity that reflects the blood vessel position and direction information around the pancreas. We segment it using the EM algorithm with the atlas as prior followed by the graph-cut. In evaluation results using 147 CT volumes, the Jaccard index and the Dice overlap of the proposed method were 62.1% and 75.1%, respectively. Although we automated all of the segmentation processes, segmentation results were superior to the other state-of-the-art methods in the Dice overlap.




as

Arranging Test Tubes in Racks Using Combined Task and Motion Planning. (arXiv:2005.03342v1 [cs.RO])

The paper develops a robotic manipulation system to treat the pressing needs for handling a large number of test tubes in clinical examination and replace or reduce human labor. It presents the technical details of the system, which separates and arranges test tubes in racks with the help of 3D vision and artificial intelligence (AI) reasoning/planning. The developed system only requires a person to put a rack with mixed and non-arranged tubes in front of a robot. The robot autonomously performs recognition, reasoning, planning, manipulation, etc., and returns a rack with separated and arranged tubes. The system is simple-to-use, and there are no requests for expert knowledge in robotics. We expect such a system to play an important role in helping managing public health and hope similar systems could be extended to other clinical manipulation like handling mixers and pipettes in the future.




as

Wavelet Integrated CNNs for Noise-Robust Image Classification. (arXiv:2005.03337v1 [cs.CV])

Convolutional Neural Networks (CNNs) are generally prone to noise interruptions, i.e., small image noise can cause drastic changes in the output. To suppress the noise effect to the final predication, we enhance CNNs by replacing max-pooling, strided-convolution, and average-pooling with Discrete Wavelet Transform (DWT). We present general DWT and Inverse DWT (IDWT) layers applicable to various wavelets like Haar, Daubechies, and Cohen, etc., and design wavelet integrated CNNs (WaveCNets) using these layers for image classification. In WaveCNets, feature maps are decomposed into the low-frequency and high-frequency components during the down-sampling. The low-frequency component stores main information including the basic object structures, which is transmitted into the subsequent layers to extract robust high-level features. The high-frequency components, containing most of the data noise, are dropped during inference to improve the noise-robustness of the WaveCNets. Our experimental results on ImageNet and ImageNet-C (the noisy version of ImageNet) show that WaveCNets, the wavelet integrated versions of VGG, ResNets, and DenseNet, achieve higher accuracy and better noise-robustness than their vanilla versions.




as

Crop Aggregating for short utterances speaker verification using raw waveforms. (arXiv:2005.03329v1 [eess.AS])

Most studies on speaker verification systems focus on long-duration utterances, which are composed of sufficient phonetic information. However, the performances of these systems are known to degrade when short-duration utterances are inputted due to the lack of phonetic information as compared to the long utterances. In this paper, we propose a method that compensates for the performance degradation of speaker verification for short utterances, referred to as "crop aggregating". The proposed method adopts an ensemble-based design to improve the stability and accuracy of speaker verification systems. The proposed method segments an input utterance into several short utterances and then aggregates the segment embeddings extracted from the segmented inputs to compose a speaker embedding. Then, this method simultaneously trains the segment embeddings and the aggregated speaker embedding. In addition, we also modified the teacher-student learning method for the proposed method. Experimental results on different input duration using the VoxCeleb1 test set demonstrate that the proposed technique improves speaker verification performance by about 45.37% relatively compared to the baseline system with 1-second test utterance condition.




as

Global Distribution of Google Scholar Citations: A Size-independent Institution-based Analysis. (arXiv:2005.03324v1 [cs.DL])

Most currently available schemes for performance based ranking of Universities or Research organizations, such as, Quacarelli Symonds (QS), Times Higher Education (THE), Shanghai University based All Research of World Universities (ARWU) use a variety of criteria that include productivity, citations, awards, reputation, etc., while Leiden and Scimago use only bibliometric indicators. The research performance evaluation in the aforesaid cases is based on bibliometric data from Web of Science or Scopus, which are commercially available priced databases. The coverage includes peer reviewed journals and conference proceedings. Google Scholar (GS) on the other hand, provides a free and open alternative to obtaining citations of papers available on the net, (though it is not clear exactly which journals are covered.) Citations are collected automatically from the net and also added to self created individual author profiles under Google Scholar Citations (GSC). This data was used by Webometrics Lab, Spain to create a ranked list of 4000+ institutions in 2016, based on citations from only the top 10 individual GSC profiles in each organization. (GSC excludes the top paper for reasons explained in the text; the simple selection procedure makes the ranked list size-independent as claimed by the Cybermetrics Lab). Using this data (Transparent Ranking TR, 2016), we find the regional and country wise distribution of GS-TR Citations. The size independent ranked list is subdivided into deciles of 400 institutions each and the number of institutions and citations of each country obtained for each decile. We test for correlation between institutional ranks between GS TR and the other ranking schemes for the top 20 institutions.




as

Database Traffic Interception for Graybox Detection of Stored and Context-Sensitive XSS. (arXiv:2005.03322v1 [cs.CR])

XSS is a security vulnerability that permits injecting malicious code into the client side of a web application. In the simplest situations, XSS vulnerabilities arise when a web application includes the user input in the web output without due sanitization. Such simple XSS vulnerabilities can be detected fairly reliably with blackbox scanners, which inject malicious payload into sensitive parts of HTTP requests and look for the reflected values in the web output.

Contemporary blackbox scanners are not effective against stored XSS vulnerabilities, where the malicious payload in an HTTP response originates from the database storage of the web application, rather than from the associated HTTP request. Similarly, many blackbox scanners do not systematically handle context-sensitive XSS vulnerabilities, where the user input is included in the web output after a transformation that prevents the scanner from recognizing the original value, but does not sanitize the value sufficiently. Among the combination of two basic data sources (stored vs reflected) and two basic vulnerability patterns (context sensitive vs not so), only one is therefore tested systematically by state-of-the-art blackbox scanners.

Our work focuses on systematic coverage of the three remaining combinations. We present a graybox mechanism that extends a general purpose database to cooperate with our XSS scanner, reporting and injecting the test inputs at the boundary between the database and the web application. Furthermore, we design a mechanism for identifying the injected inputs in the web output even after encoding by the web application, and check whether the encoding sanitizes the injected inputs correctly in the respective browser context. We evaluate our approach on eight mature and technologically diverse web applications, discovering previously unknown and exploitable XSS flaws in each of those applications.




as

Interval type-2 fuzzy logic system based similarity evaluation for image steganography. (arXiv:2005.03310v1 [cs.MM])

Similarity measure, also called information measure, is a concept used to distinguish different objects. It has been studied from different contexts by employing mathematical, psychological, and fuzzy approaches. Image steganography is the art of hiding secret data into an image in such a way that it cannot be detected by an intruder. In image steganography, hiding secret data in the plain or non-edge regions of the image is significant due to the high similarity and redundancy of the pixels in their neighborhood. However, the similarity measure of the neighboring pixels, i.e., their proximity in color space, is perceptual rather than mathematical. This paper proposes an interval type 2 fuzzy logic system (IT2 FLS) to determine the similarity between the neighboring pixels by involving an instinctive human perception through a rule-based approach. The pixels of the image having high similarity values, calculated using the proposed IT2 FLS similarity measure, are selected for embedding via the least significant bit (LSB) method. We term the proposed procedure of steganography as IT2 FLS LSB method. Moreover, we have developed two more methods, namely, type 1 fuzzy logic system based least significant bits (T1FLS LSB) and Euclidean distance based similarity measures for least significant bit (SM LSB) steganographic methods. Experimental simulations were conducted for a collection of images and quality index metrics, such as PSNR, UQI, and SSIM are used. All the three steganographic methods are applied on datasets and the quality metrics are calculated. The obtained stego images and results are shown and thoroughly compared to determine the efficacy of the IT2 FLS LSB method. Finally, we have done a comparative analysis of the proposed approach with the existing well-known steganographic methods to show the effectiveness of our proposed steganographic method.




as

Knowledge Enhanced Neural Fashion Trend Forecasting. (arXiv:2005.03297v1 [cs.IR])

Fashion trend forecasting is a crucial task for both academia and industry. Although some efforts have been devoted to tackling this challenging task, they only studied limited fashion elements with highly seasonal or simple patterns, which could hardly reveal the real fashion trends. Towards insightful fashion trend forecasting, this work focuses on investigating fine-grained fashion element trends for specific user groups. We first contribute a large-scale fashion trend dataset (FIT) collected from Instagram with extracted time series fashion element records and user information. Further-more, to effectively model the time series data of fashion elements with rather complex patterns, we propose a Knowledge EnhancedRecurrent Network model (KERN) which takes advantage of the capability of deep recurrent neural networks in modeling time-series data. Moreover, it leverages internal and external knowledge in fashion domain that affects the time-series patterns of fashion element trends. Such incorporation of domain knowledge further enhances the deep learning model in capturing the patterns of specific fashion elements and predicting the future trends. Extensive experiments demonstrate that the proposed KERN model can effectively capture the complicated patterns of objective fashion elements, therefore making preferable fashion trend forecast.




as

Cotatron: Transcription-Guided Speech Encoder for Any-to-Many Voice Conversion without Parallel Data. (arXiv:2005.03295v1 [eess.AS])

We propose Cotatron, a transcription-guided speech encoder for speaker-independent linguistic representation. Cotatron is based on the multispeaker TTS architecture and can be trained with conventional TTS datasets. We train a voice conversion system to reconstruct speech with Cotatron features, which is similar to the previous methods based on Phonetic Posteriorgram (PPG). By training and evaluating our system with 108 speakers from the VCTK dataset, we outperform the previous method in terms of both naturalness and speaker similarity. Our system can also convert speech from speakers that are unseen during training, and utilize ASR to automate the transcription with minimal reduction of the performance. Audio samples are available at https://mindslab-ai.github.io/cotatron, and the code with a pre-trained model will be made available soon.




as

Deep Learning based Person Re-identification. (arXiv:2005.03293v1 [cs.CV])

Automated person re-identification in a multi-camera surveillance setup is very important for effective tracking and monitoring crowd movement. In the recent years, few deep learning based re-identification approaches have been developed which are quite accurate but time-intensive, and hence not very suitable for practical purposes. In this paper, we propose an efficient hierarchical re-identification approach in which color histogram based comparison is first employed to find the closest matches in the gallery set, and next deep feature based comparison is carried out using Siamese network. Reduction in search space after the first level of matching helps in achieving a fast response time as well as improving the accuracy of prediction by the Siamese network by eliminating vastly dissimilar elements. A silhouette part-based feature extraction scheme is adopted in each level of hierarchy to preserve the relative locations of the different body structures and make the appearance descriptors more discriminating in nature. The proposed approach has been evaluated on five public data sets and also a new data set captured by our team in our laboratory. Results reveal that it outperforms most state-of-the-art approaches in terms of overall accuracy.




as

RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions. (arXiv:2005.03271v1 [eess.AS])

In recent years, all-neural end-to-end approaches have obtained state-of-the-art results on several challenging automatic speech recognition (ASR) tasks. However, most existing works focus on building ASR models where train and test data are drawn from the same domain. This results in poor generalization characteristics on mismatched-domains: e.g., end-to-end models trained on short segments perform poorly when evaluated on longer utterances. In this work, we analyze the generalization properties of streaming and non-streaming recurrent neural network transducer (RNN-T) based end-to-end models in order to identify model components that negatively affect generalization performance. We propose two solutions: combining multiple regularization techniques during training, and using dynamic overlapping inference. On a long-form YouTube test set, when the non-streaming RNN-T model is trained with shorter segments of data, the proposed combination improves word error rate (WER) from 22.3% to 14.8%; when the streaming RNN-T model trained on short Search queries, the proposed techniques improve WER on the YouTube set from 67.0% to 25.3%. Finally, when trained on Librispeech, we find that dynamic overlapping inference improves WER on YouTube from 99.8% to 33.0%.




as

Data selection for multi-task learning under dynamic constraints. (arXiv:2005.03270v1 [eess.SY])

Learning-based techniques are increasingly effective at controlling complex systems using data-driven models. However, most work done so far has focused on learning individual tasks or control laws. Hence, it is still a largely unaddressed research question how multiple tasks can be learned efficiently and simultaneously on the same system. In particular, no efficient state space exploration schemes have been designed for multi-task control settings. Using this research gap as our main motivation, we present an algorithm that approximates the smallest data set that needs to be collected in order to achieve high control performance for multiple learning-based control laws. We describe system uncertainty using a probabilistic Gaussian process model, which allows us to quantify the impact of potentially collected data on each learning-based controller. We then determine the optimal measurement locations by solving a stochastic optimization problem approximately. We show that, under reasonable assumptions, the approximate solution converges towards that of the exact problem. Additionally, we provide a numerical illustration of the proposed algorithm.




as

Adaptive Feature Selection Guided Deep Forest for COVID-19 Classification with Chest CT. (arXiv:2005.03264v1 [eess.IV])

Chest computed tomography (CT) becomes an effective tool to assist the diagnosis of coronavirus disease-19 (COVID-19). Due to the outbreak of COVID-19 worldwide, using the computed-aided diagnosis technique for COVID-19 classification based on CT images could largely alleviate the burden of clinicians. In this paper, we propose an Adaptive Feature Selection guided Deep Forest (AFS-DF) for COVID-19 classification based on chest CT images. Specifically, we first extract location-specific features from CT images. Then, in order to capture the high-level representation of these features with the relatively small-scale data, we leverage a deep forest model to learn high-level representation of the features. Moreover, we propose a feature selection method based on the trained deep forest model to reduce the redundancy of features, where the feature selection could be adaptively incorporated with the COVID-19 classification model. We evaluated our proposed AFS-DF on COVID-19 dataset with 1495 patients of COVID-19 and 1027 patients of community acquired pneumonia (CAP). The accuracy (ACC), sensitivity (SEN), specificity (SPE) and AUC achieved by our method are 91.79%, 93.05%, 89.95% and 96.35%, respectively. Experimental results on the COVID-19 dataset suggest that the proposed AFS-DF achieves superior performance in COVID-19 vs. CAP classification, compared with 4 widely used machine learning methods.




as

DFSeer: A Visual Analytics Approach to Facilitate Model Selection for Demand Forecasting. (arXiv:2005.03244v1 [cs.HC])

Selecting an appropriate model to forecast product demand is critical to the manufacturing industry. However, due to the data complexity, market uncertainty and users' demanding requirements for the model, it is challenging for demand analysts to select a proper model. Although existing model selection methods can reduce the manual burden to some extent, they often fail to present model performance details on individual products and reveal the potential risk of the selected model. This paper presents DFSeer, an interactive visualization system to conduct reliable model selection for demand forecasting based on the products with similar historical demand. It supports model comparison and selection with different levels of details. Besides, it shows the difference in model performance on similar products to reveal the risk of model selection and increase users' confidence in choosing a forecasting model. Two case studies and interviews with domain experts demonstrate the effectiveness and usability of DFSeer.




as

Phase retrieval of complex-valued objects via a randomized Kaczmarz method. (arXiv:2005.03238v1 [cs.IT])

This paper investigates the convergence of the randomized Kaczmarz algorithm for the problem of phase retrieval of complex-valued objects. While this algorithm has been studied for the real-valued case}, its generalization to the complex-valued case is nontrivial and has been left as a conjecture. This paper establishes the connection between the convergence of the algorithm and the convexity of an objective function. Based on the connection, it demonstrates that when the sensing vectors are sampled uniformly from a unit sphere and the number of sensing vectors $m$ satisfies $m>O(nlog n)$ as $n, m ightarrowinfty$, then this algorithm with a good initialization achieves linear convergence to the solution with high probability.




as

Mortar-based entropy-stable discontinuous Galerkin methods on non-conforming quadrilateral and hexahedral meshes. (arXiv:2005.03237v1 [math.NA])

High-order entropy-stable discontinuous Galerkin (DG) methods for nonlinear conservation laws reproduce a discrete entropy inequality by combining entropy conservative finite volume fluxes with summation-by-parts (SBP) discretization matrices. In the DG context, on tensor product (quadrilateral and hexahedral) elements, SBP matrices are typically constructed by collocating at Lobatto quadrature points. Recent work has extended the construction of entropy-stable DG schemes to collocation at more accurate Gauss quadrature points.

In this work, we extend entropy-stable Gauss collocation schemes to non-conforming meshes. Entropy-stable DG schemes require computing entropy conservative numerical fluxes between volume and surface quadrature nodes. On conforming tensor product meshes where volume and surface nodes are aligned, flux evaluations are required only between "lines" of nodes. However, on non-conforming meshes, volume and surface nodes are no longer aligned, resulting in a larger number of flux evaluations. We reduce this expense by introducing an entropy-stable mortar-based treatment of non-conforming interfaces via a face-local correction term, and provide necessary conditions for high-order accuracy. Numerical experiments in both two and three dimensions confirm the stability and accuracy of this approach.




as

Multi-Target Deep Learning for Algal Detection and Classification. (arXiv:2005.03232v1 [cs.CV])

Water quality has a direct impact on industry, agriculture, and public health. Algae species are common indicators of water quality. It is because algal communities are sensitive to changes in their habitats, giving valuable knowledge on variations in water quality. However, water quality analysis requires professional inspection of algal detection and classification under microscopes, which is very time-consuming and tedious. In this paper, we propose a novel multi-target deep learning framework for algal detection and classification. Extensive experiments were carried out on a large-scale colored microscopic algal dataset. Experimental results demonstrate that the proposed method leads to the promising performance on algal detection, class identification and genus identification.




as

Diagnosis of Coronavirus Disease 2019 (COVID-19) with Structured Latent Multi-View Representation Learning. (arXiv:2005.03227v1 [eess.IV])

Recently, the outbreak of Coronavirus Disease 2019 (COVID-19) has spread rapidly across the world. Due to the large number of affected patients and heavy labor for doctors, computer-aided diagnosis with machine learning algorithm is urgently needed, and could largely reduce the efforts of clinicians and accelerate the diagnosis process. Chest computed tomography (CT) has been recognized as an informative tool for diagnosis of the disease. In this study, we propose to conduct the diagnosis of COVID-19 with a series of features extracted from CT images. To fully explore multiple features describing CT images from different views, a unified latent representation is learned which can completely encode information from different aspects of features and is endowed with promising class structure for separability. Specifically, the completeness is guaranteed with a group of backward neural networks (each for one type of features), while by using class labels the representation is enforced to be compact within COVID-19/community-acquired pneumonia (CAP) and also a large margin is guaranteed between different types of pneumonia. In this way, our model can well avoid overfitting compared to the case of directly projecting highdimensional features into classes. Extensive experimental results show that the proposed method outperforms all comparison methods, and rather stable performances are observed when varying the numbers of training data.




as

Conley's fundamental theorem for a class of hybrid systems. (arXiv:2005.03217v1 [math.DS])

We establish versions of Conley's (i) fundamental theorem and (ii) decomposition theorem for a broad class of hybrid dynamical systems. The hybrid version of (i) asserts that a globally-defined "hybrid complete Lyapunov function" exists for every hybrid system in this class. Motivated by mechanics and control settings where physical or engineered events cause abrupt changes in a system's governing dynamics, our results apply to a large class of Lagrangian hybrid systems (with impacts) studied extensively in the robotics literature. Viewed formally, these results generalize those of Conley and Franks for continuous-time and discrete-time dynamical systems, respectively, on metric spaces. However, we furnish specific examples illustrating how our statement of sufficient conditions represents merely an early step in the longer project of establishing what formal assumptions can and cannot endow hybrid systems models with the topologically well characterized partitions of limit behavior that make Conley's theory so valuable in those classical settings.




as

OTFS-NOMA based on SCMA. (arXiv:2005.03216v1 [cs.IT])

Orthogonal Time Frequency Space (OTFS) is a $ ext{2-D}$ modulation technique that has the potential to overcome the challenges faced by orthogonal frequency division multiplexing (OFDM) in high Doppler environments. The performance of OTFS in a multi-user scenario with orthogonal multiple access (OMA) techniques has been impressive. Due to the requirement of massive connectivity in 5G and beyond, it is immensely essential to devise and examine the OTFS system with the existing Non-orthogonal Multiple Access (NOMA) techniques.

In this paper, we propose a multi-user OTFS system based on a code-domain NOMA technique called Sparse Code Multiple Access (SCMA). This system is referred to as the OTFS-SCMA model. The framework for OTFS-SCMA is designed for both downlink and uplink. First, the sparse SCMA codewords are strategically placed on the delay-Doppler plane such that the overall overloading factor of the OTFS-SCMA system is equal to that of the underlying basic SCMA system. The receiver in downlink performs the detection in two sequential phases: first, the conventional OTFS detection using the method of linear minimum mean square error (LMMSE), and then the conventional SCMA detection. For uplink, we propose a single-phase detector based on message-passing algorithm (MPA) to detect the multiple users' symbols. The performance of the proposed OTFS-SCMA system is validated through extensive simulations both in downlink and uplink. We consider delay-Doppler planes of different parameters and various SCMA systems of overloading factor up to 200$\%$. The performance of OTFS-SCMA is compared with those of existing OTFS-OMA techniques. The comprehensive investigation demonstrates the usefulness of OTFS-SCMA in future wireless communication standards.




as

A Stochastic Geometry Approach to Doppler Characterization in a LEO Satellite Network. (arXiv:2005.03205v1 [cs.IT])

A Non-terrestrial Network (NTN) comprising Low Earth Orbit (LEO) satellites can enable connectivity to underserved areas, thus complementing existing telecom networks. The high-speed satellite motion poses several challenges at the physical layer such as large Doppler frequency shifts. In this paper, an analytical framework is developed for statistical characterization of Doppler shift in an NTN where LEO satellites provide communication services to terrestrial users. Using tools from stochastic geometry, the users within a cell are grouped into disjoint clusters to limit the differential Doppler across users. Under some simplifying assumptions, the cumulative distribution function (CDF) and the probability density function are derived for the Doppler shift magnitude at a random user within a cluster. The CDFs are also provided for the minimum and the maximum Doppler shift magnitude within a cluster. Leveraging the analytical results, the interplay between key system parameters such as the cluster size and satellite altitude is examined. Numerical results validate the insights obtained from the analysis.




as

Distributed Stabilization by Probability Control for Deterministic-Stochastic Large Scale Systems : Dissipativity Approach. (arXiv:2005.03193v1 [eess.SY])

By using dissipativity approach, we establish the stability condition for the feedback connection of a deterministic dynamical system $Sigma$ and a stochastic memoryless map $Psi$. After that, we extend the result to the class of large scale systems in which: $Sigma$ consists of many sub-systems; and $Psi$ consists of many "stochastic actuators" and "probability controllers" that control the actuator's output events. We will demonstrate the proposed approach by showing the design procedures to globally stabilize the manufacturing systems while locally balance the stock levels in any production process.




as

ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context. (arXiv:2005.03191v1 [eess.AS])

Convolutional neural networks (CNN) have shown promising results for end-to-end speech recognition, albeit still behind other state-of-the-art methods in performance. In this paper, we study how to bridge this gap and go beyond with a novel CNN-RNN-transducer architecture, which we call ContextNet. ContextNet features a fully convolutional encoder that incorporates global context information into convolution layers by adding squeeze-and-excitation modules. In addition, we propose a simple scaling method that scales the widths of ContextNet that achieves good trade-off between computation and accuracy. We demonstrate that on the widely used LibriSpeech benchmark, ContextNet achieves a word error rate (WER) of 2.1\%/4.6\% without external language model (LM), 1.9\%/4.1\% with LM and 2.9\%/7.0\% with only 10M parameters on the clean/noisy LibriSpeech test sets. This compares to the previous best published system of 2.0\%/4.6\% with LM and 3.9\%/11.3\% with 20M parameters. The superiority of the proposed ContextNet model is also verified on a much larger internal dataset.




as

Lattice-based public key encryption with equality test in standard model, revisited. (arXiv:2005.03178v1 [cs.CR])

Public key encryption with equality test (PKEET) allows testing whether two ciphertexts are generated by the same message or not. PKEET is a potential candidate for many practical applications like efficient data management on encrypted databases. Potential applicability of PKEET leads to intensive research from its first instantiation by Yang et al. (CT-RSA 2010). Most of the followup constructions are secure in the random oracle model. Moreover, the security of all the concrete constructions is based on number-theoretic hardness assumptions which are vulnerable in the post-quantum era. Recently, Lee et al. (ePrint 2016) proposed a generic construction of PKEET schemes in the standard model and hence it is possible to yield the first instantiation of PKEET schemes based on lattices. Their method is to use a $2$-level hierarchical identity-based encryption (HIBE) scheme together with a one-time signature scheme. In this paper, we propose, for the first time, a direct construction of a PKEET scheme based on the hardness assumption of lattices in the standard model. More specifically, the security of the proposed scheme is reduces to the hardness of the Learning With Errors problem.




as

Fact-based Dialogue Generation with Convergent and Divergent Decoding. (arXiv:2005.03174v1 [cs.CL])

Fact-based dialogue generation is a task of generating a human-like response based on both dialogue context and factual texts. Various methods were proposed to focus on generating informative words that contain facts effectively. However, previous works implicitly assume a topic to be kept on a dialogue and usually converse passively, therefore the systems have a difficulty to generate diverse responses that provide meaningful information proactively. This paper proposes an end-to-end Fact-based dialogue system augmented with the ability of convergent and divergent thinking over both context and facts, which can converse about the current topic or introduce a new topic. Specifically, our model incorporates a novel convergent and divergent decoding that can generate informative and diverse responses considering not only given inputs (context and facts) but also inputs-related topics. Both automatic and human evaluation results on DSTC7 dataset show that our model significantly outperforms state-of-the-art baselines, indicating that our model can generate more appropriate, informative, and diverse responses.




as

Fast Mapping onto Census Blocks. (arXiv:2005.03156v1 [cs.DC])

Pandemic measures such as social distancing and contact tracing can be enhanced by rapidly integrating dynamic location data and demographic data. Projecting billions of longitude and latitude locations onto hundreds of thousands of highly irregular demographic census block polygons is computationally challenging in both research and deployment contexts. This paper describes two approaches labeled "simple" and "fast". The simple approach can be implemented in any scripting language (Matlab/Octave, Python, Julia, R) and is easily integrated and customized to a variety of research goals. This simple approach uses a novel combination of hierarchy, sparse bounding boxes, polygon crossing-number, vectorization, and parallel processing to achieve 100,000,000+ projections per second on 100 servers. The simple approach is compact, does not increase data storage requirements, and is applicable to any country or region. The fast approach exploits the thread, vector, and memory optimizations that are possible using a low-level language (C++) and achieves similar performance on a single server. This paper details these approaches with the goal of enabling the broader community to quickly integrate location and demographic data.




as

Optimally Convergent Mixed Finite Element Methods for the Stochastic Stokes Equations. (arXiv:2005.03148v1 [math.NA])

We propose some new mixed finite element methods for the time dependent stochastic Stokes equations with multiplicative noise, which use the Helmholtz decomposition of the driving multiplicative noise. It is known [16] that the pressure solution has a low regularity, which manifests in sub-optimal convergence rates for well-known inf-sup stable mixed finite element methods in numerical simulations, see [10]. We show that eliminating this gradient part from the noise in the numerical scheme leads to optimally convergent mixed finite element methods, and that this conceptual idea may be used to retool numerical methods that are well-known in the deterministic setting, including pressure stabilization methods, so that their optimal convergence properties can still be maintained in the stochastic setting. Computational experiments are also provided to validate the theoretical results and to illustrate the conceptional usefulness of the proposed numerical approach.




as

Deep Learning for Image-based Automatic Dial Meter Reading: Dataset and Baselines. (arXiv:2005.03106v1 [cs.CV])

Smart meters enable remote and automatic electricity, water and gas consumption reading and are being widely deployed in developed countries. Nonetheless, there is still a huge number of non-smart meters in operation. Image-based Automatic Meter Reading (AMR) focuses on dealing with this type of meter readings. We estimate that the Energy Company of Paran'a (Copel), in Brazil, performs more than 850,000 readings of dial meters per month. Those meters are the focus of this work. Our main contributions are: (i) a public real-world dial meter dataset (shared upon request) called UFPR-ADMR; (ii) a deep learning-based recognition baseline on the proposed dataset; and (iii) a detailed error analysis of the main issues present in AMR for dial meters. To the best of our knowledge, this is the first work to introduce deep learning approaches to multi-dial meter reading, and perform experiments on unconstrained images. We achieved a 100.0% F1-score on the dial detection stage with both Faster R-CNN and YOLO, while the recognition rates reached 93.6% for dials and 75.25% for meters using Faster R-CNN (ResNext-101).




as

Optimal Location of Cellular Base Station via Convex Optimization. (arXiv:2005.03099v1 [cs.IT])

An optimal base station (BS) location depends on the traffic (user) distribution, propagation pathloss and many system parameters, which renders its analytical study difficult so that numerical algorithms are widely used instead. In this paper, the problem is studied analytically. First, it is formulated as a convex optimization problem to minimize the total BS transmit power subject to quality-of-service (QoS) constraints, which also account for fairness among users. Due to its convex nature, Karush-Kuhn-Tucker (KKT) conditions are used to characterize a globally-optimum location as a convex combination of user locations, where convex weights depend on user parameters, pathloss exponent and overall geometry of the problem. Based on this characterization, a number of closed-form solutions are obtained. In particular, the optimum BS location is the mean of user locations in the case of free-space propagation and identical user parameters. If the user set is symmetric (as defined in the paper), the optimal BS location is independent of pathloss exponent, which is not the case in general. The analytical results show the impact of propagation conditions as well as system and user parameters on optimal BS location and can be used to develop design guidelines.




as

Eliminating NB-IoT Interference to LTE System: a Sparse Machine Learning Based Approach. (arXiv:2005.03092v1 [cs.IT])

Narrowband internet-of-things (NB-IoT) is a competitive 5G technology for massive machine-type communication scenarios, but meanwhile introduces narrowband interference (NBI) to existing broadband transmission such as the long term evolution (LTE) systems in enhanced mobile broadband (eMBB) scenarios. In order to facilitate the harmonic and fair coexistence in wireless heterogeneous networks, it is important to eliminate NB-IoT interference to LTE systems. In this paper, a novel sparse machine learning based framework and a sparse combinatorial optimization problem is formulated for accurate NBI recovery, which can be efficiently solved using the proposed iterative sparse learning algorithm called sparse cross-entropy minimization (SCEM). To further improve the recovery accuracy and convergence rate, regularization is introduced to the loss function in the enhanced algorithm called regularized SCEM. Moreover, exploiting the spatial correlation of NBI, the framework is extended to multiple-input multiple-output systems. Simulation results demonstrate that the proposed methods are effective in eliminating NB-IoT interference to LTE systems, and significantly outperform the state-of-the-art methods.




as

Experiences from Exporting Major Proof Assistant Libraries. (arXiv:2005.03089v1 [cs.SE])

The interoperability of proof assistants and the integration of their libraries is a highly valued but elusive goal in the field of theorem proving. As a preparatory step, in previous work, we translated the libraries of multiple proof assistants, specifically the ones of Coq, HOL Light, IMPS, Isabelle, Mizar, and PVS into a universal format: OMDoc/MMT.

Each translation presented tremendous theoretical, technical, and social challenges, some universal and some system-specific, some solvable and some still open. In this paper, we survey these challenges and compare and evaluate the solutions we chose.

We believe similar library translations will be an essential part of any future system interoperability solution and our experiences will prove valuable to others undertaking such efforts.




as

Diagnosing the Environment Bias in Vision-and-Language Navigation. (arXiv:2005.03086v1 [cs.CL])

Vision-and-Language Navigation (VLN) requires an agent to follow natural-language instructions, explore the given environments, and reach the desired target locations. These step-by-step navigational instructions are crucial when the agent is navigating new environments about which it has no prior knowledge. Most recent works that study VLN observe a significant performance drop when tested on unseen environments (i.e., environments not used in training), indicating that the neural agent models are highly biased towards training environments. Although this issue is considered as one of the major challenges in VLN research, it is still under-studied and needs a clearer explanation. In this work, we design novel diagnosis experiments via environment re-splitting and feature replacement, looking into possible reasons for this environment bias. We observe that neither the language nor the underlying navigational graph, but the low-level visual appearance conveyed by ResNet features directly affects the agent model and contributes to this environment bias in results. According to this observation, we explore several kinds of semantic representations that contain less low-level visual information, hence the agent learned with these features could be better generalized to unseen testing environments. Without modifying the baseline agent model and its training method, our explored semantic features significantly decrease the performance gaps between seen and unseen on multiple datasets (i.e. R2R, R4R, and CVDN) and achieve competitive unseen results to previous state-of-the-art models. Our code and features are available at: https://github.com/zhangybzbo/EnvBiasVLN




as

Line Artefact Quantification in Lung Ultrasound Images of COVID-19 Patients via Non-Convex Regularisation. (arXiv:2005.03080v1 [eess.IV])

In this paper, we present a novel method for line artefacts quantification in lung ultrasound (LUS) images of COVID-19 patients. We formulate this as a non-convex regularisation problem involving a sparsity-enforcing, Cauchy-based penalty function, and the inverse Radon transform. We employ a simple local maxima detection technique in the Radon transform domain, associated with known clinical definitions of line artefacts. Despite being non-convex, the proposed method has guaranteed convergence via a proximal splitting algorithm and accurately identifies both horizontal and vertical line artefacts in LUS images. In order to reduce the number of false and missed detection, our method includes a two-stage validation mechanism, which is performed in both Radon and image domains. We evaluate the performance of the proposed method in comparison to the current state-of-the-art B-line identification method and show a considerable performance gain with 87% correctly detected B-lines in LUS images of nine COVID-19 patients. In addition, owing to its fast convergence, which takes around 12 seconds for a given frame, our proposed method is readily applicable for processing LUS image sequences.




as

AVAC: A Machine Learning based Adaptive RRAM Variability-Aware Controller for Edge Devices. (arXiv:2005.03077v1 [eess.SY])

Recently, the Edge Computing paradigm has gained significant popularity both in industry and academia. Researchers now increasingly target to improve performance and reduce energy consumption of such devices. Some recent efforts focus on using emerging RRAM technologies for improving energy efficiency, thanks to their no leakage property and high integration density. As the complexity and dynamism of applications supported by such devices escalate, it has become difficult to maintain ideal performance by static RRAM controllers. Machine Learning provides a promising solution for this, and hence, this work focuses on extending such controllers to allow dynamic parameter updates. In this work we propose an Adaptive RRAM Variability-Aware Controller, AVAC, which periodically updates Wait Buffer and batch sizes using on-the-fly learning models and gradient ascent. AVAC allows Edge devices to adapt to different applications and their stages, to improve computation performance and reduce energy consumption. Simulations demonstrate that the proposed model can provide up to 29% increase in performance and 19% decrease in energy, compared to static controllers, using traces of real-life healthcare applications on a Raspberry-Pi based Edge deployment.




as

Guided Policy Search Model-based Reinforcement Learning for Urban Autonomous Driving. (arXiv:2005.03076v1 [cs.RO])

In this paper, we continue our prior work on using imitation learning (IL) and model free reinforcement learning (RL) to learn driving policies for autonomous driving in urban scenarios, by introducing a model based RL method to drive the autonomous vehicle in the Carla urban driving simulator. Although IL and model free RL methods have been proved to be capable of solving lots of challenging tasks, including playing video games, robots, and, in our prior work, urban driving, the low sample efficiency of such methods greatly limits their applications on actual autonomous driving. In this work, we developed a model based RL algorithm of guided policy search (GPS) for urban driving tasks. The algorithm iteratively learns a parameterized dynamic model to approximate the complex and interactive driving task, and optimizes the driving policy under the nonlinear approximate dynamic model. As a model based RL approach, when applied in urban autonomous driving, the GPS has the advantages of higher sample efficiency, better interpretability, and greater stability. We provide extensive experiments validating the effectiveness of the proposed method to learn robust driving policy for urban driving in Carla. We also compare the proposed method with other policy search and model free RL baselines, showing 100x better sample efficiency of the GPS based RL method, and also that the GPS based method can learn policies for harder tasks that the baseline methods can hardly learn.




as

Weakly-Supervised Neural Response Selection from an Ensemble of Task-Specialised Dialogue Agents. (arXiv:2005.03066v1 [cs.CL])

Dialogue engines that incorporate different types of agents to converse with humans are popular.

However, conversations are dynamic in the sense that a selected response will change the conversation on-the-fly, influencing the subsequent utterances in the conversation, which makes the response selection a challenging problem.

We model the problem of selecting the best response from a set of responses generated by a heterogeneous set of dialogue agents by taking into account the conversational history, and propose a emph{Neural Response Selection} method.

The proposed method is trained to predict a coherent set of responses within a single conversation, considering its own predictions via a curriculum training mechanism.

Our experimental results show that the proposed method can accurately select the most appropriate responses, thereby significantly improving the user experience in dialogue systems.




as

Evaluating text coherence based on the graph of the consistency of phrases to identify symptoms of schizophrenia. (arXiv:2005.03008v1 [cs.CL])

Different state-of-the-art methods of the detection of schizophrenia symptoms based on the estimation of text coherence have been analyzed. The analysis of a text at the level of phrases has been suggested. The method based on the graph of the consistency of phrases has been proposed to evaluate the semantic coherence and the cohesion of a text. The semantic coherence, cohesion, and other linguistic features (lexical diversity, lexical density) have been taken into account to form feature vectors for the training of a model-classifier. The training of the classifier has been performed on the set of English-language interviews. According to the retrieved results, the impact of each feature on the output of the model has been analyzed. The results obtained can indicate that the proposed method based on the graph of the consistency of phrases may be used in the different tasks of the detection of mental illness.




as

Football High: Owen Thomas' Story

The issues of sports-related concussions and chronic traumatic encephalopathy were intensified when the brain of a deceased 21-year-old football player was examined.




as

What Soccer Was Like When Retired Soccer Star Briana Scurry First Started Playing

Soccer great Briana Scurry started playing soccer at 12 on an all boys team and in the goal — the "safest" position for a girl ...




as

Retired Soccer Star Briana Scurry: "This Has Been the Most Difficult Thing"

"The penalty kicks, the final goals in the Olympics, playing in front of the president, in front of 90,000 people ... that is what I was born to do ... and my brain is what I used to get myself there."