ic

Joint Prediction and Time Estimation of COVID-19 Developing Severe Symptoms using Chest CT Scan. (arXiv:2005.03405v1 [eess.IV])

With the rapidly worldwide spread of Coronavirus disease (COVID-19), it is of great importance to conduct early diagnosis of COVID-19 and predict the time that patients might convert to the severe stage, for designing effective treatment plan and reducing the clinicians' workloads. In this study, we propose a joint classification and regression method to determine whether the patient would develop severe symptoms in the later time, and if yes, predict the possible conversion time that the patient would spend to convert to the severe stage. To do this, the proposed method takes into account 1) the weight for each sample to reduce the outliers' influence and explore the problem of imbalance classification, and 2) the weight for each feature via a sparsity regularization term to remove the redundant features of high-dimensional data and learn the shared information across the classification task and the regression task. To our knowledge, this study is the first work to predict the disease progression and the conversion time, which could help clinicians to deal with the potential severe cases in time or even save the patients' lives. Experimental analysis was conducted on a real data set from two hospitals with 422 chest computed tomography (CT) scans, where 52 cases were converted to severe on average 5.64 days and 34 cases were severe at admission. Results show that our method achieves the best classification (e.g., 85.91% of accuracy) and regression (e.g., 0.462 of the correlation coefficient) performance, compared to all comparison methods. Moreover, our proposed method yields 76.97% of accuracy for predicting the severe cases, 0.524 of the correlation coefficient, and 0.55 days difference for the converted time.




ic

Semantic Signatures for Large-scale Visual Localization. (arXiv:2005.03388v1 [cs.CV])

Visual localization is a useful alternative to standard localization techniques. It works by utilizing cameras. In a typical scenario, features are extracted from captured images and compared with geo-referenced databases. Location information is then inferred from the matching results. Conventional schemes mainly use low-level visual features. These approaches offer good accuracy but suffer from scalability issues. In order to assist localization in large urban areas, this work explores a different path by utilizing high-level semantic information. It is found that object information in a street view can facilitate localization. A novel descriptor scheme called "semantic signature" is proposed to summarize this information. A semantic signature consists of type and angle information of visible objects at a spatial location. Several metrics and protocols are proposed for signature comparison and retrieval. They illustrate different trade-offs between accuracy and complexity. Extensive simulation results confirm the potential of the proposed scheme in large-scale applications. This paper is an extended version of a conference paper in CBMI'18. A more efficient retrieval protocol is presented with additional experiment results.




ic

Energy-efficient topology to enhance the wireless sensor network lifetime using connectivity control. (arXiv:2005.03370v1 [cs.NI])

Wireless sensor networks have attracted much attention because of many applications in the fields of industry, military, medicine, agriculture, and education. In addition, the vast majority of researches has been done to expand its applications and improve its efficiency. However, there are still many challenges for increasing the efficiency in different parts of this network. One of the most important parts is to improve the network lifetime in the wireless sensor network. Since the sensor nodes are generally powered by batteries, the most important issue to consider in these types of networks is to reduce the power consumption of the nodes in such a way as to increase the network lifetime to an acceptable level. The contribution of this paper is using topology control, the threshold for the remaining energy in nodes, and two of the meta-algorithms include SA (Simulated annealing) and VNS (Variable Neighbourhood Search) to increase the energy remaining in the sensors. Moreover, using a low-cost spanning tree, an appropriate connectivity control among nodes is created in the network in order to increase the network lifetime. The results of simulations show that the proposed method improves the sensor lifetime and reduces the energy consumed.




ic

Scoring Root Necrosis in Cassava Using Semantic Segmentation. (arXiv:2005.03367v1 [eess.IV])

Cassava a major food crop in many parts of Africa, has majorly been affected by Cassava Brown Streak Disease (CBSD). The disease affects tuberous roots and presents symptoms that include a yellow/brown, dry, corky necrosis within the starch-bearing tissues. Cassava breeders currently depend on visual inspection to score necrosis in roots based on a qualitative score which is quite subjective. In this paper we present an approach to automate root necrosis scoring using deep convolutional neural networks with semantic segmentation. Our experiments show that the UNet model performs this task with high accuracy achieving a mean Intersection over Union (IoU) of 0.90 on the test set. This method provides a means to use a quantitative measure for necrosis scoring on root cross-sections. This is done by segmentation and classifying the necrotized and non-necrotized pixels of cassava root cross-sections without any additional feature engineering.




ic

Probabilistic Hyperproperties of Markov Decision Processes. (arXiv:2005.03362v1 [cs.LO])

We study the specification and verification of hyperproperties for probabilistic systems represented as Markov decision processes (MDPs). Hyperproperties are system properties that describe the correctness of a system as a relation between multiple executions. Hyperproperties generalize trace properties and include information-flow security requirements, like noninterference, as well as requirements like symmetry, partial observation, robustness, and fault tolerance. We introduce the temporal logic PHL, which extends classic probabilistic logics with quantification over schedulers and traces. PHL can express a wide range of hyperproperties for probabilistic systems, including both classical applications, such as differential privacy, and novel applications in areas such as robotics and planning. While the model checking problem for PHL is in general undecidable, we provide methods both for proving and for refuting a class of probabilistic hyperproperties for MDPs.




ic

JASS: Japanese-specific Sequence to Sequence Pre-training for Neural Machine Translation. (arXiv:2005.03361v1 [cs.CL])

Neural machine translation (NMT) needs large parallel corpora for state-of-the-art translation quality. Low-resource NMT is typically addressed by transfer learning which leverages large monolingual or parallel corpora for pre-training. Monolingual pre-training approaches such as MASS (MAsked Sequence to Sequence) are extremely effective in boosting NMT quality for languages with small parallel corpora. However, they do not account for linguistic information obtained using syntactic analyzers which is known to be invaluable for several Natural Language Processing (NLP) tasks. To this end, we propose JASS, Japanese-specific Sequence to Sequence, as a novel pre-training alternative to MASS for NMT involving Japanese as the source or target language. JASS is joint BMASS (Bunsetsu MASS) and BRSS (Bunsetsu Reordering Sequence to Sequence) pre-training which focuses on Japanese linguistic units called bunsetsus. In our experiments on ASPEC Japanese--English and News Commentary Japanese--Russian translation we show that JASS can give results that are competitive with if not better than those given by MASS. Furthermore, we show for the first time that joint MASS and JASS pre-training gives results that significantly surpass the individual methods indicating their complementary nature. We will release our code, pre-trained models and bunsetsu annotated data as resources for researchers to use in their own NLP tasks.




ic

Estimating Blood Pressure from Photoplethysmogram Signal and Demographic Features using Machine Learning Techniques. (arXiv:2005.03357v1 [eess.SP])

Hypertension is a potentially unsafe health ailment, which can be indicated directly from the Blood pressure (BP). Hypertension always leads to other health complications. Continuous monitoring of BP is very important; however, cuff-based BP measurements are discrete and uncomfortable to the user. To address this need, a cuff-less, continuous and a non-invasive BP measurement system is proposed using Photoplethysmogram (PPG) signal and demographic features using machine learning (ML) algorithms. PPG signals were acquired from 219 subjects, which undergo pre-processing and feature extraction steps. Time, frequency and time-frequency domain features were extracted from the PPG and their derivative signals. Feature selection techniques were used to reduce the computational complexity and to decrease the chance of over-fitting the ML algorithms. The features were then used to train and evaluate ML algorithms. The best regression models were selected for Systolic BP (SBP) and Diastolic BP (DBP) estimation individually. Gaussian Process Regression (GPR) along with ReliefF feature selection algorithm outperforms other algorithms in estimating SBP and DBP with a root-mean-square error (RMSE) of 6.74 and 3.59 respectively. This ML model can be implemented in hardware systems to continuously monitor BP and avoid any critical health conditions due to sudden changes.




ic

DramaQA: Character-Centered Video Story Understanding with Hierarchical QA. (arXiv:2005.03356v1 [cs.CL])

Despite recent progress on computer vision and natural language processing, developing video understanding intelligence is still hard to achieve due to the intrinsic difficulty of story in video. Moreover, there is not a theoretical metric for evaluating the degree of video understanding. In this paper, we propose a novel video question answering (Video QA) task, DramaQA, for a comprehensive understanding of the video story. The DramaQA focused on two perspectives: 1) hierarchical QAs as an evaluation metric based on the cognitive developmental stages of human intelligence. 2) character-centered video annotations to model local coherence of the story. Our dataset is built upon the TV drama "Another Miss Oh" and it contains 16,191 QA pairs from 23,928 various length video clips, with each QA pair belonging to one of four difficulty levels. We provide 217,308 annotated images with rich character-centered annotations, including visual bounding boxes, behaviors, and emotions of main characters, and coreference resolved scripts. Additionally, we provide analyses of the dataset as well as Dual Matching Multistream model which effectively learns character-centered representations of video to answer questions about the video. We are planning to release our dataset and model publicly for research purposes and expect that our work will provide a new perspective on video story understanding research.




ic

Pricing under a multinomial logit model with non linear network effects. (arXiv:2005.03352v1 [cs.GT])

We study the problem of pricing under a Multinomial Logit model where we incorporate network effects over the consumer's decisions. We analyse both cases, when sellers compete or collaborate. In particular, we pay special attention to the overall expected revenue and how the behaviour of the no purchase option is affected under variations of a network effect parameter. Where for example we prove that the market share for the no purchase option, is decreasing in terms of the value of the network effect, meaning that stronger communication among costumers increases the expected amount of sales. We also analyse how the customer's utility is altered when network effects are incorporated into the market, comparing the cases where both competitive and monopolistic prices are displayed. We use tools from stochastic approximation algorithms to prove that the probability of purchasing the available products converges to a unique stationary distribution. We model that the sellers can use this stationary distribution to establish their strategies. Finding that under those settings, a pure Nash Equilibrium represents the pricing strategies in the case of competition, and an optimal (that maximises the total revenue) fixed price characterise the case of collaboration.




ic

Error estimates for the Cahn--Hilliard equation with dynamic boundary conditions. (arXiv:2005.03349v1 [math.NA])

A proof of convergence is given for bulk--surface finite element semi-discretisation of the Cahn--Hilliard equation with Cahn--Hilliard-type dynamic boundary conditions in a smooth domain. The semi-discretisation is studied in the weak formulation as a second order system. Optimal-order uniform-in-time error estimates are shown in the $L^2$ and $H^1$ norms. The error estimates are based on a consistency and stability analysis. The proof of stability is performed in an abstract framework, based on energy estimates exploiting the anti-symmetric structure of the second order system. Numerical experiments illustrate the theoretical results.




ic

Regression Forest-Based Atlas Localization and Direction Specific Atlas Generation for Pancreas Segmentation. (arXiv:2005.03345v1 [cs.CV])

This paper proposes a fully automated atlas-based pancreas segmentation method from CT volumes utilizing atlas localization by regression forest and atlas generation using blood vessel information. Previous probabilistic atlas-based pancreas segmentation methods cannot deal with spatial variations that are commonly found in the pancreas well. Also, shape variations are not represented by an averaged atlas. We propose a fully automated pancreas segmentation method that deals with two types of variations mentioned above. The position and size of the pancreas is estimated using a regression forest technique. After localization, a patient-specific probabilistic atlas is generated based on a new image similarity that reflects the blood vessel position and direction information around the pancreas. We segment it using the EM algorithm with the atlas as prior followed by the graph-cut. In evaluation results using 147 CT volumes, the Jaccard index and the Dice overlap of the proposed method were 62.1% and 75.1%, respectively. Although we automated all of the segmentation processes, segmentation results were superior to the other state-of-the-art methods in the Dice overlap.




ic

Wavelet Integrated CNNs for Noise-Robust Image Classification. (arXiv:2005.03337v1 [cs.CV])

Convolutional Neural Networks (CNNs) are generally prone to noise interruptions, i.e., small image noise can cause drastic changes in the output. To suppress the noise effect to the final predication, we enhance CNNs by replacing max-pooling, strided-convolution, and average-pooling with Discrete Wavelet Transform (DWT). We present general DWT and Inverse DWT (IDWT) layers applicable to various wavelets like Haar, Daubechies, and Cohen, etc., and design wavelet integrated CNNs (WaveCNets) using these layers for image classification. In WaveCNets, feature maps are decomposed into the low-frequency and high-frequency components during the down-sampling. The low-frequency component stores main information including the basic object structures, which is transmitted into the subsequent layers to extract robust high-level features. The high-frequency components, containing most of the data noise, are dropped during inference to improve the noise-robustness of the WaveCNets. Our experimental results on ImageNet and ImageNet-C (the noisy version of ImageNet) show that WaveCNets, the wavelet integrated versions of VGG, ResNets, and DenseNet, achieve higher accuracy and better noise-robustness than their vanilla versions.




ic

Crop Aggregating for short utterances speaker verification using raw waveforms. (arXiv:2005.03329v1 [eess.AS])

Most studies on speaker verification systems focus on long-duration utterances, which are composed of sufficient phonetic information. However, the performances of these systems are known to degrade when short-duration utterances are inputted due to the lack of phonetic information as compared to the long utterances. In this paper, we propose a method that compensates for the performance degradation of speaker verification for short utterances, referred to as "crop aggregating". The proposed method adopts an ensemble-based design to improve the stability and accuracy of speaker verification systems. The proposed method segments an input utterance into several short utterances and then aggregates the segment embeddings extracted from the segmented inputs to compose a speaker embedding. Then, this method simultaneously trains the segment embeddings and the aggregated speaker embedding. In addition, we also modified the teacher-student learning method for the proposed method. Experimental results on different input duration using the VoxCeleb1 test set demonstrate that the proposed technique improves speaker verification performance by about 45.37% relatively compared to the baseline system with 1-second test utterance condition.




ic

Database Traffic Interception for Graybox Detection of Stored and Context-Sensitive XSS. (arXiv:2005.03322v1 [cs.CR])

XSS is a security vulnerability that permits injecting malicious code into the client side of a web application. In the simplest situations, XSS vulnerabilities arise when a web application includes the user input in the web output without due sanitization. Such simple XSS vulnerabilities can be detected fairly reliably with blackbox scanners, which inject malicious payload into sensitive parts of HTTP requests and look for the reflected values in the web output.

Contemporary blackbox scanners are not effective against stored XSS vulnerabilities, where the malicious payload in an HTTP response originates from the database storage of the web application, rather than from the associated HTTP request. Similarly, many blackbox scanners do not systematically handle context-sensitive XSS vulnerabilities, where the user input is included in the web output after a transformation that prevents the scanner from recognizing the original value, but does not sanitize the value sufficiently. Among the combination of two basic data sources (stored vs reflected) and two basic vulnerability patterns (context sensitive vs not so), only one is therefore tested systematically by state-of-the-art blackbox scanners.

Our work focuses on systematic coverage of the three remaining combinations. We present a graybox mechanism that extends a general purpose database to cooperate with our XSS scanner, reporting and injecting the test inputs at the boundary between the database and the web application. Furthermore, we design a mechanism for identifying the injected inputs in the web output even after encoding by the web application, and check whether the encoding sanitizes the injected inputs correctly in the respective browser context. We evaluate our approach on eight mature and technologically diverse web applications, discovering previously unknown and exploitable XSS flaws in each of those applications.




ic

Specification and Automated Analysis of Inter-Parameter Dependencies in Web APIs. (arXiv:2005.03320v1 [cs.SE])

Web services often impose inter-parameter dependencies that restrict the way in which two or more input parameters can be combined to form valid calls to the service. Unfortunately, current specification languages for web services like the OpenAPI Specification (OAS) provide no support for the formal description of such dependencies, which makes it hardly possible to automatically discover and interact with services without human intervention. In this article, we present an approach for the specification and automated analysis of inter-parameter dependencies in web APIs. We first present a domain-specific language, called Inter-parameter Dependency Language (IDL), for the specification of dependencies among input parameters in web services. Then, we propose a mapping to translate an IDL document into a constraint satisfaction problem (CSP), enabling the automated analysis of IDL specifications using standard CSP-based reasoning operations. Specifically, we present a catalogue of nine analysis operations on IDL documents allowing to compute, for example, whether a given request satisfies all the dependencies of the service. Finally, we present a tool suite including an editor, a parser, an OAS extension, a constraint programming-aided library, and a test suite supporting IDL specifications and their analyses. Together, these contributions pave the way for a new range of specification-driven applications in areas such as code generation and testing.




ic

Boosting Cloud Data Analytics using Multi-Objective Optimization. (arXiv:2005.03314v1 [cs.DB])

Data analytics in the cloud has become an integral part of enterprise businesses. Big data analytics systems, however, still lack the ability to take user performance goals and budgetary constraints for a task, collectively referred to as task objectives, and automatically configure an analytic job to achieve these objectives. This paper presents a data analytics optimizer that can automatically determine a cluster configuration with a suitable number of cores as well as other system parameters that best meet the task objectives. At a core of our work is a principled multi-objective optimization (MOO) approach that computes a Pareto optimal set of job configurations to reveal tradeoffs between different user objectives, recommends a new job configuration that best explores such tradeoffs, and employs novel optimizations to enable such recommendations within a few seconds. We present efficient incremental algorithms based on the notion of a Progressive Frontier for realizing our MOO approach and implement them into a Spark-based prototype. Detailed experiments using benchmark workloads show that our MOO techniques provide a 2-50x speedup over existing MOO methods, while offering good coverage of the Pareto frontier. When compared to Ottertune, a state-of-the-art performance tuning system, our approach recommends configurations that yield 26\%-49\% reduction of running time of the TPCx-BB benchmark while adapting to different application preferences on multiple objectives.




ic

Interval type-2 fuzzy logic system based similarity evaluation for image steganography. (arXiv:2005.03310v1 [cs.MM])

Similarity measure, also called information measure, is a concept used to distinguish different objects. It has been studied from different contexts by employing mathematical, psychological, and fuzzy approaches. Image steganography is the art of hiding secret data into an image in such a way that it cannot be detected by an intruder. In image steganography, hiding secret data in the plain or non-edge regions of the image is significant due to the high similarity and redundancy of the pixels in their neighborhood. However, the similarity measure of the neighboring pixels, i.e., their proximity in color space, is perceptual rather than mathematical. This paper proposes an interval type 2 fuzzy logic system (IT2 FLS) to determine the similarity between the neighboring pixels by involving an instinctive human perception through a rule-based approach. The pixels of the image having high similarity values, calculated using the proposed IT2 FLS similarity measure, are selected for embedding via the least significant bit (LSB) method. We term the proposed procedure of steganography as IT2 FLS LSB method. Moreover, we have developed two more methods, namely, type 1 fuzzy logic system based least significant bits (T1FLS LSB) and Euclidean distance based similarity measures for least significant bit (SM LSB) steganographic methods. Experimental simulations were conducted for a collection of images and quality index metrics, such as PSNR, UQI, and SSIM are used. All the three steganographic methods are applied on datasets and the quality metrics are calculated. The obtained stego images and results are shown and thoroughly compared to determine the efficacy of the IT2 FLS LSB method. Finally, we have done a comparative analysis of the proposed approach with the existing well-known steganographic methods to show the effectiveness of our proposed steganographic method.




ic

Safe Data-Driven Distributed Coordination of Intersection Traffic. (arXiv:2005.03304v1 [math.OC])

This work addresses the problem of traffic management at and near an isolated un-signalized intersection for autonomous and networked vehicles through coordinated optimization of their trajectories. We decompose the trajectory of each vehicle into two phases: the provisional phase and the coordinated phase. A vehicle, upon entering the region of interest, initially operates in the provisional phase, in which the vehicle is allowed to optimize its trajectory but is constrained to guarantee in-lane safety and to not enter the intersection. Periodically, all the vehicles in their provisional phase switch to their coordinated phase, which is obtained by coordinated optimization of the schedule of the vehicles' intersection usage as well as their trajectories. For the coordinated phase, we propose a data-driven solution, in which the intersection usage order is obtained through a data-driven online "classification" and the trajectories are computed sequentially. This approach is computationally very efficient and does not compromise much on optimality. Moreover, it also allows for incorporation of "macro" information such as traffic arrival rates into the solution. We also discuss a distributed implementation of this proposed data-driven sequential algorithm. Finally, we compare the proposed algorithm and its two variants against traditional methods of intersection management and against some existing results in the literature by micro-simulations.




ic

Adaptive Dialog Policy Learning with Hindsight and User Modeling. (arXiv:2005.03299v1 [cs.AI])

Reinforcement learning methods have been used to compute dialog policies from language-based interaction experiences. Efficiency is of particular importance in dialog policy learning, because of the considerable cost of interacting with people, and the very poor user experience from low-quality conversations. Aiming at improving the efficiency of dialog policy learning, we develop algorithm LHUA (Learning with Hindsight, User modeling, and Adaptation) that, for the first time, enables dialog agents to adaptively learn with hindsight from both simulated and real users. Simulation and hindsight provide the dialog agent with more experience and more (positive) reinforcements respectively. Experimental results suggest that, in success rate and policy quality, LHUA outperforms competitive baselines from the literature, including its no-simulation, no-adaptation, and no-hindsight counterparts.




ic

Cotatron: Transcription-Guided Speech Encoder for Any-to-Many Voice Conversion without Parallel Data. (arXiv:2005.03295v1 [eess.AS])

We propose Cotatron, a transcription-guided speech encoder for speaker-independent linguistic representation. Cotatron is based on the multispeaker TTS architecture and can be trained with conventional TTS datasets. We train a voice conversion system to reconstruct speech with Cotatron features, which is similar to the previous methods based on Phonetic Posteriorgram (PPG). By training and evaluating our system with 108 speakers from the VCTK dataset, we outperform the previous method in terms of both naturalness and speaker similarity. Our system can also convert speech from speakers that are unseen during training, and utilize ASR to automate the transcription with minimal reduction of the performance. Audio samples are available at https://mindslab-ai.github.io/cotatron, and the code with a pre-trained model will be made available soon.




ic

Deep Learning based Person Re-identification. (arXiv:2005.03293v1 [cs.CV])

Automated person re-identification in a multi-camera surveillance setup is very important for effective tracking and monitoring crowd movement. In the recent years, few deep learning based re-identification approaches have been developed which are quite accurate but time-intensive, and hence not very suitable for practical purposes. In this paper, we propose an efficient hierarchical re-identification approach in which color histogram based comparison is first employed to find the closest matches in the gallery set, and next deep feature based comparison is carried out using Siamese network. Reduction in search space after the first level of matching helps in achieving a fast response time as well as improving the accuracy of prediction by the Siamese network by eliminating vastly dissimilar elements. A silhouette part-based feature extraction scheme is adopted in each level of hierarchy to preserve the relative locations of the different body structures and make the appearance descriptors more discriminating in nature. The proposed approach has been evaluated on five public data sets and also a new data set captured by our team in our laboratory. Results reveal that it outperforms most state-of-the-art approaches in terms of overall accuracy.




ic

YANG2UML: Bijective Transformation and Simplification of YANG to UML. (arXiv:2005.03292v1 [cs.SE])

Software Defined Networking is currently revolutionizing computer networking by decoupling the network control (control plane) from the forwarding functions (data plane) enabling the network control to become directly programmable and the underlying infrastructure to be abstracted for applications and network services. Next to the well-known OpenFlow protocol, the XML-based NETCONF protocol is also an important means for exchanging configuration information from a management platform and is nowadays even part of OpenFlow. In combination with NETCONF, YANG is the corresponding protocol that defines the associated data structures supporting virtually all network configuration protocols. YANG itself is a semantically rich language, which -- in order to facilitate familiarization with the relevant subject -- is often visualized to involve other experts or developers and to support them by their daily work (writing applications which make use of YANG). In order to support this process, this paper presents an novel approach to optimize and simplify YANG data models to assist further discussions with the management and implementations (especially of interfaces) to reduce complexity. Therefore, we have defined a bidirectional mapping of YANG to UML and developed a tool that renders the created UML diagrams. This combines the benefits to use the formal language YANG with automatically maintained UML diagrams to involve other experts or developers, closing the gap between technically improved data models and their human readability.




ic

Data selection for multi-task learning under dynamic constraints. (arXiv:2005.03270v1 [eess.SY])

Learning-based techniques are increasingly effective at controlling complex systems using data-driven models. However, most work done so far has focused on learning individual tasks or control laws. Hence, it is still a largely unaddressed research question how multiple tasks can be learned efficiently and simultaneously on the same system. In particular, no efficient state space exploration schemes have been designed for multi-task control settings. Using this research gap as our main motivation, we present an algorithm that approximates the smallest data set that needs to be collected in order to achieve high control performance for multiple learning-based control laws. We describe system uncertainty using a probabilistic Gaussian process model, which allows us to quantify the impact of potentially collected data on each learning-based controller. We then determine the optimal measurement locations by solving a stochastic optimization problem approximately. We show that, under reasonable assumptions, the approximate solution converges towards that of the exact problem. Additionally, we provide a numerical illustration of the proposed algorithm.




ic

Adaptive Feature Selection Guided Deep Forest for COVID-19 Classification with Chest CT. (arXiv:2005.03264v1 [eess.IV])

Chest computed tomography (CT) becomes an effective tool to assist the diagnosis of coronavirus disease-19 (COVID-19). Due to the outbreak of COVID-19 worldwide, using the computed-aided diagnosis technique for COVID-19 classification based on CT images could largely alleviate the burden of clinicians. In this paper, we propose an Adaptive Feature Selection guided Deep Forest (AFS-DF) for COVID-19 classification based on chest CT images. Specifically, we first extract location-specific features from CT images. Then, in order to capture the high-level representation of these features with the relatively small-scale data, we leverage a deep forest model to learn high-level representation of the features. Moreover, we propose a feature selection method based on the trained deep forest model to reduce the redundancy of features, where the feature selection could be adaptively incorporated with the COVID-19 classification model. We evaluated our proposed AFS-DF on COVID-19 dataset with 1495 patients of COVID-19 and 1027 patients of community acquired pneumonia (CAP). The accuracy (ACC), sensitivity (SEN), specificity (SPE) and AUC achieved by our method are 91.79%, 93.05%, 89.95% and 96.35%, respectively. Experimental results on the COVID-19 dataset suggest that the proposed AFS-DF achieves superior performance in COVID-19 vs. CAP classification, compared with 4 widely used machine learning methods.




ic

Quda: Natural Language Queries for Visual Data Analytics. (arXiv:2005.03257v1 [cs.CL])

Visualization-oriented natural language interfaces (V-NLIs) have been explored and developed in recent years. One challenge faced by V-NLIs is in the formation of effective design decisions that usually requires a deep understanding of user queries. Learning-based approaches have shown potential in V-NLIs and reached state-of-the-art performance in various NLP tasks. However, because of the lack of sufficient training samples that cater to visual data analytics, cutting-edge techniques have rarely been employed to facilitate the development of V-NLIs. We present a new dataset, called Quda, to help V-NLIs understand free-form natural language. Our dataset contains 14;035 diverse user queries annotated with 10 low-level analytic tasks that assist in the deployment of state-of-the-art techniques for parsing complex human language. We achieve this goal by first gathering seed queries with data analysts who are target users of V-NLIs. Then we employ extensive crowd force for paraphrase generation and validation. We demonstrate the usefulness of Quda in building V-NLIs by creating a prototype that makes effective design decisions for free-form user queries. We also show that Quda can be beneficial for a wide range of applications in the visualization community by analyzing the design tasks described in academic publications.




ic

DFSeer: A Visual Analytics Approach to Facilitate Model Selection for Demand Forecasting. (arXiv:2005.03244v1 [cs.HC])

Selecting an appropriate model to forecast product demand is critical to the manufacturing industry. However, due to the data complexity, market uncertainty and users' demanding requirements for the model, it is challenging for demand analysts to select a proper model. Although existing model selection methods can reduce the manual burden to some extent, they often fail to present model performance details on individual products and reveal the potential risk of the selected model. This paper presents DFSeer, an interactive visualization system to conduct reliable model selection for demand forecasting based on the products with similar historical demand. It supports model comparison and selection with different levels of details. Besides, it shows the difference in model performance on similar products to reveal the risk of model selection and increase users' confidence in choosing a forecasting model. Two case studies and interviews with domain experts demonstrate the effectiveness and usability of DFSeer.




ic

Multi-Target Deep Learning for Algal Detection and Classification. (arXiv:2005.03232v1 [cs.CV])

Water quality has a direct impact on industry, agriculture, and public health. Algae species are common indicators of water quality. It is because algal communities are sensitive to changes in their habitats, giving valuable knowledge on variations in water quality. However, water quality analysis requires professional inspection of algal detection and classification under microscopes, which is very time-consuming and tedious. In this paper, we propose a novel multi-target deep learning framework for algal detection and classification. Extensive experiments were carried out on a large-scale colored microscopic algal dataset. Experimental results demonstrate that the proposed method leads to the promising performance on algal detection, class identification and genus identification.




ic

Constructing Accurate and Efficient Deep Spiking Neural Networks with Double-threshold and Augmented Schemes. (arXiv:2005.03231v1 [cs.NE])

Spiking neural networks (SNNs) are considered as a potential candidate to overcome current challenges such as the high-power consumption encountered by artificial neural networks (ANNs), however there is still a gap between them with respect to the recognition accuracy on practical tasks. A conversion strategy was thus introduced recently to bridge this gap by mapping a trained ANN to an SNN. However, it is still unclear that to what extent this obtained SNN can benefit both the accuracy advantage from ANN and high efficiency from the spike-based paradigm of computation. In this paper, we propose two new conversion methods, namely TerMapping and AugMapping. The TerMapping is a straightforward extension of a typical threshold-balancing method with a double-threshold scheme, while the AugMapping additionally incorporates a new scheme of augmented spike that employs a spike coefficient to carry the number of typical all-or-nothing spikes occurring at a time step. We examine the performance of our methods based on MNIST, Fashion-MNIST and CIFAR10 datasets. The results show that the proposed double-threshold scheme can effectively improve accuracies of the converted SNNs. More importantly, the proposed AugMapping is more advantageous for constructing accurate, fast and efficient deep SNNs as compared to other state-of-the-art approaches. Our study therefore provides new approaches for further integration of advanced techniques in ANNs to improve the performance of SNNs, which could be of great merit to applied developments with spike-based neuromorphic computing.




ic

Hierarchical Predictive Coding Models in a Deep-Learning Framework. (arXiv:2005.03230v1 [cs.CV])

Bayesian predictive coding is a putative neuromorphic method for acquiring higher-level neural representations to account for sensory input. Although originating in the neuroscience community, there are also efforts in the machine learning community to study these models. This paper reviews some of the more well known models. Our review analyzes module connectivity and patterns of information transfer, seeking to find general principles used across the models. We also survey some recent attempts to cast these models within a deep learning framework. A defining feature of Bayesian predictive coding is that it uses top-down, reconstructive mechanisms to predict incoming sensory inputs or their lower-level representations. Discrepancies between the predicted and the actual inputs, known as prediction errors, then give rise to future learning that refines and improves the predictive accuracy of learned higher-level representations. Predictive coding models intended to describe computations in the neocortex emerged prior to the development of deep learning and used a communication structure between modules that we name the Rao-Ballard protocol. This protocol was derived from a Bayesian generative model with some rather strong statistical assumptions. The RB protocol provides a rubric to assess the fidelity of deep learning models that claim to implement predictive coding.




ic

End-to-End Domain Adaptive Attention Network for Cross-Domain Person Re-Identification. (arXiv:2005.03222v1 [cs.CV])

Person re-identification (re-ID) remains challenging in a real-world scenario, as it requires a trained network to generalise to totally unseen target data in the presence of variations across domains. Recently, generative adversarial models have been widely adopted to enhance the diversity of training data. These approaches, however, often fail to generalise to other domains, as existing generative person re-identification models have a disconnect between the generative component and the discriminative feature learning stage. To address the on-going challenges regarding model generalisation, we propose an end-to-end domain adaptive attention network to jointly translate images between domains and learn discriminative re-id features in a single framework. To address the domain gap challenge, we introduce an attention module for image translation from source to target domains without affecting the identity of a person. More specifically, attention is directed to the background instead of the entire image of the person, ensuring identifying characteristics of the subject are preserved. The proposed joint learning network results in a significant performance improvement over state-of-the-art methods on several benchmark datasets.




ic

Hierarchical Attention Network for Action Segmentation. (arXiv:2005.03209v1 [cs.CV])

The temporal segmentation of events is an essential task and a precursor for the automatic recognition of human actions in the video. Several attempts have been made to capture frame-level salient aspects through attention but they lack the capacity to effectively map the temporal relationships in between the frames as they only capture a limited span of temporal dependencies. To this end we propose a complete end-to-end supervised learning approach that can better learn relationships between actions over time, thus improving the overall segmentation performance. The proposed hierarchical recurrent attention framework analyses the input video at multiple temporal scales, to form embeddings at frame level and segment level, and perform fine-grained action segmentation. This generates a simple, lightweight, yet extremely effective architecture for segmenting continuous video streams and has multiple application domains. We evaluate our system on multiple challenging public benchmark datasets, including MERL Shopping, 50 salads, and Georgia Tech Egocentric datasets, and achieves state-of-the-art performance. The evaluated datasets encompass numerous video capture settings which are inclusive of static overhead camera views and dynamic, ego-centric head-mounted camera views, demonstrating the direct applicability of the proposed framework in a variety of settings.




ic

A Stochastic Geometry Approach to Doppler Characterization in a LEO Satellite Network. (arXiv:2005.03205v1 [cs.IT])

A Non-terrestrial Network (NTN) comprising Low Earth Orbit (LEO) satellites can enable connectivity to underserved areas, thus complementing existing telecom networks. The high-speed satellite motion poses several challenges at the physical layer such as large Doppler frequency shifts. In this paper, an analytical framework is developed for statistical characterization of Doppler shift in an NTN where LEO satellites provide communication services to terrestrial users. Using tools from stochastic geometry, the users within a cell are grouped into disjoint clusters to limit the differential Doppler across users. Under some simplifying assumptions, the cumulative distribution function (CDF) and the probability density function are derived for the Doppler shift magnitude at a random user within a cluster. The CDFs are also provided for the minimum and the maximum Doppler shift magnitude within a cluster. Leveraging the analytical results, the interplay between key system parameters such as the cluster size and satellite altitude is examined. Numerical results validate the insights obtained from the analysis.




ic

Distributed Stabilization by Probability Control for Deterministic-Stochastic Large Scale Systems : Dissipativity Approach. (arXiv:2005.03193v1 [eess.SY])

By using dissipativity approach, we establish the stability condition for the feedback connection of a deterministic dynamical system $Sigma$ and a stochastic memoryless map $Psi$. After that, we extend the result to the class of large scale systems in which: $Sigma$ consists of many sub-systems; and $Psi$ consists of many "stochastic actuators" and "probability controllers" that control the actuator's output events. We will demonstrate the proposed approach by showing the design procedures to globally stabilize the manufacturing systems while locally balance the stock levels in any production process.




ic

ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context. (arXiv:2005.03191v1 [eess.AS])

Convolutional neural networks (CNN) have shown promising results for end-to-end speech recognition, albeit still behind other state-of-the-art methods in performance. In this paper, we study how to bridge this gap and go beyond with a novel CNN-RNN-transducer architecture, which we call ContextNet. ContextNet features a fully convolutional encoder that incorporates global context information into convolution layers by adding squeeze-and-excitation modules. In addition, we propose a simple scaling method that scales the widths of ContextNet that achieves good trade-off between computation and accuracy. We demonstrate that on the widely used LibriSpeech benchmark, ContextNet achieves a word error rate (WER) of 2.1\%/4.6\% without external language model (LM), 1.9\%/4.1\% with LM and 2.9\%/7.0\% with only 10M parameters on the clean/noisy LibriSpeech test sets. This compares to the previous best published system of 2.0\%/4.6\% with LM and 3.9\%/11.3\% with 20M parameters. The superiority of the proposed ContextNet model is also verified on a much larger internal dataset.




ic

A Dynamical Perspective on Point Cloud Registration. (arXiv:2005.03190v1 [cs.CV])

We provide a dynamical perspective on the classical problem of 3D point cloud registration with correspondences. A point cloud is considered as a rigid body consisting of particles. The problem of registering two point clouds is formulated as a dynamical system, where the dynamic model point cloud translates and rotates in a viscous environment towards the static scene point cloud, under forces and torques induced by virtual springs placed between each pair of corresponding points. We first show that the potential energy of the system recovers the objective function of the maximum likelihood estimation. We then adopt Lyapunov analysis, particularly the invariant set theorem, to analyze the rigid body dynamics and show that the system globally asymptotically tends towards the set of equilibrium points, where the globally optimal registration solution lies in. We conjecture that, besides the globally optimal equilibrium point, the system has either three or infinite "spurious" equilibrium points, and these spurious equilibria are all locally unstable. The case of three spurious equilibria corresponds to generic shape of the point cloud, while the case of infinite spurious equilibria happens when the point cloud exhibits symmetry. Therefore, simulating the dynamics with random perturbations guarantees to obtain the globally optimal registration solution. Numerical experiments support our analysis and conjecture.




ic

Determinantal Point Processes in Randomized Numerical Linear Algebra. (arXiv:2005.03185v1 [cs.DS])

Randomized Numerical Linear Algebra (RandNLA) uses randomness to develop improved algorithms for matrix problems that arise in scientific computing, data science, machine learning, etc. Determinantal Point Processes (DPPs), a seemingly unrelated topic in pure and applied mathematics, is a class of stochastic point processes with probability distribution characterized by sub-determinants of a kernel matrix. Recent work has uncovered deep and fruitful connections between DPPs and RandNLA which lead to new guarantees and improved algorithms that are of interest to both areas. We provide an overview of this exciting new line of research, including brief introductions to RandNLA and DPPs, as well as applications of DPPs to classical linear algebra tasks such as least squares regression, low-rank approximation and the Nystr"om method. For example, random sampling with a DPP leads to new kinds of unbiased estimators for least squares, enabling more refined statistical and inferential understanding of these algorithms; a DPP is, in some sense, an optimal randomized algorithm for the Nystr"om method; and a RandNLA technique called leverage score sampling can be derived as the marginal distribution of a DPP. We also discuss recent algorithmic developments, illustrating that, while not quite as efficient as standard RandNLA techniques, DPP-based algorithms are only moderately more expensive.




ic

A Proposal for Intelligent Agents with Episodic Memory. (arXiv:2005.03182v1 [cs.AI])

In the future we can expect that artificial intelligent agents, once deployed, will be required to learn continually from their experience during their operational lifetime. Such agents will also need to communicate with humans and other agents regarding the content of their experience, in the context of passing along their learnings, for the purpose of explaining their actions in specific circumstances or simply to relate more naturally to humans concerning experiences the agent acquires that are not necessarily related to their assigned tasks. We argue that to support these goals, an agent would benefit from an episodic memory; that is, a memory that encodes the agent's experience in such a way that the agent can relive the experience, communicate about it and use its past experience, inclusive of the agents own past actions, to learn more effective models and policies. In this short paper, we propose one potential approach to provide an AI agent with such capabilities. We draw upon the ever-growing body of work examining the function and operation of the Medial Temporal Lobe (MTL) in mammals to guide us in adding an episodic memory capability to an AI agent composed of artificial neural networks (ANNs). Based on that, we highlight important aspects to be considered in the memory organization and we propose an architecture combining ANNs and standard Computer Science techniques for supporting storage and retrieval of episodic memories. Despite being initial work, we hope this short paper can spark discussions around the creation of intelligent agents with memory or, at least, provide a different point of view on the subject.




ic

Lattice-based public key encryption with equality test in standard model, revisited. (arXiv:2005.03178v1 [cs.CR])

Public key encryption with equality test (PKEET) allows testing whether two ciphertexts are generated by the same message or not. PKEET is a potential candidate for many practical applications like efficient data management on encrypted databases. Potential applicability of PKEET leads to intensive research from its first instantiation by Yang et al. (CT-RSA 2010). Most of the followup constructions are secure in the random oracle model. Moreover, the security of all the concrete constructions is based on number-theoretic hardness assumptions which are vulnerable in the post-quantum era. Recently, Lee et al. (ePrint 2016) proposed a generic construction of PKEET schemes in the standard model and hence it is possible to yield the first instantiation of PKEET schemes based on lattices. Their method is to use a $2$-level hierarchical identity-based encryption (HIBE) scheme together with a one-time signature scheme. In this paper, we propose, for the first time, a direct construction of a PKEET scheme based on the hardness assumption of lattices in the standard model. More specifically, the security of the proposed scheme is reduces to the hardness of the Learning With Errors problem.




ic

Nonlinear model reduction: a comparison between POD-Galerkin and POD-DEIM methods. (arXiv:2005.03173v1 [physics.comp-ph])

Several nonlinear model reduction techniques are compared for the three cases of the non-parallel version of the Kuramoto-Sivashinsky equation, the transient regime of flow past a cylinder at $Re=100$ and fully developed flow past a cylinder at the same Reynolds number. The linear terms of the governing equations are reduced by Galerkin projection onto a POD basis of the flow state, while the reduced nonlinear convection terms are obtained either by a Galerkin projection onto the same state basis, by a Galerkin projection onto a POD basis representing the nonlinearities or by applying the Discrete Empirical Interpolation Method (DEIM) to a POD basis of the nonlinearities. The quality of the reduced order models is assessed as to their stability, accuracy and robustness, and appropriate quantitative measures are introduced and compared. In particular, the properties of the reduced linear terms are compared to those of the full-scale terms, and the structure of the nonlinear quadratic terms is analyzed as to the conservation of kinetic energy. It is shown that all three reduction techniques provide excellent and similar results for the cases of the Kuramoto-Sivashinsky equation and the limit-cycle cylinder flow. For the case of the transient regime of flow past a cylinder, only the pure Galerkin techniques are successful, while the DEIM technique produces reduced-order models that diverge in finite time.




ic

On the Learnability of Possibilistic Theories. (arXiv:2005.03157v1 [cs.LO])

We investigate learnability of possibilistic theories from entailments in light of Angluin's exact learning model. We consider cases in which only membership, only equivalence, and both kinds of queries can be posed by the learner. We then show that, for a large class of problems, polynomial time learnability results for classical logic can be transferred to the respective possibilistic extension. In particular, it follows from our results that the possibilistic extension of propositional Horn theories is exactly learnable in polynomial time. As polynomial time learnability in the exact model is transferable to the classical probably approximately correct model extended with membership queries, our work also establishes such results in this model.




ic

An augmented Lagrangian preconditioner for implicitly-constituted non-Newtonian incompressible flow. (arXiv:2005.03150v1 [math.NA])

We propose an augmented Lagrangian preconditioner for a three-field stress-velocity-pressure discretization of stationary non-Newtonian incompressible flow with an implicit constitutive relation of power-law type. The discretization employed makes use of the divergence-free Scott-Vogelius pair for the velocity and pressure. The preconditioner builds on the work [P. E. Farrell, L. Mitchell, and F. Wechsung, SIAM J. Sci. Comput., 41 (2019), pp. A3073-A3096], where a Reynolds-robust preconditioner for the three-dimensional Newtonian system was introduced. The preconditioner employs a specialized multigrid method for the stress-velocity block that involves a divergence-capturing space decomposition and a custom prolongation operator. The solver exhibits excellent robustness with respect to the parameters arising in the constitutive relation, allowing for the simulation of a wide range of materials.




ic

Optimally Convergent Mixed Finite Element Methods for the Stochastic Stokes Equations. (arXiv:2005.03148v1 [math.NA])

We propose some new mixed finite element methods for the time dependent stochastic Stokes equations with multiplicative noise, which use the Helmholtz decomposition of the driving multiplicative noise. It is known [16] that the pressure solution has a low regularity, which manifests in sub-optimal convergence rates for well-known inf-sup stable mixed finite element methods in numerical simulations, see [10]. We show that eliminating this gradient part from the noise in the numerical scheme leads to optimally convergent mixed finite element methods, and that this conceptual idea may be used to retool numerical methods that are well-known in the deterministic setting, including pressure stabilization methods, so that their optimal convergence properties can still be maintained in the stochastic setting. Computational experiments are also provided to validate the theoretical results and to illustrate the conceptional usefulness of the proposed numerical approach.




ic

A Gentle Introduction to Quantum Computing Algorithms with Applications to Universal Prediction. (arXiv:2005.03137v1 [quant-ph])

In this technical report we give an elementary introduction to Quantum Computing for non-physicists. In this introduction we describe in detail some of the foundational Quantum Algorithms including: the Deutsch-Jozsa Algorithm, Shor's Algorithm, Grocer Search, and Quantum Counting Algorithm and briefly the Harrow-Lloyd Algorithm. Additionally we give an introduction to Solomonoff Induction, a theoretically optimal method for prediction. We then attempt to use Quantum computing to find better algorithms for the approximation of Solomonoff Induction. This is done by using techniques from other Quantum computing algorithms to achieve a speedup in computing the speed prior, which is an approximation of Solomonoff's prior, a key part of Solomonoff Induction. The major limiting factors are that the probabilities being computed are often so small that without a sufficient (often large) amount of trials, the error may be larger than the result. If a substantial speedup in the computation of an approximation of Solomonoff Induction can be achieved through quantum computing, then this can be applied to the field of intelligent agents as a key part of an approximation of the agent AIXI.




ic

Evaluation, Tuning and Interpretation of Neural Networks for Meteorological Applications. (arXiv:2005.03126v1 [physics.ao-ph])

Neural networks have opened up many new opportunities to utilize remotely sensed images in meteorology. Common applications include image classification, e.g., to determine whether an image contains a tropical cyclone, and image translation, e.g., to emulate radar imagery for satellites that only have passive channels. However, there are yet many open questions regarding the use of neural networks in meteorology, such as best practices for evaluation, tuning and interpretation. This article highlights several strategies and practical considerations for neural network development that have not yet received much attention in the meteorological community, such as the concept of effective receptive fields, underutilized meteorological performance measures, and methods for NN interpretation, such as synthetic experiments and layer-wise relevance propagation. We also consider the process of neural network interpretation as a whole, recognizing it as an iterative scientist-driven discovery process, and breaking it down into individual steps that researchers can take. Finally, while most work on neural network interpretation in meteorology has so far focused on networks for image classification tasks, we expand the focus to also include networks for image translation.




ic

Rigid Matrices From Rectangular PCPs. (arXiv:2005.03123v1 [cs.CC])

We introduce a variant of PCPs, that we refer to as rectangular PCPs, wherein proofs are thought of as square matrices, and the random coins used by the verifier can be partitioned into two disjoint sets, one determining the row of each query and the other determining the *column*.

We construct PCPs that are efficient, short, smooth and (almost-)rectangular. As a key application, we show that proofs for hard languages in NTIME$(2^n)$, when viewed as matrices, are rigid infinitely often. This strengthens and considerably simplifies a recent result of Alman and Chen [FOCS, 2019] constructing explicit rigid matrices in FNP. Namely, we prove the following theorem: - There is a constant $delta in (0,1)$ such that there is an FNP-machine that, for infinitely many $N$, on input $1^N$ outputs $N imes N$ matrices with entries in $mathbb{F}_2$ that are $delta N^2$-far (in Hamming distance) from matrices of rank at most $2^{log N/Omega(log log N)}$.

Our construction of rectangular PCPs starts with an analysis of how randomness yields queries in the Reed--Muller-based outer PCP of Ben-Sasson, Goldreich, Harsha, Sudan and Vadhan [SICOMP, 2006; CCC, 2005]. We then show how to preserve rectangularity under PCP composition and a smoothness-inducing transformation. This warrants refined and stronger notions of rectangularity, which we prove for the outer PCP and its transforms.




ic

Electricity-Aware Heat Unit Commitment: A Bid-Validity Approach. (arXiv:2005.03120v1 [eess.SY])

Coordinating the operation of combined heat and power plants (CHPs) and heat pumps (HPs) at the interface between heat and power systems is essential to achieve a cost-effective and efficient operation of the overall energy system. Indeed, in the current sequential market practice, the heat market has no insight into the impacts of heat dispatch on the electricity market. While preserving this sequential practice, this paper introduces an electricity-aware heat unit commitment model. Coordination is achieved through bid validity constraints, which embed the techno-economic linkage between heat and electricity outputs and costs of CHPs and HPs. This approach constitutes a novel market mechanism for the coordination of heat and power systems, defining heat bids conditionally on electricity market prices. The resulting model is a trilevel optimization problem, which we recast as a mixed-integer linear program using a lexicographic function. We use a realistic case study based on the Danish power and heat system, and show that the proposed model yields a 4.5% reduction in total operating cost of heat and power systems compared to a traditional decoupled unit commitment model, while reducing the financial losses of each CHP and HP due to invalid bids by up-to 20.3 million euros.




ic

Strong replica symmetry in high-dimensional optimal Bayesian inference. (arXiv:2005.03115v1 [math.PR])

We consider generic optimal Bayesian inference, namely, models of signal reconstruction where the posterior distribution and all hyperparameters are known. Under a standard assumption on the concentration of the free energy, we show how replica symmetry in the strong sense of concentration of all multioverlaps can be established as a consequence of the Franz-de Sanctis identities; the identities themselves in the current setting are obtained via a novel perturbation of the prior distribution of the signal. Concentration of multioverlaps means that asymptotically the posterior distribution has a particularly simple structure encoded by a random probability measure (or, in the case of binary signal, a non-random probability measure). We believe that such strong control of the model should be key in the study of inference problems with underlying sparse graphical structure (error correcting codes, block models, etc) and, in particular, in the derivation of replica symmetric formulas for the free energy and mutual information in this context.




ic

Deep Learning for Image-based Automatic Dial Meter Reading: Dataset and Baselines. (arXiv:2005.03106v1 [cs.CV])

Smart meters enable remote and automatic electricity, water and gas consumption reading and are being widely deployed in developed countries. Nonetheless, there is still a huge number of non-smart meters in operation. Image-based Automatic Meter Reading (AMR) focuses on dealing with this type of meter readings. We estimate that the Energy Company of Paran'a (Copel), in Brazil, performs more than 850,000 readings of dial meters per month. Those meters are the focus of this work. Our main contributions are: (i) a public real-world dial meter dataset (shared upon request) called UFPR-ADMR; (ii) a deep learning-based recognition baseline on the proposed dataset; and (iii) a detailed error analysis of the main issues present in AMR for dial meters. To the best of our knowledge, this is the first work to introduce deep learning approaches to multi-dial meter reading, and perform experiments on unconstrained images. We achieved a 100.0% F1-score on the dial detection stage with both Faster R-CNN and YOLO, while the recognition rates reached 93.6% for dials and 75.25% for meters using Faster R-CNN (ResNext-101).




ic

Constrained de Bruijn Codes: Properties, Enumeration, Constructions, and Applications. (arXiv:2005.03102v1 [cs.IT])

The de Bruijn graph, its sequences, and their various generalizations, have found many applications in information theory, including many new ones in the last decade. In this paper, motivated by a coding problem for emerging memory technologies, a set of sequences which generalize sequences in the de Bruijn graph are defined. These sequences can be also defined and viewed as constrained sequences. Hence, they will be called constrained de Bruijn sequences and a set of such sequences will be called a constrained de Bruijn code. Several properties and alternative definitions for such codes are examined and they are analyzed as generalized sequences in the de Bruijn graph (and its generalization) and as constrained sequences. Various enumeration techniques are used to compute the total number of sequences for any given set of parameters. A construction method of such codes from the theory of shift-register sequences is proposed. Finally, we show how these constrained de Bruijn sequences and codes can be applied in constructions of codes for correcting synchronization errors in the $ell$-symbol read channel and in the racetrack memory channel. For this purpose, these codes are superior in their size on previously known codes.




ic

Inference with Choice Functions Made Practical. (arXiv:2005.03098v1 [cs.AI])

We study how to infer new choices from previous choices in a conservative manner. To make such inferences, we use the theory of choice functions: a unifying mathematical framework for conservative decision making that allows one to impose axioms directly on the represented decisions. We here adopt the coherence axioms of De Bock and De Cooman (2019). We show how to naturally extend any given choice assessment to such a coherent choice function, whenever possible, and use this natural extension to make new choices. We present a practical algorithm to compute this natural extension and provide several methods that can be used to improve its scalability.