ma

A LiDAR-based real-time capable 3D Perception System for Automated Driving in Urban Domains. (arXiv:2005.03404v1 [cs.RO])

We present a LiDAR-based and real-time capable 3D perception system for automated driving in urban domains. The hierarchical system design is able to model stationary and movable parts of the environment simultaneously and under real-time conditions. Our approach extends the state of the art by innovative in-detail enhancements for perceiving road users and drivable corridors even in case of non-flat ground surfaces and overhanging or protruding elements. We describe a runtime-efficient pointcloud processing pipeline, consisting of adaptive ground surface estimation, 3D clustering and motion classification stages. Based on the pipeline's output, the stationary environment is represented in a multi-feature mapping and fusion approach. Movable elements are represented in an object tracking system capable of using multiple reference points to account for viewpoint changes. We further enhance the tracking system by explicit consideration of occlusion and ambiguity cases. Our system is evaluated using a subset of the TUBS Road User Dataset. We enhance common performance metrics by considering application-driven aspects of real-world traffic scenarios. The perception system shows impressive results and is able to cope with the addressed scenarios while still preserving real-time capability.




ma

Datom: A Deformable modular robot for building self-reconfigurable programmable matter. (arXiv:2005.03402v1 [cs.RO])

Moving a module in a modular robot is a very complex and error-prone process. Unlike in swarm, in the modular robots we are targeting, the moving module must keep the connection to, at least, one other module. In order to miniaturize each module to few millimeters, we have proposed a design which is using electrostatic actuator. However, this movement is composed of several attachment, detachment creating the movement and each small step can fail causing a module to break the connection. The idea developed in this paper consists in creating a new kind of deformable module allowing a movement which keeps the connection between the moving and the fixed modules. We detail the geometry and the practical constraints during the conception of this new module. We then validate the possibility of movement for a module in an existing configuration. This implies the cooperation of some of the modules placed along the path and we show in simulation that it exists a motion process to reach every free positions of the surface for a given configuration.




ma

Does Multi-Encoder Help? A Case Study on Context-Aware Neural Machine Translation. (arXiv:2005.03393v1 [cs.CL])

In encoder-decoder neural models, multiple encoders are in general used to represent the contextual information in addition to the individual sentence. In this paper, we investigate multi-encoder approaches in documentlevel neural machine translation (NMT). Surprisingly, we find that the context encoder does not only encode the surrounding sentences but also behaves as a noise generator. This makes us rethink the real benefits of multi-encoder in context-aware translation - some of the improvements come from robust training. We compare several methods that introduce noise and/or well-tuned dropout setup into the training of these encoders. Experimental results show that noisy training plays an important role in multi-encoder-based NMT, especially when the training data is small. Also, we establish a new state-of-the-art on IWSLT Fr-En task by careful use of noise generation and dropout methods.




ma

Semantic Signatures for Large-scale Visual Localization. (arXiv:2005.03388v1 [cs.CV])

Visual localization is a useful alternative to standard localization techniques. It works by utilizing cameras. In a typical scenario, features are extracted from captured images and compared with geo-referenced databases. Location information is then inferred from the matching results. Conventional schemes mainly use low-level visual features. These approaches offer good accuracy but suffer from scalability issues. In order to assist localization in large urban areas, this work explores a different path by utilizing high-level semantic information. It is found that object information in a street view can facilitate localization. A novel descriptor scheme called "semantic signature" is proposed to summarize this information. A semantic signature consists of type and angle information of visible objects at a spatial location. Several metrics and protocols are proposed for signature comparison and retrieval. They illustrate different trade-offs between accuracy and complexity. Extensive simulation results confirm the potential of the proposed scheme in large-scale applications. This paper is an extended version of a conference paper in CBMI'18. A more efficient retrieval protocol is presented with additional experiment results.




ma

WSMN: An optimized multipurpose blind watermarking in Shearlet domain using MLP and NSGA-II. (arXiv:2005.03382v1 [cs.CR])

Digital watermarking is a remarkable issue in the field of information security to avoid the misuse of images in multimedia networks. Although access to unauthorized persons can be prevented through cryptography, it cannot be simultaneously used for copyright protection or content authentication with the preservation of image integrity. Hence, this paper presents an optimized multipurpose blind watermarking in Shearlet domain with the help of smart algorithms including MLP and NSGA-II. In this method, four copies of the robust copyright logo are embedded in the approximate coefficients of Shearlet by using an effective quantization technique. Furthermore, an embedded random sequence as a semi-fragile authentication mark is effectively extracted from details by the neural network. Due to performing an effective optimization algorithm for selecting optimum embedding thresholds, and also distinguishing the texture of blocks, the imperceptibility and robustness have been preserved. The experimental results reveal the superiority of the scheme with regard to the quality of watermarked images and robustness against hybrid attacks over other state-of-the-art schemes. The average PSNR and SSIM of the dual watermarked images are 38 dB and 0.95, respectively; Besides, it can effectively extract the copyright logo and locates forgery regions under severe attacks with satisfactory accuracy.




ma

Vid2Curve: Simultaneously Camera Motion Estimation and Thin Structure Reconstruction from an RGB Video. (arXiv:2005.03372v1 [cs.GR])

Thin structures, such as wire-frame sculptures, fences, cables, power lines, and tree branches, are common in the real world.

It is extremely challenging to acquire their 3D digital models using traditional image-based or depth-based reconstruction methods because thin structures often lack distinct point features and have severe self-occlusion.

We propose the first approach that simultaneously estimates camera motion and reconstructs the geometry of complex 3D thin structures in high quality from a color video captured by a handheld camera.

Specifically, we present a new curve-based approach to estimate accurate camera poses by establishing correspondences between featureless thin objects in the foreground in consecutive video frames, without requiring visual texture in the background scene to lock on.

Enabled by this effective curve-based camera pose estimation strategy, we develop an iterative optimization method with tailored measures on geometry, topology as well as self-occlusion handling for reconstructing 3D thin structures.

Extensive validations on a variety of thin structures show that our method achieves accurate camera pose estimation and faithful reconstruction of 3D thin structures with complex shape and topology at a level that has not been attained by other existing reconstruction methods.




ma

Scoring Root Necrosis in Cassava Using Semantic Segmentation. (arXiv:2005.03367v1 [eess.IV])

Cassava a major food crop in many parts of Africa, has majorly been affected by Cassava Brown Streak Disease (CBSD). The disease affects tuberous roots and presents symptoms that include a yellow/brown, dry, corky necrosis within the starch-bearing tissues. Cassava breeders currently depend on visual inspection to score necrosis in roots based on a qualitative score which is quite subjective. In this paper we present an approach to automate root necrosis scoring using deep convolutional neural networks with semantic segmentation. Our experiments show that the UNet model performs this task with high accuracy achieving a mean Intersection over Union (IoU) of 0.90 on the test set. This method provides a means to use a quantitative measure for necrosis scoring on root cross-sections. This is done by segmentation and classifying the necrotized and non-necrotized pixels of cassava root cross-sections without any additional feature engineering.




ma

Soft Interference Cancellation for Random Coding in Massive Gaussian Multiple-Access. (arXiv:2005.03364v1 [cs.IT])

We utilize recent results on the exact block error probability of Gaussian random codes in additive white Gaussian noise to analyze Gaussian random coding for massive multiple-access at finite message length. Soft iterative interference cancellation is found to closely approach the performance bounds recently found in [1]. The existence of two fundamentally different regimes in the trade-off between power and bandwidth efficiency reported in [2] is related to much older results in [3] on power optimization by linear programming. Furthermore, we tighten the achievability bounds of [1] in the low power regime and show that orthogonal constellations are very close to the theoretical limits for message lengths around 100 and above.




ma

Probabilistic Hyperproperties of Markov Decision Processes. (arXiv:2005.03362v1 [cs.LO])

We study the specification and verification of hyperproperties for probabilistic systems represented as Markov decision processes (MDPs). Hyperproperties are system properties that describe the correctness of a system as a relation between multiple executions. Hyperproperties generalize trace properties and include information-flow security requirements, like noninterference, as well as requirements like symmetry, partial observation, robustness, and fault tolerance. We introduce the temporal logic PHL, which extends classic probabilistic logics with quantification over schedulers and traces. PHL can express a wide range of hyperproperties for probabilistic systems, including both classical applications, such as differential privacy, and novel applications in areas such as robotics and planning. While the model checking problem for PHL is in general undecidable, we provide methods both for proving and for refuting a class of probabilistic hyperproperties for MDPs.




ma

JASS: Japanese-specific Sequence to Sequence Pre-training for Neural Machine Translation. (arXiv:2005.03361v1 [cs.CL])

Neural machine translation (NMT) needs large parallel corpora for state-of-the-art translation quality. Low-resource NMT is typically addressed by transfer learning which leverages large monolingual or parallel corpora for pre-training. Monolingual pre-training approaches such as MASS (MAsked Sequence to Sequence) are extremely effective in boosting NMT quality for languages with small parallel corpora. However, they do not account for linguistic information obtained using syntactic analyzers which is known to be invaluable for several Natural Language Processing (NLP) tasks. To this end, we propose JASS, Japanese-specific Sequence to Sequence, as a novel pre-training alternative to MASS for NMT involving Japanese as the source or target language. JASS is joint BMASS (Bunsetsu MASS) and BRSS (Bunsetsu Reordering Sequence to Sequence) pre-training which focuses on Japanese linguistic units called bunsetsus. In our experiments on ASPEC Japanese--English and News Commentary Japanese--Russian translation we show that JASS can give results that are competitive with if not better than those given by MASS. Furthermore, we show for the first time that joint MASS and JASS pre-training gives results that significantly surpass the individual methods indicating their complementary nature. We will release our code, pre-trained models and bunsetsu annotated data as resources for researchers to use in their own NLP tasks.




ma

Self-Supervised Human Depth Estimation from Monocular Videos. (arXiv:2005.03358v1 [cs.CV])

Previous methods on estimating detailed human depth often require supervised training with `ground truth' depth data. This paper presents a self-supervised method that can be trained on YouTube videos without known depth, which makes training data collection simple and improves the generalization of the learned network. The self-supervised learning is achieved by minimizing a photo-consistency loss, which is evaluated between a video frame and its neighboring frames warped according to the estimated depth and the 3D non-rigid motion of the human body. To solve this non-rigid motion, we first estimate a rough SMPL model at each video frame and compute the non-rigid body motion accordingly, which enables self-supervised learning on estimating the shape details. Experiments demonstrate that our method enjoys better generalization and performs much better on data in the wild.




ma

Estimating Blood Pressure from Photoplethysmogram Signal and Demographic Features using Machine Learning Techniques. (arXiv:2005.03357v1 [eess.SP])

Hypertension is a potentially unsafe health ailment, which can be indicated directly from the Blood pressure (BP). Hypertension always leads to other health complications. Continuous monitoring of BP is very important; however, cuff-based BP measurements are discrete and uncomfortable to the user. To address this need, a cuff-less, continuous and a non-invasive BP measurement system is proposed using Photoplethysmogram (PPG) signal and demographic features using machine learning (ML) algorithms. PPG signals were acquired from 219 subjects, which undergo pre-processing and feature extraction steps. Time, frequency and time-frequency domain features were extracted from the PPG and their derivative signals. Feature selection techniques were used to reduce the computational complexity and to decrease the chance of over-fitting the ML algorithms. The features were then used to train and evaluate ML algorithms. The best regression models were selected for Systolic BP (SBP) and Diastolic BP (DBP) estimation individually. Gaussian Process Regression (GPR) along with ReliefF feature selection algorithm outperforms other algorithms in estimating SBP and DBP with a root-mean-square error (RMSE) of 6.74 and 3.59 respectively. This ML model can be implemented in hardware systems to continuously monitor BP and avoid any critical health conditions due to sudden changes.




ma

DramaQA: Character-Centered Video Story Understanding with Hierarchical QA. (arXiv:2005.03356v1 [cs.CL])

Despite recent progress on computer vision and natural language processing, developing video understanding intelligence is still hard to achieve due to the intrinsic difficulty of story in video. Moreover, there is not a theoretical metric for evaluating the degree of video understanding. In this paper, we propose a novel video question answering (Video QA) task, DramaQA, for a comprehensive understanding of the video story. The DramaQA focused on two perspectives: 1) hierarchical QAs as an evaluation metric based on the cognitive developmental stages of human intelligence. 2) character-centered video annotations to model local coherence of the story. Our dataset is built upon the TV drama "Another Miss Oh" and it contains 16,191 QA pairs from 23,928 various length video clips, with each QA pair belonging to one of four difficulty levels. We provide 217,308 annotated images with rich character-centered annotations, including visual bounding boxes, behaviors, and emotions of main characters, and coreference resolved scripts. Additionally, we provide analyses of the dataset as well as Dual Matching Multistream model which effectively learns character-centered representations of video to answer questions about the video. We are planning to release our dataset and model publicly for research purposes and expect that our work will provide a new perspective on video story understanding research.




ma

Quantum correlation alignment for unsupervised domain adaptation. (arXiv:2005.03355v1 [quant-ph])

Correlation alignment (CORAL), a representative domain adaptation (DA) algorithm, decorrelates and aligns a labelled source domain dataset to an unlabelled target domain dataset to minimize the domain shift such that a classifier can be applied to predict the target domain labels. In this paper, we implement the CORAL on quantum devices by two different methods. One method utilizes quantum basic linear algebra subroutines (QBLAS) to implement the CORAL with exponential speedup in the number and dimension of the given data samples. The other method is achieved through a variational hybrid quantum-classical procedure. In addition, the numerical experiments of the CORAL with three different types of data sets, namely the synthetic data, the synthetic-Iris data, the handwritten digit data, are presented to evaluate the performance of our work. The simulation results prove that the variational quantum correlation alignment algorithm (VQCORAL) can achieve competitive performance compared with the classical CORAL.




ma

DMCP: Differentiable Markov Channel Pruning for Neural Networks. (arXiv:2005.03354v1 [cs.CV])

Recent works imply that the channel pruning can be regarded as searching optimal sub-structure from unpruned networks.

However, existing works based on this observation require training and evaluating a large number of structures, which limits their application.

In this paper, we propose a novel differentiable method for channel pruning, named Differentiable Markov Channel Pruning (DMCP), to efficiently search the optimal sub-structure.

Our method is differentiable and can be directly optimized by gradient descent with respect to standard task loss and budget regularization (e.g. FLOPs constraint).

In DMCP, we model the channel pruning as a Markov process, in which each state represents for retaining the corresponding channel during pruning, and transitions between states denote the pruning process.

In the end, our method is able to implicitly select the proper number of channels in each layer by the Markov process with optimized transitions. To validate the effectiveness of our method, we perform extensive experiments on Imagenet with ResNet and MobilenetV2.

Results show our method can achieve consistent improvement than state-of-the-art pruning methods in various FLOPs settings. The code is available at https://github.com/zx55/dmcp




ma

Error estimates for the Cahn--Hilliard equation with dynamic boundary conditions. (arXiv:2005.03349v1 [math.NA])

A proof of convergence is given for bulk--surface finite element semi-discretisation of the Cahn--Hilliard equation with Cahn--Hilliard-type dynamic boundary conditions in a smooth domain. The semi-discretisation is studied in the weak formulation as a second order system. Optimal-order uniform-in-time error estimates are shown in the $L^2$ and $H^1$ norms. The error estimates are based on a consistency and stability analysis. The proof of stability is performed in an abstract framework, based on energy estimates exploiting the anti-symmetric structure of the second order system. Numerical experiments illustrate the theoretical results.




ma

Scene Text Image Super-Resolution in the Wild. (arXiv:2005.03341v1 [cs.CV])

Low-resolution text images are often seen in natural scenes such as documents captured by mobile phones. Recognizing low-resolution text images is challenging because they lose detailed content information, leading to poor recognition accuracy. An intuitive solution is to introduce super-resolution (SR) techniques as pre-processing. However, previous single image super-resolution (SISR) methods are trained on synthetic low-resolution images (e.g.Bicubic down-sampling), which is simple and not suitable for real low-resolution text recognition. To this end, we pro-pose a real scene text SR dataset, termed TextZoom. It contains paired real low-resolution and high-resolution images which are captured by cameras with different focal length in the wild. It is more authentic and challenging than synthetic data, as shown in Fig. 1. We argue improv-ing the recognition accuracy is the ultimate goal for Scene Text SR. In this purpose, a new Text Super-Resolution Network termed TSRN, with three novel modules is developed. (1) A sequential residual block is proposed to extract the sequential information of the text images. (2) A boundary-aware loss is designed to sharpen the character boundaries. (3) A central alignment module is proposed to relieve the misalignment problem in TextZoom. Extensive experiments on TextZoom demonstrate that our TSRN largely improves the recognition accuracy by over 13%of CRNN, and by nearly 9.0% of ASTER and MORAN compared to synthetic SR data. Furthermore, our TSRN clearly outperforms 7 state-of-the-art SR methods in boosting the recognition accuracy of LR images in TextZoom. For example, it outperforms LapSRN by over 5% and 8%on the recognition accuracy of ASTER and CRNN. Our results suggest that low-resolution text recognition in the wild is far from being solved, thus more research effort is needed.




ma

Wavelet Integrated CNNs for Noise-Robust Image Classification. (arXiv:2005.03337v1 [cs.CV])

Convolutional Neural Networks (CNNs) are generally prone to noise interruptions, i.e., small image noise can cause drastic changes in the output. To suppress the noise effect to the final predication, we enhance CNNs by replacing max-pooling, strided-convolution, and average-pooling with Discrete Wavelet Transform (DWT). We present general DWT and Inverse DWT (IDWT) layers applicable to various wavelets like Haar, Daubechies, and Cohen, etc., and design wavelet integrated CNNs (WaveCNets) using these layers for image classification. In WaveCNets, feature maps are decomposed into the low-frequency and high-frequency components during the down-sampling. The low-frequency component stores main information including the basic object structures, which is transmitted into the subsequent layers to extract robust high-level features. The high-frequency components, containing most of the data noise, are dropped during inference to improve the noise-robustness of the WaveCNets. Our experimental results on ImageNet and ImageNet-C (the noisy version of ImageNet) show that WaveCNets, the wavelet integrated versions of VGG, ResNets, and DenseNet, achieve higher accuracy and better noise-robustness than their vanilla versions.




ma

Causal Paths in Temporal Networks of Face-to-Face Human Interactions. (arXiv:2005.03333v1 [cs.SI])

In a temporal network causal paths are characterized by the fact that links from a source to a target must respect the chronological order. In this article we study the causal paths structure in temporal networks of human face to face interactions in different social contexts. In a static network paths are transitive i.e. the existence of a link from $a$ to $b$ and from $b$ to $c$ implies the existence of a path from $a$ to $c$ via $b$. In a temporal network the chronological constraint introduces time correlations that affects transitivity. A probabilistic model based on higher order Markov chains shows that correlations that can invalidate transitivity are present only when the time gap between consecutive events is larger than the average value and are negligible below such a value. The comparison between the densities of the temporal and static accessibility matrices shows that the static representation can be used with good approximation. Moreover, we quantify the extent of the causally connected region of the networks over time.




ma

Specification and Automated Analysis of Inter-Parameter Dependencies in Web APIs. (arXiv:2005.03320v1 [cs.SE])

Web services often impose inter-parameter dependencies that restrict the way in which two or more input parameters can be combined to form valid calls to the service. Unfortunately, current specification languages for web services like the OpenAPI Specification (OAS) provide no support for the formal description of such dependencies, which makes it hardly possible to automatically discover and interact with services without human intervention. In this article, we present an approach for the specification and automated analysis of inter-parameter dependencies in web APIs. We first present a domain-specific language, called Inter-parameter Dependency Language (IDL), for the specification of dependencies among input parameters in web services. Then, we propose a mapping to translate an IDL document into a constraint satisfaction problem (CSP), enabling the automated analysis of IDL specifications using standard CSP-based reasoning operations. Specifically, we present a catalogue of nine analysis operations on IDL documents allowing to compute, for example, whether a given request satisfies all the dependencies of the service. Finally, we present a tool suite including an editor, a parser, an OAS extension, a constraint programming-aided library, and a test suite supporting IDL specifications and their analyses. Together, these contributions pave the way for a new range of specification-driven applications in areas such as code generation and testing.




ma

Interval type-2 fuzzy logic system based similarity evaluation for image steganography. (arXiv:2005.03310v1 [cs.MM])

Similarity measure, also called information measure, is a concept used to distinguish different objects. It has been studied from different contexts by employing mathematical, psychological, and fuzzy approaches. Image steganography is the art of hiding secret data into an image in such a way that it cannot be detected by an intruder. In image steganography, hiding secret data in the plain or non-edge regions of the image is significant due to the high similarity and redundancy of the pixels in their neighborhood. However, the similarity measure of the neighboring pixels, i.e., their proximity in color space, is perceptual rather than mathematical. This paper proposes an interval type 2 fuzzy logic system (IT2 FLS) to determine the similarity between the neighboring pixels by involving an instinctive human perception through a rule-based approach. The pixels of the image having high similarity values, calculated using the proposed IT2 FLS similarity measure, are selected for embedding via the least significant bit (LSB) method. We term the proposed procedure of steganography as IT2 FLS LSB method. Moreover, we have developed two more methods, namely, type 1 fuzzy logic system based least significant bits (T1FLS LSB) and Euclidean distance based similarity measures for least significant bit (SM LSB) steganographic methods. Experimental simulations were conducted for a collection of images and quality index metrics, such as PSNR, UQI, and SSIM are used. All the three steganographic methods are applied on datasets and the quality metrics are calculated. The obtained stego images and results are shown and thoroughly compared to determine the efficacy of the IT2 FLS LSB method. Finally, we have done a comparative analysis of the proposed approach with the existing well-known steganographic methods to show the effectiveness of our proposed steganographic method.




ma

Safe Data-Driven Distributed Coordination of Intersection Traffic. (arXiv:2005.03304v1 [math.OC])

This work addresses the problem of traffic management at and near an isolated un-signalized intersection for autonomous and networked vehicles through coordinated optimization of their trajectories. We decompose the trajectory of each vehicle into two phases: the provisional phase and the coordinated phase. A vehicle, upon entering the region of interest, initially operates in the provisional phase, in which the vehicle is allowed to optimize its trajectory but is constrained to guarantee in-lane safety and to not enter the intersection. Periodically, all the vehicles in their provisional phase switch to their coordinated phase, which is obtained by coordinated optimization of the schedule of the vehicles' intersection usage as well as their trajectories. For the coordinated phase, we propose a data-driven solution, in which the intersection usage order is obtained through a data-driven online "classification" and the trajectories are computed sequentially. This approach is computationally very efficient and does not compromise much on optimality. Moreover, it also allows for incorporation of "macro" information such as traffic arrival rates into the solution. We also discuss a distributed implementation of this proposed data-driven sequential algorithm. Finally, we compare the proposed algorithm and its two variants against traditional methods of intersection management and against some existing results in the literature by micro-simulations.




ma

Cotatron: Transcription-Guided Speech Encoder for Any-to-Many Voice Conversion without Parallel Data. (arXiv:2005.03295v1 [eess.AS])

We propose Cotatron, a transcription-guided speech encoder for speaker-independent linguistic representation. Cotatron is based on the multispeaker TTS architecture and can be trained with conventional TTS datasets. We train a voice conversion system to reconstruct speech with Cotatron features, which is similar to the previous methods based on Phonetic Posteriorgram (PPG). By training and evaluating our system with 108 speakers from the VCTK dataset, we outperform the previous method in terms of both naturalness and speaker similarity. Our system can also convert speech from speakers that are unseen during training, and utilize ASR to automate the transcription with minimal reduction of the performance. Audio samples are available at https://mindslab-ai.github.io/cotatron, and the code with a pre-trained model will be made available soon.




ma

YANG2UML: Bijective Transformation and Simplification of YANG to UML. (arXiv:2005.03292v1 [cs.SE])

Software Defined Networking is currently revolutionizing computer networking by decoupling the network control (control plane) from the forwarding functions (data plane) enabling the network control to become directly programmable and the underlying infrastructure to be abstracted for applications and network services. Next to the well-known OpenFlow protocol, the XML-based NETCONF protocol is also an important means for exchanging configuration information from a management platform and is nowadays even part of OpenFlow. In combination with NETCONF, YANG is the corresponding protocol that defines the associated data structures supporting virtually all network configuration protocols. YANG itself is a semantically rich language, which -- in order to facilitate familiarization with the relevant subject -- is often visualized to involve other experts or developers and to support them by their daily work (writing applications which make use of YANG). In order to support this process, this paper presents an novel approach to optimize and simplify YANG data models to assist further discussions with the management and implementations (especially of interfaces) to reduce complexity. Therefore, we have defined a bidirectional mapping of YANG to UML and developed a tool that renders the created UML diagrams. This combines the benefits to use the formal language YANG with automatically maintained UML diagrams to involve other experts or developers, closing the gap between technically improved data models and their human readability.




ma

On the unique solution of the generalized absolute value equation. (arXiv:2005.03287v1 [math.NA])

In this paper, some useful necessary and sufficient conditions for the unique solution of the generalized absolute value equation (GAVE) $Ax-B|x|=b$ with $A, Bin mathbb{R}^{n imes n}$ from the optimization field are first presented, which cover the fundamental theorem for the unique solution of the linear system $Ax=b$ with $Ain mathbb{R}^{n imes n}$. Not only that, some new sufficient conditions for the unique solution of the GAVE are obtained, which are weaker than the previous published works.




ma

Continuous maximal covering location problems with interconnected facilities. (arXiv:2005.03274v1 [math.OC])

In this paper we analyze a continuous version of the maximal covering location problem, in which the facilities are required to be interconnected by means of a graph structure in which two facilities are allowed to be linked if a given distance is not exceed. We provide a mathematical programming framework for the problem and different resolution strategies. First, we propose a Mixed Integer Non Linear Programming formulation, and derive properties of the problem that allow us to project the continuous variables out avoiding the nonlinear constraints, resulting in an equivalent pure integer programming formulation. Since the number of constraints in the integer programming formulation is large and the constraints are, in general, difficult to handle, we propose two branch-&-cut approaches that avoid the complete enumeration of the constraints resulting in more efficient procedures. We report the results of an extensive battery of computational experiments comparing the performance of the different approaches.




ma

RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions. (arXiv:2005.03271v1 [eess.AS])

In recent years, all-neural end-to-end approaches have obtained state-of-the-art results on several challenging automatic speech recognition (ASR) tasks. However, most existing works focus on building ASR models where train and test data are drawn from the same domain. This results in poor generalization characteristics on mismatched-domains: e.g., end-to-end models trained on short segments perform poorly when evaluated on longer utterances. In this work, we analyze the generalization properties of streaming and non-streaming recurrent neural network transducer (RNN-T) based end-to-end models in order to identify model components that negatively affect generalization performance. We propose two solutions: combining multiple regularization techniques during training, and using dynamic overlapping inference. On a long-form YouTube test set, when the non-streaming RNN-T model is trained with shorter segments of data, the proposed combination improves word error rate (WER) from 22.3% to 14.8%; when the streaming RNN-T model trained on short Search queries, the proposed techniques improve WER on the YouTube set from 67.0% to 25.3%. Finally, when trained on Librispeech, we find that dynamic overlapping inference improves WER on YouTube from 99.8% to 33.0%.




ma

Online Proximal-ADMM For Time-varying Constrained Convex Optimization. (arXiv:2005.03267v1 [eess.SY])

This paper considers a convex optimization problem with cost and constraints that evolve over time. The function to be minimized is strongly convex and possibly non-differentiable, and variables are coupled through linear constraints.In this setting, the paper proposes an online algorithm based on the alternating direction method of multipliers(ADMM), to track the optimal solution trajectory of the time-varying problem; in particular, the proposed algorithm consists of a primal proximal gradient descent step and an appropriately perturbed dual ascent step. The paper derives tracking results, asymptotic bounds, and linear convergence results. The proposed algorithm is then specialized to a multi-area power grid optimization problem, and our numerical results verify the desired properties.




ma

Critique of Boyu Sima's Proof that ${ m P} eq{ m NP}$. (arXiv:2005.03256v1 [cs.CC])

We review and critique Boyu Sima's paper, "A solution of the P versus NP problem based on specific property of clique function," (arXiv:1911.00722) which claims to prove that ${ m P} eq{ m NP}$ by way of removing the gap between the nonmonotone circuit complexity and the monotone circuit complexity of the clique function. We first describe Sima's argument, and then we describe where and why it fails. Finally, we present a simple example that clearly demonstrates the failure.




ma

Structured inversion of the Bernstein-Vandermonde Matrix. (arXiv:2005.03251v1 [math.NA])

Bernstein polynomials, long a staple of approximation theory and computational geometry, have also increasingly become of interest in finite element methods. Many fundamental problems in interpolation and approximation give rise to interesting linear algebra questions. When attempting to find a polynomial approximation of boundary or initial data, one encounters the Bernstein-Vandermonde matrix, which is found to be highly ill-conditioned. Previously, we used the relationship between monomial Bezout matrices and the inverse of Hankel matrices to obtain a decomposition of the inverse of the Bernstein mass matrix in terms of Hankel, Toeplitz, and diagonal matrices. In this paper, we use properties of the Bernstein-Bezout matrix to factor the inverse of the Bernstein-Vandermonde matrix into a difference of products of Hankel, Toeplitz, and diagonal matrices. We also use a nonstandard matrix norm to study the conditioning of the Bernstein-Vandermonde matrix, showing that the conditioning in this case is better than in the standard 2-norm. Additionally, we use properties of multivariate Bernstein polynomials to derive a block $LU$ decomposition of the Bernstein-Vandermonde matrix corresponding to equispaced nodes on the $d$-simplex.




ma

DFSeer: A Visual Analytics Approach to Facilitate Model Selection for Demand Forecasting. (arXiv:2005.03244v1 [cs.HC])

Selecting an appropriate model to forecast product demand is critical to the manufacturing industry. However, due to the data complexity, market uncertainty and users' demanding requirements for the model, it is challenging for demand analysts to select a proper model. Although existing model selection methods can reduce the manual burden to some extent, they often fail to present model performance details on individual products and reveal the potential risk of the selected model. This paper presents DFSeer, an interactive visualization system to conduct reliable model selection for demand forecasting based on the products with similar historical demand. It supports model comparison and selection with different levels of details. Besides, it shows the difference in model performance on similar products to reveal the risk of model selection and increase users' confidence in choosing a forecasting model. Two case studies and interviews with domain experts demonstrate the effectiveness and usability of DFSeer.




ma

Enhancing Software Development Process Using Automated Adaptation of Object Ensembles. (arXiv:2005.03241v1 [cs.SE])

Software development has been changing rapidly. This development process can be influenced through changing developer friendly approaches. We can save time consumption and accelerate the development process if we can automatically guide programmer during software development. There are some approaches that recommended relevant code snippets and APIitems to the developer. Some approaches apply general code, searching techniques and some approaches use an online based repository mining strategies. But it gets quite difficult to help programmers when they need particular type conversion problems. More specifically when they want to adapt existing interfaces according to their expectation. One of the familiar triumph to guide developers in such situation is adapting collections and arrays through automated adaptation of object ensembles. But how does it help to a novice developer in real time software development that is not explicitly specified? In this paper, we have developed a system that works as a plugin-tool integrated with a particular Data Mining Integrated environment (DMIE) to recommend relevant interface while they seek for a type conversion situation. We have a mined repository of respective adapter classes and related APIs from where developer, search their query and get their result using the relevant transformer classes. The system that recommends developers titled automated objective ensembles (AOE plugin).From the investigation as we have ever made, we can see that our approach much better than some of the existing approaches.




ma

Phase retrieval of complex-valued objects via a randomized Kaczmarz method. (arXiv:2005.03238v1 [cs.IT])

This paper investigates the convergence of the randomized Kaczmarz algorithm for the problem of phase retrieval of complex-valued objects. While this algorithm has been studied for the real-valued case}, its generalization to the complex-valued case is nontrivial and has been left as a conjecture. This paper establishes the connection between the convergence of the algorithm and the convexity of an objective function. Based on the connection, it demonstrates that when the sensing vectors are sampled uniformly from a unit sphere and the number of sensing vectors $m$ satisfies $m>O(nlog n)$ as $n, m ightarrowinfty$, then this algorithm with a good initialization achieves linear convergence to the solution with high probability.




ma

Mortar-based entropy-stable discontinuous Galerkin methods on non-conforming quadrilateral and hexahedral meshes. (arXiv:2005.03237v1 [math.NA])

High-order entropy-stable discontinuous Galerkin (DG) methods for nonlinear conservation laws reproduce a discrete entropy inequality by combining entropy conservative finite volume fluxes with summation-by-parts (SBP) discretization matrices. In the DG context, on tensor product (quadrilateral and hexahedral) elements, SBP matrices are typically constructed by collocating at Lobatto quadrature points. Recent work has extended the construction of entropy-stable DG schemes to collocation at more accurate Gauss quadrature points.

In this work, we extend entropy-stable Gauss collocation schemes to non-conforming meshes. Entropy-stable DG schemes require computing entropy conservative numerical fluxes between volume and surface quadrature nodes. On conforming tensor product meshes where volume and surface nodes are aligned, flux evaluations are required only between "lines" of nodes. However, on non-conforming meshes, volume and surface nodes are no longer aligned, resulting in a larger number of flux evaluations. We reduce this expense by introducing an entropy-stable mortar-based treatment of non-conforming interfaces via a face-local correction term, and provide necessary conditions for high-order accuracy. Numerical experiments in both two and three dimensions confirm the stability and accuracy of this approach.




ma

End-to-End Domain Adaptive Attention Network for Cross-Domain Person Re-Identification. (arXiv:2005.03222v1 [cs.CV])

Person re-identification (re-ID) remains challenging in a real-world scenario, as it requires a trained network to generalise to totally unseen target data in the presence of variations across domains. Recently, generative adversarial models have been widely adopted to enhance the diversity of training data. These approaches, however, often fail to generalise to other domains, as existing generative person re-identification models have a disconnect between the generative component and the discriminative feature learning stage. To address the on-going challenges regarding model generalisation, we propose an end-to-end domain adaptive attention network to jointly translate images between domains and learn discriminative re-id features in a single framework. To address the domain gap challenge, we introduce an attention module for image translation from source to target domains without affecting the identity of a person. More specifically, attention is directed to the background instead of the entire image of the person, ensuring identifying characteristics of the subject are preserved. The proposed joint learning network results in a significant performance improvement over state-of-the-art methods on several benchmark datasets.




ma

Multi-dimensional Avikainen's estimates. (arXiv:2005.03219v1 [math.PR])

Avikainen proved the estimate $mathbb{E}[|f(X)-f(widehat{X})|^{q}] leq C(p,q) mathbb{E}[|X-widehat{X}|^{p}]^{frac{1}{p+1}} $ for $p,q in [1,infty)$, one-dimensional random variables $X$ with the bounded density function and $widehat{X}$, and a function $f$ of bounded variation in $mathbb{R}$. In this article, we will provide multi-dimensional analogues of this estimate for functions of bounded variation in $mathbb{R}^{d}$, Orlicz-Sobolev spaces, Sobolev spaces with variable exponents and fractional Sobolev spaces. The main idea of our arguments is to use Hardy-Littlewood maximal estimates and pointwise characterizations of these function spaces. We will apply main statements to numerical analysis on irregular functionals of a solution to stochastic differential equations based on the Euler-Maruyama scheme and the multilevel Monte Carlo method, and to estimates of the $L^{2}$-time regularity of decoupled forward-backward stochastic differential equations with irregular terminal conditions.




ma

Conley's fundamental theorem for a class of hybrid systems. (arXiv:2005.03217v1 [math.DS])

We establish versions of Conley's (i) fundamental theorem and (ii) decomposition theorem for a broad class of hybrid dynamical systems. The hybrid version of (i) asserts that a globally-defined "hybrid complete Lyapunov function" exists for every hybrid system in this class. Motivated by mechanics and control settings where physical or engineered events cause abrupt changes in a system's governing dynamics, our results apply to a large class of Lagrangian hybrid systems (with impacts) studied extensively in the robotics literature. Viewed formally, these results generalize those of Conley and Franks for continuous-time and discrete-time dynamical systems, respectively, on metric spaces. However, we furnish specific examples illustrating how our statement of sufficient conditions represents merely an early step in the longer project of establishing what formal assumptions can and cannot endow hybrid systems models with the topologically well characterized partitions of limit behavior that make Conley's theory so valuable in those classical settings.




ma

OTFS-NOMA based on SCMA. (arXiv:2005.03216v1 [cs.IT])

Orthogonal Time Frequency Space (OTFS) is a $ ext{2-D}$ modulation technique that has the potential to overcome the challenges faced by orthogonal frequency division multiplexing (OFDM) in high Doppler environments. The performance of OTFS in a multi-user scenario with orthogonal multiple access (OMA) techniques has been impressive. Due to the requirement of massive connectivity in 5G and beyond, it is immensely essential to devise and examine the OTFS system with the existing Non-orthogonal Multiple Access (NOMA) techniques.

In this paper, we propose a multi-user OTFS system based on a code-domain NOMA technique called Sparse Code Multiple Access (SCMA). This system is referred to as the OTFS-SCMA model. The framework for OTFS-SCMA is designed for both downlink and uplink. First, the sparse SCMA codewords are strategically placed on the delay-Doppler plane such that the overall overloading factor of the OTFS-SCMA system is equal to that of the underlying basic SCMA system. The receiver in downlink performs the detection in two sequential phases: first, the conventional OTFS detection using the method of linear minimum mean square error (LMMSE), and then the conventional SCMA detection. For uplink, we propose a single-phase detector based on message-passing algorithm (MPA) to detect the multiple users' symbols. The performance of the proposed OTFS-SCMA system is validated through extensive simulations both in downlink and uplink. We consider delay-Doppler planes of different parameters and various SCMA systems of overloading factor up to 200$\%$. The performance of OTFS-SCMA is compared with those of existing OTFS-OMA techniques. The comprehensive investigation demonstrates the usefulness of OTFS-SCMA in future wireless communication standards.




ma

What comprises a good talking-head video generation?: A Survey and Benchmark. (arXiv:2005.03201v1 [cs.CV])

Over the years, performance evaluation has become essential in computer vision, enabling tangible progress in many sub-fields. While talking-head video generation has become an emerging research topic, existing evaluations on this topic present many limitations. For example, most approaches use human subjects (e.g., via Amazon MTurk) to evaluate their research claims directly. This subjective evaluation is cumbersome, unreproducible, and may impend the evolution of new research. In this work, we present a carefully-designed benchmark for evaluating talking-head video generation with standardized dataset pre-processing strategies. As for evaluation, we either propose new metrics or select the most appropriate ones to evaluate results in what we consider as desired properties for a good talking-head video, namely, identity preserving, lip synchronization, high video quality, and natural-spontaneous motion. By conducting a thoughtful analysis across several state-of-the-art talking-head generation approaches, we aim to uncover the merits and drawbacks of current methods and point out promising directions for future work. All the evaluation code is available at: https://github.com/lelechen63/talking-head-generation-survey.




ma

ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context. (arXiv:2005.03191v1 [eess.AS])

Convolutional neural networks (CNN) have shown promising results for end-to-end speech recognition, albeit still behind other state-of-the-art methods in performance. In this paper, we study how to bridge this gap and go beyond with a novel CNN-RNN-transducer architecture, which we call ContextNet. ContextNet features a fully convolutional encoder that incorporates global context information into convolution layers by adding squeeze-and-excitation modules. In addition, we propose a simple scaling method that scales the widths of ContextNet that achieves good trade-off between computation and accuracy. We demonstrate that on the widely used LibriSpeech benchmark, ContextNet achieves a word error rate (WER) of 2.1\%/4.6\% without external language model (LM), 1.9\%/4.1\% with LM and 2.9\%/7.0\% with only 10M parameters on the clean/noisy LibriSpeech test sets. This compares to the previous best published system of 2.0\%/4.6\% with LM and 3.9\%/11.3\% with 20M parameters. The superiority of the proposed ContextNet model is also verified on a much larger internal dataset.




ma

An Optimal Control Theory for the Traveling Salesman Problem and Its Variants. (arXiv:2005.03186v1 [math.OC])

We show that the traveling salesman problem (TSP) and its many variants may be modeled as functional optimization problems over a graph. In this formulation, all vertices and arcs of the graph are functionals; i.e., a mapping from a space of measurable functions to the field of real numbers. Many variants of the TSP, such as those with neighborhoods, with forbidden neighborhoods, with time-windows and with profits, can all be framed under this construct. In sharp contrast to their discrete-optimization counterparts, the modeling constructs presented in this paper represent a fundamentally new domain of analysis and computation for TSPs and their variants. Beyond its apparent mathematical unification of a class of problems in graph theory, the main advantage of the new approach is that it facilitates the modeling of certain application-specific problems in their home space of measurable functions. Consequently, certain elements of economic system theory such as dynamical models and continuous-time cost/profit functionals can be directly incorporated in the new optimization problem formulation. Furthermore, subtour elimination constraints, prevalent in discrete optimization formulations, are naturally enforced through continuity requirements. The price for the new modeling framework is nonsmooth functionals. Although a number of theoretical issues remain open in the proposed mathematical framework, we demonstrate the computational viability of the new modeling constructs over a sample set of problems to illustrate the rapid production of end-to-end TSP solutions to extensively-constrained practical problems.




ma

On Optimal Control of Discounted Cost Infinite-Horizon Markov Decision Processes Under Local State Information Structures. (arXiv:2005.03169v1 [eess.SY])

This paper investigates a class of optimal control problems associated with Markov processes with local state information. The decision-maker has only local access to a subset of a state vector information as often encountered in decentralized control problems in multi-agent systems. Under this information structure, part of the state vector cannot be observed. We leverage ab initio principles and find a new form of Bellman equations to characterize the optimal policies of the control problem under local information structures. The dynamic programming solutions feature a mixture of dynamics associated unobservable state components and the local state-feedback policy based on the observable local information. We further characterize the optimal local-state feedback policy using linear programming methods. To reduce the computational complexity of the optimal policy, we propose an approximate algorithm based on virtual beliefs to find a sub-optimal policy. We show the performance bounds on the sub-optimal solution and corroborate the results with numerical case studies.




ma

Avoiding 5/4-powers on the alphabet of nonnegative integers. (arXiv:2005.03158v1 [math.CO])

We identify the structure of the lexicographically least word avoiding 5/4-powers on the alphabet of nonnegative integers. Specifically, we show that this word has the form $p au(varphi(z) varphi^2(z) cdots)$ where $p, z$ are finite words, $varphi$ is a 6-uniform morphism, and $ au$ is a coding. This description yields a recurrence for the $i$th letter, which we use to prove that the sequence of letters is 6-regular with rank 188. More generally, we prove $k$-regularity for a sequence satisfying a recurrence of the same type.




ma

Fast Mapping onto Census Blocks. (arXiv:2005.03156v1 [cs.DC])

Pandemic measures such as social distancing and contact tracing can be enhanced by rapidly integrating dynamic location data and demographic data. Projecting billions of longitude and latitude locations onto hundreds of thousands of highly irregular demographic census block polygons is computationally challenging in both research and deployment contexts. This paper describes two approaches labeled "simple" and "fast". The simple approach can be implemented in any scripting language (Matlab/Octave, Python, Julia, R) and is easily integrated and customized to a variety of research goals. This simple approach uses a novel combination of hierarchy, sparse bounding boxes, polygon crossing-number, vectorization, and parallel processing to achieve 100,000,000+ projections per second on 100 servers. The simple approach is compact, does not increase data storage requirements, and is applicable to any country or region. The fast approach exploits the thread, vector, and memory optimizations that are possible using a low-level language (C++) and achieves similar performance on a single server. This paper details these approaches with the goal of enabling the broader community to quickly integrate location and demographic data.




ma

NTIRE 2020 Challenge on Image Demoireing: Methods and Results. (arXiv:2005.03155v1 [cs.CV])

This paper reviews the Challenge on Image Demoireing that was part of the New Trends in Image Restoration and Enhancement (NTIRE) workshop, held in conjunction with CVPR 2020. Demoireing is a difficult task of removing moire patterns from an image to reveal an underlying clean image. The challenge was divided into two tracks. Track 1 targeted the single image demoireing problem, which seeks to remove moire patterns from a single image. Track 2 focused on the burst demoireing problem, where a set of degraded moire images of the same scene were provided as input, with the goal of producing a single demoired image as output. The methods were ranked in terms of their fidelity, measured using the peak signal-to-noise ratio (PSNR) between the ground truth clean images and the restored images produced by the participants' methods. The tracks had 142 and 99 registered participants, respectively, with a total of 14 and 6 submissions in the final testing stage. The entries span the current state-of-the-art in image and burst image demoireing problems.




ma

Decentralized Adaptive Control for Collaborative Manipulation of Rigid Bodies. (arXiv:2005.03153v1 [cs.RO])

In this work, we consider a group of robots working together to manipulate a rigid object to track a desired trajectory in $SE(3)$. The robots have no explicit communication network among them, and they do no know the mass or friction properties of the object, or where they are attached to the object. However, we assume they share data from a common IMU placed arbitrarily on the object. To solve this problem, we propose a decentralized adaptive control scheme wherein each agent maintains and adapts its own estimate of the object parameters in order to track a reference trajectory. We present an analysis of the controller's behavior, and show that all closed-loop signals remain bounded, and that the system trajectory will almost always (except for initial conditions on a set of measure zero) converge to the desired trajectory. We study the proposed controller's performance using numerical simulations of a manipulation task in 3D, and with hardware experiments which demonstrate our algorithm on a planar manipulation task. These studies, taken together, demonstrate the effectiveness of the proposed controller even in the presence of numerous unmodelled effects, such as discretization errors and complex frictional interactions.




ma

An augmented Lagrangian preconditioner for implicitly-constituted non-Newtonian incompressible flow. (arXiv:2005.03150v1 [math.NA])

We propose an augmented Lagrangian preconditioner for a three-field stress-velocity-pressure discretization of stationary non-Newtonian incompressible flow with an implicit constitutive relation of power-law type. The discretization employed makes use of the divergence-free Scott-Vogelius pair for the velocity and pressure. The preconditioner builds on the work [P. E. Farrell, L. Mitchell, and F. Wechsung, SIAM J. Sci. Comput., 41 (2019), pp. A3073-A3096], where a Reynolds-robust preconditioner for the three-dimensional Newtonian system was introduced. The preconditioner employs a specialized multigrid method for the stress-velocity block that involves a divergence-capturing space decomposition and a custom prolongation operator. The solver exhibits excellent robustness with respect to the parameters arising in the constitutive relation, allowing for the simulation of a wide range of materials.




ma

Optimally Convergent Mixed Finite Element Methods for the Stochastic Stokes Equations. (arXiv:2005.03148v1 [math.NA])

We propose some new mixed finite element methods for the time dependent stochastic Stokes equations with multiplicative noise, which use the Helmholtz decomposition of the driving multiplicative noise. It is known [16] that the pressure solution has a low regularity, which manifests in sub-optimal convergence rates for well-known inf-sup stable mixed finite element methods in numerical simulations, see [10]. We show that eliminating this gradient part from the noise in the numerical scheme leads to optimally convergent mixed finite element methods, and that this conceptual idea may be used to retool numerical methods that are well-known in the deterministic setting, including pressure stabilization methods, so that their optimal convergence properties can still be maintained in the stochastic setting. Computational experiments are also provided to validate the theoretical results and to illustrate the conceptional usefulness of the proposed numerical approach.




ma

A Separation Theorem for Joint Sensor and Actuator Scheduling with Guaranteed Performance Bounds. (arXiv:2005.03143v1 [eess.SY])

We study the problem of jointly designing a sparse sensor and actuator schedule for linear dynamical systems while guaranteeing a control/estimation performance that approximates the fully sensed/actuated setting. We further prove a separation principle, showing that the problem can be decomposed into finding sensor and actuator schedules separately. However, it is shown that this problem cannot be efficiently solved or approximated in polynomial, or even quasi-polynomial time for time-invariant sensor/actuator schedules; instead, we develop deterministic polynomial-time algorithms for a time-varying sensor/actuator schedule with guaranteed approximation bounds. Our main result is to provide a polynomial-time joint actuator and sensor schedule that on average selects only a constant number of sensors and actuators at each time step, irrespective of the dimension of the system. The key idea is to sparsify the controllability and observability Gramians while providing approximation guarantees for Hankel singular values. This idea is inspired by recent results in theoretical computer science literature on sparsification.




ma

Rigid Matrices From Rectangular PCPs. (arXiv:2005.03123v1 [cs.CC])

We introduce a variant of PCPs, that we refer to as rectangular PCPs, wherein proofs are thought of as square matrices, and the random coins used by the verifier can be partitioned into two disjoint sets, one determining the row of each query and the other determining the *column*.

We construct PCPs that are efficient, short, smooth and (almost-)rectangular. As a key application, we show that proofs for hard languages in NTIME$(2^n)$, when viewed as matrices, are rigid infinitely often. This strengthens and considerably simplifies a recent result of Alman and Chen [FOCS, 2019] constructing explicit rigid matrices in FNP. Namely, we prove the following theorem: - There is a constant $delta in (0,1)$ such that there is an FNP-machine that, for infinitely many $N$, on input $1^N$ outputs $N imes N$ matrices with entries in $mathbb{F}_2$ that are $delta N^2$-far (in Hamming distance) from matrices of rank at most $2^{log N/Omega(log log N)}$.

Our construction of rectangular PCPs starts with an analysis of how randomness yields queries in the Reed--Muller-based outer PCP of Ben-Sasson, Goldreich, Harsha, Sudan and Vadhan [SICOMP, 2006; CCC, 2005]. We then show how to preserve rectangularity under PCP composition and a smoothness-inducing transformation. This warrants refined and stronger notions of rectangularity, which we prove for the outer PCP and its transforms.