Latest en news

JASS: Japanese-specific Sequence to Sequence Pre-training for Neural Machine Translation. (arXiv:2005.03361v1 [cs.CL])

By arxiv.org
Published On ::

Neural machine translation (NMT) needs large parallel corpora for state-of-the-art translation quality. Low-resource NMT is typically addressed by transfer learning which leverages large monolingual or parallel corpora for pre-training. Monolingual pre-training approaches such as MASS (MAsked Sequence to Sequence) are extremely effective in boosting NMT quality for languages with small parallel corpora. However, they do not account for linguistic information obtained using syntactic analyzers which is known to be invaluable for several Natural Language Processing (NLP) tasks. To this end, we propose JASS, Japanese-specific Sequence to Sequence, as a novel pre-training alternative to MASS for NMT involving Japanese as the source or target language. JASS is joint BMASS (Bunsetsu MASS) and BRSS (Bunsetsu Reordering Sequence to Sequence) pre-training which focuses on Japanese linguistic units called bunsetsus. In our experiments on ASPEC Japanese--English and News Commentary Japanese--Russian translation we show that JASS can give results that are competitive with if not better than those given by MASS. Furthermore, we show for the first time that joint MASS and JASS pre-training gives results that significantly surpass the individual methods indicating their complementary nature. We will release our code, pre-trained models and bunsetsu annotated data as resources for researchers to use in their own NLP tasks.

JASS: Japanese-specific Sequence to Sequence Pre-training for Neural Machine Translation. (arXiv:2005.03361v1 [cs.CL])

DramaQA: Character-Centered Video Story Understanding with Hierarchical QA. (arXiv:2005.03356v1 [cs.CL])

Quantum correlation alignment for unsupervised domain adaptation. (arXiv:2005.03355v1 [quant-ph])

DMCP: Differentiable Markov Channel Pruning for Neural Networks. (arXiv:2005.03354v1 [cs.CV])

Regression Forest-Based Atlas Localization and Direction Specific Atlas Generation for Pancreas Segmentation. (arXiv:2005.03345v1 [cs.CV])

Scene Text Image Super-Resolution in the Wild. (arXiv:2005.03341v1 [cs.CV])

Bitvector-aware Query Optimization for Decision Support Queries (extended version). (arXiv:2005.03328v1 [cs.DB])

Global Distribution of Google Scholar Citations: A Size-independent Institution-based Analysis. (arXiv:2005.03324v1 [cs.DL])

Database Traffic Interception for Graybox Detection of Stored and Context-Sensitive XSS. (arXiv:2005.03322v1 [cs.CR])

Specification and Automated Analysis of Inter-Parameter Dependencies in Web APIs. (arXiv:2005.03320v1 [cs.SE])

Encoding in the Dark Grand Challenge: An Overview. (arXiv:2005.03315v1 [eess.IV])

Safe Data-Driven Distributed Coordination of Intersection Traffic. (arXiv:2005.03304v1 [math.OC])

Knowledge Enhanced Neural Fashion Trend Forecasting. (arXiv:2005.03297v1 [cs.IR])

Cotatron: Transcription-Guided Speech Encoder for Any-to-Many Voice Conversion without Parallel Data. (arXiv:2005.03295v1 [eess.AS])

Deep Learning based Person Re-identification. (arXiv:2005.03293v1 [cs.CV])

On the unique solution of the generalized absolute value equation. (arXiv:2005.03287v1 [math.NA])

RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions. (arXiv:2005.03271v1 [eess.AS])

Enhancing Software Development Process Using Automated Adaptation of Object Ensembles. (arXiv:2005.03241v1 [cs.SE])

Mortar-based entropy-stable discontinuous Galerkin methods on non-conforming quadrilateral and hexahedral meshes. (arXiv:2005.03237v1 [math.NA])

Safe Reinforcement Learning through Meta-learned Instincts. (arXiv:2005.03233v1 [cs.LG])

Constructing Accurate and Efficient Deep Spiking Neural Networks with Double-threshold and Augmented Schemes. (arXiv:2005.03231v1 [cs.NE])

Diagnosis of Coronavirus Disease 2019 (COVID-19) with Structured Latent Multi-View Representation Learning. (arXiv:2005.03227v1 [eess.IV])

Deeply Supervised Active Learning for Finger Bones Segmentation. (arXiv:2005.03225v1 [cs.CV])

End-to-End Domain Adaptive Attention Network for Cross-Domain Person Re-Identification. (arXiv:2005.03222v1 [cs.CV])

Multi-dimensional Avikainen's estimates. (arXiv:2005.03219v1 [math.PR])

Conley's fundamental theorem for a class of hybrid systems. (arXiv:2005.03217v1 [math.DS])

Shared Autonomy with Learned Latent Actions. (arXiv:2005.03210v1 [cs.RO])

Hierarchical Attention Network for Action Segmentation. (arXiv:2005.03209v1 [cs.CV])

What comprises a good talking-head video generation?: A Survey and Benchmark. (arXiv:2005.03201v1 [cs.CV])

Enabling Cross-chain Transactions: A Decentralized Cryptocurrency Exchange Protocol. (arXiv:2005.03199v1 [cs.CR])

A Proposal for Intelligent Agents with Episodic Memory. (arXiv:2005.03182v1 [cs.AI])

Lattice-based public key encryption with equality test in standard model, revisited. (arXiv:2005.03178v1 [cs.CR])

A Parameterized Perspective on Attacking and Defending Elections. (arXiv:2005.03176v1 [cs.GT])

Fact-based Dialogue Generation with Convergent and Divergent Decoding. (arXiv:2005.03174v1 [cs.CL])

Nonlinear model reduction: a comparison between POD-Galerkin and POD-DEIM methods. (arXiv:2005.03173v1 [physics.comp-ph])

Fast Mapping onto Census Blocks. (arXiv:2005.03156v1 [cs.DC])

NTIRE 2020 Challenge on Image Demoireing: Methods and Results. (arXiv:2005.03155v1 [cs.CV])

Decentralized Adaptive Control for Collaborative Manipulation of Rigid Bodies. (arXiv:2005.03153v1 [cs.RO])

An augmented Lagrangian preconditioner for implicitly-constituted non-Newtonian incompressible flow. (arXiv:2005.03150v1 [math.NA])

Optimally Convergent Mixed Finite Element Methods for the Stochastic Stokes Equations. (arXiv:2005.03148v1 [math.NA])

A Separation Theorem for Joint Sensor and Actuator Scheduling with Guaranteed Performance Bounds. (arXiv:2005.03143v1 [eess.SY])

A Gentle Introduction to Quantum Computing Algorithms with Applications to Universal Prediction. (arXiv:2005.03137v1 [quant-ph])

Catch Me If You Can: Using Power Analysis to Identify HPC Activity. (arXiv:2005.03135v1 [cs.CR])

Electricity-Aware Heat Unit Commitment: A Bid-Validity Approach. (arXiv:2005.03120v1 [eess.SY])

Strong replica symmetry in high-dimensional optimal Bayesian inference. (arXiv:2005.03115v1 [math.PR])

Constrained de Bruijn Codes: Properties, Enumeration, Constructions, and Applications. (arXiv:2005.03102v1 [cs.IT])

Inference with Choice Functions Made Practical. (arXiv:2005.03098v1 [cs.AI])

Near-optimal Detector for SWIPT-enabled Differential DF Relay Networks with SER Analysis. (arXiv:2005.03096v1 [cs.IT])

Heterogeneous Facility Location Games. (arXiv:2005.03095v1 [cs.GT])

Eliminating NB-IoT Interference to LTE System: a Sparse Machine Learning Based Approach. (arXiv:2005.03092v1 [cs.IT])

The finish line: Attachment of Signs

The Finish Line: Drainage Efficiency

The Finish Line: Eco-Friendliness of EIFS

The Finish Line: Adhesives vs. Mechanical Fasteners

The Finish Line: A (Faux) Monument for the Ages

Green Globes vs. LEED

Cloaked in Green?

Building Product Transparency— Be Careful What You Ask For

An Energy Label for Buildings

A Green Screw?

Benefits of the Variable Refrigerant Flow

Green Advocacy vs. Informed Consent

Green Building Mistakes

The Greenest Low Slope Roofing Solution

ANSI Green Globes 2015

Subscribe To Our Newsletter