Latest or news

Successfully Applying the Stabilized Lottery Ticket Hypothesis to the Transformer Architecture. (arXiv:2005.03454v1 [cs.LG])

By arxiv.org
Published On ::

Sparse models require less memory for storage and enable a faster inference by reducing the necessary number of FLOPs. This is relevant both for time-critical and on-device computations using neural networks. The stabilized lottery ticket hypothesis states that networks can be pruned after none or few training iterations, using a mask computed based on the unpruned converged model. On the transformer architecture and the WMT 2014 English-to-German and English-to-French tasks, we show that stabilized lottery ticket pruning performs similar to magnitude pruning for sparsity levels of up to 85%, and propose a new combination of pruning techniques that outperforms all other techniques for even higher levels of sparsity. Furthermore, we confirm that the parameter's initial sign and not its specific value is the primary factor for successful training, and show that magnitude pruning cannot be used to find winning lottery tickets.

Successfully Applying the Stabilized Lottery Ticket Hypothesis to the Transformer Architecture. (arXiv:2005.03454v1 [cs.LG])

A combination of 'pooling' with a prediction model can reduce by 73% the number of COVID-19 (Corona-virus) tests. (arXiv:2005.03453v1 [cs.LG])

Lifted Regression/Reconstruction Networks. (arXiv:2005.03452v1 [cs.LG])

An Experimental Study of Reduced-Voltage Operation in Modern FPGAs for Neural Network Acceleration. (arXiv:2005.03451v1 [cs.LG])

Parametrized Universality Problems for One-Counter Nets. (arXiv:2005.03435v1 [cs.FL])

Dirichlet spectral-Galerkin approximation method for the simply supported vibrating plate eigenvalues. (arXiv:2005.03433v1 [math.NA])

The Perceptimatic English Benchmark for Speech Perception Models. (arXiv:2005.03418v1 [cs.CL])

Detection and Feeder Identification of the High Impedance Fault at Distribution Networks Based on Synchronous Waveform Distortions. (arXiv:2005.03411v1 [eess.SY])

AutoSOS: Towards Multi-UAV Systems Supporting Maritime Search and Rescue with Lightweight AI and Edge Computing. (arXiv:2005.03409v1 [cs.RO])

A LiDAR-based real-time capable 3D Perception System for Automated Driving in Urban Domains. (arXiv:2005.03404v1 [cs.RO])

Datom: A Deformable modular robot for building self-reconfigurable programmable matter. (arXiv:2005.03402v1 [cs.RO])

Scheduling with a processing time oracle. (arXiv:2005.03394v1 [cs.DS])

Semantic Signatures for Large-scale Visual Localization. (arXiv:2005.03388v1 [cs.CV])

2kenize: Tying Subword Sequences for Chinese Script Conversion. (arXiv:2005.03375v1 [cs.CL])

Energy-efficient topology to enhance the wireless sensor network lifetime using connectivity control. (arXiv:2005.03370v1 [cs.NI])

Scoring Root Necrosis in Cassava Using Semantic Segmentation. (arXiv:2005.03367v1 [eess.IV])

Soft Interference Cancellation for Random Coding in Massive Gaussian Multiple-Access. (arXiv:2005.03364v1 [cs.IT])

JASS: Japanese-specific Sequence to Sequence Pre-training for Neural Machine Translation. (arXiv:2005.03361v1 [cs.CL])

DramaQA: Character-Centered Video Story Understanding with Hierarchical QA. (arXiv:2005.03356v1 [cs.CL])

Quantum correlation alignment for unsupervised domain adaptation. (arXiv:2005.03355v1 [quant-ph])

DMCP: Differentiable Markov Channel Pruning for Neural Networks. (arXiv:2005.03354v1 [cs.CV])

Pricing under a multinomial logit model with non linear network effects. (arXiv:2005.03352v1 [cs.GT])

Error estimates for the Cahn--Hilliard equation with dynamic boundary conditions. (arXiv:2005.03349v1 [math.NA])

Regression Forest-Based Atlas Localization and Direction Specific Atlas Generation for Pancreas Segmentation. (arXiv:2005.03345v1 [cs.CV])

Wavelet Integrated CNNs for Noise-Robust Image Classification. (arXiv:2005.03337v1 [cs.CV])

Causal Paths in Temporal Networks of Face-to-Face Human Interactions. (arXiv:2005.03333v1 [cs.SI])

Crop Aggregating for short utterances speaker verification using raw waveforms. (arXiv:2005.03329v1 [eess.AS])

Bitvector-aware Query Optimization for Decision Support Queries (extended version). (arXiv:2005.03328v1 [cs.DB])

Database Traffic Interception for Graybox Detection of Stored and Context-Sensitive XSS. (arXiv:2005.03322v1 [cs.CR])

A Review of Computer Vision Methods in Network Security. (arXiv:2005.03318v1 [cs.NI])

Interval type-2 fuzzy logic system based similarity evaluation for image steganography. (arXiv:2005.03310v1 [cs.MM])

Safe Data-Driven Distributed Coordination of Intersection Traffic. (arXiv:2005.03304v1 [math.OC])

Knowledge Enhanced Neural Fashion Trend Forecasting. (arXiv:2005.03297v1 [cs.IR])

Cotatron: Transcription-Guided Speech Encoder for Any-to-Many Voice Conversion without Parallel Data. (arXiv:2005.03295v1 [eess.AS])

YANG2UML: Bijective Transformation and Simplification of YANG to UML. (arXiv:2005.03292v1 [cs.SE])

Data selection for multi-task learning under dynamic constraints. (arXiv:2005.03270v1 [eess.SY])

Online Proximal-ADMM For Time-varying Constrained Convex Optimization. (arXiv:2005.03267v1 [eess.SY])

Adaptive Feature Selection Guided Deep Forest for COVID-19 Classification with Chest CT. (arXiv:2005.03264v1 [eess.IV])

Quda: Natural Language Queries for Visual Data Analytics. (arXiv:2005.03257v1 [cs.CL])

Coding for Optimized Writing Rate in DNA Storage. (arXiv:2005.03248v1 [cs.IT])

DFSeer: A Visual Analytics Approach to Facilitate Model Selection for Demand Forecasting. (arXiv:2005.03244v1 [cs.HC])

Mortar-based entropy-stable discontinuous Galerkin methods on non-conforming quadrilateral and hexahedral meshes. (arXiv:2005.03237v1 [math.NA])

Safe Reinforcement Learning through Meta-learned Instincts. (arXiv:2005.03233v1 [cs.LG])

Multi-Target Deep Learning for Algal Detection and Classification. (arXiv:2005.03232v1 [cs.CV])

Constructing Accurate and Efficient Deep Spiking Neural Networks with Double-threshold and Augmented Schemes. (arXiv:2005.03231v1 [cs.NE])

Hierarchical Predictive Coding Models in a Deep-Learning Framework. (arXiv:2005.03230v1 [cs.CV])

Diagnosis of Coronavirus Disease 2019 (COVID-19) with Structured Latent Multi-View Representation Learning. (arXiv:2005.03227v1 [eess.IV])

Deeply Supervised Active Learning for Finger Bones Segmentation. (arXiv:2005.03225v1 [cs.CV])

End-to-End Domain Adaptive Attention Network for Cross-Domain Person Re-Identification. (arXiv:2005.03222v1 [cs.CV])

Conley's fundamental theorem for a class of hybrid systems. (arXiv:2005.03217v1 [math.DS])

The Finish Line: Floor Line Joints

The Finish Line: A (Faux) Monument for the Ages

The Finish Line: Right Solutions for the Right Problems

Will Synthetic Biology Save the World?

Building Product Transparency— Be Careful What You Ask For

An Energy Label for Buildings

Hydronic Floor Heating

Green Advocacy vs. Informed Consent

Fundraising Regulator appoints four new committee members

Companies' 'Green' Efforts Include Products’ Material Content

World Wide Security Goes Green!

Securitas Technology Partners with K9s United in Support of Law Enforcement Canines

Incomplete information can fuel misjudgment: study

FHWA rule updates protections for workers and drivers in work zones

The Future of Morning Meals

Subscribe To Our Newsletter