ICLR 2022 杰出论文奖 - 获奖论文速览

ICLR 2022 杰出论文奖 - 获奖论文速览,第1张

目录

杰出论文奖论文

论文 1:Analytic-DPM: an Analytic Estimate of the Optimal Reverse Variance in Diffusion Probabilistic Models

论文 2:Hyperparameter Tuning with Renyi Differential Privacy

论文 3:Learning Strides in Convolutional Neural Networks

论文 4:Expressiveness and Approximation Properties of Graph Neural Networks

论文 5:Comparing Distributions by Measuring Differences that Affect Decision Making

论文 6:Neural Collapse Under MSE Loss: Proximity to and Dynamics on the Central Path

论文 7:Bootstrapped Meta-Learning

荣誉提名论文

提名论文 1:Understanding over-squashing and bottlenecks on graphs via curvature

提名论文 2:Efficiently Modeling Long Sequences with Structured State Spaces

提名论文 3:PiCO: Contrastive Label Disambiguation for Partial Label Learning


2022年国际学习表征会议 (ICLR) 是致力于人工智能表征学习 (通常被称为深度学习) 发展的专业人士的首要聚会,已经宣布了第十届会议议程,邀请了不同的发言者,会议评定了七篇获奖论文,三篇荣誉奖。

该项目主席审查了3391份提交的论文,并在四个月的审查过程中论文接收总数 1095 篇,论文接收率 32.3%。其中共有 54 篇 Oral(口头报告)论文和 176 篇 Spolight 论文。全部论文链接:ICLR 2022 Conference | OpenReview。

今年的评委会期待着在4月25日下午17:00、26日下午01:00、27日上午09:00和28日下午01:00的口头会议上展示获奖论文。七项杰出论文奖得主为:

杰出论文奖论文

论文 1:Analytic-DPM: an Analytic Estimate of the Optimal Reverse Variance in Diffusion Probabilistic Models

https://arxiv.org/pdf/2201.06503.pdf

https://openreview.net/pdf?id=0xiJLKH-ufZ

获奖理由:扩散概率模型(Diffusion probabilistic model,DPM)是一类强大的生成模型,是机器学习中一个快速发展的话题。本文旨在解决 DPM 模型的固有局限性,这种局限性为 DPM 中最优反向方差的计算缓慢且昂贵。作者首先给出了一个令人惊讶的结果,即 DPM 的最优反向方差和相应的最优 KL 散度都有其得分函数的解析形式。之后他们提出了新颖而优雅的免训练推理框架:Analytic-DPM,它使用蒙特卡罗方法和预训练的基于得分模型来估计方差和 KL 散度的分析形式。

这篇论文在理论贡献(表明 DPM 的最优反向方差和 KL 散度都具有解析形式)和实际益处(提出适用于各种 DPM 模型的免训练推理)方面都很重要,并且很可能影响未来对 DPM 的研究。

 【转自:https://mp.weixin.qq.com/s/XrJ47E94XEKbkcFYkBMPTw】

论文 2:Hyperparameter Tuning with Renyi Differential Privacy

https://arxiv.org/pdf/2110.03620v1.pdf

https://openreview.net/pdf?id=-70L8lpp9DF

获奖理由:本文对学习算法差分隐私分析的一个重要盲点提供了新的见解,即学习算法在数据上进行多次运行以调优超参数。作者指出,在某些情况下,部分数据可能会扭曲最优超参数,从而泄露私人信息。此外,作者在 Renyi 差分隐私框架下为超参数搜索过程提供了隐私保障。

这是一篇优秀的论文,考虑了学习算法的日常使用及其对社会隐私的影响,并提出了解决方案。这项工作将为差分隐私机器学习算法的后续工作提供基础。

 【转自:https://mp.weixin.qq.com/s/XrJ47E94XEKbkcFYkBMPTw】

论文 3:Learning Strides in Convolutional Neural Networks

https://arxiv.org/pdf/2202.01653.pdf

https://openreview.net/pdf?id=M752z9FKJP

获奖理由:本文讨论了任何使用卷积网络的研究者都面临的一个重要问题,即以一种原则性的方式设置 stride,这种根据原则性的方法忽略了可能的实验和试错。作者提出了一种新颖的、非常聪明的、可以用来学习 stride 的数学公式,并展示了一种实用方法,该方法在综合基准中实现了 SOTA 结果。文中主要思想是 DiffStride,这是第一个具有可学习 stride 的下采样层,它允许学习傅里叶域中裁剪掩码的大小,以适合可微编程的方式有效地调整大小。

这是一篇优秀的论文,它提出了一种可能成为常用工具箱以及深度学习课程一部分的方法。

【转自:https://mp.weixin.qq.com/s/XrJ47E94XEKbkcFYkBMPTw】

论文 4:Expressiveness and Approximation Properties of Graph Neural Networks

https://arxiv.org/pdf/2204.04661v1.pdf

https://openreview.net/pdf?id=wIzUeM3TAU

获奖理由:这篇理论比较强的论文展示了如何将有关不同图神经网络 GNN 架构的表达性和可分离性的问题进行简化(有时通过检查它们在张量语言中的计算来大大简化),其中这些问题与常见的组合概念有关,例如树宽(treewidth)。特别地,本文提出通过 Weisfeiler-Leman (WL) 检验,可以很容易地得到 GNN 分离力(separation power)的边界,该检验已成为衡量 GNN 分离力的标准。该框架对通过 GNN 研究函数的逼近性也有一定的指导意义。

本文通过提供描述、比较和分析 GNN 架构的通用框架,有可能对未来的研究产生重大影响。此外,本文提供了一个工具箱,GNN 架构设计人员可以使用该工具箱分析 GNN 的分离能力,而无需了解 WL 测试的复杂性。

【转自:https://mp.weixin.qq.com/s/XrJ47E94XEKbkcFYkBMPTw】

论文 5:Comparing Distributions by Measuring Differences that Affect Decision Making

https://openreview.net/pdf?id=KB5onONJIAU

获奖理由:该研究提出了一类新的差异(discrepancy),可以根据决策任务的最佳损失比较两个概率分布。通过适当地选择决策任务,该方法泛化了 Jensen-Shannon 散度(divergence)和最大平均差异族。与各种基准上的竞争基线相比,该方法实现了卓越的测试性能,并且具有广阔的应用前景,可用于了解气候变化对不同社会和经济活动的影响、评估样本质量以及选择针对不同决策任务的特征。评审委员会认为该论文具有非凡的实验意义,因为该方法允许用户在通过决策损失比较分布时直接指定其偏好,这意味着实际应用将有更高的可解释性。

【转自:https://mp.weixin.qq.com/s/XrJ47E94XEKbkcFYkBMPTw】

论文 6:Neural Collapse Under MSE Loss: Proximity to and Dynamics on the Central Path

https://arxiv.org/pdf/2106.02073.pdf

https://openreview.net/pdf?id=w1UbdvWH_R3

获奖理由:该研究对当今深度网络训练范式中普遍存在的「神经崩溃(neural collapse)」现象提出了新的理论见解。在神经崩溃期间,最后一层特征崩溃到类均值,分类器和类均值都崩溃到相同的 Simplex Equiangular Tight Frame,分类器行为崩溃到最近类均值决策规则。

该研究没有采用在数学上难以分析的交叉熵损失,而是提出了一种新的均方误差 (MSE) 损失分解,以便分析神经崩溃下损失的每个组成部分,这反过来又形成了一种新的「中心路径(central path)」理论构造,其中线性分类器在整个动态过程中对特征激活保持 MSE 最优。最后,通过探究沿中心路径的重归一化(renormalized)梯度流,研究者推导出预测神经崩溃的精确动态。该研究为理解深度网络的实验训练动态提供了新颖且极具启发性的理论见解。

【转自:https://mp.weixin.qq.com/s/XrJ47E94XEKbkcFYkBMPTw】

论文 7:Bootstrapped Meta-Learning

https://arxiv.org/pdf/2109.04504.pdf

https://openreview.net/pdf?id=b-ny3x071E5

获奖理由:元学习具有增强人工智能的潜力,但元优化一直是释放这种潜力的巨大挑战,该研究为元学习开辟了一个新方向。受 TD 学习的启发,研究者提出一种从自身或其他更新规则引导元学习器的方法。该研究进行了透彻的理论分析和多项实验,在 Atari ALE 基准测试中为无模型智能体实现了新的 SOTA,并在多任务元学习中提升了性能和效率。

【转自:https://mp.weixin.qq.com/s/XrJ47E94XEKbkcFYkBMPTw】

荣誉提名论文

另外 3 篇论文获得杰出论文奖荣誉提名是:

提名论文 1:Understanding over-squashing and bottlenecks on graphs via curvature

https://openreview.net/pdf?id=7UmjRGzp-A

提名论文 2:Efficiently Modeling Long Sequences with Structured State Spaces

https://openreview.net/pdf?id=uYLFoz1vlAC

提名论文 3:PiCO: Contrastive Label Disambiguation for Partial Label Learning

https://openreview.net/pdf?id=EhYjZy6e1gJ

 ICLR 其他论文推荐

1. ICLR 2022 — A Selection of 10 Papers You Shouldn’t Miss – Towards AI

2. Most Influential ICLR Papers (2022-02) – Paper Digest

TABLE 1: Most Influential ICLR Papers (2022-02)

YEARRANKPAPERAUTHOR(S)
20211An Image Is Worth 16×16 Words: Transformers for Image Recognition at Scale
IF:8   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: Transformers applied directly to image patches and pre-trained on large datasets work really well on image classification.
ALEXEY DOSOVITSKIY et. al.
20212Deformable DETR: Deformable Transformers for End-to-End Object Detection
IF:6   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: Deformable DETR is an efficient and fast-converging end-to-end object detector. It mitigates the high complexity and slow convergence issues of DETR via a novel sampling-based efficient attention mechanism.
XIZHOU ZHU et. al.
20213Rethinking Attention with Performers
IF:5   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We introduce Performers, linear full-rank-attention Transformers via provable random feature approximation methods, without relying on sparsity or low-rankness.
KRZYSZTOF MARCIN CHOROMANSKI et. al.
20214DEBERTA: DECODING-ENHANCED BERT WITH DISENTANGLED ATTENTION
IF:5   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: A new model architecture DeBERTa is proposed that improves the BERT and RoBERTa models using disentangled attention and enhanced mask decoder.
Pengcheng He; Xiaodong Liu; Jianfeng Gao; Weizhu Chen;
20215Adaptive Federated Optimization
IF:5   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We propose adaptive federated optimization techniques, and highlight their improved performance over popular methods such as FedAvg.
SASHANK J. REDDI et. al.
20216FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
IF:4   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We propose a non-autoregressive TTS model named FastSpeech 2 to better solve the one-to-many mapping problem in TTS and surpass autoregressive models in voice quality.
YI REN et. al.
20217Prototypical Contrastive Learning of Unsupervised Representations
IF:4   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We propose an unsupervised representation learning method that bridges contrastive learning with clustering in an EM framework.
Junnan Li; Pan Zhou; Caiming Xiong; Steven Hoi;
20218Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels
IF:4   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: The first successful demonstration that image augmentation can be applied to image-based Deep RL to achieve SOTA performance.
Denis Yarats; Ilya Kostrikov; Rob Fergus;
20219Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval
IF:4   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: This paper improves the learning of dense text retrieval using ANCE, which selects global negatives with bigger gradient norms using an asynchronously updated ANN index.
LEE XIONG et. al.
202110Fourier Neural Operator for Parametric Partial Differential Equations
IF:4   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: A novel neural operator based on Fourier transformation for learning partial differential equations.
ZONGYI LI et. al.
202111In Search of Lost Domain Generalization
IF:4   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: Our ERM baseline achieves state-of-the-art performance across many domain generalization benchmarks
Ishaan Gulrajani; David Lopez-Paz;
202112GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
IF:4   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In this paper we demonstrate conditional computation as a remedy to the above mentioned impediments, and demonstrate its efficacy and utility.
DMITRY LEPIKHIN et. al.
202113Score-Based Generative Modeling Through Stochastic Differential Equations
IF:4   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: A general framework for training and sampling from score-based models that unifies and generalizes previous methods, allows likelihood computation, and enables controllable generation.
YANG SONG et. al.
202114Sharpness-aware Minimization for Efficiently Improving Generalization
IF:4   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: Motivated by the connection between geometry of the loss landscape and generalization, we introduce a procedure for simultaneously minimizing loss value and loss sharpness.
Pierre Foret; Ariel Kleiner; Hossein Mobahi; Behnam Neyshabur;
202115Recurrent Independent Mechanisms
IF:4   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: Learning recurrent mechanisms which operate independently, and sparingly interact can lead to better generalization to out of distribution samples.
ANIRUDH GOYAL et. al.
20201ALBERT: A Lite BERT For Self-supervised Learning Of Language Representations
IF:8   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: A new pretraining method that establishes new state-of-the-art results on the GLUE, RACE, and SQuAD benchmarks while having fewer parameters compared to BERT-large.
ZHENZHONG LAN et. al.
20202ELECTRA: Pre-training Text Encoders As Discriminators Rather Than Generators
IF:8   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: A text encoder trained to distinguish real input tokens from plausible fakes efficiently learns effective language representations.
Kevin Clark; Minh-Thang Luong; Quoc V. Le; Christopher D. Manning;
20203BERTScore: Evaluating Text Generation With BERT
IF:7   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We propose BERTScore, an automatic evaluation metric for text generation, which correlates better with human judgments and provides stronger model selection performance than existing metrics.
Tianyi Zhang*; Varsha Kishore*; Felix Wu*; Kilian Q. Weinberger; Yoav Artzi;
20204On The Variance Of The Adaptive Learning Rate And Beyond
IF:7   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: If warmup is the answer, what is the question?
LIYUAN LIU et. al.
20205The Curious Case Of Neural Text Degeneration
IF:7   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: Current language generation systems either aim for high likelihood and devolve into generic repetition or miscalibrate their stochasticity?we provide evidence of both and propose a solution: Nucleus Sampling.
Ari Holtzman; Jan Buys; Leo Du; Maxwell Forbes; Yejin Choi;
20206Reformer: The Efficient Transformer
IF:7   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: Efficient Transformer with locality-sensitive hashing and reversible layers
Nikita Kitaev; Lukasz Kaiser; Anselm Levskaya;
20207VL-BERT: Pre-training Of Generic Visual-Linguistic Representations
IF:7   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: VL-BERT is a simple yet powerful pre-trainable generic representation for visual-linguistic tasks. It is pre-trained on the massive-scale caption dataset and text-only corpus, and can be finetuned for varies down-stream visual-linguistic tasks.
WEIJIE SU et. al.
20208On The Convergence Of FedAvg On Non-IID Data
IF:6   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In this paper, we analyze the convergence of \texttt{FedAvg} on non-iid data and establish a convergence rate of $\mathcal{O}(\frac{1}{T})$ for strongly convex and smooth problems, where $T$ is the number of SGDs.
Xiang Li; Kaixuan Huang; Wenhao Yang; Shusen Wang; Zhihua Zhang;
20209Once For All: Train One Network And Specialize It For Efficient Deployment
IF:6   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We introduce techniques to train a single once-for-all network that fits many hardware platforms.
Han Cai; Chuang Gan; Tianzhe Wang; Zhekai Zhang; Song Han;
202010Fast Is Better Than Free: Revisiting Adversarial Training
IF:6   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: FGSM-based adversarial training, with randomization, works just as well as PGD-based adversarial training: we can use this to train a robust classifier in 6 minutes on CIFAR10, and 12 hours on ImageNet, on a single machine.
Eric Wong; Leslie Rice; J. Zico Kolter;
202011AugMix: A Simple Data Processing Method To Improve Robustness And Uncertainty
IF:6   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We obtain state-of-the-art on robustness to data shifts, and we maintain calibration under data shift even though even when accuracy drops
DAN HENDRYCKS* et. al.
202012DropEdge: Towards Deep Graph Convolutional Networks On Node Classification
IF:6   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: This paper proposes DropEdge, a novel and flexible technique to alleviate over-smoothing and overfitting issue in deep Graph Convolutional Networks.
Yu Rong; Wenbing Huang; Tingyang Xu; Junzhou Huang;
202013Large Batch Optimization For Deep Learning: Training BERT In 76 Minutes
IF:6   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: A fast optimizer for general applications and large-batch training.
YANG YOU et. al.
202014Dream To Control: Learning Behaviors By Latent Imagination
IF:6   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We present Dreamer, an agent that learns long-horizon behaviors purely by latent imagination using analytic value gradients.
Danijar Hafner; Timothy Lillicrap; Jimmy Ba; Mohammad Norouzi;
202015Deep Double Descent: Where Bigger Models And More Data Hurt
IF:5   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We demonstrate, and characterize, realistic settings where bigger models are worse, and more data hurts.
PREETUM NAKKIRAN et. al.
20191Large Scale GAN Training For High Fidelity Natural Image Synthesis
IF:8   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: GANs benefit from scaling up.
Andrew Brock; Jeff Donahue; Karen Simonyan;
20192Decoupled Weight Decay Regularization
IF:8   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: Novel variants of optimization methods that combine the benefits of both adaptive and non-adaptive methods.
Ilya Loshchilov; Frank Hutter;
20193GLUE: A Multi-Task Benchmark And Analysis Platform For Natural Language Understanding
IF:8   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We present a multi-task benchmark and analysis platform for evaluating generalization in natural language understanding systems.
ALEX WANG et. al.
20194How Powerful Are Graph Neural Networks?
IF:8   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We develop theoretical foundations for the expressive power of GNNs and design a provably most powerful GNN.
Keyulu Xu*; Weihua Hu*; Jure Leskovec; Stefanie Jegelka;
20195DARTS: Differentiable Architecture Search
IF:8   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We propose a differentiable architecture search algorithm for both convolutional and recurrent networks, achieving competitive performance with the state of the art using orders of magnitude less computation resources.
Hanxiao Liu; Karen Simonyan; Yiming Yang;
20196The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks
IF:8   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: Feedforward neural networks that can have weights pruned after training could have had the same weights pruned before training
Jonathan Frankle; Michael Carbin;
20197Learning Deep Representations By Mutual Information Estimation And Maximization
IF:8   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We learn deep representation by maximizing mutual information, leveraging structure in the objective, and are able to compute with fully supervised classifiers with comparable architectures
R DEVON HJELM et. al.
20198ImageNet-trained CNNs Are Biased Towards Texture; Increasing Shape Bias Improves Accuracy And Robustness
IF:8   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: ImageNet-trained CNNs are biased towards object texture (instead of shape like humans). Overcoming this major difference between human and machine vision yields improved detection performance and previously unseen robustness to image distortions.
ROBERT GEIRHOS et. al.
20199ProxylessNAS: Direct Neural Architecture Search On Target Task And Hardware
IF:8   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: Proxy-less neural architecture search for directly learning architectures on large-scale target task (ImageNet) while reducing the cost to the same level of normal training.
Han Cai; Ligeng Zhu; Song Han;
201910Benchmarking Neural Network Robustness To Common Corruptions And Perturbations
IF:7   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We propose ImageNet-C to measure classifier corruption robustness and ImageNet-P to measure perturbation robustness
Dan Hendrycks; Thomas Dietterich;
201911Robustness May Be At Odds With Accuracy
IF:7   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We show that adversarial robustness might come at the cost of standard classification performance, but also yields unexpected benefits.
Dimitris Tsipras; Shibani Santurkar; Logan Engstrom; Alexander Turner; Aleksander Madry;
201912Gradient Descent Provably Optimizes Over-parameterized Neural Networks
IF:7   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We prove gradient descent achieves zero training loss with a linear rate on over-parameterized neural networks.
Simon S. Du; Xiyu Zhai; Barnabas Poczos; Aarti Singh;
201913A Closer Look At Few-shot Classification
IF:7   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: A detailed empirical study in few-shot classification that revealing challenges in standard evaluation setting and showing a new direction.
Wei-Yu Chen; Yen-Cheng Liu; Zsolt Kira; Yu-Chiang Frank Wang; Jia-Bin Huang;
201914Rethinking The Value Of Network Pruning
IF:7   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In structured network pruning, fine-tuning a pruned model only gives comparable performance with training it from scratch.
Zhuang Liu; Mingjie Sun; Tinghui Zhou; Gao Huang; Trevor Darrell;
201915Meta-Learning With Latent Embedding Optimization
IF:7   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: Latent Embedding Optimization (LEO) is a novel gradient-based meta-learner with state-of-the-art performance on the challenging 5-way 1-shot and 5-shot miniImageNet and tieredImageNet classification tasks.
ANDREI A. RUSU et. al.
20181Towards Deep Learning Models Resistant To Adversarial Attacks
IF:9   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We provide a principled, optimization-based re-look at the notion of adversarial examples, and develop methods that produce models that are adversarially robust against a wide range of adversaries.
Aleksander Madry; Aleksandar Makelov; Ludwig Schmidt; Dimitris Tsipras; Adrian Vladu;
20182Graph Attention Networks
IF:9   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: A novel approach to processing graph-structured data by neural networks, leveraging attention over a node’s neighborhood. Achieves state-of-the-art results on transductive citation network tasks and an inductive protein-protein interaction task.
PETAR VELICKOVIC et. al.
20183Progressive Growing Of GANs For Improved Quality, Stability, And Variation
IF:9   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We train generative adversarial networks in a progressive fashion, enabling us to generate high-resolution images with high quality.
Tero Karras; Timo Aila; Samuli Laine; Jaakko Lehtinen;
20184Mixup: Beyond Empirical Risk Minimization
IF:8   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: Training on convex combinations between random training examples and their labels improves generalization in deep neural networks
Hongyi Zhang; Moustapha Cisse; Yann N. Dauphin; David Lopez-Paz;
20185Spectral Normalization For Generative Adversarial Networks
IF:8   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We propose a novel weight normalization technique called spectral normalization to stabilize the training of the discriminator of GANs.
Takeru Miyato; Toshiki Kataoka; Masanori Koyama; Yuichi Yoshida;
20186Ensemble Adversarial Training: Attacks And Defenses
IF:9   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: Adversarial training with single-step methods overfits, and remains vulnerable to simple black-box and white-box attacks. We show that including adversarial examples from multiple sources helps defend against black-box attacks.
FLORIAN TRAM�R et. al.
20187Unsupervised Representation Learning By Predicting Image Rotations
IF:8   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In our work we propose to learn image features by training ConvNets to recognize the 2d rotation that is applied to the image that it gets as input.
Spyros Gidaris; Praveer Singh; Nikos Komodakis;
20188On The Convergence Of Adam And Beyond
IF:8   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We investigate the convergence of popular optimization algorithms like Adam , RMSProp and propose new variants of these methods which provably converge to optimal solution in convex settings.
Sashank J. Reddi; Satyen Kale; Sanjiv Kumar;
20189Word Translation Without Parallel Data
IF:8   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: Aligning languages without the Rosetta Stone: with no parallel data, we construct bilingual dictionaries using adversarial training, cross-domain local scaling, and an accurate proxy criterion for cross-validation.
Guillaume Lample; Alexis Conneau; Marc’Aurelio Ranzato; Ludovic Denoyer; Herv� J�gou;
201810A Deep Reinforced Model For Abstractive Summarization
IF:8   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: A summarization model combining a new intra-attention and reinforcement learning method to increase summary ROUGE scores and quality for long sequences.
Romain Paulus; Caiming Xiong; Richard Socher;
201811Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting
IF:8   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: A neural sequence model that learns to forecast on a directed graph.
Yaguang Li; Rose Yu; Cyrus Shahabi; Yan Liu;
201812Regularizing And Optimizing LSTM Language Models
IF:8   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: Effective regularization and optimization strategies for LSTM-based language models achieves SOTA on PTB and WT2.
Stephen Merity; Nitish Shirish Keskar; Richard Socher;
201813Countering Adversarial Images Using Input Transformations
IF:8   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We apply a model-agnostic defense strategy against adversarial examples and achieve 60% white-box accuracy and 90% black-box accuracy against major attack algorithms.
Chuan Guo; Mayank Rana; Moustapha Cisse; Laurens van der Maaten;
201814A Simple Neural Attentive Meta-Learner
IF:7   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: a simple RNN-based meta-learner that achieves SOTA performance on popular benchmarks
Nikhil Mishra; Mostafa Rohaninejad; Xi Chen; Pieter Abbeel;
201815Unsupervised Machine Translation Using Monolingual Corpora Only
IF:7   Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We propose a new unsupervised machine translation model that can learn without using parallel corpora; experimental results show impressive performance on multiple corpora and pairs of languages.
Guillaume Lample; Alexis Conneau; Ludovic Denoyer; Marc’Aurelio Ranzato;

欢迎分享,转载请注明来源:内存溢出

原文地址: http://outofmemory.cn/langs/760125.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-05-01
下一篇 2022-05-01

发表评论

登录后才能评论

评论列表(0条)

保存