Skip to content

Introduction

Organize all carefully read papers, all papers are sorted by subject and time.

RNN

GRU

编号 时间 论文 作者 领域 评价 code
1412 201412 Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling Junyoung Chung
Bengio, Yoshua

Encoder–Decoder architecture

编号 时间 论文 作者 领域 评价 code
1406 201406 Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation Kyunghyun Cho
Cho, Kyunghyun
Bengio, Yoshua
NLP-machine translation 首次提出了RNN Encoder–Decoder;
补充:
- Neural machine translation by jointly learning to align and translate
- employing attention in machine translation
- soft alignment
1409 201409 Sequence to Sequence Learning with Neural Networks Google NLP-machine translation 这篇论文参考了Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation,所不同的是,它使用的是LSTM;
### blog

attention

编号 时间 论文 作者 领域 评价 code
1409 201409 Neural Machine Translation by Jointly Learning to Align and Translate Bahdanau, Dzmitry
Cho, Kyunghyun
Bengio, Yoshua
NLP-machine translation 首次提出了attention mechanism;
补出:
- Neural machine translation by jointly learning to align and translate
- employing attention in machine translation
- soft alignment
text_classification/a06_Seq2seqWithAttention/
1412 201412 MULTIPLE OBJECT RECOGNITION WITH VISUAL ATTENTION Google DeepMind
University of Toronto
- employing attention in OBJECT RECOGNITION
1502 201502 Show, Attend and Tell: Neural Image Caption Generation with Visual Attention Kelvin Xu,
Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhutdinov, Richard Zemel,
Yoshua Bengio
image caption 比较容易阅读,容易理解,可以作为了解attention的入门读物
- 受1409和1412的启发,将attention mechanism应用于generating its caption
- 提出了soft attention和hard attention
kelvinxu/arctic-captions
1601 201601 Long Short-Term Memory-Networks for Machine Reading Jianpeng Cheng
University of Edinburgh
Self-Attention
1706 201706 Attention Is All You Need Google Brain 首次提出了transformer--一种新的model architecture
补充:
- 《attention is all you need》解读
- huggingface/transformers
- models/official/transformer/

NLP

编号 时间 论文 作者 评价
1810 201810 BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Google Brain 补出
- TensorFlow code and pre-trained models for BERT

distant supervision for relation extraction

标号 年份 论文 优点 弱点
0911 200911 Distant supervision for relation extraction without labeled data 很多,比如:无需人工标注、成本低、大容量数据集、能够规避一些困扰监督学习的问题 1. 噪声大,性能不够强
2. 需要人为设计特征
... ... ... ...
15 2015 Distant supervisionfor relation extraction via piecewise convolutional neural networks. 1. 使用**PCNNs**s神经网络选择最大概率为valid instance的句子来从中提取特征,不依赖于传统的NLP工具 1. 每个bag中仅仅选择一个句子(最大概率)作为valid instance,导致它未能充分利用bag中的信息
17 2017 Distant Supervision for Relation Extraction with Sentence-level Attention and Entity Descriptions 1. bag中会考虑多个 valid Instance
2. 由神经网络来提取特征
3. 提出**entity descriptions**思路
1904 201904 Distant Supervision Relation Extraction with Intra-Bag and Inter-Bag Attentions 1. 除了intra-bag(包内) attentions,还添加了inter-bag(包间) attentions

memory network

编号 时间 论文 作者 评价
1410 201410 Memory Networks Jason Weston, Sumit Chopra, Antoine Bordes 首次提出memory model
1503 201503 End-To-End Memory Networks Sainbayar Sukhbaatar, Arthur Szlam, Jason Weston, Rob Fergus