id author title date pages extension mime words sentences flesch summary cache txt work_eopg7ki7vjcg7lxczmnt7oaeqq Wenpeng Yin Attentive Convolution: Equipping CNNs with RNN-style Attention Mechanisms 2018 16 .pdf application/pdf 9744 1274 68 the attention-weighted sum of hidden states corresponding to nonlocal context (e.g., the hidden convolution filters derive a higher-level representation for word, denoted as wordnew, by integrating word with three pieces of context: leftcontext, We apply ATTCONV to three sentence modeling tasks with variable-size context: a largescale Yelp sentiment classification task (Lin et al., • ATTCONV shows its flexibility and effectiveness in sentence modeling with variablesize context. Figure 4: ATTCONV models sentence tx with context ty. Recall that ATTCONVaims to compute a representation for tx in a way that convolution filters encode not only local context, but also Attentive convolution then generates the higherlevel hidden state at position i: source of attention is hidden states in sentence tx, by function fmgran(Hy), feature map Hy of context ty acting as input; and (iii) attention beneficiary is learned by function fbene(Hx), Hx acting instance of generating source of attention by function fmgran(H), learning word representations words, our intra-context attentive convolution is ./cache/work_eopg7ki7vjcg7lxczmnt7oaeqq.pdf ./txt/work_eopg7ki7vjcg7lxczmnt7oaeqq.txt