如果您想在时间维度上关注,那么这段代码对我来说似乎是正确的:
activations = LSTM(units, return_sequences=True)(embedded)# compute importance for each stepattention = Dense(1, activation='tanh')(activations)attention = Flatten()(attention)attention = Activation('softmax')(attention)attention = RepeatVector(units)(attention)attention = Permute([2, 1])(attention)sent_representation = merge([activations, attention], mode='mul')
您已经计算出shape的注意力向量
(batch_size, max_length):
attention = Activation('softmax')(attention)
我以前从未看过这段代码,所以我不能说这段代码是否正确:
K.sum(xin, axis=-2)
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)