tensorflow2.0学习笔记: LSTM 长短期记忆模型

  1. 普通RNN的信息更不能长久传播(存在与理论上),LSTM引入选择性机制:

    选择性输出
    选择性输入
    选择性遗忘
    
  2. 选择性通过-门-实现,门限机制:

    向量A是经过Sigmoid激活后的向量,包含的是概率值:
     向量A  =  [0.1,0.9,0.4,0,0.6] <- (Sigmoid :f(x) = 1/(1+exp(-x))
    向量B是输入的信息向量:
     向量B  =  [13.8,14,-7,-4,30]
    A为门限,B为信息
    A与B点乘(点积),对应元素相乘:
    A * B = [0.138,12.6,-2.8,0,18.0]
    
  3. LSTM

    遗忘门:比如新的一句有新的主语,就应该忘掉之前的主语
    传入门:是不是要把主语的性别信息添加进来
    输出门:动词该用单数形式还是复数形式(保留传入下层的信息)
    状态更新:当前状态+经过遗忘门的上一状态+经过传入门的输入状态
    

1. LSTM

embedding_dim = 16
batch_size = 128
single_model = keras.models.Sequential([
    # 1. define matrix : [vacob_size,embedding_dim]
    # 2. [1,2,3,4...], max_length * embedding_dim
    # 3. batch_size * max_length * embedding_dim
    keras.layers.Embedding(vocab_size,embedding_dim,input_length = max_length),
    # batch_size * max_length * embedding_dim -> batch_size *  embedding_dim
    keras.layers.LSTM(units = 64, return_sequences = False),#False 只返回最后一层
    
    keras.layers.Dense(64,activation='relu'),
    keras.layers.Dense(1,activation='sigmoid')
])

single_model.summary()
single_model.compile(optimizer = 'adam',loss = 'binary_crossentropy',metrics = ['accuracy'])
epochs = 30
history_single = single_model.fit(train_data,train_labels,epochs = epochs,
                    batch_size = batch_size,
                    validation_split = 0.2)

2. 双向LSTM

embedding_dim = 16
batch_size = 128
bi_LSTM_model = keras.models.Sequential([
    keras.layers.Embedding(vocab_size,embedding_dim,input_length = max_length),
    keras.layers.Bidirectional(
        keras.layers.LSTM(units = 32, return_sequences = False)),
    
    keras.layers.Dense(32,activation='relu'),
    keras.layers.Dense(1,activation='sigmoid')
])

bi_LSTM_model.summary()
bi_LSTM_model.compile(optimizer = 'adam',loss = 'binary_crossentropy',metrics = ['accuracy'])
epochs = 30
history = bi_LSTM_model.fit(train_data,train_labels,epochs = epochs,
                    batch_size = batch_size,
                    validation_split = 0.2)

数据处理部分与tensorflow2.0学习笔记: RNN 循环神经网络的一致,在此略去。整个程序上的不同之处仅是把keras.layers.SimpleRNN换成了keras.layers.LSTM,即实现了下的LSTM。

你可能感兴趣的:(tensorflow2.0学习笔记: LSTM 长短期记忆模型)