순환 신경망(Recurrent Neural Network) (2)

머신러닝/딥러닝

순환 신경망(Recurrent Neural Network) (2)

독학자78 2021. 1. 4. 08:36

2020/11/16 - [머신러닝/딥러닝] - 순환 신경망(Recurrent Neural Network) (1)

구현

다대일 순환 신경망 모델을 구현합니다.

IMDB 데이터셋을 불러옵니다.

import numpy as np
from tensorflow.keras.datasets import imdb

(x_train, y_train), (x_test, y_test) = imdb.load_data(skip_top=20, num_words=100)

skip_top 은 빈도수 상위 20개를 생략하는 것이고, num_words 는 데이터를 이루는 단어의 개수를 지정합니다.

데이터를 구성하는 정수 인코딩된 값들은 단어를 나타내는 것입니다.

x_train[0]

[2, 2, 22, 2, 43, 2, 2, 2, 2, 65, 2, 2, 66, 2, 2, 2, 36, 2, 2, 25, 2, 43, 2, 2, 50, 2, 2, 2, 35, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 39, 2, 2, 2, 2, 2, 2, 38, 2, 2, 2, 2, 50, 2, 2, 2, 2, 2, 2, 22, 2, 2, 2, 2, 2, 22, 71, 87, 2, 2, 43, 2, 38, 76, 2, 2, 2, 2, 22, 2, 2, 2, 2, 2, 2, 2, 2, 2, 62, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 66, 2, 33, 2, 2, 2, 2, 38, 2, 2, 25, 2, 51, 36, 2, 48, 25, 2, 33, 2, 22, 2, 2, 28, 77, 52, 2, 2, 2, 2, 82, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 36, 71, 43, 2, 2, 26, 2, 2, 46, 2, 2, 2, 2, 2, 2, 88, 2, 2, 2, 2, 98, 32, 2, 56, 26, 2, 2, 2, 2, 2, 2, 2, 22, 21, 2, 2, 26, 2, 2, 2, 30, 2, 2, 51, 36, 28, 2, 92, 25, 2, 2, 2, 65, 2, 38, 2, 88, 2, 2, 2, 2, 2, 2, 2, 2, 32, 2, 2, 2, 2, 2, 32]

어휘 사전을 불러옵니다.

word_to_index = imdb.get_word_index()
index_to_word = { v: k for k, v in word_to_index.items() }

index_to_word

{34701: 'fawn',
 52006: 'tsukino',
 52007: 'nunnery',
 16816: 'sonja',
 63951: 'vani',
 1408: 'woods',
 16115: 'spiders',
 2345: 'hanging',
 2289: 'woody',
 52008: 'trawling',
 ...
 }

2는 어휘 사전에 없는 단어로 생략합니다.

for i in range(len(x_train)):
    x_train[i] = [w for w in x_train[i] if w > 2]
    
for i in range(len(x_test)):
    x_test[i] = [w for w in x_test[i] if w > 2]
    
x_train[0]

[22, 43, 65, 66, 36, 25, 43, 50, 35, 39, 38, 50, 22, 22, 71, 87, 43, 38, 76, 22, 62, 66, 33, 38, 25, 51, 36, 48, 25, 33, 22, 28, 77, 52, 82, 36, 71, 43, 26, 46, 88, 98, 32, 56, 26, 22, 21, 26, 30, 51, 36, 28, 92, 25, 65, 38, 88, 32, 32]

정수값을 인덱스로 어휘 사전과 맵핑하기 위해서는 -3을 해야합니다.

for w in x_train[0]:
    print(index_to_word[w - 3], end=' ')

film just story really they you just there an from so there film film were great just so much film would really at so you what they if you at film have been good also they were just are out because them all up are film but are be what they have don't you story so because all all

각 샘플의 길이는 다르므로 전처리가 필요합니다.

print(len(x_train[0]))
print(len(x_train[1]))

59 
32

0으로 채우는 패딩을 이용합니다. (tf.keras.preprocessing.sequence.pad_sequences)

from tensorflow.keras.preprocessing import sequence

x_train = sequence.pad_sequences(x_train, 100)
x_test = sequence.pad_sequences(x_test, 100)

x_train.shape

(25000, 100)

정수값을 원-핫 인코딩합니다. (tf.keras.utils.to_categorical)

from tensorflow.keras.utils import to_categorical

x_train = to_categorical(x_train)
x_test = to_categorical(x_test)

x_train.shape

(25000, 100, 100)

전처리를 끝내고 학습 데이터의 20%를 검증 데이터로 분할합니다.

from sklearn.model_selection import train_test_split

x_train, x_val, y_train, y_val = train_test_split(x_train, y_train, test_size=0.2)

print(x_train.shape)
print(x_val.shape)

(20000, 100, 100)
(5000, 100, 100)

타겟 데이터의 형태를 변형합니다.

y_train = y_train.reshape(-1, 1)
y_val = y_val.reshape(-1, 1)
y_test = y_test.reshape(-1, 1)

신경망을 정의합니다.

import numpy as np
import tensorflow as tf

class Model:
    def __init__(self, lr=1e-3):
        tf.reset_default_graph()
        
        with tf.name_scope('input'):
            self.x = tf.placeholder(tf.float32, [None, 100, 100])
            self.y = tf.placeholder(tf.float32, [None, 1])

        with tf.name_scope('layer'):

            cell = tf.nn.rnn_cell.BasicRNNCell(32)
            outputs, state = tf.nn.dynamic_rnn(cell, self.x, dtype=tf.float32)
                       
        with tf.name_scope('output'):
            self.output = tf.layers.dense(state, 1, tf.nn.sigmoid)
            
        with tf.name_scope('accuracy'):
            self.predict = tf.cast(tf.greater(self.output, tf.constant([0.5])), dtype=tf.float32)
            self.accuracy = tf.reduce_mean(tf.cast(tf.equal(self.y, self.predict), dtype=tf.float32))    
        
        with tf.name_scope('loss'):
            cross_entropy = tf.keras.losses.binary_crossentropy(y_true=self.y, y_pred=self.output)
            self.loss = tf.reduce_mean(cross_entropy)
        
        with tf.name_scope('optimizer'):
            self.train_op = tf.train.AdamOptimizer(lr).minimize(self.loss)

        with tf.name_scope('summary'):
            self.summary_loss = tf.placeholder(tf.float32)
            self.summary_accuracy = tf.placeholder(tf.float32)
            
            tf.summary.scalar('loss', self.summary_loss)
            tf.summary.scalar('accuracy', self.summary_accuracy)
            
            self.merge = tf.summary.merge_all()

        self.train_writer = tf.summary.FileWriter('./tmp/rnn_imdb/train', tf.get_default_graph())
        self.val_writer = tf.summary.FileWriter('./tmp/rnn_imdb/val', tf.get_default_graph())
        
        self.sess = tf.Session()
        
        self.sess.run(tf.global_variables_initializer())

    def write_summary(self, tl, ta, vl, va, epoch):
        train_summary = self.sess.run(self.merge, {self.summary_loss: tl, self.summary_accuracy: ta})
        val_summary = self.sess.run(self.merge, {self.summary_loss: vl, self.summary_accuracy: va})
        
        self.train_writer.add_summary(train_summary, epoch)
        self.val_writer.add_summary(val_summary, epoch)
    
    def train(self, x_train, y_train, x_val, y_val, epochs, batch_size=32):
        data_size = len(x_train)
        for e in range(epochs):
            t_l, t_a = [], []
    
            idx = np.random.permutation(np.arange(data_size))
            _x_train, _y_train = x_train[idx], y_train[idx]
            
            for i in range(0, data_size, batch_size):
                si, ei = i, i + batch_size
                if ei > data_size:
                    ei = data_size
                
                x_batch, y_batch = _x_train[si:ei, :, :], _y_train[si:ei]
                
                tl, ta, _ = self.sess.run([self.loss, self.accuracy, self.train_op], {self.x: x_batch, self.y: y_batch})
                t_l.append(tl)
                t_a.append(ta)
                
            vl, va = self.sess.run([self.loss, self.accuracy], {self.x: x_val, self.y: y_val})
            
            self.write_summary(np.mean(t_l), np.mean(t_a), vl, va, e)
            
            print('epoch:', e + 1, ' / loss:', np.mean(t_l), '/ acc:', np.mean(t_a), ' / val_loss:', vl, '/ val_acc:', va)
    
    def score(self, x, y):
        return self.sess.run(self.accuracy, {self.x: x, self.y: y})

모델을 학습하고 테스트합니다.

model = Model()
model.train(x_train, y_train, x_val, y_val, epochs=10)
model.score(x_test, y_test)

epoch: 1  / loss: 0.6221659 / acc: 0.6369  / val_loss: 0.5833103 / val_acc: 0.6894
epoch: 2  / loss: 0.57779557 / acc: 0.69685  / val_loss: 0.58450437 / val_acc: 0.6992
epoch: 3  / loss: 0.56940305 / acc: 0.7049  / val_loss: 0.577118 / val_acc: 0.6984
epoch: 4  / loss: 0.56505644 / acc: 0.70675  / val_loss: 0.5694209 / val_acc: 0.7142
epoch: 5  / loss: 0.5640247 / acc: 0.7107  / val_loss: 0.5725659 / val_acc: 0.698
epoch: 6  / loss: 0.56029844 / acc: 0.7155  / val_loss: 0.55980057 / val_acc: 0.7198
epoch: 7  / loss: 0.55862373 / acc: 0.7167  / val_loss: 0.563976 / val_acc: 0.7166
epoch: 8  / loss: 0.56085664 / acc: 0.71315  / val_loss: 0.559803 / val_acc: 0.7206
epoch: 9  / loss: 0.5551869 / acc: 0.71845  / val_loss: 0.56608874 / val_acc: 0.7114
epoch: 10  / loss: 0.55442244 / acc: 0.7194  / val_loss: 0.58167785 / val_acc: 0.7012

0.69952

에포크에 대한 정확도와 손실 함수의 그래프는 다음과 같습니다.