'Logistic Regression' 태그의 글 목록

Logistic Regression (1)

로지스틱 회귀(Logistic Regression)2020.11.06

로지스틱 회귀(Logistic Regression)

2020. 11. 6. 16:57, 머신러닝/딥러닝

로지스틱 회귀(Logistic Regression)

로지스틱 회귀는 0 또는 1과 같은 범주를 갖는 이진 분류(binary classification) 문제를 위한 알고리즘으로 해당 범주에 대한 확률을 예측합니다.

시그모이드 함수(Sigmoid Function)

시그모이드 함수는 출력으로 $0<y<1$ 의 범위를 갖는 활성화 함수(activation function)의 한 종류입니다.

시그모이드 함수의 유도되는 과정은 다음과 같습니다.

오즈 비(odds ratio) > 로짓 함수(logit function) > 시그모이드 함수(sigmoid function)

오즈 비는 성공 확률 $p$ 과 실패 확률 $1-p$ 의 비율을 나타냅니다.

$OR = {p \over 1-p}$

import numpy as np
import matplotlib.pyplot as plt

def odds_ratio(x):
    return x / (1-x)

x = np.arange(0, 1, 0.01)
y = odds_ratio(x)

plt.plot(x, y)
plt.title('Odds ratio')
plt.xlabel('p')
plt.show()

$p$ 의 값은 1에 가까워지면 급격히 증가하는 특징을 가집니다.

로짓 함수는 오즈 비에 로그를 취한 함수입니다.

$logit(p)=\text{log}({p \over 1-p})$

import numpy as np
import matplotlib.pyplot as plt

def odds_ratio(x):
    return x / (1-x)

def logit(x):
    return np.log(odds_ratio(x))

x = np.arange(0.001, 1, 0.001)
y = logit(x)

plt.plot(x, y)
plt.title('Logit function')
plt.xlabel('p')
plt.show()

$p$ 가 0.5일 때 0이 되고 0과 1일 때 각각 음의 무한, 양의 무한이 되는 특징을 가집니다.

로짓 함수를 출력을 $z$ 로 놓고 $z$ 대하여 다시 정리하면 로지스틱 함수가 됩니다.

$\text{log}({p \over 1-p})=z$

$sigmoid(z)=\frac{1}{ 1+e^{-z} }$

import numpy as np
import matplotlib.pyplot as plt

def sigmoid(x):
    return 1 / (1+np.exp(-x))

x = np.arange(-10, 10, 0.1)
y = sigmoid(x)

plt.plot(x, y)
plt.title('Logistic function')
plt.xlabel('z')
plt.show()

로짓 함수를 반대로 뒤집어 놓은 모양으로 S자를 나타내어 로지스틱 함수를 시그모이드 함수라고 합니다.

이러한 시그모이드 함수를 통해 출력된 값을 0.5와 같은 임계값을 기준으로 나누면 이진 분류할 수 있는 것입니다.

손실 함수

이진 분류 문제에서 손실 함수는 어떻게 정의할 수 있을까요?

이진 분류 문제에서 정답 데이터는 0과 1의 범주를 갖으며 신경망의 예측에 해당하는 시그모이드 함수의 출력은 0과 1 사이의 범위를 갖습니다. 이에 대해 로그 함수의 특징을 이용하여 손실 함수를 정의할 수 있습니다.

로그 함수는 입력이 0에 가까지면 출력의 절대값이 커지고, 입력이 1에 가까워지면 출력의 절대값이 작아집니다.

이 같은 특징을 이용하여 다음과 같이 나타낼 수 있습니다.

$\hat y = sigmoid(x)$

$\text{if } y=1 \rightarrow -\text{log}(\hat y)$

$\text{if } y=0 \rightarrow -\text{log}(1 - \hat y)$

정답 $y$ 가 1인 경우에는 예측 $\hat y$ 이 0에 가까워진다면 오차는 커지고 1에 가까워진다면 오차는 작아지게 됩니다. $y$ 가 0인 경우에는 $\hat y$ 가 0에 가까워진다면 오차는 작아지고 1에 가까워진다면 오차는 커지게 됩니다.

이를 통해 손실 함수를 정의하면 다음과 같습니다.

$L=-{1 \over n}\displaystyle \sum_{i=1}^n \Big[ y_i \text{log}(\hat y_i))+(1-y_i) \text{log}(1-\hat y_i) \Big]$

구현

로지스틱 회귀 모델을 구현합니다.

유방암 데이터셋을 불러옵니다.

from sklearn.datasets import load_breast_cancer

breast_cancer = load_breast_cancer()

data = breast_cancer.data
target = breast_cancer.target
feature_names = breast_cancer.feature_names

print('data shape:', data.shape)
print('data sample:', data[0])
print('feature_names:', feature_names)

print('target shape:', target.shape)
print('target sample:', target[0], target[1], target[2])

data shape: (569, 30)
data sample: [1.799e+01 1.038e+01 1.228e+02 1.001e+03 1.184e-01 2.776e-01 3.001e-01
 1.471e-01 2.419e-01 7.871e-02 1.095e+00 9.053e-01 8.589e+00 1.534e+02
 6.399e-03 4.904e-02 5.373e-02 1.587e-02 3.003e-02 6.193e-03 2.538e+01
 1.733e+01 1.846e+02 2.019e+03 1.622e-01 6.656e-01 7.119e-01 2.654e-01
 4.601e-01 1.189e-01]
feature_names: ['mean radius' 'mean texture' 'mean perimeter' 'mean area'
 'mean smoothness' 'mean compactness' 'mean concavity'
 'mean concave points' 'mean symmetry' 'mean fractal dimension'
 'radius error' 'texture error' 'perimeter error' 'area error'
 'smoothness error' 'compactness error' 'concavity error'
 'concave points error' 'symmetry error' 'fractal dimension error'
 'worst radius' 'worst texture' 'worst perimeter' 'worst area'
 'worst smoothness' 'worst compactness' 'worst concavity'
 'worst concave points' 'worst symmetry' 'worst fractal dimension']
target shape: (569,)
target sample: 0 0 0

8:2의 비율로 학습 데이터와 테스트 데이터로 분할합니다.

from sklearn.model_selection import train_test_split

x_train, x_test, y_train, y_test = train_test_split(data, target, test_size=0.2)

데이터 스케일을 확인합니다.

import matplotlib.pyplot as plt

fig = plt.figure(figsize=(12, 6))
ax = fig.add_subplot(111)

for i in range(x_train.shape[1]): # all features
    ax.scatter(x_train[:, i], y_train, s=10)
    
plt.title('Raw')
plt.xlabel('x')
plt.ylabel('y')
plt.show()

학습 데이터와 테스트 데이터를 표준화합니다.

train_mean = x_train.mean(axis=0)
train_std = x_train.std(axis=0)

x_train = (x_train - train_mean) / train_std
x_test = (x_test - train_mean) / train_std

산점도로 나타나는 x축의 스케일이 조정되었습니다.

import matplotlib.pyplot as plt

fig = plt.figure(figsize=(12, 6))
ax = fig.add_subplot(111)

for i in range(x_train.shape[1]): # all features
    ax.scatter(x_train[:, i], y_train, s=10)
    
plt.title('Standardization')
plt.xlabel('x')
plt.ylabel('y')
plt.show()

신경망을 정의합니다.

import numpy as np
import tensorflow as tf

class Model:
    def __init__(self, lr=1e-3):
        tf.reset_default_graph()
        
        with tf.name_scope('input'):
            self.x = tf.placeholder(tf.float32, [None, 30])
            self.y = tf.placeholder(tf.float32, [None, 1])

        with tf.name_scope('layer'):
            fc = tf.layers.dense(self.x, 128, tf.nn.relu)
            fc2 = tf.layers.dense(fc, 128, tf.nn.relu)
            fc3 = tf.layers.dense(fc2, 1)
            
        with tf.name_scope('output'):
            self.output = tf.nn.sigmoid(fc3)

        with tf.name_scope('accuracy'):
            self.predict = tf.cast(tf.greater(self.output, tf.constant([0.5])), dtype=tf.float32)
            self.accuracy = tf.reduce_mean(tf.cast(tf.equal(self.y, self.predict), dtype=tf.float32))    
        
        with tf.name_scope('loss'):
            self.loss = -tf.reduce_mean(self.y * tf.log(self.output) + (1-self.y) * tf.log(1-self.output))

        with tf.name_scope('optimizer'):
            self.train_op = tf.train.GradientDescentOptimizer(lr).minimize(self.loss)

        with tf.name_scope('summary'):
            self.summary_loss = tf.placeholder(tf.float32)
            self.summary_accuracy = tf.placeholder(tf.float32)
            
            tf.summary.scalar('loss', self.summary_loss)
            tf.summary.scalar('accuracy', self.summary_accuracy)
            
            self.merge = tf.summary.merge_all()

        self.writer = tf.summary.FileWriter('./tmp/logistic-regression_breast_cancer_1', tf.get_default_graph())
        
        self.sess = tf.Session()
        
        self.sess.run(tf.global_variables_initializer())
                               
    def train(self, x, y, epochs, batch_size=32):
        data_size = len(x)
        for e in range(epochs):
            epoch_loss, epoch_acc = [], []
            
            idx = np.random.permutation(np.arange(data_size))
            x_train, y_train = x[idx], y[idx]
            for i in range(0, data_size, batch_size):
                si, ei = i, i + batch_size
                if ei > data_size:
                    ei = data_size

                x_batch, y_batch = x_train[si:ei, :], y_train[si:ei]
                
                loss, acc, _ = self.sess.run([self.loss, self.accuracy, self.train_op], {self.x: x_batch, self.y: y_batch})
                
                epoch_loss.append(loss)
                epoch_acc.append(acc)
            
            summary = self.sess.run(self.merge, {self.summary_loss: np.mean(epoch_loss), self.summary_accuracy: np.mean(epoch_acc)})
            self.writer.add_summary(summary, e)
            
            print('epoch:', e + 1, ' / loss:', np.mean(epoch_loss), '/ accuracy:', np.mean(epoch_acc))
    
    def score(self, x, y):
        return self.sess.run(self.accuracy, {self.x: x, self.y: y})

미니 배치 경사 하강법으로 학습을 수행합니다.

def train(self, x, y, epochs, batch_size=32):
    data_size = len(x)
    for e in range(epochs):
        epoch_loss, epoch_acc = [], []

        idx = np.random.permutation(np.arange(data_size))
        x_train, y_train = x[idx], y[idx]
        for i in range(0, data_size, batch_size):
            si, ei = i, i + batch_size
            if ei > data_size:
                ei = data_size

            x_batch, y_batch = x_train[si:ei, :], y_train[si:ei]

            loss, acc, _ = self.sess.run([self.loss, self.accuracy, self.train_op], {self.x: x_batch, self.y: y_batch})

            epoch_loss.append(loss)
            epoch_acc.append(acc)

        summary = self.sess.run(self.merge, {self.summary_loss: np.mean(epoch_loss), self.summary_accuracy: np.mean(epoch_acc)})
        self.writer.add_summary(summary, e)

        print('epoch:', e + 1, ' / loss:', np.mean(epoch_loss), '/ accuracy:', np.mean(epoch_acc))

모델을 학습하고 테스트합니다.

model = Model()
model.train(x_train, y_train, epochs=200)
model.score(x_test, y_test)

...

epoch: 190  / loss: 0.1256168 / accuracy: 0.96875
epoch: 191  / loss: 0.12216289 / accuracy: 0.96875
epoch: 192  / loss: 0.1224106 / accuracy: 0.96875
epoch: 193  / loss: 0.12211617 / accuracy: 0.96875
epoch: 194  / loss: 0.12102139 / accuracy: 0.96875
epoch: 195  / loss: 0.12139161 / accuracy: 0.96875
epoch: 196  / loss: 0.12274469 / accuracy: 0.96875
epoch: 197  / loss: 0.12596376 / accuracy: 0.96339285
epoch: 198  / loss: 0.12571438 / accuracy: 0.97083336
epoch: 199  / loss: 0.1223996 / accuracy: 0.97291666
epoch: 200  / loss: 0.12401686 / accuracy: 0.97291666

0.98245615

에포크에 대한 정확도와 손실 함수의 그래프는 다음과 같습니다.

'머신러닝 > 딥러닝' 카테고리의 다른 글

소프트맥스 회귀(Softmax Regression) (1) (0)	2020.11.10
교차 검증(Cross Validation) (0)	2020.11.09
선형 회귀(Linear Regression) (0)	2020.11.06
순전파와 역전파 (0)	2020.11.06
신경망 학습 (0)	2020.11.04

Comments, Trackbacks

PREV 1 NEXT

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

로지스틱 회귀(Logistic Regression)

시그모이드 함수(Sigmoid Function)

손실 함수

구현

'머신러닝 > 딥러닝' 카테고리의 다른 글

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역