OpenCV 강좌, 간단한 Mask R-CNN 모델 사용법

최근 컴퓨터 비전 분야에서 Mask R-CNN은 객체 탐지 및 분할 작업을 위한 매우 인기 있는 방법론 중 하나입니다.
Mask R-CNN은 Faster R-CNN을 기반으로 하여, 각 객체에 대해 픽셀 수준의 마스크를 생성하는 기능을 추가합니다.
본 강좌에서는 Python과 OpenCV를 사용하여 Mask R-CNN 모델을 활용하는 방법에 대해 알아보겠습니다.

1. Mask R-CNN 개요

Mask R-CNN은 단순한 객체 탐지를 넘어, 각 객체의 윤곽을 픽셀 단위로 표현할 수 있는 방법을 제공합니다.
이 모델은 잠재적으로 모든 픽셀에서 객체가 속하는 클래스와 마스크를 예측합니다.
Mask R-CNN은 주로 COCO(Common Objects in Context) 데이터셋에서 훈련되어 많은 객체를 인식할 수 있습니다.

2. Mask R-CNN 모델 다운로드 및 설정

Mask R-CNN을 사용하기 위해 사전 훈련된 모델 파일을 다운로드해야 합니다.
일반적으로 Mask R-CNN을 사용하는 데 필요한 라이브러리는 TensorFlow 또는 PyTorch이며, 여기서는 TensorFlow를 기반으로 설정하겠습니다.


# 필요한 라이브러리 설치
!pip install tensorflow opencv-python

다음으로 Mask R-CNN 모델을 다운로드합니다. TensorFlow Model Zoo에서 일반적으로 사용되는 사전 훈련된 모델 파일을 다운로드할 수 있습니다.


# Mask R-CNN 모델 다운로드
MODEL_URL = "https://github.com/matterport/Mask_RCNN/releases/download/v1.0/mask_rcnn_coco.h5"
!wget {MODEL_URL} -O mask_rcnn_coco.h5

3. OpenCV를 사용한 Mask R-CNN 모델 구현

이제 OpenCV를 사용하여 Mask R-CNN을 구현해보겠습니다.
OpenCV는 이미지와 비디오 처리에서 널리 사용되는 라이브러리로, Mask R-CNN의 추론에 필요한 전처리 및 후처리 작업을 쉽게 수행할 수 있습니다.

3.1 이미지 전처리

Mask R-CNN은 고정된 크기의 입력 이미지를 필요로 하므로, 원본 이미지를 해당 크기로 조정해야 합니다.


import cv2
import numpy as np

# 이미지 전처리 함수
def preprocess_image(image, target_size=(1024, 1024)):
    # 이미지 크기 조정
    h, w, _ = image.shape
    ratio = min(target_size[0] / h, target_size[1] / w)
    new_size = (int(w * ratio), int(h * ratio))
    image_resized = cv2.resize(image, new_size)

    # 정규화
    image_normalized = image_resized / 255.0
    return image_normalized

3.2 Mask R-CNN 모델 로드

TensorFlow를 사용하여 사전 훈련된 Mask R-CNN 모델을 로드합니다.


import tensorflow as tf

# Mask R-CNN 모델 로드
model = tf.keras.models.load_model('mask_rcnn_coco.h5', compile=False)

3.3 추론 수행

전처리된 이미지를 모델에 입력하여 객체 탐지 및 분할을 수행합니다.


def detect_objects(image):
    # 이미지를 전처리
    preprocessed_image = preprocess_image(image)

    # 모델 사용
    detections = model.predict(np.expand_dims(preprocessed_image, axis=0))

    return detections

3.4 결과 후처리

모델의 출력을 해석하고 결과를 시각화합니다.


def visualize_results(image, detections):
    for i in range(len(detections['rois'])):
        roi = detections['rois'][i]
        score = detections['scores'][i]
        if score > 0.5:  # 신뢰도 기준
            # 바운딩 박스 그리기
            cv2.rectangle(image, (int(roi[1]), int(roi[0])), 
                          (int(roi[3]), int(roi[2])), (255, 0, 0), 2)
    cv2.imshow('Result', image)
    cv2.waitKey(0)
    cv2.destroyAllWindows()

4. 전체 코드

위의 모든 단계를 통합한 전체 코드입니다.


import cv2
import numpy as np
import tensorflow as tf

def preprocess_image(image, target_size=(1024, 1024)):
    h, w, _ = image.shape
    ratio = min(target_size[0] / h, target_size[1] / w)
    new_size = (int(w * ratio), int(h * ratio))
    image_resized = cv2.resize(image, new_size)
    image_normalized = image_resized / 255.0
    return image_normalized

def detect_objects(image):
    preprocessed_image = preprocess_image(image)
    detections = model.predict(np.expand_dims(preprocessed_image, axis=0))
    return detections

def visualize_results(image, detections):
    for i in range(len(detections['rois'])):
        roi = detections['rois'][i]
        score = detections['scores'][i]
        if score > 0.5:
            cv2.rectangle(image, (int(roi[1]), int(roi[0])), 
                          (int(roi[3]), int(roi[2])), (255, 0, 0), 2)
    cv2.imshow('Result', image)
    cv2.waitKey(0)
    cv2.destroyAllWindows()

# 주요 실행 코드
if __name__ == "__main__":
    model = tf.keras.models.load_model('mask_rcnn_coco.h5', compile=False)
    image = cv2.imread('input_image.jpg')  # 테스트할 이미지
    detections = detect_objects(image)
    visualize_results(image, detections)

5. 결론

이번 강좌에서는 OpenCV와 Mask R-CNN을 사용하여 객체 탐지 및 분할을 수행하는 방법을 알아보았습니다.
Mask R-CNN은 다양한 응용 분야에서 객체 인식 능력을 강화하는 데 매우 효과적입니다.
이 코드를 바탕으로 다른 데이터셋이나 사용자 정의 모델로 확장하여 더 많은 객체 탐지 작업을 수행할 수 있습니다.
컴퓨터 비전 분야에서의 지속적인 발전에 발맞추어 나가시기 바랍니다.