본문 바로가기
OpenMMLab/MMCV

MMCV 이해하기: Data Process

by Lizardee 2024. 6. 21.

※ MMCV는 다음과 같은 기능을 한다:

  • Data Process
    • mmcv.image
    • mmcv.video
  • Visualization
    • mmcv.visualization
  • Data Transformation
    • mmcv.transforms
  • Various CNN Architectures
    • mmcv.cnn
  • High-quality Implementation of Common CUDA Ops
    • mmcv.ops

 


mmcv.image
I/O(Input/Output)

▶ Read/Write

imread Read an image.
imwrite Write image to file.
import mmcv

img = mmcv.imread('test.jpg')  # imread: 읽기
mmcv.imwrite(img, 'out.jpg')   # imwrite: 쓰기

 

imfrombytes Read an image from bytes.
# rb: 파일을 읽기 전용으로 열고, 데이터를 바이너리 모드로 읽는 것

with open('test.jpg', 'rb') as f:  # 'test.jpg' 파일을 읽기 전용으로 열고, 그 데이터를 바이너리 형식으로 읽어,
    data = f.read()                # 'data' 변수에 저장한다.
img = mmcv.imfrombytes(data)       # 'mmcv.imfrombytes' 함수 사용하여, 'data'에서 이미지를 만든다.

 

 

 

▶ Show

imshow Show an image.
mmcv.imshow('tests/data/color.jpg')                                 
# this is equivalent to

for i in range(10):                                                   # 10개의 랜덤 이미지를 차례로 표시한다.
    img = np.random.randint(256, size=(100, 100, 3), dtype=np.uint8)  # 랜덤한 이미지('100x100' 크기의 3채널 이미지) 생성하여,
    mmcv.imshow(img, win_name='test image', wait_time=200)            # 표시한다.

 

 

색 공간(Color Space)
bgr2gray Convert a BGR image to grayscale image.
bgr2hls Convert a BGR image to HLS.
bgr2hsv Convert a BGR image to HSV.
bgr2rgb Convert a BGR image to RGB.
bgr2ycbcr Convert a BGR image to YCbCr image.
gray2bgr Convert a grayscale image to BGR image.
gray2rgb Convert a grayscale image to RGB image.
hls2bgr Convert a HLS image to BGR.
hsv2bgr Convert a HSV image to BGR.
imconvert Convert an image from the src colorspace to dst colorspace.
rgb2bgr Convert a RGB image to BGR.
rgb2gray Convert a RGB image to grayscale image.
rgb2ycbcr Convert a RGB image to YCbCr image.
ycbcr2bgr Convert a YCbCr image  to BGR image.
ycbcr2rgb Convert a YCbCr image to RGB image.
img = mmcv.imread('tests/data/color.jpg')
img1 = mmcv.bgr2rgb(img)
img2 = mmcv.rgb2gray(img1)
img3 = mmcv.bgr2hsv(img)

 

 

기하학(Geometric)

▶ Resize

imrescale Resize image while keeping the aspect ratio.
imresize Resize image to a given size.
imresize_like Resize image to the same size of a given image.
imresize_to_multiple Resize image according to a given size or scale factor and then rounds up the resized or rescaled image to the nearest value that can be divided by the divisor.
# resize by a ratio
mmcv.imrescale(img, 0.5)

# resize so that the max edge no longer than 1000, short edge no longer than 800
# without changing the aspect ratio
mmcv.imrescale(img, (1000, 800))
# resize to a given size
mmcv.imresize(img, (1000, 600), return_scale=True)   # True: (resized_img, scale)

# resize to the same size of another image
mmcv.imresize_like(img, dst_img, return_scale=False)  # False: return value is mearely the resized image.

 

▶ Rotate

imrotate Rotate an image.
img = mmcv.imread('tests/data/color.jpg')

# rotate the image clockwise by 30 degrees.
img_ = mmcv.imrotate(img, 30)

# rotate the image counterclockwise by 90 degrees.
img_ = mmcv.imrotate(img, -90)

# rotate the image clockwise by 30 degrees, and rescale it by 1.5x at the same time.
img_ = mmcv.imrotate(img, 30, scale=1.5)

# rotate the image clockwise by 30 degrees, with (100, 100) as the center.
img_ = mmcv.imrotate(img, 30, center=(100, 100))

# rotate the image clockwise by 30 degrees, and extend the image size.
img_ = mmcv.imrotate(img, 30, auto_bound=True)

 

▶ Flip

imflip Flip an image horizontally or vertically.
img = mmcv.imread('tests/data/color.jpg')

# flip the image horizontally
mmcv.imflip(img)

# flip the image vertically
mmcv.imflip(img, direction='vertical')

 

▶ Crop

imcrop Crop image patches.
import mmcv
import numpy as np

img = mmcv.imread('tests/data/color.jpg')

# crop the region (10, 10, 100, 120)
bboxes = np.array([10, 10, 100, 120])
patch = mmcv.imcrop(img, bboxes)

# crop two regions (10, 10, 100, 120) and (0, 0, 50, 50)
bboxes = np.array([[10, 10, 100, 120], [0, 0, 50, 50]])
patches = mmcv.imcrop(img, bboxes)

# crop two regions, and rescale the patches by 1.2x
patches = mmcv.imcrop(img, bboxes, scale=1.2)

 

▶ Padding

impad Pad the given image to a certain shape or pad on all sides with specified padding mode and padding value.
impad_to_multiple Pad an image to ensure each edge to be multiple to some number.
img = mmcv.imread('tests/data/color.jpg')

# pad the image to (1000, 1200) with all zeros
img_ = mmcv.impad(img, shape=(1000, 1200), pad_val=0)

# pad the image to (1000, 1200) with different values for three channels.
img_ = mmcv.impad(img, shape=(1000, 1200), pad_val=(100, 50, 200))

# pad the image on left, right, top, bottom borders with all zeros
img_ = mmcv.impad(img, padding=(10, 20, 30, 40), pad_val=0)

# pad the image on left, right, top, bottom borders with different values
# for three channels.
img_ = mmcv.impad(img, padding=(10, 20, 30, 40), pad_val=(100, 50, 200))

# pad an image so that each edge is a multiple of some value.
img_ = mmcv.impad_to_multiple(img, 32)

 

 

 

https://mmcv.readthedocs.io/en/latest/api/image.html

https://mmcv.readthedocs.io/en/latest/understand_mmcv/data_process.html


mmcv.video
I/O(Input/Output)
VideoReader Video class with similar usage to a list object.
video = mmcv.VideoReader('test.mp4')

# obtain basic information
print(len(video))
print(video.width, video.height, video.resolution, video.fps)

# iterate over all frames
for frame in video:
    print(frame.shape)

# read the next frame
img = video.read()

# read a frame by index
img = video[100]

# read some frames
img = video[5:10]

 

frames2video Read the frame images from directory and join them as a video.
# split a video into frames and save to a folder
video = mmcv.VideoReader('test.mp4')
video.cvt2frames('out_dir')

# generate video from frames
mmcv.frames2video('out_dir', 'test.avi')

 

 

Optical Flow
dequantize_flow Recover from quantized flow.
flow_from_bytes Read dense optical flow from bytes.
flow_warp Use flow to warm img.
flowread Read the optical flow map.
flowwrite Write optical flow to file.
quantize_flow Quantize flow to [0,255].
sparse_flow_from_bytes Read the optical flow in KITTI datasets from bytes.

 

 

Video Processing
cut_video Cut a clip from a video.
concat_video Concatenate multiple videos into a single one.
resize_video Resize a video.
# cut a video clip
mmcv.cut_video('test.mp4', 'clip1.mp4', start=3, end=10, vcodec='h264')

# join a list of video clips
mmcv.concat_video(['clip1.mp4', 'clip2.mp4'], 'joined.mp4', log_level='quiet')

# resize a video with the specified size
mmcv.resize_video('test.mp4', 'resized1.mp4', (360, 240))

# resize a video with a scaling ratio of 2
mmcv.resize_video('test.mp4', 'resized2.mp4', ratio=2)

 

 

https://mmcv.readthedocs.io/en/latest/api/video.html

https://mmcv.readthedocs.io/en/latest/understand_mmcv/data_process.html