学以致用实现计算机视觉分类的各类图像增广

学以致用实现计算机视觉分类的各类图像增广常用图像增广方法主要有：左右翻转(上下翻转对于许多目标并不常用)，随机裁剪，变换颜色(亮度，对比度，饱和度和色调)等等，我们拟用opencv-python实现部分数据增强方法。用来完成增广结构如下：class FunctionClass:def __init__(self, parameter):self.parameter=parameterdef _

汤半泛

406人浏览 · 2021-03-09 23:32:43

汤半泛 · 2021-03-09 23:32:43 发布

学以致用实现计算机视觉分类的各类图像增广

常用图像增广方法主要有：左右翻转(上下翻转对于许多目标并不常用)，随机裁剪，变换颜色(亮度，对比度，饱和度和色调)等等，我们拟用opencv-python实现部分数据增强方法。

用来完成增广

结构如下：

class FunctionClass:
    def __init__(self, parameter):
        self.parameter=parameter

    def __call__(self, img):

要求

1.补全代码
2.验证增强效果
3.可自选实现其他增强效果

Why we need to use 'class" not a function

我们为什么要用类来完成这些功能呢？

大佬如是说：

import cv2
import numpy as np
from matplotlib import pyplot as plt
%matplotlib inline

filename = '1.jpg'
## [Load an image from a file]
img = cv2.imread(filename)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
plt.imshow(img)

在这里插入图片描述

print(img.shape)

(720, 1080, 3)

看完小姐姐，努力工作

“我的飞桨学习之旅”活动正式开启！

https://paddlepaddle.cloud.csdn.net/p/w02wdw01

1.图片缩放

Let’s look at the details of the ‘cv2.resize’

?cv2.resize

调整大小(src，dsize[，dst[，fx[，fy[，插值])->dst。

  cv2.resize(InputArray src, OutputArray dst, Size, fx, fy, interpolation)

InputArray src	输入图片
Size	输出图片尺寸
fx, fy	沿x轴，y轴的缩放系数
interpolation	插入方式

interpolation 选项所用的插值方法：

INTER_NEAREST	最近邻插值
INTER_LINEAR	双线性插值（默认设置）
INTER_AREA	使用像素区域关系进行重采样。
INTER_CUBIC	4x4像素邻域的双三次插值
INTER_LANCZOS4	8x8像素邻域的Lanczos插值

interpolation - 插值方法。共有5种：

１）INTER_NEAREST - 最近邻插值法

２）INTER_LINEAR - 双线性插值法（默认）

３）INTER_AREA - 基于局部像素的重采样（resampling using pixel area relation）。
对于图像抽取（image decimation）来说，这可能是一个更好的方法。但如果是放大图像时，它和最近邻法的效果类似。

４）INTER_CUBIC - 基于4x4像素邻域的3次插值法

５）INTER_LANCZOS4 - 基于8x8像素邻域的Lanczos插值

注意：

1.输出尺寸格式为（宽，高）

2.默认的插值方法为：双线性插值

class Resize:
    def __init__(self, size=(600,600),scale=0): #可选比例，可选尺寸。
        self.size=size
        self.scale=scale       

    def __call__(self, img):
        if self.scale > 0:
             return cv2.resize(img,(0,0),fx=self.scale,fy=self.scale,interpolation=cv2.INTER_NEAREST)
        else:
            return cv2.resize(img,self.size)

img = cv2.imread(filename)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
resize=Resize((1024, 668))
img2=resize(img)
print(f'选择缩放尺寸')
print(f'new size (HWC): {img2.shape}')
plt.figure(figsize=(8,10),dpi=100)
plt.imshow(img2)

选择缩放尺寸
new size (HWC): (668, 1024, 3)

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-3ADwBECr-1615303780768)(output_10_2.png)]

resize=Resize(scale=0.8)
img2=resize(img)
print(f'new size (HWC): {img2.shape}')
print(f'选择缩放比例')
plt.figure(figsize=(8,10),dpi=100)
plt.imshow(img2)

new size (HWC): (576, 864, 3)

在这里插入图片描述

2.图片翻转

?cv2.flip

函数功能： 将二维数组围绕水平、垂直或两个轴进行翻转。

src :输入数组

flipCode：为一个标志,指定数组如何翻转数组。

flipCode dst

flipCode	dst
1	水平翻转
0	垂直翻转
-1	水平和垂直翻转

class Flip:
    '''
    函数功能： 将二维数组围绕水平、垂直或两个轴进行翻转。
    src :输入数组
    flipCode：为一个标志,指定数组如何翻转数组。

    '''

    def __init__(self, mode):
        self.mode=mode       

    def __call__(self, img):

        return cv2.flip(img,self.mode)
    def __str__(self):
         return "mode = [-1,0,1]"


import random
flip=Flip(mode=-1)
img2=flip(img)
plt.figure(figsize=(8,10),dpi=100)
plt.axis('off')
plt.imshow(img2)

在这里插入图片描述

3种翻转效果比较

img = cv2.imread(filename)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

mode = [-1,0,1]
imgs=[]
imgs.append(img)

for i in mode:
    flip=Flip(mode=i)
    img2=flip(img)
    imgs.append(img2)
    
img2=np.concatenate(imgs,1)
print(f'new size (HWC): {img2.shape}')
plt.figure(figsize=(8,10),dpi=100)
plt.axis('off')
plt.imshow(img2)

new size (HWC): (720, 4320, 3)

在这里插入图片描述

随机翻转效果（自定义）

class RandomFlip:
    '''
    函数功能： 将二维数组围绕水平、垂直或两个轴进行翻转。
    src :输入数组
    flipCode：为一个标志,指定数组如何翻转数组。

    '''
    def __init__(self):
        self.mode=random.choice([-1,0,1])       

    def __call__(self, img):

        return cv2.flip(img,self.mode)
    def __str__(self):
         return "RandomFlip"
import random        
rflip=RandomFlip()
img2=rflip(img)
plt.axis('off')
plt.imshow(img2)

在这里插入图片描述

imgs=[]
for i in range(3):
    img=rflip(img)
    imgs.append(img)

fig, axes = plt.subplots(1,3, figsize=(12, 12))
for n, ax in enumerate(axes.flatten()):
    ax.imshow(imgs[n]);
    ax.set_axis_off()
    
fig.tight_layout()

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-icNk9mZJ-1615303780776)(output_19_0.png)]

3图片旋转

class Rotate:
    def __init__(self, degree,size):
        self.degree=degree
        self.size=size

    def __call__(self, img):
        height,width,_ =img.shape
        matRotate = cv2.getRotationMatrix2D((height*0.5, width*0.5), self.degree, self.size) # mat rotate 1 center 2 angle 3 缩放系数
        return cv2.warpAffine(img, matRotate, (height, width))
 

rotate=Rotate(65, 0.7)
img2=rotate(img)
plt.figure(figsize=(8,10),dpi=50)
print(f'new size (HWC): {img2.shape}')
plt.axis('off')
plt.imshow(img2)

new size (HWC): (1080, 720, 3)

在这里插入图片描述

随机任意旋转（自定义）

def random_rotate(img):
    height,width,_ =img.shape
    degree=random.choice(range(0,360,10))
    size=random.uniform(0.5, 0.9)
    matRotate = cv2.getRotationMatrix2D((height*0.5, width*0.5),degree, size) # mat rotate 1 center 2 angle 3 缩放系数
    return cv2.warpAffine(img, matRotate, (width,height ))

imgs=[]
for i in range(9):
    imgs.append(random_rotate(img))
imgs_v=[]   
for i in range(0,9,3):
    imgs_v.append(np.concatenate(imgs[i:i+3],1))
img_h=np.concatenate(imgs_v,0)

plt.figure(figsize=(8,10),dpi=50)
print(f'new size (HWC): {img_h.shape}')

plt.figure(figsize=(8,10),dpi=100)
plt.axis('off')
plt.imshow(img_h)

new size (HWC): (2160, 3240, 3)

在这里插入图片描述

4.图片亮度调节

class Brightness:
    def __init__(self,brightness_factor):
        self.brightness_factor=brightness_factor


    def __call__(self, img):
        shape =img.shape
        dst= np.zeros(shape, img.dtype)
        return cv2.addWeighted(img, self.brightness_factor, dst, 1-self.brightness_factor, 3)  

img = cv2.imread('1.jpg')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

brightness=Brightness(1.2)
img2=brightness(img)
plt.axis('off')
plt.imshow(img2)

在这里插入图片描述

?cv2.addWeighted

渐进式调节亮度（自定义）

import math

def bright(img,factor):
    shape =img.shape
    dst= np.zeros(shape, img.dtype)
    return cv2.addWeighted(img, factor, dst, 1-factor, 3)  

imgs=[]
for i in range(1,5):
    imgs.append(bright(img,math.sin(1/i)))
img1=np.concatenate(imgs,1)

plt.figure(figsize=(8,10),dpi=100)
plt.axis('off')
plt.imshow(img1)

在这里插入图片描述

5.图片随机裁剪

import random
import math

class RandomErasing(object):
    def __init__(self, EPSILON=0.5, sl=0.02, sh=0.4, r1=0.25,
                 mean=[0., 0., 0.]):
        self.EPSILON = EPSILON
        self.mean = mean
        self.sl = sl
        self.sh = sh
        self.r1 = r1

    def __call__(self, img):
        if random.uniform(0, 1) > self.EPSILON: # 擦除的几率 为50%
            return img

        for attempt in range(100):
            area = img.shape[0] * img.shape[1]

            target_area = random.uniform(self.sl, self.sh) * area
            aspect_ratio = random.uniform(self.r1, 1 / self.r1) #（0.25，4），大于1的概率大

            h = int(round(math.sqrt(target_area * aspect_ratio))) # 前面定义的面积，开方后作为高
            w = int(round(math.sqrt(target_area / aspect_ratio)))  # 前面定义的面积，开方后作为作为宽
            if w < img.shape[0] and h < img.shape[1]:
                            x1 = random.randint(0, img.shape[1] - h)
                            y1 = random.randint(0, img.shape[0] - w)
                            if img.shape[2] == 3:
                                img[ x1:x1 + h, y1:y1 + w, 0] = self.mean[0]
                                img[ x1:x1 + h, y1:y1 + w, 1] = self.mean[1]
                                img[ x1:x1 + h, y1:y1 + w, 2] = self.mean[2]
                            else:
                                img[x1:x1 + h, y1:y1 + w,0] = self.mean[0]
                            return img        
        


img = cv2.imread('1.jpg')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

erase = RandomErasing()
img2=erase(img)
plt.figure(figsize=(8,10),dpi=100)
plt.axis('off')
plt.imshow(img2)

在这里插入图片描述

img = cv2.imread('1.jpg')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)


imgs=[]

transforms=[rflip,random_rotate,brightness,erase]

# 遍历上述各种效果
for transform in transforms:
    imgs.append(transform(img)) 
img1 = np.concatenate(imgs,1)
print(f'img1:{img1.shape}')

# 叠加上述各种效果
imgs=[]
for transform in transforms:
    img=transform(img)
    #print(img.shape)
    imgs.append(img) 
img2 = np.concatenate(imgs,1)
print(f'img2:{img1.shape}')

img3=np.concatenate([img1,img2],0)

plt.figure(figsize=(8,10),dpi=100)
plt.axis('off')
g=transform(img)
    #print(img.shape)
    imgs.append(img) 
img2 = np.concatenate(imgs,1)
print(f'img2:{img1.shape}')

img3=np.concatenate([img1,img2],0)

plt.figure(figsize=(8,10),dpi=100)
plt.axis('off')
plt.imshow(img3)

img1:(720, 4320, 3)
img2:(720, 4320, 3)

在这里插入图片描述

技术共进，成长同行——讯飞AI开发者社区

更多推荐

程序员必学！大模型五大核心技术(MCP/Agent/RAG/RPA/A2A)全解析（收藏版）

讯飞AI开发者社区

构建综合交通运输大模型：驱动交通强国建设的智能技术底座

最后，本文前瞻性地指出了在数据融合、模型复杂性、安全伦理等方面面临的挑战，并提出了相应的对策建议，以期为我国“人工智能+交通运输”的战略实施提供理论参考与实践指引。通过夯实“数据、算法、工具链”三大支柱，成功打造这一强大的智能技术底座，必将为我国构建安全、便捷、高效、绿色、经济的现代化综合交通体系提供核心驱动力，最终实现“人享其行、物畅其流”的美好愿景。通过对桥梁、隧道、轨道等基础设施的实时监测数

讯飞AI开发者社区

自然语言处理（NLP）基础

苹果”可以指一种水果，也可能指Apple 公司；“我今天早上没吃饭”中的“没”是否表示“完全没有”需要结合上下文。NLP 的任务就是让计算机能够“读懂”这些文字和语音，抽取其中的语义信息，从而与人类进行自然交流。简单来说，NLP 是计算机科学、人工智能与语言学的交叉学科。自然语言处理（NLP）让计算机能够理解和生成自然语言，是人工智能最贴近人类日常生活的技术之一。随着深度学习和大模型的发展，NLP