一文实现yolov5实例分割(数据标注、标签转换、模型训练、模型推理)
前言
本文记录一下yolov5实例分割的完全过程,包括制作自己的数据集,标签转换,然后训练模型,测试模型效果。
本文更推荐在电脑端阅读~ 😄
一、实例分割--数据标注
实例分割的标签,可以用多边形表示,那么我们可以用 labelme 或 labelimg进行标注。本文示例用labelme展示。
labelme 开源地址:https://github.com/wkentaro/labelme
安装方式,可以通过conda、pip等方式安装,支持windows、linux等系统。
下面是conda的安装方式,其它方式参考上面的网址安装即可。
# python3
conda create --name=labelme python=3
conda activate labelmepip install labelme
安装好labelme后,在环境终端输入labelme,来打开它;
然后点击Open(打开单张图片)、OpenDir(打开一个文件夹下所用图片),看到如下界面
然后,点击鼠标右键,选择Create Polygons(创建多边形);
通过点击鼠标左键,创建多个点,来包围我们要标注的物体;比如我们需要标注:car、bus、motorbike,三个类别。
然后点击Save(保存刚才标注的信息),保存为一个json文件。
里面的关键信息:
- label:是指类别名称;比如刚才标注的car、bus、motorbike,三个类别。
- point:是标注的点,对应通过点击鼠标左键,创建多个点,来包围我们要标注的物体;每个点对应(x,y)的值。
- imagePath:图片的路径以及名称。
- imageHeight:图片的高度。
- imageWidth:图片的宽度。
我们待会需要把json文件转为yolov5训练的txt标签格式。
先看看labelme生成的标签示例,json文件:
{"version": "5.1.1","flags": {},"shapes": [{"label": "car","points": [[518.2777777777778,623.5],[498.83333333333337,666.5555555555555],[498.83333333333337,709.6111111111111],[501.6111111111111,737.3888888888889],[529.3888888888889,738.7777777777778],[597.4444444444445,741.5555555555555],[626.6111111111111,738.7777777777778],[628.0,672.1111111111111],[636.3333333333334,652.6666666666667],[614.1111111111111,629.0555555555555],[548.8333333333334,620.7222222222223]],"group_id": null,"shape_type": "polygon","flags": {}},{"label": "bus","points": [[90.49999999999997,561.0],[155.77777777777774,558.2222222222223],[190.49999999999997,561.0],[251.61111111111106,563.7777777777778],[300.2222222222222,569.3333333333334],[305.7777777777777,638.7777777777778],[298.83333333333326,648.5],[250.2222222222222,647.1111111111111],[222.44444444444443,656.8333333333334],[207.16666666666666,679.0555555555555],[207.16666666666666,701.2777777777778],[177.99999999999997,705.4444444444445],[157.16666666666666,698.5],[126.61111111111106,701.2777777777778],[127.99999999999997,709.6111111111111],[105.77777777777774,705.4444444444445],[100.2222222222222,692.9444444444445],[79.38888888888886,692.9444444444445],[83.55555555555551,608.2222222222223],[68.27777777777774,609.6111111111111],[68.27777777777774,572.1111111111111]],"group_id": null,"shape_type": "polygon","flags": {}},{"label": "motorbike","points": [[976.6111111111112,654.0555555555555],[962.7222222222223,674.8888888888889],[962.7222222222223,690.1666666666667],[966.888888888889,716.5555555555555],[990.5000000000001,720.7222222222223],[1004.388888888889,708.2222222222223],[1005.7777777777777,676.2777777777778],[1000.2222222222223,658.2222222222223]],"group_id": null,"shape_type": "polygon","flags": {}}],"imagePath": "traffic04.png","imageData": "iVBORw0KGgoAAAANSUhEUgAABWUAAAOBCAYAAACZFsKEAAEAAElEQVR4nOz9W8/sOpMmBgaVuXYd3HXTaBTQ/f9/StsGPD7AnosButoeow/jKg/QF9VA1fftlSn6QhnKUOiJA0UqM993r2ftd0spiWg27Kl6lzqS2Ugp+Xa/pSzDhJ8vSQ7+ZlIdCxfd36EuWqIlz+yoBWD/0TzYx65h+XMR3oqzS/X6Hj+oOmKHaBauI89ZRC2Kz3e7zW9s8dWraNwEek5+39xlH4vvdAG/B21/Xa8fP27fDsWpW1/IayTkiB0LTUtW/WKVQv7+Drxj/+v8TLQpxSFFlGAAAAABJRU5ErkJggg==","imageHeight": 897,"imageWidth": 1381
}
二、标签转换 json转txt
下面我们把json文件转为yolov5训练的txt标签格式。
首先了解一下yolov5 的txt标签格式,分为由类别 、坐标点x、坐标点y组成;其中坐标x和y都是做了归一化的。
类别 x1 y1 x2 y2 ....... xn yn
看一个示例:
1 0.07505175983436857 0.3322981366459628 0.16718426501035202 0.4989648033126295 0.0025879917184265366 0.556935817805383 0.0036231884057971323 0.365424430641822
1 0.662008281573499 0.1356107660455487 0.7572463768115941 0.28467908902691513 0.9983333333333333 0.16563615198920995 0.9983333333333333 0.0010351966873706593
1 0.8059006211180125 0.898550724637681 0.5833333333333334 0.9834368530020703 0.5922712215320913 0.9983333333333333 0.8628364389233955 0.9983333333333333
1 0.9839544513457558 0.8302277432712215 0.8059006211180125 0.898550724637681 0.8655644489978869 0.9983333333333333 0.9953416149068324 0.9983333333333333 0.9983333333333333 0.8603253623188408
这里直接上转换代码,json转为txt
import json
import os
import argparse
from tqdm import tqdmdef convert_label_json(json_dir, save_dir, classes):json_paths = os.listdir(json_dir)classes = classes.split(',')for json_path in tqdm(json_paths):# print("json_path:", json_path)path = os.path.join(json_dir,json_path)with open(path,'r') as load_f:json_dict = json.load(load_f)h, w = json_dict['imageHeight'], json_dict['imageWidth']# save txt pathtxt_path = os.path.join(save_dir, json_path.replace('json', 'txt'))txt_file = open(txt_path, 'w')for shape_dict in json_dict['shapes']:label = shape_dict['label']label_index = classes.index(label)points = shape_dict['points']points_nor_list = []for point in points:points_nor_list.append(point[0]/w)points_nor_list.append(point[1]/h)points_nor_list = list(map(lambda x:str(x),points_nor_list))points_nor_str = ' '.join(points_nor_list)label_str = str(label_index) + ' ' +points_nor_str + '\\n'txt_file.writelines(label_str)# print("end\\n")if __name__ == "__main__":parser = argparse.ArgumentParser(description='json convert to txt params')parser.add_argument('--json-dir', type=str, help='json path dir')parser.add_argument('--save-dir', type=str, help='txt save dir')parser.add_argument('--classes', type=str, help='classes')args = parser.parse_args()json_dir = args.json_dirsave_dir = args.save_dirclasses = args.classesconvert_label_json(json_dir, save_dir, classes)
使用示例
python json2txt.py --json-dir ./label_json/ --save-dir ./label_txt/ --classes "car,bus,motorbike"
- --json-dir:是用labelme标注好的json文件路径
- --save-dir:是保存txt的路径
- --classes:是标注的类别;这需要确认顺序,比如输入的是"car,bus,motorbike",对应编码id为0,1,2
三、模型训练
yolov5的v7版本支持实例分割,我们去官网下载开源代码就可以啦。
https://github.com/ultralytics/yolov5/tree/v7.0
这一步我们来到了模型训练啦 ( •̀ ω •́ )y
我们首先把转换的数据,存放在datasets文件夹中,其中datasets与yolov5-master是同一级目录的。yolov5-master目录下面就存储yolov5的的代码了。
创建一个名称为autopilot_seg,存放刚才标注的图片和标签。目录结构如下:
— datasets
— autopilot_seg
— images
— labels
— yolov5-master
— classify
— data
— models
— segment
— utils
............
由于是示例展示,没有把数据集划分:训练集、验证集、测试集。
训练和验证用同一个数据集,但是,但是,但是,在实际验证、做实验、做项目,十分有必要分训练集、验证集的。
然后在yolov5-master工程中的data目录,创建一个名称为autopilot_seg.yaml,来指定数据集存放情况:
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
# parent
# ├── yolov5
# └── datasets
# └── autopilot_seg ← downloads here path: ../datasets/autopilot_seg # dataset root dir
train: images/ # train images
val: images/ # val images
test: # test images (optional)# Classes
names:0: car1: bus2: motorbike
开始训练模型啦
我们使用预训练模型yolov5s-seg.pt,进行训练新的数据,这样加快模型收敛:
python segment/train.py --data autopilot_seg.yaml --weights yolov5s-seg.pt --img 640
默认是训练100轮的,通过下面的传参,可以修改训练参数
文件路径:segment \\ train.py,大约在463行代码;
def parse_opt(known=False):parser = argparse.ArgumentParser()parser.add_argument('--weights', type=str, default=ROOT / 'yolov5s-seg.pt', help='initial weights path')parser.add_argument('--cfg', type=str, default='', help='model.yaml path')parser.add_argument('--data', type=str, default=ROOT / 'data/coco128-seg.yaml', help='dataset.yaml path')parser.add_argument('--hyp', type=str, default=ROOT / 'data/hyps/hyp.scratch-low.yaml', help='hyperparameters path')parser.add_argument('--epochs', type=int, default=100, help='total training epochs')parser.add_argument('--batch-size', type=int, default=16, help='total batch size for all GPUs, -1 for autobatch')parser.add_argument('--imgsz', '--img', '--img-size', type=int, default=640, help='train, val image size (pixels)')parser.add_argument('--rect', action='store_true', help='rectangular training')parser.add_argument('--resume', nargs='?', const=True, default=False, help='resume most recent training')parser.add_argument('--nosave', action='store_true', help='only save final checkpoint')parser.add_argument('--noval', action='store_true', help='only validate final epoch')parser.add_argument('--noautoanchor', action='store_true', help='disable AutoAnchor')parser.add_argument('--noplots', action='store_true', help='save no plot files')parser.add_argument('--evolve', type=int, nargs='?', const=300, help='evolve hyperparameters for x generations')parser.add_argument('--bucket', type=str, default='', help='gsutil bucket')parser.add_argument('--cache', type=str, nargs='?', const='ram', help='image --cache ram/disk')parser.add_argument('--image-weights', action='store_true', help='use weighted image selection for training')parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')parser.add_argument('--multi-scale', action='store_true', help='vary img-size +/- 50%%')parser.add_argument('--single-cls', action='store_true', help='train multi-class data as single-class')parser.add_argument('--optimizer', type=str, choices=['SGD', 'Adam', 'AdamW'], default='SGD', help='optimizer')parser.add_argument('--sync-bn', action='store_true', help='use SyncBatchNorm, only available in DDP mode')parser.add_argument('--workers', type=int, default=8, help='max dataloader workers (per RANK in DDP mode)')parser.add_argument('--project', default=ROOT / 'runs/train-seg', help='save to project/name')parser.add_argument('--name', default='exp', help='save to project/name')parser.add_argument('--exist-ok', action='store_true', help='existing project/name ok, do not increment')parser.add_argument('--quad', action='store_true', help='quad dataloader')parser.add_argument('--cos-lr', action='store_true', help='cosine LR scheduler')parser.add_argument('--label-smoothing', type=float, default=0.0, help='Label smoothing epsilon')parser.add_argument('--patience', type=int, default=100, help='EarlyStopping patience (epochs without improvement)')parser.add_argument('--freeze', nargs='+', type=int, default=[0], help='Freeze layers: backbone=10, first3=0 1 2')parser.add_argument('--save-period', type=int, default=-1, help='Save checkpoint every x epochs (disabled if < 1)')parser.add_argument('--seed', type=int, default=0, help='Global training seed')parser.add_argument('--local_rank', type=int, default=-1, help='Automatic DDP Multi-GPU argument, do not modify')# Instance Segmentation Argsparser.add_argument('--mask-ratio', type=int, default=4, help='Downsample the truth masks to saving memory')parser.add_argument('--no-overlap', action='store_true', help='Overlap masks train faster at slightly less mAP')return parser.parse_known_args()[0] if known else parser.parse_args()
下面是一些打印信息:
hyperparameters: lr0=0.01, lrf=0.01, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=0.05, cls=0.5, cls_pw=1.0, obj=1.0, obj_pw=1.0, iou_t=0.2, anchor_t=4.0, fl_gamma=0.0, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.0, copy_paste=0.0
TensorBoard: Start with 'tensorboard --logdir runs/train-seg', view at http://localhost:6006/
Overriding model.yaml nc=80 with nc=3from n params module arguments 0 -1 1 3520 models.common.Conv [3, 32, 6, 2, 2] 1 -1 1 18560 models.common.Conv [32, 64, 3, 2] 2 -1 1 18816 models.common.C3 [64, 64, 1] 3 -1 1 73984 models.common.Conv [64, 128, 3, 2] 4 -1 2 115712 models.common.C3 [128, 128, 2] 5 -1 1 295424 models.common.Conv [128, 256, 3, 2] 6 -1 3 625152 models.common.C3 [256, 256, 3] 7 -1 1 1180672 models.common.Conv [256, 512, 3, 2] 8 -1 1 1182720 models.common.C3 [512, 512, 1] 9 -1 1 656896 models.common.SPPF [512, 512, 5] 10 -1 1 131584 models.common.Conv [512, 256, 1, 1] 11 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest'] 12 [-1, 6] 1 0 models.common.Concat [1] 13 -1 1 361984 models.common.C3 [512, 256, 1, False] 14 -1 1 33024 models.common.Conv [256, 128, 1, 1] 15 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest'] 16 [-1, 4] 1 0 models.common.Concat [1] 17 -1 1 90880 models.common.C3 [256, 128, 1, False] 18 -1 1 147712 models.common.Conv [128, 128, 3, 2] 19 [-1, 14] 1 0 models.common.Concat [1] 20 -1 1 296448 models.common.C3 [256, 256, 1, False] 21 -1 1 590336 models.common.Conv [256, 256, 3, 2] 22 [-1, 10] 1 0 models.common.Concat [1] 23 -1 1 1182720 models.common.C3 [512, 512, 1, False] 24 [17, 20, 23] 1 407464 models.yolo.Segment [3, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], 32, 128, [128, 256, 512]]
Model summary: 225 layers, 7413608 parameters, 7413608 gradients, 25.9 GFLOPsTransferred 361/367 items from yolov5s-seg.pt
AMP: checks passed ✅
optimizer: SGD(lr=0.01) with parameter groups 60 weight(decay=0.0), 63 weight(decay=0.0005), 63 bias
train: Scanning /guopu/yolov5-v7-seg/datasets/parking_seg/labels.cache... 172 images, 1 backgrounds, 0 corrupt: 100%|██████████| 173/173 [00:00<?, ?it/s]
train: Caching images (0.2GB ram): 100%|██████████| 173/173 [00:00<00:00, 744.77it/s]
val: Scanning /guopu/yolov5-v7-seg/datasets/parking_seg/labels.cache... 172 images, 1 backgrounds, 0 corrupt: 100%|██████████| 173/173 [00:00<?, ?it/s]
val: Caching images (0.2GB ram): 100%|██████████| 173/173 [00:00<00:00, 425.08it/s]AutoAnchor: 4.52 anchors/target, 1.000 Best Possible Recall (BPR). Current anchors are a good fit to dataset ✅
Plotting labels to runs/train-seg/exp3/labels.jpg...
Image sizes 640 train, 640 val
Using 8 dataloader workers
Logging results to runs/train-seg/exp3
Starting training for 100 epochs...Epoch GPU_mem box_loss seg_loss obj_loss cls_loss Instances Size0/99 4.41G 0.119 0.1303 0.03598 0.04046 42 640: 100%|██████████| 11/11 [00:11<00:00, 1.04s/it]Class Images Instances Box(P R mAP50 mAP50-95) Mask(P R mAP50 mAP50-95): 100%|██████████| 6/6 [00:04<00:00, 1.21it/s]all 173 416 0.00755 0.172 0.00566 0.00126 0.00114 0.163 0.000808 0.000146Epoch GPU_mem box_loss seg_loss obj_loss cls_loss Instances Size1/99 4.88G 0.1122 0.08991 0.03867 0.03891 66 640: 100%|██████████| 11/11 [00:04<00:00, 2.31it/s]Class Images Instances Box(P R mAP50 mAP50-95) Mask(P R mAP50 mAP50-95): 100%|██████████| 6/6 [00:04<00:00, 1.33it/s]all 173 416 0.0038 0.55 0.00454 0.0012 0.00196 0.287 0.00155 0.000278Epoch GPU_mem box_loss seg_loss obj_loss cls_loss Instances Size2/99 4.88G 0.09685 0.07117 0.04401 0.03686 64 640: 100%|██████████| 11/11 [00:04<00:00, 2.35it/s]Class Images Instances Box(P R mAP50 mAP50-95) Mask(P R mAP50 mAP50-95): 100%|██████████| 6/6 [00:04<00:00, 1.46it/s]all 173 416 0.00546 0.752 0.0105 0.00266 0.00407 0.586 0.00507 0.00103Epoch GPU_mem box_loss seg_loss obj_loss cls_loss Instances Size3/99 4.88G 0.08271 0.06062 0.04612 0.03379 52 640: 100%|██████████| 11/11 [00:04<00:00, 2.33it/s]Class Images Instances Box(P R mAP50 mAP50-95) Mask(P R mAP50 mAP50-95): 100%|██████████| 6/6 [00:03<00:00, 1.69it/s]all 173 416 0.00615 0.752 0.0121 0.00342 0.00653 0.506 0.00844 0.00184...........Epoch GPU_mem box_loss seg_loss obj_loss cls_loss Instances Size96/99 4.89G 0.01945 0.01475 0.01745 0.003529 42 640: 100%|██████████| 11/11 [00:04<00:00, 2.50it/s]Class Images Instances Box(P R mAP50 mAP50-95) Mask(P R mAP50 mAP50-95): 100%|██████████| 6/6 [00:02<00:00, 2.02it/s]all 173 416 0.989 0.996 0.994 0.884 0.989 0.996 0.994 0.86Epoch GPU_mem box_loss seg_loss obj_loss cls_loss Instances Size97/99 4.89G 0.01732 0.01472 0.0163 0.003331 35 640: 100%|██████████| 11/11 [00:04<00:00, 2.51it/s]Class Images Instances Box(P R mAP50 mAP50-95) Mask(P R mAP50 mAP50-95): 100%|██████████| 6/6 [00:02<00:00, 2.09it/s]all 173 416 0.987 0.996 0.994 0.878 0.987 0.996 0.994 0.866Epoch GPU_mem box_loss seg_loss obj_loss cls_loss Instances Size98/99 4.89G 0.01833 0.01492 0.01707 0.003123 38 640: 100%|██████████| 11/11 [00:04<00:00, 2.49it/s]Class Images Instances Box(P R mAP50 mAP50-95) Mask(P R mAP50 mAP50-95): 100%|██████████| 6/6 [00:02<00:00, 2.03it/s]all 173 416 0.986 0.996 0.994 0.881 0.986 0.996 0.994 0.865Epoch GPU_mem box_loss seg_loss obj_loss cls_loss Instances Size99/99 4.89G 0.01827 0.01308 0.01517 0.00358 56 640: 100%|██████████| 11/11 [00:04<00:00, 2.45it/s]Class Images Instances Box(P R mAP50 mAP50-95) Mask(P R mAP50 mAP50-95): 100%|██████████| 6/6 [00:03<00:00, 1.99it/s]all 173 416 0.987 0.996 0.994 0.887 0.987 0.996 0.994 0.868100 epochs completed in 0.230 hours.
Optimizer stripped from runs/train-seg/exp3/weights/last.pt, 15.2MB
Optimizer stripped from runs/train-seg/exp3/weights/best.pt, 15.2MB
这个示例代码只标注了173图片,而且训练集与测试集是一样的,所以精度会比较高;
大家在实际的实验或项目中,需要把数据划分为训练集、验证集,数量也会比较大。这里能理解这示例流程就可以啦。
四、模型推理
模型推理,主要是用刚才训练好的模型,去预测新的图片,看看效果如何。
实例分割的代码是在 segment \\ predict.py中,用刚才训练好的权重:runs/train-seg/exp3/weights/best.pt去预测
python segment/predict.py --weights runs/train-seg/exp3/weights/best.pt
也可以推理实时摄像头的数据:
python segment/predict.py --weights runs/train-seg/exp3/weights/best.pt --source 0
下面是yolov5实例分割,在其它数据集的效果:
分享完成啦~ ( •̀ ω •́ )✧
本文只供大家参考与学习,谢谢~
后面分享一下yolov5实例分割的思路和原理,以及代码讲解~