目标检测之voc格式的数据集转换为csv格式

东方耀 · 发表于 2020-12-4 22:53:11

import os
import glob
import csv
from xml.dom.minidom import parse
# 目标检测之voc格式的数据集转换为csv格式
# label_dict = {"0": "person", "1": "car", "2": "motorbike", "3": "bus", "4": "truck"}
label_dict = {"0": "car", "1": "truck", "2": "pedestrian", "3": "bicyclist", "4": "light"}
input_voc_dir = "/home/dfy888/DataSets/driving_datasets_voc"
annotations_dir = os.path.join(input_voc_dir, "Annotations")
JPEGImages_dir = os.path.join(input_voc_dir, "JPEGImages")
xml_num = len(glob.glob(os.path.join(annotations_dir, "*.xml")))
img_num = len(glob.glob(os.path.join(JPEGImages_dir, "*.jpg")))
print("voc格式的图片数目={};xml文件数目={}".format(img_num, xml_num))
assert xml_num == img_num, "必须相等!"
# path/to/image.jpg,x1,y1,x2,y2,class_name
# 某些图片没有任何的目标，仅仅作为负样本存在
# /data/imgs/img_001.jpg,837,346,981,456,cow
# /data/imgs/img_002.jpg,215,312,279,391,cat
# /data/imgs/img_002.jpg,22,5,89,84,bird
# /data/imgs/img_003.jpg,,,,,
csv_train = "./train_annots.csv"
csv_val = "./val_annots.csv"
# class_name,id id从0开始不要包括背景
csv_classes = "./class_list.csv"
def write_to_csv(dataset_type, csv_file):
dataset_txt_file = os.path.join(input_voc_dir, "ImageSets", "Main", "%s.txt" % dataset_type)
f_csv = open(csv_file, 'w', newline='')
csv_write = csv.writer(f_csv)
with open(dataset_txt_file, "r") as f:
lines = f.readlines()
for line in lines:
img_name = line.strip()
img_path = os.path.join(JPEGImages_dir, "%s.jpg" % img_name)
xml_annotation_path = os.path.join(annotations_dir, "%s.xml" % img_name)
dom = parse(xml_annotation_path)
# 获取文档元素对象
data = dom.documentElement
objs = data.getElementsByTagName('object')
for obj in objs:
# 获取标签中内容
name = obj.getElementsByTagName('name')[0].childNodes[0].nodeValue
x1 = obj.getElementsByTagName('bndbox')[0].getElementsByTagName('xmin')[0].childNodes[0].nodeValue
y1 = obj.getElementsByTagName('bndbox')[0].getElementsByTagName('ymin')[0].childNodes[0].nodeValue
x2 = obj.getElementsByTagName('bndbox')[0].getElementsByTagName('xmax')[0].childNodes[0].nodeValue
y2 = obj.getElementsByTagName('bndbox')[0].getElementsByTagName('ymax')[0].childNodes[0].nodeValue
# path/to/image.jpg,x1,y1,x2,y2,class_name
csv_write.writerow([img_path, x1, y1, x2, y2, name])
f_csv.close()
with open(csv_classes, 'w', newline='')as f:
csv_write = csv.writer(f)
for id, name in label_dict.items():
csv_write.writerow([name, id])
if __name__ == "__main__":
write_to_csv("train", csv_train)
write_to_csv("val", csv_val)

复制代码

leironh · 发表于 2020-12-5 23:39:07

66666666666666

leironh · 发表于 2020-12-6 09:20:53

Python高级编程与AI数据分析课程

leironh · 发表于 2020-12-6 09:25:05

笔记写的挺好的，点个赞

leironh · 发表于 2020-12-6 09:29:50

目标检测之voc格式的数据集转换为csv格式

leironh · 发表于 2020-12-6 09:32:15

目标检测之voc格式的数据集转换为csv格式

leironh · 发表于 2020-12-6 09:36:03

目标检测之voc格式的数据集转换为csv格式目标检测之voc格式的数据集转换为csv格式

leironh · 发表于 2020-12-6 09:39:31

目标检测之coco与voc格式的数据相互转换，并验证coco格式的脚本目标检测之coco与voc格式的数据相互转换，并验证coco格式的脚本

leironh · 发表于 2020-12-6 09:40:05

目标检测之coco与voc格式的数据相互转换，并验证coco格式的脚本目标检测之coco与voc格式的数据相互转换，并验证coco格式的脚本目标检测之coco与voc格式的数据相互转换，并验证coco格式的脚本

leironh · 发表于 2020-12-6 09:40:30

目标检测之coco与voc格式的数据相互转换，并验证coco格式的脚本目标检测之coco与voc格式的数据相互转换，并验证coco格式的脚本目标检测之coco与voc格式的数据相互转换，并验证coco格式的脚本目标检测之coco与voc格式的数据相互转换，并验证coco格式的脚本

		自动登录	找回密码
密码			立即注册

[课堂笔记] 目标检测之voc格式的数据集转换为csv格式