东方耀AI技术分享

 找回密码
 立即注册

QQ登录

只需一步,快速开始

搜索
热搜: 活动 交友 discuz
查看: 5202|回复: 3
打印 上一主题 下一主题

[课堂笔记] 基于YOLO算法的通用物体检测项目实战总结(干货)

[复制链接]

1365

主题

1856

帖子

1万

积分

管理员

Rank: 10Rank: 10Rank: 10

积分
14429
QQ
跳转到指定楼层
楼主
发表于 2019-9-19 10:36:55 | 只看该作者 |只看大图 回帖奖励 |倒序浏览 |阅读模式




自然场景下通用物体检测业务场景综述


物体检测旨在构造智能算法和程序,来“观察”这个世界。
1、计算机本身是比较困难将这些信息抽象成为一种比较高层语意的表达,去对应现实生活中的名词概念。
2、我们通常所说的“观察”(see),实际上是已经包含了对视觉信息的加工,以及和真实世界的关系映射。




那么在计算机视觉领域,我们怎样去定义“观察”(see)这个概念,和我们人类的理解一致呢?
1、1982年 David Marr: To know what is where by looking(识别、检测、分割)
2、这个物体在这个真实世界当中的3D信息(SLAM)
3、这个场景正在发生什么,希望计算机能够根据图像或者视频,我们回答一些问题(事件、问答)


物体检测:用于定位图像中的多个不同类别的物体   定位+分类


算法性能的评价:
1、分类问题的:precision 精准率  recall召回率
2、AP:average precision,每一类别P值的平均值
   mAP:mean average precision,对所有类别的AP取均值。  
多标签图像分类任务中图片的标签不止一个,因此评价不能用普通单标签图像分类的标准


数据集资源:COCO 公开数据集(使用脚本下载coco数据集)
  1. #!/bin/bash

  2. # Clone COCO API
  3. git clone https://github.com/pdollar/coco
  4. cd coco

  5. mkdir images
  6. cd images

  7. # Download Images
  8. wget -c https://pjreddie.com/media/files/train2014.zip
  9. wget -c https://pjreddie.com/media/files/val2014.zip

  10. # Unzip
  11. unzip -q train2014.zip
  12. unzip -q val2014.zip

  13. cd ..

  14. # Download COCO Metadata
  15. wget -c https://pjreddie.com/media/files/instances_train-val2014.zip
  16. wget -c https://pjreddie.com/media/files/coco/5k.part
  17. wget -c https://pjreddie.com/media/files/coco/trainvalno5k.part
  18. wget -c https://pjreddie.com/media/files/coco/labels.tgz
  19. tar xzf labels.tgz
  20. unzip -q instances_train-val2014.zip

  21. # Set Up Image Lists 这里需要Python2的环境运行啊
  22. paste <(awk "{print "$PWD"}" <5k.part) 5k.part | tr -d '\t' > 5k.txt
  23. paste <(awk "{print "$PWD"}" <trainvalno5k.part) trainvalno5k.part | tr -d '\t' > trainvalno5k.txt

复制代码


●技术难点
1、物体种类多
2、物体变化丰富
3、场景变化丰富
4、尺度、光照、遮挡等等

YOLOV3算法:速度和精度最均衡的目标检测网络

YoLoV3的改进策略:
1、更好的主干网络(类ResNet)
2、多尺度预测(类FPN)
3、更好的分类器

DarkNet框架:Open Source Neural Networks in C
1、一个较为轻型的完全基于C与CUDA的开源深度学习框架
2、容易安装,没有任何依赖项,移植性非常好,支持CPU与GPU两种计算方式
3、地址:https://pjreddie.com/darknet/  https://github.com/pjreddie/darknet

git clone https://github.com/pjreddie/darknet
cd darknet
make
  1. GPU=1
  2. CUDNN=1
  3. OPENCV=1
  4. OPENMP=1
  5. DEBUG=0

  6. ARCH= -gencode arch=compute_30,code=sm_30 \
  7.       -gencode arch=compute_35,code=sm_35 \
  8.       -gencode arch=compute_50,code=[sm_50,compute_50] \
  9.       -gencode arch=compute_52,code=[sm_52,compute_52] \
  10.       -gencode arch=compute_60,code=sm_60 \
  11.       -gencode arch=compute_61,code=[sm_61,compute_61] \
  12.       -gencode arch=compute_75,code=sm_75
复制代码

DarkNet的测试
./darknet imtest data/eagle.jpg


训练的(使用预训练模型)
./darknet detector train cfg/coco.data cfg/yolov3.cfg backup/yolov3.backup
测试的:-thresh 0.6 加上置信度的阈值
./darknet detector test cfg/coco.data cfg/yolov3.cfg backup/yolov3_20000.weights data/giraffe.jpg -thresh 0.4
./darknet detector test cfg/coco.data cfg/yolov3.cfg backup/yolov3_20000.weights data/person.jpg


测试的(用Python接口)
python python/darknet.py
# python 3.7.4 报错  并不支持python3
net = load_net("cfg/yolov3.cfg", "backup/yolov3_20000.weights", 0)
meta = load_meta("cfg/coco.data")
r = detect(net, meta, "data/person.jpg")
print(r)

yolov3.weights : 是训练了50万次的模型  需要下载
https://pjreddie.com/darknet/yolo/
https://pjreddie.com/media/files/yolov3.weights
测试的:
./darknet detector test cfg/coco.data cfg/yolov3.cfg backup/yolov3.weights data/person.jpg

迭代次数小于1000时,每隔100次保存一次,大于1000时,每10000次保存一次
代码位置:examples/detector.c   line 138










coco数据集.png (280.09 KB, 下载次数: 324)

coco数据集.png

coco的下载.png (375.36 KB, 下载次数: 319)

coco的下载.png

yOLOv3.png (147.09 KB, 下载次数: 318)

yOLOv3.png

coco数据集上各算法的表现.png (194.84 KB, 下载次数: 325)

coco数据集上各算法的表现.png

yolov3.cfg.png (170.8 KB, 下载次数: 314)

yolov3.cfg.png

coco.data.png (48.02 KB, 下载次数: 320)

coco.data.png

detector.c.png (69.11 KB, 下载次数: 324)

detector.c.png

log01.png (536.08 KB, 下载次数: 324)

log01.png

log02.png (555.4 KB, 下载次数: 328)

log02.png

log03.png (624.68 KB, 下载次数: 320)

log03.png

log04.png (515.76 KB, 下载次数: 337)

log04.png

darknet_yolov3_coco_16188.png (260.99 KB, 下载次数: 316)

darknet_yolov3_coco_16188.png

darknet_yolov3_coco_20015.png (256.46 KB, 下载次数: 318)

darknet_yolov3_coco_20015.png

yolov3更好的主干网络.png (375.08 KB, 下载次数: 321)

yolov3更好的主干网络.png

yolov3多尺度预测.png (584.12 KB, 下载次数: 322)

yolov3多尺度预测.png

yolov3更好的分类器.png (282.49 KB, 下载次数: 321)

yolov3更好的分类器.png

COCO1.jpg (133.05 KB, 下载次数: 313)

COCO1.jpg

COCO2.jpg (100.7 KB, 下载次数: 313)

COCO2.jpg

COCO3.jpg (144.92 KB, 下载次数: 308)

COCO3.jpg
让天下人人学会人工智能!人工智能的前景一片大好!
回复

使用道具 举报

1365

主题

1856

帖子

1万

积分

管理员

Rank: 10Rank: 10Rank: 10

积分
14429
QQ
沙发
 楼主| 发表于 2019-9-19 16:44:59 | 只看该作者
  1. [yolo]
  2. # mask 是 anchor的索引
  3. mask = 0,1,2
  4. # 采用0 1 2 前三个尺寸
  5. anchors = 10,13,  16,30,  33,23,  30,61,  62,45,  59,119,  116,90,  156,198,  373,326
  6. classes=80
  7. # 每一个anchor需要预测的bbox数量9
  8. num=9
  9. # 数据增强的配置信息
  10. jitter=.3
  11. # 控制参与loss计算的检测框规模 iou>0.7 则不参与loss计算 一般为0.5-0.7之间
  12. ignore_thresh = .7
  13. truth_thresh = 1
  14. # 采用随机多尺度进行训练 如果为0 则采用固定的尺寸进行训练
  15. random=1
复制代码
让天下人人学会人工智能!人工智能的前景一片大好!
回复

使用道具 举报

1365

主题

1856

帖子

1万

积分

管理员

Rank: 10Rank: 10Rank: 10

积分
14429
QQ
板凳
 楼主| 发表于 2019-9-19 16:46:42 | 只看该作者
  1. #include "darknet.h"

  2. static int coco_ids[] = {1,2,3,4,5,6,7,8,9,10,11,13,14,15,16,17,18,19,20,21,22,23,24,25,27,28,31,32,33,34,35,36,37,38,39,40,41,42,43,44,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,67,70,72,73,74,75,76,77,78,79,80,81,82,84,85,86,87,88,89,90};


  3. void train_detector(char *datacfg, char *cfgfile, char *weightfile, int *gpus, int ngpus, int clear)
  4. {
  5.     list *options = read_data_cfg(datacfg);
  6.     char *train_images = option_find_str(options, "train", "data/train.list");
  7.     char *backup_directory = option_find_str(options, "backup", "/backup/");

  8.     srand(time(0));
  9.     char *base = basecfg(cfgfile);
  10.     printf("%s\n", base);
  11.     float avg_loss = -1;
  12.     network **nets = calloc(ngpus, sizeof(network));

  13.     srand(time(0));
  14.     int seed = rand();
  15.     int i;
  16.     for(i = 0; i < ngpus; ++i){
  17.         srand(seed);
  18. #ifdef GPU
  19.         cuda_set_device(gpus[i]);
  20. #endif
  21.         nets[i] = load_network(cfgfile, weightfile, clear);
  22.         nets[i]->learning_rate *= ngpus;
  23.     }
  24.     srand(time(0));
  25.     network *net = nets[0];

  26.     int imgs = net->batch * net->subdivisions * ngpus;
  27.     printf("Learning Rate: %g, Momentum: %g, Decay: %g\n", net->learning_rate, net->momentum, net->decay);
  28.     data train, buffer;

  29.     layer l = net->layers[net->n - 1];

  30.     int classes = l.classes;
  31.     float jitter = l.jitter;

  32.     list *plist = get_paths(train_images);
  33.     //int N = plist->size;
  34.     char **paths = (char **)list_to_array(plist);

  35.     load_args args = get_base_args(net);
  36.     args.coords = l.coords;
  37.     args.paths = paths;
  38.     args.n = imgs;
  39.     args.m = plist->size;
  40.     args.classes = classes;
  41.     args.jitter = jitter;
  42.     args.num_boxes = l.max_boxes;
  43.     args.d = &buffer;
  44.     args.type = DETECTION_DATA;
  45.     //args.type = INSTANCE_DATA;
  46.     args.threads = 64;

  47.     pthread_t load_thread = load_data(args);
  48.     double time;
  49.     int count = 0;
  50.     //while(i*imgs < N*120){
  51.     while(get_current_batch(net) < net->max_batches){
  52.         if(l.random && count++%10 == 0){
  53.             printf("Resizing\n");
  54.             int dim = (rand() % 10 + 10) * 32;
  55.             if (get_current_batch(net)+200 > net->max_batches) dim = 608;
  56.             //int dim = (rand() % 4 + 16) * 32;
  57.             printf("%d\n", dim);
  58.             args.w = dim;
  59.             args.h = dim;

  60.             pthread_join(load_thread, 0);
  61.             train = buffer;
  62.             free_data(train);
  63.             load_thread = load_data(args);

  64.             #pragma omp parallel for
  65.             for(i = 0; i < ngpus; ++i){
  66.                 resize_network(nets[i], dim, dim);
  67.             }
  68.             net = nets[0];
  69.         }
  70.         time=what_time_is_it_now();
  71.         pthread_join(load_thread, 0);
  72.         train = buffer;
  73.         load_thread = load_data(args);

  74.         /*
  75.            int k;
  76.            for(k = 0; k < l.max_boxes; ++k){
  77.            box b = float_to_box(train.y.vals[10] + 1 + k*5);
  78.            if(!b.x) break;
  79.            printf("loaded: %f %f %f %f\n", b.x, b.y, b.w, b.h);
  80.            }
  81.          */
  82.         /*
  83.            int zz;
  84.            for(zz = 0; zz < train.X.cols; ++zz){
  85.            image im = float_to_image(net->w, net->h, 3, train.X.vals[zz]);
  86.            int k;
  87.            for(k = 0; k < l.max_boxes; ++k){
  88.            box b = float_to_box(train.y.vals[zz] + k*5, 1);
  89.            printf("%f %f %f %f\n", b.x, b.y, b.w, b.h);
  90.            draw_bbox(im, b, 1, 1,0,0);
  91.            }
  92.            show_image(im, "truth11");
  93.            cvWaitKey(0);
  94.            save_image(im, "truth11");
  95.            }
  96.          */

  97.         printf("Loaded: %lf seconds\n", what_time_is_it_now()-time);

  98.         time=what_time_is_it_now();
  99.         float loss = 0;
  100. #ifdef GPU
  101.         if(ngpus == 1){
  102.             loss = train_network(net, train);
  103.         } else {
  104.             loss = train_networks(nets, ngpus, train, 4);
  105.         }
  106. #else
  107.         loss = train_network(net, train);
  108. #endif
  109.         if (avg_loss < 0) avg_loss = loss;
  110.         avg_loss = avg_loss*.9 + loss*.1;

  111.         i = get_current_batch(net);
  112.         printf("%ld: %f, %f avg, %f rate, %lf seconds, %d images\n", get_current_batch(net), loss, avg_loss, get_current_rate(net), what_time_is_it_now()-time, i*imgs);
  113.         if(i%100==0){
  114. #ifdef GPU
  115.             if(ngpus != 1) sync_nets(nets, ngpus, 0);
  116. #endif
  117.             char buff[256];
  118.             sprintf(buff, "%s/%s.backup", backup_directory, base);
  119.             save_weights(net, buff);
  120.         }
  121.         if(i%10000==0 || (i < 1000 && i%100 == 0)){
  122. #ifdef GPU
  123.             if(ngpus != 1) sync_nets(nets, ngpus, 0);
  124. #endif
  125.             char buff[256];
  126.             sprintf(buff, "%s/%s_%d.weights", backup_directory, base, i);
  127.             save_weights(net, buff);
  128.         }
  129.         free_data(train);
  130.     }
  131. #ifdef GPU
  132.     if(ngpus != 1) sync_nets(nets, ngpus, 0);
  133. #endif
  134.     char buff[256];
  135.     sprintf(buff, "%s/%s_final.weights", backup_directory, base);
  136.     save_weights(net, buff);
  137. }


  138. static int get_coco_image_id(char *filename)
  139. {
  140.     char *p = strrchr(filename, '/');
  141.     char *c = strrchr(filename, '_');
  142.     if(c) p = c;
  143.     return atoi(p+1);
  144. }

  145. static void print_cocos(FILE *fp, char *image_path, detection *dets, int num_boxes, int classes, int w, int h)
  146. {
  147.     int i, j;
  148.     int image_id = get_coco_image_id(image_path);
  149.     for(i = 0; i < num_boxes; ++i){
  150.         float xmin = dets[i].bbox.x - dets[i].bbox.w/2.;
  151.         float xmax = dets[i].bbox.x + dets[i].bbox.w/2.;
  152.         float ymin = dets[i].bbox.y - dets[i].bbox.h/2.;
  153.         float ymax = dets[i].bbox.y + dets[i].bbox.h/2.;

  154.         if (xmin < 0) xmin = 0;
  155.         if (ymin < 0) ymin = 0;
  156.         if (xmax > w) xmax = w;
  157.         if (ymax > h) ymax = h;

  158.         float bx = xmin;
  159.         float by = ymin;
  160.         float bw = xmax - xmin;
  161.         float bh = ymax - ymin;

  162.         for(j = 0; j < classes; ++j){
  163.             if (dets[i].prob[j]) fprintf(fp, "{"image_id":%d, "category_id":%d, "bbox":[%f, %f, %f, %f], "score":%f},\n", image_id, coco_ids[j], bx, by, bw, bh, dets[i].prob[j]);
  164.         }
  165.     }
  166. }

  167. void print_detector_detections(FILE **fps, char *id, detection *dets, int total, int classes, int w, int h)
  168. {
  169.     int i, j;
  170.     for(i = 0; i < total; ++i){
  171.         float xmin = dets[i].bbox.x - dets[i].bbox.w/2. + 1;
  172.         float xmax = dets[i].bbox.x + dets[i].bbox.w/2. + 1;
  173.         float ymin = dets[i].bbox.y - dets[i].bbox.h/2. + 1;
  174.         float ymax = dets[i].bbox.y + dets[i].bbox.h/2. + 1;

  175.         if (xmin < 1) xmin = 1;
  176.         if (ymin < 1) ymin = 1;
  177.         if (xmax > w) xmax = w;
  178.         if (ymax > h) ymax = h;

  179.         for(j = 0; j < classes; ++j){
  180.             if (dets[i].prob[j]) fprintf(fps[j], "%s %f %f %f %f %f\n", id, dets[i].prob[j],
  181.                     xmin, ymin, xmax, ymax);
  182.         }
  183.     }
  184. }

  185. void print_imagenet_detections(FILE *fp, int id, detection *dets, int total, int classes, int w, int h)
  186. {
  187.     int i, j;
  188.     for(i = 0; i < total; ++i){
  189.         float xmin = dets[i].bbox.x - dets[i].bbox.w/2.;
  190.         float xmax = dets[i].bbox.x + dets[i].bbox.w/2.;
  191.         float ymin = dets[i].bbox.y - dets[i].bbox.h/2.;
  192.         float ymax = dets[i].bbox.y + dets[i].bbox.h/2.;

  193.         if (xmin < 0) xmin = 0;
  194.         if (ymin < 0) ymin = 0;
  195.         if (xmax > w) xmax = w;
  196.         if (ymax > h) ymax = h;

  197.         for(j = 0; j < classes; ++j){
  198.             int class = j;
  199.             if (dets[i].prob[class]) fprintf(fp, "%d %d %f %f %f %f %f\n", id, j+1, dets[i].prob[class],
  200.                     xmin, ymin, xmax, ymax);
  201.         }
  202.     }
  203. }

  204. void validate_detector_flip(char *datacfg, char *cfgfile, char *weightfile, char *outfile)
  205. {
  206.     int j;
  207.     list *options = read_data_cfg(datacfg);
  208.     char *valid_images = option_find_str(options, "valid", "data/train.list");
  209.     char *name_list = option_find_str(options, "names", "data/names.list");
  210.     char *prefix = option_find_str(options, "results", "results");
  211.     char **names = get_labels(name_list);
  212.     char *mapf = option_find_str(options, "map", 0);
  213.     int *map = 0;
  214.     if (mapf) map = read_map(mapf);

  215.     network *net = load_network(cfgfile, weightfile, 0);
  216.     set_batch_network(net, 2);
  217.     fprintf(stderr, "Learning Rate: %g, Momentum: %g, Decay: %g\n", net->learning_rate, net->momentum, net->decay);
  218.     srand(time(0));

  219.     list *plist = get_paths(valid_images);
  220.     char **paths = (char **)list_to_array(plist);

  221.     layer l = net->layers[net->n-1];
  222.     int classes = l.classes;

  223.     char buff[1024];
  224.     char *type = option_find_str(options, "eval", "voc");
  225.     FILE *fp = 0;
  226.     FILE **fps = 0;
  227.     int coco = 0;
  228.     int imagenet = 0;
  229.     if(0==strcmp(type, "coco")){
  230.         if(!outfile) outfile = "coco_results";
  231.         snprintf(buff, 1024, "%s/%s.json", prefix, outfile);
  232.         fp = fopen(buff, "w");
  233.         fprintf(fp, "[\n");
  234.         coco = 1;
  235.     } else if(0==strcmp(type, "imagenet")){
  236.         if(!outfile) outfile = "imagenet-detection";
  237.         snprintf(buff, 1024, "%s/%s.txt", prefix, outfile);
  238.         fp = fopen(buff, "w");
  239.         imagenet = 1;
  240.         classes = 200;
  241.     } else {
  242.         if(!outfile) outfile = "comp4_det_test_";
  243.         fps = calloc(classes, sizeof(FILE *));
  244.         for(j = 0; j < classes; ++j){
  245.             snprintf(buff, 1024, "%s/%s%s.txt", prefix, outfile, names[j]);
  246.             fps[j] = fopen(buff, "w");
  247.         }
  248.     }

  249.     int m = plist->size;
  250.     int i=0;
  251.     int t;

  252.     float thresh = .005;
  253.     float nms = .45;

  254.     int nthreads = 4;
  255.     image *val = calloc(nthreads, sizeof(image));
  256.     image *val_resized = calloc(nthreads, sizeof(image));
  257.     image *buf = calloc(nthreads, sizeof(image));
  258.     image *buf_resized = calloc(nthreads, sizeof(image));
  259.     pthread_t *thr = calloc(nthreads, sizeof(pthread_t));

  260.     image input = make_image(net->w, net->h, net->c*2);

  261.     load_args args = {0};
  262.     args.w = net->w;
  263.     args.h = net->h;
  264.     //args.type = IMAGE_DATA;
  265.     args.type = LETTERBOX_DATA;

  266.     for(t = 0; t < nthreads; ++t){
  267.         args.path = paths[i+t];
  268.         args.im = &buf[t];
  269.         args.resized = &buf_resized[t];
  270.         thr[t] = load_data_in_thread(args);
  271.     }
  272.     double start = what_time_is_it_now();
  273.     for(i = nthreads; i < m+nthreads; i += nthreads){
  274.         fprintf(stderr, "%d\n", i);
  275.         for(t = 0; t < nthreads && i+t-nthreads < m; ++t){
  276.             pthread_join(thr[t], 0);
  277.             val[t] = buf[t];
  278.             val_resized[t] = buf_resized[t];
  279.         }
  280.         for(t = 0; t < nthreads && i+t < m; ++t){
  281.             args.path = paths[i+t];
  282.             args.im = &buf[t];
  283.             args.resized = &buf_resized[t];
  284.             thr[t] = load_data_in_thread(args);
  285.         }
  286.         for(t = 0; t < nthreads && i+t-nthreads < m; ++t){
  287.             char *path = paths[i+t-nthreads];
  288.             char *id = basecfg(path);
  289.             copy_cpu(net->w*net->h*net->c, val_resized[t].data, 1, input.data, 1);
  290.             flip_image(val_resized[t]);
  291.             copy_cpu(net->w*net->h*net->c, val_resized[t].data, 1, input.data + net->w*net->h*net->c, 1);

  292.             network_predict(net, input.data);
  293.             int w = val[t].w;
  294.             int h = val[t].h;
  295.             int num = 0;
  296.             detection *dets = get_network_boxes(net, w, h, thresh, .5, map, 0, &num);
  297.             if (nms) do_nms_sort(dets, num, classes, nms);
  298.             if (coco){
  299.                 print_cocos(fp, path, dets, num, classes, w, h);
  300.             } else if (imagenet){
  301.                 print_imagenet_detections(fp, i+t-nthreads+1, dets, num, classes, w, h);
  302.             } else {
  303.                 print_detector_detections(fps, id, dets, num, classes, w, h);
  304.             }
  305.             free_detections(dets, num);
  306.             free(id);
  307.             free_image(val[t]);
  308.             free_image(val_resized[t]);
  309.         }
  310.     }
  311.     for(j = 0; j < classes; ++j){
  312.         if(fps) fclose(fps[j]);
  313.     }
  314.     if(coco){
  315.         fseek(fp, -2, SEEK_CUR);
  316.         fprintf(fp, "\n]\n");
  317.         fclose(fp);
  318.     }
  319.     fprintf(stderr, "Total Detection Time: %f Seconds\n", what_time_is_it_now() - start);
  320. }


  321. void validate_detector(char *datacfg, char *cfgfile, char *weightfile, char *outfile)
  322. {
  323.     int j;
  324.     list *options = read_data_cfg(datacfg);
  325.     char *valid_images = option_find_str(options, "valid", "data/train.list");
  326.     char *name_list = option_find_str(options, "names", "data/names.list");
  327.     char *prefix = option_find_str(options, "results", "results");
  328.     char **names = get_labels(name_list);
  329.     char *mapf = option_find_str(options, "map", 0);
  330.     int *map = 0;
  331.     if (mapf) map = read_map(mapf);

  332.     network *net = load_network(cfgfile, weightfile, 0);
  333.     set_batch_network(net, 1);
  334.     fprintf(stderr, "Learning Rate: %g, Momentum: %g, Decay: %g\n", net->learning_rate, net->momentum, net->decay);
  335.     srand(time(0));

  336.     list *plist = get_paths(valid_images);
  337.     char **paths = (char **)list_to_array(plist);

  338.     layer l = net->layers[net->n-1];
  339.     int classes = l.classes;

  340.     char buff[1024];
  341.     char *type = option_find_str(options, "eval", "voc");
  342.     FILE *fp = 0;
  343.     FILE **fps = 0;
  344.     int coco = 0;
  345.     int imagenet = 0;
  346.     if(0==strcmp(type, "coco")){
  347.         if(!outfile) outfile = "coco_results";
  348.         snprintf(buff, 1024, "%s/%s.json", prefix, outfile);
  349.         fp = fopen(buff, "w");
  350.         fprintf(fp, "[\n");
  351.         coco = 1;
  352.     } else if(0==strcmp(type, "imagenet")){
  353.         if(!outfile) outfile = "imagenet-detection";
  354.         snprintf(buff, 1024, "%s/%s.txt", prefix, outfile);
  355.         fp = fopen(buff, "w");
  356.         imagenet = 1;
  357.         classes = 200;
  358.     } else {
  359.         if(!outfile) outfile = "comp4_det_test_";
  360.         fps = calloc(classes, sizeof(FILE *));
  361.         for(j = 0; j < classes; ++j){
  362.             snprintf(buff, 1024, "%s/%s%s.txt", prefix, outfile, names[j]);
  363.             fps[j] = fopen(buff, "w");
  364.         }
  365.     }


  366.     int m = plist->size;
  367.     int i=0;
  368.     int t;

  369.     float thresh = .005;
  370.     float nms = .45;

  371.     int nthreads = 4;
  372.     image *val = calloc(nthreads, sizeof(image));
  373.     image *val_resized = calloc(nthreads, sizeof(image));
  374.     image *buf = calloc(nthreads, sizeof(image));
  375.     image *buf_resized = calloc(nthreads, sizeof(image));
  376.     pthread_t *thr = calloc(nthreads, sizeof(pthread_t));

  377.     load_args args = {0};
  378.     args.w = net->w;
  379.     args.h = net->h;
  380.     //args.type = IMAGE_DATA;
  381.     args.type = LETTERBOX_DATA;

  382.     for(t = 0; t < nthreads; ++t){
  383.         args.path = paths[i+t];
  384.         args.im = &buf[t];
  385.         args.resized = &buf_resized[t];
  386.         thr[t] = load_data_in_thread(args);
  387.     }
  388.     double start = what_time_is_it_now();
  389.     for(i = nthreads; i < m+nthreads; i += nthreads){
  390.         fprintf(stderr, "%d\n", i);
  391.         for(t = 0; t < nthreads && i+t-nthreads < m; ++t){
  392.             pthread_join(thr[t], 0);
  393.             val[t] = buf[t];
  394.             val_resized[t] = buf_resized[t];
  395.         }
  396.         for(t = 0; t < nthreads && i+t < m; ++t){
  397.             args.path = paths[i+t];
  398.             args.im = &buf[t];
  399.             args.resized = &buf_resized[t];
  400.             thr[t] = load_data_in_thread(args);
  401.         }
  402.         for(t = 0; t < nthreads && i+t-nthreads < m; ++t){
  403.             char *path = paths[i+t-nthreads];
  404.             char *id = basecfg(path);
  405.             float *X = val_resized[t].data;
  406.             network_predict(net, X);
  407.             int w = val[t].w;
  408.             int h = val[t].h;
  409.             int nboxes = 0;
  410.             detection *dets = get_network_boxes(net, w, h, thresh, .5, map, 0, &nboxes);
  411.             if (nms) do_nms_sort(dets, nboxes, classes, nms);
  412.             if (coco){
  413.                 print_cocos(fp, path, dets, nboxes, classes, w, h);
  414.             } else if (imagenet){
  415.                 print_imagenet_detections(fp, i+t-nthreads+1, dets, nboxes, classes, w, h);
  416.             } else {
  417.                 print_detector_detections(fps, id, dets, nboxes, classes, w, h);
  418.             }
  419.             free_detections(dets, nboxes);
  420.             free(id);
  421.             free_image(val[t]);
  422.             free_image(val_resized[t]);
  423.         }
  424.     }
  425.     for(j = 0; j < classes; ++j){
  426.         if(fps) fclose(fps[j]);
  427.     }
  428.     if(coco){
  429.         fseek(fp, -2, SEEK_CUR);
  430.         fprintf(fp, "\n]\n");
  431.         fclose(fp);
  432.     }
  433.     fprintf(stderr, "Total Detection Time: %f Seconds\n", what_time_is_it_now() - start);
  434. }

  435. void validate_detector_recall(char *cfgfile, char *weightfile)
  436. {
  437.     network *net = load_network(cfgfile, weightfile, 0);
  438.     set_batch_network(net, 1);
  439.     fprintf(stderr, "Learning Rate: %g, Momentum: %g, Decay: %g\n", net->learning_rate, net->momentum, net->decay);
  440.     srand(time(0));

  441.     list *plist = get_paths("data/coco_val_5k.list");
  442.     char **paths = (char **)list_to_array(plist);

  443.     layer l = net->layers[net->n-1];

  444.     int j, k;

  445.     int m = plist->size;
  446.     int i=0;

  447.     float thresh = .001;
  448.     float iou_thresh = .5;
  449.     float nms = .4;

  450.     int total = 0;
  451.     int correct = 0;
  452.     int proposals = 0;
  453.     float avg_iou = 0;

  454.     for(i = 0; i < m; ++i){
  455.         char *path = paths[i];
  456.         image orig = load_image_color(path, 0, 0);
  457.         image sized = resize_image(orig, net->w, net->h);
  458.         char *id = basecfg(path);
  459.         network_predict(net, sized.data);
  460.         int nboxes = 0;
  461.         detection *dets = get_network_boxes(net, sized.w, sized.h, thresh, .5, 0, 1, &nboxes);
  462.         if (nms) do_nms_obj(dets, nboxes, 1, nms);

  463.         char labelpath[4096];
  464.         find_replace(path, "images", "labels", labelpath);
  465.         find_replace(labelpath, "JPEGImages", "labels", labelpath);
  466.         find_replace(labelpath, ".jpg", ".txt", labelpath);
  467.         find_replace(labelpath, ".JPEG", ".txt", labelpath);

  468.         int num_labels = 0;
  469.         box_label *truth = read_boxes(labelpath, &num_labels);
  470.         for(k = 0; k < nboxes; ++k){
  471.             if(dets[k].objectness > thresh){
  472.                 ++proposals;
  473.             }
  474.         }
  475.         for (j = 0; j < num_labels; ++j) {
  476.             ++total;
  477.             box t = {truth[j].x, truth[j].y, truth[j].w, truth[j].h};
  478.             float best_iou = 0;
  479.             for(k = 0; k < l.w*l.h*l.n; ++k){
  480.                 float iou = box_iou(dets[k].bbox, t);
  481.                 if(dets[k].objectness > thresh && iou > best_iou){
  482.                     best_iou = iou;
  483.                 }
  484.             }
  485.             avg_iou += best_iou;
  486.             if(best_iou > iou_thresh){
  487.                 ++correct;
  488.             }
  489.         }

  490.         fprintf(stderr, "%5d %5d %5d\tRPs/Img: %.2f\tIOU: %.2f%%\tRecall:%.2f%%\n", i, correct, total, (float)proposals/(i+1), avg_iou*100/total, 100.*correct/total);
  491.         free(id);
  492.         free_image(orig);
  493.         free_image(sized);
  494.     }
  495. }


  496. void test_detector(char *datacfg, char *cfgfile, char *weightfile, char *filename, float thresh, float hier_thresh, char *outfile, int fullscreen)
  497. {
  498.     list *options = read_data_cfg(datacfg);
  499.     char *name_list = option_find_str(options, "names", "data/names.list");
  500.     char **names = get_labels(name_list);

  501.     image **alphabet = load_alphabet();
  502.     network *net = load_network(cfgfile, weightfile, 0);
  503.     set_batch_network(net, 1);
  504.     srand(2222222);
  505.     double time;
  506.     char buff[256];
  507.     char *input = buff;
  508.     float nms=.45;
  509.     while(1){
  510.         if(filename){
  511.             strncpy(input, filename, 256);
  512.         } else {
  513.             printf("Enter Image Path: ");
  514.             fflush(stdout);
  515.             input = fgets(input, 256, stdin);
  516.             if(!input) return;
  517.             strtok(input, "\n");
  518.         }
  519.         image im = load_image_color(input,0,0);
  520.         image sized = letterbox_image(im, net->w, net->h);
  521.         //image sized = resize_image(im, net->w, net->h);
  522.         //image sized2 = resize_max(im, net->w);
  523.         //image sized = crop_image(sized2, -((net->w - sized2.w)/2), -((net->h - sized2.h)/2), net->w, net->h);
  524.         //resize_network(net, sized.w, sized.h);
  525.         layer l = net->layers[net->n-1];


  526.         float *X = sized.data;
  527.         time=what_time_is_it_now();
  528.         network_predict(net, X);
  529.         printf("%s: Predicted in %f seconds.\n", input, what_time_is_it_now()-time);
  530.         int nboxes = 0;
  531.         detection *dets = get_network_boxes(net, im.w, im.h, thresh, hier_thresh, 0, 1, &nboxes);
  532.         //printf("%d\n", nboxes);
  533.         //if (nms) do_nms_obj(boxes, probs, l.w*l.h*l.n, l.classes, nms);
  534.         if (nms) do_nms_sort(dets, nboxes, l.classes, nms);
  535.         draw_detections(im, dets, nboxes, thresh, names, alphabet, l.classes);
  536.         free_detections(dets, nboxes);
  537.         if(outfile){
  538.             save_image(im, outfile);
  539.         }
  540.         else{
  541.             save_image(im, "predictions");
  542. #ifdef OPENCV
  543.             make_window("predictions", 512, 512, 0);
  544.             show_image(im, "predictions", 0);
  545. #endif
  546.         }

  547.         free_image(im);
  548.         free_image(sized);
  549.         if (filename) break;
  550.     }
  551. }

  552. /*
  553. void censor_detector(char *datacfg, char *cfgfile, char *weightfile, int cam_index, const char *filename, int class, float thresh, int skip)
  554. {
  555. #ifdef OPENCV
  556.     char *base = basecfg(cfgfile);
  557.     network *net = load_network(cfgfile, weightfile, 0);
  558.     set_batch_network(net, 1);

  559.     srand(2222222);
  560.     CvCapture * cap;

  561.     int w = 1280;
  562.     int h = 720;

  563.     if(filename){
  564.         cap = cvCaptureFromFile(filename);
  565.     }else{
  566.         cap = cvCaptureFromCAM(cam_index);
  567.     }

  568.     if(w){
  569.         cvSetCaptureProperty(cap, CV_CAP_PROP_FRAME_WIDTH, w);
  570.     }
  571.     if(h){
  572.         cvSetCaptureProperty(cap, CV_CAP_PROP_FRAME_HEIGHT, h);
  573.     }

  574.     if(!cap) error("Couldn't connect to webcam.\n");
  575.     cvNamedWindow(base, CV_WINDOW_NORMAL);
  576.     cvResizeWindow(base, 512, 512);
  577.     float fps = 0;
  578.     int i;
  579.     float nms = .45;

  580.     while(1){
  581.         image in = get_image_from_stream(cap);
  582.         //image in_s = resize_image(in, net->w, net->h);
  583.         image in_s = letterbox_image(in, net->w, net->h);
  584.         layer l = net->layers[net->n-1];

  585.         float *X = in_s.data;
  586.         network_predict(net, X);
  587.         int nboxes = 0;
  588.         detection *dets = get_network_boxes(net, in.w, in.h, thresh, 0, 0, 0, &nboxes);
  589.         //if (nms) do_nms_obj(boxes, probs, l.w*l.h*l.n, l.classes, nms);
  590.         if (nms) do_nms_sort(dets, nboxes, l.classes, nms);

  591.         for(i = 0; i < nboxes; ++i){
  592.             if(dets[i].prob[class] > thresh){
  593.                 box b = dets[i].bbox;
  594.                 int left  = b.x-b.w/2.;
  595.                 int top   = b.y-b.h/2.;
  596.                 censor_image(in, left, top, b.w, b.h);
  597.             }
  598.         }
  599.         show_image(in, base);
  600.         cvWaitKey(10);
  601.         free_detections(dets, nboxes);


  602.         free_image(in_s);
  603.         free_image(in);


  604.         float curr = 0;
  605.         fps = .9*fps + .1*curr;
  606.         for(i = 0; i < skip; ++i){
  607.             image in = get_image_from_stream(cap);
  608.             free_image(in);
  609.         }
  610.     }
  611.     #endif
  612. }

  613. void extract_detector(char *datacfg, char *cfgfile, char *weightfile, int cam_index, const char *filename, int class, float thresh, int skip)
  614. {
  615. #ifdef OPENCV
  616.     char *base = basecfg(cfgfile);
  617.     network *net = load_network(cfgfile, weightfile, 0);
  618.     set_batch_network(net, 1);

  619.     srand(2222222);
  620.     CvCapture * cap;

  621.     int w = 1280;
  622.     int h = 720;

  623.     if(filename){
  624.         cap = cvCaptureFromFile(filename);
  625.     }else{
  626.         cap = cvCaptureFromCAM(cam_index);
  627.     }

  628.     if(w){
  629.         cvSetCaptureProperty(cap, CV_CAP_PROP_FRAME_WIDTH, w);
  630.     }
  631.     if(h){
  632.         cvSetCaptureProperty(cap, CV_CAP_PROP_FRAME_HEIGHT, h);
  633.     }

  634.     if(!cap) error("Couldn't connect to webcam.\n");
  635.     cvNamedWindow(base, CV_WINDOW_NORMAL);
  636.     cvResizeWindow(base, 512, 512);
  637.     float fps = 0;
  638.     int i;
  639.     int count = 0;
  640.     float nms = .45;

  641.     while(1){
  642.         image in = get_image_from_stream(cap);
  643.         //image in_s = resize_image(in, net->w, net->h);
  644.         image in_s = letterbox_image(in, net->w, net->h);
  645.         layer l = net->layers[net->n-1];

  646.         show_image(in, base);

  647.         int nboxes = 0;
  648.         float *X = in_s.data;
  649.         network_predict(net, X);
  650.         detection *dets = get_network_boxes(net, in.w, in.h, thresh, 0, 0, 1, &nboxes);
  651.         //if (nms) do_nms_obj(boxes, probs, l.w*l.h*l.n, l.classes, nms);
  652.         if (nms) do_nms_sort(dets, nboxes, l.classes, nms);

  653.         for(i = 0; i < nboxes; ++i){
  654.             if(dets[i].prob[class] > thresh){
  655.                 box b = dets[i].bbox;
  656.                 int size = b.w*in.w > b.h*in.h ? b.w*in.w : b.h*in.h;
  657.                 int dx  = b.x*in.w-size/2.;
  658.                 int dy  = b.y*in.h-size/2.;
  659.                 image bim = crop_image(in, dx, dy, size, size);
  660.                 char buff[2048];
  661.                 sprintf(buff, "results/extract/%07d", count);
  662.                 ++count;
  663.                 save_image(bim, buff);
  664.                 free_image(bim);
  665.             }
  666.         }
  667.         free_detections(dets, nboxes);


  668.         free_image(in_s);
  669.         free_image(in);


  670.         float curr = 0;
  671.         fps = .9*fps + .1*curr;
  672.         for(i = 0; i < skip; ++i){
  673.             image in = get_image_from_stream(cap);
  674.             free_image(in);
  675.         }
  676.     }
  677.     #endif
  678. }
  679. */

  680. /*
  681. void network_detect(network *net, image im, float thresh, float hier_thresh, float nms, detection *dets)
  682. {
  683.     network_predict_image(net, im);
  684.     layer l = net->layers[net->n-1];
  685.     int nboxes = num_boxes(net);
  686.     fill_network_boxes(net, im.w, im.h, thresh, hier_thresh, 0, 0, dets);
  687.     if (nms) do_nms_sort(dets, nboxes, l.classes, nms);
  688. }
  689. */

  690. void run_detector(int argc, char **argv)
  691. {
  692.     char *prefix = find_char_arg(argc, argv, "-prefix", 0);
  693.     float thresh = find_float_arg(argc, argv, "-thresh", .5);
  694.     float hier_thresh = find_float_arg(argc, argv, "-hier", .5);
  695.     int cam_index = find_int_arg(argc, argv, "-c", 0);
  696.     int frame_skip = find_int_arg(argc, argv, "-s", 0);
  697.     int avg = find_int_arg(argc, argv, "-avg", 3);
  698.     if(argc < 4){
  699.         fprintf(stderr, "usage: %s %s [train/test/valid] [cfg] [weights (optional)]\n", argv[0], argv[1]);
  700.         return;
  701.     }
  702.     char *gpu_list = find_char_arg(argc, argv, "-gpus", 0);
  703.     char *outfile = find_char_arg(argc, argv, "-out", 0);
  704.     int *gpus = 0;
  705.     int gpu = 0;
  706.     int ngpus = 0;
  707.     if(gpu_list){
  708.         printf("%s\n", gpu_list);
  709.         int len = strlen(gpu_list);
  710.         ngpus = 1;
  711.         int i;
  712.         for(i = 0; i < len; ++i){
  713.             if (gpu_list[i] == ',') ++ngpus;
  714.         }
  715.         gpus = calloc(ngpus, sizeof(int));
  716.         for(i = 0; i < ngpus; ++i){
  717.             gpus[i] = atoi(gpu_list);
  718.             gpu_list = strchr(gpu_list, ',')+1;
  719.         }
  720.     } else {
  721.         gpu = gpu_index;
  722.         gpus = &gpu;
  723.         ngpus = 1;
  724.     }

  725.     int clear = find_arg(argc, argv, "-clear");
  726.     int fullscreen = find_arg(argc, argv, "-fullscreen");
  727.     int width = find_int_arg(argc, argv, "-w", 0);
  728.     int height = find_int_arg(argc, argv, "-h", 0);
  729.     int fps = find_int_arg(argc, argv, "-fps", 0);
  730.     //int class = find_int_arg(argc, argv, "-class", 0);

  731.     char *datacfg = argv[3];
  732.     char *cfg = argv[4];
  733.     char *weights = (argc > 5) ? argv[5] : 0;
  734.     char *filename = (argc > 6) ? argv[6]: 0;
  735.     if(0==strcmp(argv[2], "test")) test_detector(datacfg, cfg, weights, filename, thresh, hier_thresh, outfile, fullscreen);
  736.     else if(0==strcmp(argv[2], "train")) train_detector(datacfg, cfg, weights, gpus, ngpus, clear);
  737.     else if(0==strcmp(argv[2], "valid")) validate_detector(datacfg, cfg, weights, outfile);
  738.     else if(0==strcmp(argv[2], "valid2")) validate_detector_flip(datacfg, cfg, weights, outfile);
  739.     else if(0==strcmp(argv[2], "recall")) validate_detector_recall(cfg, weights);
  740.     else if(0==strcmp(argv[2], "demo")) {
  741.         list *options = read_data_cfg(datacfg);
  742.         int classes = option_find_int(options, "classes", 20);
  743.         char *name_list = option_find_str(options, "names", "data/names.list");
  744.         char **names = get_labels(name_list);
  745.         demo(cfg, weights, thresh, cam_index, filename, names, classes, frame_skip, prefix, avg, hier_thresh, width, height, fps, fullscreen);
  746.     }
  747.     //else if(0==strcmp(argv[2], "extract")) extract_detector(datacfg, cfg, weights, cam_index, filename, class, thresh, frame_skip);
  748.     //else if(0==strcmp(argv[2], "censor")) censor_detector(datacfg, cfg, weights, cam_index, filename, class, thresh, frame_skip);
  749. }
复制代码
让天下人人学会人工智能!人工智能的前景一片大好!
回复

使用道具 举报

0

主题

117

帖子

258

积分

中级会员

Rank: 3Rank: 3

积分
258
QQ
地板
发表于 2020-2-3 15:51:04 | 只看该作者
谢谢老师提供的资料。
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

QQ|Archiver|手机版|小黑屋|人工智能工程师的摇篮 ( 湘ICP备2020019608号-1 )

GMT+8, 2024-4-20 07:18 , Processed in 0.213249 second(s), 25 queries .

Powered by Discuz! X3.4

© 2001-2017 Comsenz Inc.

快速回复 返回顶部 返回列表