(Log日志与下面的运行数据相同) 问题复现步骤: 1.运行examples/tflite/mobilenet_v1中的test.py 2.分别使用模拟器进行评估以及连接开发板评估,结果如下: ----------模拟器评估----------------- --> config model done --> Loading model W The target_platform is not set in config, using default target platform rk1808. done --> Building model done --> Export RKNN model done --> Init runtime environment librknn_runtime version 1.7.3 (5047ff8 build: 2022-08-13 12:11:22 base: 1131) done --> Running model mobilenet_v1 -----TOP 5----- [156]: 0.8642578125 [155]: 0.083740234375 [205]: 0.01241302490234375 [284]: 0.006565093994140625 [194]: 0.002044677734375 done --> Evaluate model performance W When performing performance evaluation, inputs can be set to None to use fake inputs. ======================================================================== Performance ======================================================================== Layer ID Name Time(us) 60 openvx.tensor_transpose_3 72 1 convolution.relu.pooling.layer2_2 369 3 convolution.relu.pooling.layer2_2 211 5 convolution.relu.pooling.layer2_2 184 7 convolution.relu.pooling.layer2_2 315 9 convolution.relu.pooling.layer2_2 99 11 convolution.relu.pooling.layer2_2 137 13 convolution.relu.pooling.layer2_2 103 15 convolution.relu.pooling.layer2_2 116 17 convolution.relu.pooling.layer2_2 95 19 convolution.relu.pooling.layer2_2 102 21 convolution.relu.pooling.layer2_2 151 23 convolution.relu.pooling.layer2_2 95 25 convolution.relu.pooling.layer2_2 109 27 convolution.relu.pooling.layer2_2 106 29 convolution.relu.pooling.layer2_2 211 31 convolution.relu.pooling.layer2_2 106 33 convolution.relu.pooling.layer2_2 211 35 convolution.relu.pooling.layer2_2 106 37 convolution.relu.pooling.layer2_2 211 39 convolution.relu.pooling.layer2_2 106 41 convolution.relu.pooling.layer2_2 211 43 convolution.relu.pooling.layer2_2 106 45 convolution.relu.pooling.layer2_2 211 47 convolution.relu.pooling.layer2_2 108 49 convolution.relu.pooling.layer2_2 163 51 convolution.relu.pooling.layer2_2 206 53 convolution.relu.pooling.layer2_2 319 55 pooling.layer2 34 56 fullyconnected.relu.layer_3 110 58 softmaxlayer2.layer 39 Total Time(us): 4722 FPS(600MHz): 158.83 FPS(800MHz): 211.77 Note: Time of each layer is converted according to 800MHz! ======================================================================== done ---------开发板评估----------- --> config model done --> Loading model done --> Building model done --> Export RKNN model done --> Init runtime environment W Flag perf_debug has been set, it will affect the performance of inference! I NPUTransfer: Starting NPU Transfer Client, Transfer version 2.1.0 (b5861e7@2020-11-23T11:50:36) D RKNNAPI: ============================================== D RKNNAPI: RKNN VERSION: D RKNNAPI: API: 1.7.3 (0cfd4a1 build: 2022-08-15 17:08:57) D RKNNAPI: DRV: 1.7.0 (7880361 build: 2021-08-16 14:05:08) D RKNNAPI: ============================================== done --> Running model mobilenet_v1 -----TOP 5----- [156]: 0.8515625 [155]: 0.091796875 [205]: 0.0135955810546875 [284]: 0.0064697265625 [194 260]: 0.002239227294921875 done --> Evaluate model performance W When performing performance evaluation, inputs can be set to None to use fake inputs. ======================================================================== Performance #### The performance result is just for debugging, #### #### may worse than actual performance! #### ======================================================================== Layer ID Name Operator Uid Time(us) 0 MobilenetV1/MobilenetV1/Conv2d_0/Relu6_1 TENSOR_TRANS 60 361 _RKNN_mark_perm_60_0 2 MobilenetV1/MobilenetV1/Conv2d_0/Relu6_1 CONVOLUTION 1 920 _2 3 MobilenetV1/MobilenetV1/Conv2d_1_depthwi DEPTH_WISE_CONV 3 896 se/Relu6_3_2 4 MobilenetV1/MobilenetV1/Conv2d_1_pointwi CONVOLUTION 5 1106 se/Relu6_5_2 5 MobilenetV1/MobilenetV1/Conv2d_2_depthwi DEPTH_WISE_CONV 7 430 se/Relu6_7_2 6 MobilenetV1/MobilenetV1/Conv2d_2_pointwi CONVOLUTION 9 5080 se/Relu6_9_2 7 MobilenetV1/MobilenetV1/Conv2d_3_depthwi DEPTH_WISE_CONV 11 4888 se/Relu6_11_2 8 MobilenetV1/MobilenetV1/Conv2d_3_pointwi CONVOLUTION 13 5043 se/Relu6_13_2 9 MobilenetV1/MobilenetV1/Conv2d_4_depthwi DEPTH_WISE_CONV 15 148 se/Relu6_15_2 10 MobilenetV1/MobilenetV1/Conv2d_4_pointwi CONVOLUTION 17 243 se/Relu6_17_2 11 MobilenetV1/MobilenetV1/Conv2d_5_depthwi DEPTH_WISE_CONV 19 161 se/Relu6_19_2 12 MobilenetV1/MobilenetV1/Conv2d_5_pointwi CONVOLUTION 21 196 se/Relu6_21_2 13 MobilenetV1/MobilenetV1/Conv2d_6_depthwi DEPTH_WISE_CONV 23 74 se/Relu6_23_2 14 MobilenetV1/MobilenetV1/Conv2d_6_pointwi CONVOLUTION 25 109 se/Relu6_25_2 15 MobilenetV1/MobilenetV1/Conv2d_7_depthwi DEPTH_WISE_CONV 27 74 se/Relu6_27_2 16 MobilenetV1/MobilenetV1/Conv2d_7_pointwi CONVOLUTION 29 157 se/Relu6_29_2 17 MobilenetV1/MobilenetV1/Conv2d_8_depthwi DEPTH_WISE_CONV 31 69 se/Relu6_31_2 18 MobilenetV1/MobilenetV1/Conv2d_8_pointwi CONVOLUTION 33 160 se/Relu6_33_2 19 MobilenetV1/MobilenetV1/Conv2d_9_depthwi DEPTH_WISE_CONV 35 69 se/Relu6_35_2 20 MobilenetV1/MobilenetV1/Conv2d_9_pointwi CONVOLUTION 37 157 se/Relu6_37_2 21 MobilenetV1/MobilenetV1/Conv2d_10_depthw DEPTH_WISE_CONV 39 67 ise/Relu6_39_2 22 MobilenetV1/MobilenetV1/Conv2d_10_pointw CONVOLUTION 41 158 ise/Relu6_41_2 23 MobilenetV1/MobilenetV1/Conv2d_11_depthw DEPTH_WISE_CONV 43 72 ise/Relu6_43_2 24 MobilenetV1/MobilenetV1/Conv2d_11_pointw CONVOLUTION 45 159 ise/Relu6_45_2 25 MobilenetV1/MobilenetV1/Conv2d_12_depthw DEPTH_WISE_CONV 47 73 ise/Relu6_47_2 26 MobilenetV1/MobilenetV1/Conv2d_12_pointw CONVOLUTION 49 128 ise/Relu6_49_2 27 MobilenetV1/MobilenetV1/Conv2d_13_depthw DEPTH_WISE_CONV 51 69 ise/Relu6_51_2 28 MobilenetV1/MobilenetV1/Conv2d_13_pointw CONVOLUTION 53 209 ise/Relu6_53_2 29 MobilenetV1/Logits/AvgPool_1a/AvgPool_55 DEPTH_WISE_CONV 55 128 _2 30 MobilenetV1/Logits/Conv2d_1c_1x1/BiasAdd FULLYCONNECTED 56 477 _56_0 1 Softmax2Layer_1 SOFTMAX 945 Total Time(us): 22826 FPS: 43.81 ======================================================================== done 相同的模型推理时间差距5倍左右 (后续测试yolov5s模型,在开发板上评估运行帧率4帧左右) 请问是什么原因导致?应该如何优化解决。 |
-
1.74 KB, 下载次数: 0, 下载积分: 灯泡 -1 , 经验 -1