EvalDNN

A Toolbox for Deep Neural Network Models

Accuracy

* Official reported data is put in parentheses

Model Top-1 Top-5
inception_resnet_v2 80.4%(80.4%) 95.3%(95.3%)
inception_v1 69.8%(69.8%) 89.6%(89.6%)
inception_v2 74.0%(73.9%) 91.8%(91.8%)
inception_v3 78.0%(78.0%) 93.9%(93.9%)
inception_v4 80.2%(80.2%) 95.2%(95.2%)
mobilenet_v1_0_25_128 41.4%(41.5%) 66.3%(66.3%)
mobilenet_v1_0_5_160 59.0%(59.1%) 81.9%(81.9%)
mobilenet_v1_1_0_224 71.0%(70.9%) 90.0%(89.9%)
mobilenet_v2_1_0_224 71.8%(71.9%) 90.7%(91.0%)
mobilenet_v2_1_4_224 75.0%(74.9%) 92.5%(92.5%)
nasnet_a_large_331 82.7%(82.7%) 96.2%(96.2%)
nasnet_a_mobile_224 74.0%(74.0%) 91.6%(91.6%)
pnasnet_5_large_331 82.9%(82.9%) 96.2%(96.2%)
pnasnet_5_mobile_224 74.1%(74.2%) 91.9%(91.9%)
resnet_v1_101 76.4%(76.4%) 92.9%(92.9%)
resnet_v1_152 76.8%(76.8%) 93.2%(93.2%)
resnet_v1_50 75.2%(75.2%) 92.2%(92.2%)
resnet_v2_101 77.0%(77.0%) 93.7%(93.7%)
resnet_v2_152 77.8%(77.8%) 94.1%(94.1%)
resnet_v2_50 75.6%(75.6%) 92.8%(92.8%)
vgg16 70.9%(71.5%) 89.8%(89.8%)
vgg19 71.0%(71.1%) 89.8%(89.8%)

Neuron Coverage

Model Layers Neurons t=0.0 t=0.1 t=0.2 t=0.3 t=0.4 t=0.5 t=0.6 t=0.7 t=0.8 t=0.9
inception_resnet_v2 780 246336 99.2% 96.4% 85.9% 75.8% 69.4% 64.9% 60.1% 44.0% 16.5% 4.7%
inception_v1 195 35577 100.0% 98.2% 90.5% 81.2% 71.2% 61.2% 51.4% 32.1% 13.3% 6.2%
inception_v2 231 44321 100.0% 99.0% 90.4% 79.2% 68.6% 59.0% 51.7% 30.6% 13.4% 5.9%
inception_v3 312 76265 100.0% 97.4% 87.2% 76.0% 63.5% 54.0% 46.4% 28.3% 11.1% 4.5%
inception_v4 491 130944 99.9% 96.3% 84.9% 72.6% 62.2% 54.6% 48.2% 28.9% 10.8% 3.9%
mobilenet_v1_0_25_128 84 10466 99.9% 99.9% 99.1% 95.9% 91.7% 85.8% 77.6% 63.2% 45.3% 30.0%
mobilenet_v1_0_5_160 84 18930 99.9% 99.8% 98.0% 93.3% 87.6% 80.5% 71.4% 53.7% 32.1% 19.0%
mobilenet_v1_1_0_224 84 35858 99.9% 99.4% 95.5% 88.4% 81.5% 76.9% 68.2% 45.6% 23.7% 12.6%
mobilenet_v2_1_0_224 142 52946 100.0% 98.3% 91.7% 87.2% 82.4% 78.2% 73.9% 58.6% 28.8% 8.6%
mobilenet_v2_1_4_224 142 73586 99.9% 97.6% 91.0% 86.0% 81.0% 77.1% 73.4% 54.0% 24.7% 9.8%
nasnet_a_large_331 1129 537684 100.0% 90.8% 75.3% 69.5% 67.1% 66.0% 63.0% 43.1% 12.3% 1.6%
nasnet_a_mobile_224 829 100878 99.9% 98.8% 89.2% 79.4% 72.1% 68.3% 65.7% 55.1% 27.7% 7.1%
pnasnet_5_large_331 851 451392 100.0% 92.6% 77.5% 70.7% 68.5% 66.8% 61.6% 40.5% 12.5% 1.0%
pnasnet_5_mobile_224 677 86448 99.9% 99.4% 93.9% 84.5% 75.4% 70.3% 66.7% 55.3% 29.0% 6.4%
resnet_v1_101 318 159531 100.0% 99.1% 89.5% 79.9% 73.4% 69.9% 65.3% 33.7% 6.1% 2.0%
resnet_v1_152 471 228651 100.0% 98.8% 88.4% 78.8% 72.8% 69.6% 66.0% 34.3% 5.4% 1.4%
resnet_v1_50 165 81195 100.0% 98.8% 91.2% 83.0% 76.2% 71.9% 66.7% 32.9% 8.2% 3.9%
resnet_v2_101 314 155692 99.9% 88.7% 77.2% 71.9% 69.6% 68.6% 64.4% 30.3% 5.5% 1.8%
resnet_v2_152 467 224812 99.9% 88.0% 76.0% 70.9% 68.9% 67.9% 62.4% 28.8% 3.6% 1.3%
resnet_v2_50 161 77356 100.0% 92.2% 80.8% 74.4% 71.3% 69.9% 63.3% 34.5% 6.8% 3.4%
vgg16 36 27304 100.0% 98.0% 90.0% 84.4% 81.5% 80.0% 78.3% 69.5% 64.5% 63.6%
vgg19 42 29864 100.0% 97.6% 88.2% 82.1% 78.9% 77.2% 73.9% 63.9% 58.7% 58.2%

Robustness

Model FGSM BIM DeepFool
Success Rate Avg Time Avg Linf Dist Success Rate Avg Time Avg Linf Dist Success Rate Avg Time Avg MSE
inception_resnet_v2 100.0% 7.12s 0.10506264 100.0% 28.23s 0.00588022 99.6% 5.61s 0.00000899
inception_v1 100.0% 0.27s 0.00909138 100.0% 4.00s 0.00107600 99.6% 0.77s 0.00000058
inception_v2 100.0% 0.52s 0.02083008 100.0% 5.80s 0.00092522 98.8% 1.42s 0.00000044
inception_v3 99.9% 1.85s 0.04260582 100.0% 12.64s 0.00130179 98.9% 3.07s 0.00000055
inception_v4 99.9% 5.33s 0.07990478 100.0% 24.71s 0.00245782 99.0% 5.69s 0.00000200
mobilenet_v1_0_25_128 100.0% 0.06s 0.00105528 100.0% 0.99s 0.00055203 97.8% 0.25s 0.00011658
mobilenet_v1_0_5_160 100.0% 0.07s 0.00213283 100.0% 1.20s 0.00062653 99.2% 0.17s 0.00518589
mobilenet_v1_1_0_224 100.0% 0.11s 0.00232121 100.0% 2.07s 0.00052688 98.5% 0.40s 0.01588689
mobilenet_v2_1_0_224 100.0% 0.16s 0.00834787 100.0% 2.36s 0.00099513 99.3% 0.36s 0.00011537
mobilenet_v2_1_4_224 100.0% 0.20s 0.01329507 100.0% 2.88s 0.00109281 97.3% 0.85s 0.00003651
nasnet_a_large_331 100.0% 13.62s 0.14173908 100.0% 46.52s 0.00560161 98.0% 12.82s 0.00000821
nasnet_a_mobile_224 100.0% 0.84s 0.03725069 100.0% 8.73s 0.00186446 99.2% 1.98s 0.00000130
pnasnet_5_large_331 99.9% 10.09s 0.10770091 100.0% 42.92s 0.00428668 98.2% 10.43s 0.00000964
pnasnet_5_mobile_224 100.0% 0.59s 0.02707373 100.0% 7.68s 0.00142396 99.1% 2.02s 0.00000097
resnet_v1_101 100.0% 0.90s 0.01410290 100.0% 11.81s 0.00140987 99.0% 2.91s 0.00000099
resnet_v1_152 100.0% 1.23s 0.01258933 100.0% 16.91s 0.00140267 99.4% 4.10s 0.00000101
resnet_v1_50 100.0% 0.42s 0.00766031 100.0% 6.49s 0.00112461 98.5% 1.86s 0.00000081
resnet_v2_101 100.0% 1.87s 0.02834016 100.0% 17.26s 0.00268525 98.6% 4.93s 0.00000095
resnet_v2_152 100.0% 2.82s 0.03036385 100.0% 25.14s 0.00317337 98.5% 7.87s 0.00000096
resnet_v2_50 100.0% 0.86s 0.01956913 100.0% 9.45s 0.00138591 97.8% 2.94s 0.00000075
vgg16 100.0% 0.81s 0.00423501 100.0% 12.64s 0.00189962 99.7% 2.08s 0.00000226
vgg19 100.0% 1.01s 0.00518590 100.0% 15.41s 0.00203222 99.7% 2.55s 0.00000244

Cross verification

To cross verify the correctness of our implementation of EvalDNN, we also use this project to perform FGSM attack on some of the models we selected.

Using that tool, with epsilon=1 as the argument, the following result in parentheses is obtained and compared with that obtained by EvalDNN.

Model Success Rate Avg Linf Dist
inception_resnet_v2 100.0%(100.0%) 0.10506264(0.00391963)
inception_v1 100.0%(99.7%) 0.00909138(0.00391789)
inception_v2 100.0%(100.0%) 0.02083008(0.00391808)
inception_v3 99.9%(100.0%) 0.04260582(0.00391977)
inception_v4 99.9%(99.7%) 0.07990478(0.00391998)
mobilenet_v1_0_25_128 100.0%(99.8%) 0.00105528(0.00391868)
mobilenet_v1_0_5_160 100.0%(100.0%) 0.00213283(0.00391672)
mobilenet_v1_1_0_224 100.0%(100.0%) 0.00232121(0.00391761)
nasnet_a_large_331 100.0%(99.8%) 0.14173908(0.00392060)
resnet_v2_101 100.0%(100.0%) 0.02834016(0.00392008)
resnet_v2_152 100.0%(100.0%) 0.03036385(0.00391987)
resnet_v2_50 100.0%(100.0%) 0.01956913(0.00392010)
vgg16 100.0%(99.9%) 0.00423501(0.00391760)
vgg19 100.0%(99.9%) 0.00518590(0.00391851)