-
Notifications
You must be signed in to change notification settings - Fork 75k
Description
System information
- Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu14.04
- Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: Xiaomi 8
- TensorFlow installed from (source or binary): binary
- TensorFlow version (use command below): 1.10
- Python version:
- Bazel version (if compiling from source):
- GCC/Compiler version (if compiling from source):
- CUDA/cuDNN version: 9.0 / 7.1
- GPU model and memory:
- Exact command to reproduce:
Describe the problem
I test performance of tf-mobile, tf-lite, tf-mobile-int8, tf-lite-int8 on android, and I find that the speed of tf-lite is much slower than tf-mobile.
-
I use
freeze_graphto generateA.pbfile fromcheckpointfor testing tf-mobile performance. -
I use
toco_convertto convertA.pbfile toA.tflitefile for for testing tf-lite performance. -
I use
transform_graphto get quantitativeAQ.pbfile fromA.pbfile for testing tf-mobile int8 performance. -
I train a model with the same architecture by adding the line
tf.contrib.quantize.create_training_graph()and get thecheckpointfile. Then I replace the line withtf.contrib.quantize.create_eval_graph()to generate theA.pbtxtfile, and usecheckpointfile andA.pbtxtfile to getA8.pbwith fake quantization nodes. Finally, I usetoco_convertto get theA8.tflitefile. -
I test the performance with these 4 files on android, each runs several times for inference on the same image, and the result is listed below:
tf-mobile: 357ms per image
tf-mobile int8: 356ms per image
tf-lite: 844ms per image
tf-lite int8; 571ms per image
I wonder why tf-lite is much slower than tf-mobile.
PS: the model architecture only contains: CONV+BN+RELU, RESHAPE, FULLY-CONTECT ops.
The features shape from CONV+BN+RELU is [B,T,C], then I reshape it to [-1,C] and go on to the fc layer, then reshape the out with shape [B*T,K] to [B,T,K], which is the final result I expected.
I wonder is the reshape op the brings the worse performance ?
Thank you very much ...