tflite runs much slower than tfmobile ...

### System information
- **Have I written custom code (as opposed to using a stock example script provided in TensorFlow)**: Yes
- **OS Platform and Distribution (e.g., Linux Ubuntu 16.04)**: Ubuntu14.04
- **Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device**: Xiaomi 8
- **TensorFlow installed from (source or binary)**: binary
- **TensorFlow version (use command below)**: 1.10
- **Python version**:
- **Bazel version (if compiling from source)**:
- **GCC/Compiler version (if compiling from source)**:
- **CUDA/cuDNN version**: 9.0 / 7.1
- **GPU model and memory**:
- **Exact command to reproduce**:

### Describe the problem

I test performance of tf-mobile, tf-lite, tf-mobile-int8, tf-lite-int8 on android, and I find that the speed of tf-lite is much slower than tf-mobile.

1. I use `freeze_graph` to generate `A.pb` file from `checkpoint` for testing tf-mobile performance.

2. I use `toco_convert` to convert `A.pb` file to `A.tflite` file for for testing tf-lite performance.

3. I use `transform_graph` to get quantitative `AQ.pb` file from `A.pb` file for testing tf-mobile int8 performance.

4. I train a model with the same architecture by adding the line `tf.contrib.quantize.create_training_graph()`  and get the `checkpoint` file. Then I replace the line with `tf.contrib.quantize.create_eval_graph()` to generate the `A.pbtxt` file, and use `checkpoint` file and `A.pbtxt` file to get `A8.pb` with fake quantization nodes. Finally, I use `toco_convert` to get the `A8.tflite` file.

5. I test the performance with these 4 files on android, each runs several times for inference on the same image, and the result is listed below:

tf-mobile:           357ms per image
tf-mobile int8:    356ms per image
tf-lite:                 844ms per image
tf-lite int8;          571ms per image

**I wonder why tf-lite is much slower than tf-mobile.**

PS: the model architecture only contains: CONV+BN+RELU, RESHAPE, FULLY-CONTECT ops.

The features shape from CONV+BN+RELU is [B,T,C], then I reshape it to [-1,C] and go on to the fc layer, then reshape the out with shape [B*T,K] to [B,T,K], which is the final result I expected.

**I wonder is the reshape op the brings the worse performance ?**

**Thank you very much ...**



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

tflite runs much slower than tfmobile ... #21787

System information

Describe the problem

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

tflite runs much slower than tfmobile ... #21787

Description

System information

Describe the problem

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions