Facelandmarker slow on Android apps but fast with TFLite model benchmark tool #5872

arrufat · 2025-02-28T14:22:46Z

Have I written custom code (as opposed to using a stock example script provided in MediaPipe)

Yes

OS Platform and Distribution

Android 15

MediaPipe Tasks SDK version

0.10.21

Task name (e.g. Image classification, Gesture recognition etc.)

Face landmark detection

Programming Language and version (e.g. C++, Python, Java)

Kotlin

Describe the actual behavior

The face landmarker model takes about 30-70 ms to run on a Pixel 9 Pro

Describe the expected behaviour

I would expect the model to run in real-time (on Desktop, using Wasm it takes 15-20 ms)

Standalone code/steps you may have used to try to get what you need

I tried the code sample code from https://github.com/google-ai-edge/mediapipe-samples/tree/main/examples/face_landmarker/android

Other info / Complete Logs

To check how fast the model can run, I used the TFLite Model Benchmark Tool.
I unzipped the face_landmarker.task and benchmarked both the face_detector.tflite and the face_landmarks_detector.tlite models:

~ $ adb shell am start -S \
          -n org.tensorflow.lite.benchmark/.BenchmarkModelActivity \
          --es args '"--graph=/data/local/tmp/face_detector.tflite \
      --use_gpu=true"'
~ $ adb shell am start -S \
          -n org.tensorflow.lite.benchmark/.BenchmarkModelActivity \
          --es args '"--graph=/data/local/tmp/face_landmarks_detector.tflite \
      --use_gpu=true"'

~ $ adb shell am start -S \
          -n org.tensorflow.lite.benchmark/.BenchmarkModelActivity \
          --es args '"--graph=/data/local/tmp/face_detector.tflite \
      --use_gpu=false"'
~ $ adb shell am start -S \
          -n org.tensorflow.lite.benchmark/.BenchmarkModelActivity \
          --es args '"--graph=/data/local/tmp/face_landmarks_detector.tflite \
      --use_gpu=false"'

Results:

adb logcat | grep "Inference timings in us"
02-28 23:15:03.904 30233 30233 I tflite  : Inference timings in us: Init: 2770755, First inference: 19349, Warmup (avg): 4108.97, Inference (avg): 3611.28
02-28 23:15:22.704 30277 30277 I tflite  : Inference timings in us: Init: 4607458, First inference: 31031, Warmup (avg): 12077.7, Inference (avg): 11406.

02-28 23:15:43.189 30337 30337 I tflite  : Inference timings in us: Init: 16248, First inference: 10541, Warmup (avg): 11986.1, Inference (avg): 15703.2
02-28 23:15:53.231 30379 30379 I tflite  : Inference timings in us: Init: 86527, First inference: 59826, Warmup (avg): 60634.6, Inference (avg): 72700.4

So, on GPU:

face detector: 3611.28 µs
face landmarker: 11406.9 µs
total = 15018.18 µs

And on CPU:

face detector: 15703.2 µs
face landmarker: 72700.4 µs
total = 88403.6 µs

In my app, I am initializing like this:

        val baseOptions = BaseOptions.builder()
            .setModelAssetPath(context.getString("face_landmarker.task"))
            .setDelegate(Delegate.GPU)
            .build()
        val options = FaceLandmarker.FaceLandmarkerOptions.builder()
            .setBaseOptions(baseOptions)
            .setMinFaceDetectionConfidence(0.5f)
            .setMinTrackingConfidence(0.5f)
            .setMinFacePresenceConfidence(0.5f)
            .setNumFaces(1)
            .setOutputFacialTransformationMatrixes(false)
            .setOutputFaceBlendshapes(false)
            .setRunningMode(RunningMode.VIDEO)
            .build()
        val faceLandmarker = FaceLandmarker.createFromOptions(context, options)

And then using it like this:

        val bitmap = image.toBitmap()  // where image is an `ImageProxy` from the camera
        val mpImage = BitmapImageBuilder(bitmap).build()
        val timestampMs = SystemClock.uptimeMillis()
        val result = faceLandmarker.detectForVideo(mpImage, timestampMs)
        val detectTimeMs = SystemClock.uptimeMillis() - timestampMs

I would expect to see a bit more than 15 ms because the mpImage is 640×480 and needs to be resized to 192×192 and 256×256 for the detector and the landmarker, respectively.
However, the gap between the TFLite model benchmark tool (15ms) and the actual app (30-70ms) seems too large.
Am I initializing the model properly, is there something I am missing that's hampering the performance?

Thanks in advance.

The text was updated successfully, but these errors were encountered:

google-ml-butler bot assigned kuaashish Feb 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Facelandmarker slow on Android apps but fast with TFLite model benchmark tool #5872

Facelandmarker slow on Android apps but fast with TFLite model benchmark tool #5872

arrufat commented Feb 28, 2025 •

edited

Loading

Facelandmarker slow on Android apps but fast with TFLite model benchmark tool #5872

Facelandmarker slow on Android apps but fast with TFLite model benchmark tool #5872

Comments

arrufat commented Feb 28, 2025 • edited Loading

Have I written custom code (as opposed to using a stock example script provided in MediaPipe)

OS Platform and Distribution

MediaPipe Tasks SDK version

Task name (e.g. Image classification, Gesture recognition etc.)

Programming Language and version (e.g. C++, Python, Java)

Describe the actual behavior

Describe the expected behaviour

Standalone code/steps you may have used to try to get what you need

Other info / Complete Logs

arrufat commented Feb 28, 2025 •

edited

Loading