[pythran] Re: Use pythran to deploy tensorflow models.

From: Mehdi AMINI <joker.eph@xxxxxxxxx>
To: pythran@xxxxxxxxxxxxx
Date: Fri, 22 Mar 2019 08:14:04 -0700

On Fri, Mar 22, 2019 at 2:55 AM Serge Guelton <
serge.guelton@xxxxxxxxxxxxxxxxxxx> wrote:

On Thu, Mar 21, 2019 at 08:56:23AM -0700, Mehdi AMINI wrote:

On Wed, Mar 20, 2019 at 11:29 AM Jean Laroche <ripngo@xxxxxxxxx> wrote:

    I use tensorflow a lot for training models for machine learning, that
    part works really well.
    What's more of a pain in the butt is when it comes to deploying your
    models for inference (i.e., no longer train, but use the models to
    detect/classify etc), especially if you're trying to get a small
    footprint and fast execution

    Tensorflow has a few ways of doing that:
    1) Keep using the models in python using the tensorflow module.
    2) Use the serving mechanism offered by tensorflow, this creates a

web

    server which you query by sending your input features and getting the
    output of the model back.

https://www.tensorflow.org/tfx/guide/serving

    3) Use Tensorflow lite which target mobile deployment (android and

ios)

    https://www.tensorflow.org/lite/guide
    4) Use the tensorflow C++ API (

https://www.tensorflow.org/guide/extend/cc)

    None of these are satisfactory to me:
    in 1) you must deploy the enormous tensorflow module with your app.
    in 2) you're relying on a local server, which isn't great for me

either.

    in 3) you're dead in the water if you're not android or ios
    4) should work but from all accounts it's a bit of a pain to get to
    work. In particular the C++ API relies on libraries that must be
    compiled using Bazel and the final footprint is not small at all.

Have you looked into tfcompile?
It generate a self-contained binary that does not depend on the

TensorFlow

runtime.
The interface is C++, but it may be possible to improve this to get a

Python

interface to the generated module.

Interesting. I wonder if they have some cross-call optimizations or not?

It is using XLA, so it depends on the XLA target, but in general yes they
do.

Either way, as showcased in this post:

http://serge-sans-paille.github.io/pythran-stories/an-incursion-into-basic-ml-gradient-descent-compiled-with-pythran.html

The only current gain of using pythran when chaining tf calls is likely to
be vectorization (which is nice), plus a few temporary array removal.
That being said, tehre's probably optimization opportunities there, Jean
can you share the Python script generated by your hack?

Follow-Ups:
- [pythran] Re: Use pythran to deploy tensorflow models.
  - From: Jean Laroche
- [pythran] Re: Use pythran to deploy tensorflow models.
  - From: Jean Laroche

References:
- [pythran] Use pythran to deploy tensorflow models.
  - From: Jean Laroche
- [pythran] Re: Use pythran to deploy tensorflow models.
  - From: Mehdi AMINI
- [pythran] Re: Use pythran to deploy tensorflow models.
  - From: Serge Guelton

[pythran] Re: Use pythran to deploy tensorflow models.

Other related posts: