On Fri, Mar 22, 2019 at 2:55 AM Serge Guelton <
serge.guelton@xxxxxxxxxxxxxxxxxxx> wrote:
On Thu, Mar 21, 2019 at 08:56:23AM -0700, Mehdi AMINI wrote:
web
On Wed, Mar 20, 2019 at 11:29 AM Jean Laroche <ripngo@xxxxxxxxx> wrote:
I use tensorflow a lot for training models for machine learning, that
part works really well.
What's more of a pain in the butt is when it comes to deploying your
models for inference (i.e., no longer train, but use the models to
detect/classify etc), especially if you're trying to get a small
footprint and fast execution
Tensorflow has a few ways of doing that:
1) Keep using the models in python using the tensorflow module.
2) Use the serving mechanism offered by tensorflow, this creates a
server which you query by sending your input features and getting thehttps://www.tensorflow.org/tfx/guide/serving
output of the model back.
3) Use Tensorflow lite which target mobile deployment (android andios)
https://www.tensorflow.org/lite/guidehttps://www.tensorflow.org/guide/extend/cc)
4) Use the tensorflow C++ API (
either.
None of these are satisfactory to me:
in 1) you must deploy the enormous tensorflow module with your app.
in 2) you're relying on a local server, which isn't great for me
in 3) you're dead in the water if you're not android or iosTensorFlow
4) should work but from all accounts it's a bit of a pain to get to
work. In particular the C++ API relies on libraries that must be
compiled using Bazel and the final footprint is not small at all.
Have you looked into tfcompile?
It generate a self-contained binary that does not depend on the
runtime.Python
The interface is C++, but it may be possible to improve this to get a
interface to the generated module.
Interesting. I wonder if they have some cross-call optimizations or not?
Either way, as showcased in this post:
http://serge-sans-paille.github.io/pythran-stories/an-incursion-into-basic-ml-gradient-descent-compiled-with-pythran.html
The only current gain of using pythran when chaining tf calls is likely to
be vectorization (which is nice), plus a few temporary array removal.
That being said, tehre's probably optimization opportunities there, Jean
can you share the Python script generated by your hack?