How nebullvm works
Easy-to-use library to boost AI inference leveraging state-of-the-art optimization techniques πŸš€
​nebullvm is an open-source tool designed to speed up AI inference in just a few lines of code. nebullvm boosts your model to achieve the maximum acceleration that is physically possible on your hardware.
We are building a new AI inference acceleration product leveraging state-of-the-art open-source optimization tools enabling the optimization of the whole software to hardware stack.
If you like the idea, leave a star on GitHub to support the project ⭐​
The core nebullvm workflow consists of 3 steps:
  • Select: input your model in your preferred DL framework and express your preferences regarding:
    • Accuracy loss: do you want to trade off a little accuracy for much higher performance?
    • Optimization time: stellar accelerations can be time-consuming. Can you wait, or do you need an instant answer?
  • Search: nebullvm automatically tests every combination of optimization techniques across the software-to-hardware stack (sparsity, quantization, compilers, etc.) that is compatible with your needs and local hardware.
  • Serve: finally, nebullvm chooses the best configuration of optimization techniques and returns an accelerated version of your model in the DL framework of your choice (just on steroids πŸš€).

Orientation map

There are still some open answers that you might want answers to.
  • What is the product architecture? Read more.
  • What models, hardware and optimization techniques are supported by nebullvm? See here.
  • How to move directly to library installation? Go here.
  • How to get started with the nebullvm API? Check here.
  • Where can I find notebooks to test nebullvm? Find here.
And before we go any further, leave a ⭐ on GitHub if you enjoy the project and join the Discord community where we chat about nebullvm and AI optimization.And happy acceleration πŸš€πŸš€
Export as PDF
Copy link