Development philosophy and acknowledgements

We are not here to reinvent the wheel, but to build an all-in-one open-source product to master all the available AI acceleration techniques and deliver the fastest AI ever. As a result, nebullvm leverages available enterprise-grade open-source optimization tools. If these tools and communities already exist, and are distributed under a permissive license (Apache, MIT, etc), we integrate them and happily contribute to their communities. However, many tools do not exist yet, in which case we implement them and open-source the code so that the community can benefit from it.
nebullvm compressor leverages the following open-source projects:
  • ​Intel Neural Compressor: targeting to provide unified APIs for network compression technologies, such as low precision quantization, sparsity, pruning, knowledge distillation, across different deep learning frameworks to pursue optimal inference performance.
  • ​SparseML: libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models.
nebullvm optimizer leverages the following open-source projects:
  • ​Apache TVM: open deep learning compiler stack for cpu, gpu and specialized accelerators.
  • ​BladeDISC: end-to-end Dynamic Shape Compiler project for machine learning workloads.
  • ​DeepSparse: neural network inference engine that delivers GPU-class performance for sparsified models on CPUs.
  • ​ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
  • ​OpenVINO: open-source toolkit for optimizing and deploying AI inference.
  • ​TensorRT: C++ library for high performance inference on NVIDIA GPUs and deep learning accelerators.
  • ​TFlite and XLA: open-source to accelerate TensorFlow models.