Nebuly
Search…
Benchmarks
nebulgym has just been launched and has been tested on limited use cases. Early results are remarkably good, and it is expected that nebulgym will further reduce training time in future releases. At the same time, it is expected that nebulgym may fail in untested cases and provide different results, perhaps greater or worse than those shown below.
We tested nebulgym on the custom model that you can find in Examples and tutorials. The test consists of a training over 10 epochs and a batch size of 8.
Below are the training times in seconds before nebulgym optimization and after its acceleration, and the speedup, which is calculated as the response time of the unoptimized model divided by the response time of the accelerated model.

Training time in seconds

Hardware
Not-optimized
Accelerated
Speedup
M1 Pro
632.05
347.52
1.8x
Intel Xeon
788.05
381.01
2.1x
AMD EPYC
1547.35
1034.37
1.5x
NVIDIA T4
258.88
127.32
2.0x

Hardware setup

  • M1 Pro: Apple M1 Pro 16GB of RAM
  • Intel Xeon: EC2 Instance on AWS - t2.large
  • AMD EPYC: EC2 Instance on AWS - t4a.large
  • NVIDIA T4: EC2 instance on AWS - g4dn.xlarge
How does nebulgym perform on your training setup? What do you think about nebulgym and what are ways to make it even better? Share your ideas and results with us in the community chat.
Export as PDF
Copy link
On this page