DarkNet CUDA vs OpenCL and CPU vs NVIDIA vs AMD

Hi, today I will show you some measurements results for my PhD. I am working of the first publication about DarkNet on OpenCL, the source code of this project you can find at https://github.com/sowson/darknet. The IEEE publication has to be consistent and smart. I cannot put on it too much graphics and big tables… but waits, I have a public blog site. So, I can post it here. First things first the battle heroes come on the stage.
Ph.D. CUDA vs OpenCL and CPU vs NVIDIA vs AMDMy workstation is using 2x NVIDIA Titan RTX 24 GB DDR6 or 2x XFX AMD Radeon VII 16 GB HBM2 and basically on the Ubuntu 18.04 I did the measurements. First, I would like to show you caparison of the back propagation, part of the training process. Truth be told I asked several times community to measure the performance and compare OpenCL versions. It was not happed, so I decided to invest in GPUs from AMD and make comparison by myself. Now I will show you the mentioned comparison of timings of back propagation part.
Ph.D. CUDA vs OpenCL and CPU vs NVIDIA vs AMD BW Now let me show you last back propagation convolutional layer only, but with all sub kernels inside to give you option to caparison and choose the best GPU for DarkNet on OpenCL.
Ph.D. CUDA vs OpenCL and CPU vs NVIDIA vs AMD BW LastVery nice result of the AMD right? But only with CLBlast instead of clBLAS. Looks like AMD have to fix this basic linear algebra subsystems, otherwise it does make no sense to use it. Last thing to mention is that I am comparing top mainstream GPUs from NVidia and AMD and AMD I believe thanks to HBM2 VRAM is working super nice.
Regarding the IEEE publication, I am working on it, there will be a nice story about the journey of DarkNet on OpenCL, many viewpoints, measurements, results, conclusions and more, so stay tuned. Thanks for reading!
p ;).

Leave a Reply

Your email address will not be published. Required fields are marked *

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.