Skip to content

Conversation

@scarrazza
Copy link
Contributor

No description provided.

@scarlehoff
Copy link
Member

Is this able to run on the FPGA already?

I might try tomorrow unifying the python and C version so that they use the same OpenCL kernels so that FPGA, C and python are always synchronized.

@scarrazza
Copy link
Contributor Author

Not yet, I have to finalize the vegas opencl loop in C++.
I am working on it, so probably in 1-2h the code will be ready for compilation.

The problem about unification is that I am sure the FPGA kernel will look much more different than the GPU/CPU.

@scarlehoff
Copy link
Member

We should make it look similar enough though. I haven't looked at opencl-for-xilink but if it is very different then what is the point of having opencl support...

@scarrazza
Copy link
Contributor Author

scarrazza commented Sep 24, 2019

it will look similar, but in order to provide fast results we have to add tons of #pragmas and specific attributes for barrier and pipeline operations. (things that for sure we can include in a single cl file with #ifdef FPGA...)

@scarrazza
Copy link
Contributor Author

Software emulation seems to work. I am now testing hardware emulation and real hardware, I will push the bitstreams as soon the compilation terminates.

@scarrazza
Copy link
Contributor Author

scarrazza commented Sep 24, 2019

OK, all three modes are working, the current kernel on real fpga hardware takes almost the same amount of time of a 36 threads CPU on dom. Compilation time requires 2h for hardware.

Here the profiling for sw_emulation: profiling_cpu.pdf

Next step: play with the kernel and the special opencl instructions described in

@scarlehoff
Copy link
Member

We have to be careful with the binnaries, having one set for testing is ok but I think it is better if we share it in dom in some folder, because the repository is already over 60 MB...

@scarrazza
Copy link
Contributor Author

Yeah, we can drop them from here.

@scarrazza
Copy link
Contributor Author

Asking for 100 events makes the hw_emu usable. Here the report:
profiling_hw_emu.pdf

There is a nice table which helps in checking if we are doing things properly (and we are not).

@scarlehoff
Copy link
Member

I think the opencl kernel can be greatly improved in a way that is much softer in memory (and also less useful for HEP, but we don't really care at this point :P)
I'll continue playing with the python/C version until I am happy with it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants