This paper proposes an efficient parallel shooting and bouncing ray (SBR) method on the graphics processing unit (GPU) cluster for solving the electromagnetic scattering problems. At each incident direction, the parallel SBR method partitions the virtual aperture into sub-apertures, and distributes the computational process of each sub-aperture over GPU nodes. As ray tubes in the virtual aperture do not have the same computational time, the parallel efficiency highly depends on how to partition the virtual aperture. This paper addresses this issue by a dynamic partitioning scheme according to the computational time at the previous angle, which can achieve excellent load balance. Numerical examples are presented to demonstrate the accuracy, high parallel efficiency, good scalability and versatility of the proposed method.
1. Ling, H., R. C. Chou, and S. W. Lee, "Shooting and bouncing rays: Calculating the RCS of an arbitrarily shaped cavity," IEEE Trans. Antennas Propag., Vol. 37, No. 2, 194-205, 1989. doi:10.1109/8.18706
2. Heh, D. Y., E. L. Tan, and H. Lin, "Modeling the interaction of terahertz pulse with healthy skin and basal cell carcinoma using the unconditionally stable fundamental adi-FDTD method," Progress In Electromagnetics Research B, Vol. 37, 365-386, 2012. doi:10.2528/PIERB11090905
3. Nam, K. M., L. M. Zurk, and S. Schecklman, "Modeling terahertz diffuse scattering from granular media using radiative transfer theory," Progress In Electromagnetics Research B, Vol. 38, 205-223, 2012.
4. Jin, K. S., T. I. Suh, S. H. Suk, B. C. Kim, and H. T. Kim, "Fast ray tracing using a space-division algorithm for RCS prediction," Journal of Electromagnetic Waves and Applications, Vol. 20, No. 1, 119-126, 2006. doi:10.1163/156939306775777341
5. Tao, Y. B., H. Lin, and H. J. Bao, "KD-tree based fast ray tracing for RCS prediction," Progress In Electromagnetics Research, Vol. 81, 329-341, 2008. doi:10.2528/PIER08011305
7. Suk, S. H., T. I. Seo, H. S. Park, and H. T. Kim, "Multiresolution grid algorithm in the SBR and its application to the RCS calculation," Microw. Opt. Technol. Lett., Vol. 29, No. 6, 394-397, 2001. doi:10.1002/mop.1188
8. Tao, Y. B., H. Lin, and H. J. Bao, "GPU-based shooting and bouncing ray method for fast RCS prediction," IEEE Trans. Antennas Propag., Vol. 58, No. 2, 494-502, 2010. doi:10.1109/TAP.2009.2037694
9. Gao, P. C., Y. B. Tao, and H. Lin, "Fast RCS prediction using multiresolution shooting and bouncing ray method on the GPU," Progress In Electromagnetics Research, Vol. 107, 187-202, 2010. doi:10.2528/PIER10061807
10. Vaccari, A., A. Cala'Lesina, L. Cristoforetti, and R. Pontalti, "Parallel implementation of a 3D subgridding FDTD algorithm for large simulations," Progress In Electromagnetics Research, Vol. 120, 263-292, 2011.
11. Guo, X.-M., Q.-X. Guo, W. Zhao, and W. Yu, "Parallel FDTD simulation using NUMA acceleration technique," Progress In Electromagnetics Research Letters, Vol. 28, 1-8, 2012. doi:10.2528/PIERL11101706
12. Garcia-Donoro, D., I. Martinez-Fernandez, L. E. Garcia-Castillo, Y. Zhang, and T. K. Sarkar, "RCS computation using a parallel in-core and out-of-core direct solver," Progress In Electromagnetics Research, Vol. 118, 505-525, 2011. doi:10.2528/PIER11052611
13. Pan, X.-M., W.-C. Pi, and X.-Q. Sheng, "On openMP parallelization of the multilevel fast multipole algorithm," Progress In Electromagnetics Research, Vol. 112, 199-203, 2011.
14. Ergul, O., "Parallel implementation of MLFMA for homogeneous objects with various material properties," Progress In Electromagnetics Research, Vol. 121, 505-520, 2011. doi:10.2528/PIER11092501
15. Fan, Z., F. Qiu, and A. Kaufman, "Zippy: A framework for computation and visualization on a GPU cluster," Computer Graphics Forum, Vol. 27, No. 2, 341-350, 2008. doi:10.1111/j.1467-8659.2008.01131.x
16. Godel, N., N. Nunn, T. Warburton, and M. Clemens, "Scalability of high-order discontinuous Galerkin FEM computations for solving electromagnetic wave propagation problems on GPU Clusters," IEEE Trans. Magn., Vol. 46, No. 8, 3469-3472, 2010. doi:10.1109/TMAG.2010.2046022
17. Lee, K. H., I. Ahmed, R. S. M. Goh, E. H. Khoo, E. P. Li, and T. G. G. Hung, "Implementation of the FDTD method based on Lorentz-Drude dispersive model on GPU for plasmonics applications," Progress In Electromagnetics Research, Vol. 116, 441-456, 2011.
18. Shahmansouri, A and B. Rashidian, "GPU implementation of split-field finite-difference time-domain method for Drude-Lorentz dispersive media," Progress In Electromagnetics Research, Vol. 125, 55-77, 2012. doi:10.2528/PIER12010505
19. Dziekonski, A., P. Sypek, A. Lamecki, and M. Mrozowski, "Finite element matrix generation on a GPU," Progress In Electromagnetics Research, Vol. 128, 249-265, 2012.
20. Capozzoli, A., C. Curcio, and A. Liseno, "Fast GPU-based interpolation for SAR backprojection," Progress In Electromagnetics Research, Vol. 133, 259-283, 2013.
21. Popov, S, J. Gunther, H.-P. Seidel, and P. Slusallek, "Stackless KD-tree traversal for high performance GPU ray tracing," Computer Graphics Forum, Vol. 26, No. 3, 415-424, 2007. doi:10.1111/j.1467-8659.2007.01064.x
22. Marchesin, S., C. Mongenet, and J.-M. Dischler, "Dynamic load balancing for parallel volume rendering," Proceedings of the 6th Eurographics Conference on Parallel Graphics and Visualization, 43-50, 2006.