This paper presents an efficient technique for fast generation of sparse systems of linear equations arising in computational electromagnetics in a finite element method using higher order elements. The proposed approach employs a graphics processing unit (GPU) for both numerical integration and matrix assembly. The performance results obtained on a test platform consisting of a Fermi GPU (1x Tesla C2075) and a CPU (2x twelve-core Opterons), indicate that the GPU implementation of the matrix generation allows one to achieve speedups by a factor of 81 and 19 over the optimized single-and multi-threaded CPU-only implementations, respectively.
1. Shahmansouri, A. and B. Rashidian, "GPU implementation of split-field finite-difference time-domain method for Drude-Lorentz dispersive media," Progress In Electromagnetics Research, Vol. 125, 55-77, 2012. doi:10.2528/PIER12010505
2. Gao, P. C., Y. B. Tao, Z. H. Bai, and H. Lin, "Mapping the SBR and TW-ILDCs to heterogeneous CPU-GPU architecture for fast computation of electromagnetic scattering," Progress In Electromagnetics Research, Vol. 122, 137-154, 2012. doi:10.2528/PIER11092303
3. Gao, P. C., Y. B. Tao, and H. Lin, "Fast RCS prediction using multiresolution shooting and bouncing ray method on the GPU," Progress In Electromagnetics Research, Vol. 107, 187-202, 2010. doi:10.2528/PIER10061807
4. Banasiaka, R., Z. Yeb, and M. Soleimanic, "Improving three-dimensional electrical capacitance tomography imaging using approximation error model theory," Journal of Electromagnetic Waves and Applications, Vol. 26, No. 2-3, 411-421, 2012.
5. Jiang, W.-Q., M. Zhang, and Y. Wang, "CUDA-based radiative transfer method with application to the EM scattering from a two-layer canopy model," Journal of Electromagnetic Waves and Applications, Vol. 24, No. 17-18, 2509-2521, 2010. doi:10.1163/156939310793675772
7. Fotyga, G., K. Nyka, and M. Mrozowski, "Effcient model order reduction for FEM analysis of waveguide structures and resonators," Progress In Electromagnetics Research, Vol. 127, 277-295, 2012. doi:10.2528/PIER12021609
8. Klopf, E. M., S. B. Manic, M. M. Ilic, and B. M. Notaros, "Effcient time-domain analysis of waveguide discontinuities using higher order FEM in frequency domain," Progress In Electromagnetics Research, Vol. 120, 215-234, 2011.
9. Trujillo-Romero, C. J., L. Leija, and A. Vera, "FEM modeling for performance evaluation of an electromagnetic oncology deep hyperthermia applicator when using monopole, inverted T, and plate antennas," Progress In Electromagnetics Research, Vol. 120, 99-125, 2011.
10. Sun, H., Y. Wu, and Z. Ruan, "Edge-Based finite element method analysis of the transmission characteristics in antipodal finline," Journal of Electromagnetic Waves and Applications, Vol. 25, No. 4, 565-575, 2011. doi:10.1163/156939311794500250
11. Sun, H., Y. Wu, and Z. Ruan, "A study of transmission characteristics in elliptic-shaped microshield lines," Journal of Electromagnetic Waves and Applications, Vol. 25, No. 17-18, 2353-2364, 2011. doi:10.1163/156939311798806176
12. Jin, J., The Finite Element Method in Electromagnetics, John Wiley and Sons Inc., New York, 2002.
13. Volakis, J. L., A. Chatterjee, and L. C. Kempel, Finite Element Method for Electromagnetics. Antennas, Microwave Circuits and Scattering Applications, IEEE Series on Electromagnetic Wave Theory, IEEE Press, NJ, 1998.
14. Pelosi, G., R. Coccioli, and S. Selleri, Quick Finite Elements for Electromagnetic Waves, Artech House Inc., 2009.
15. Dehnavi, M. M., D. M. Fernandez, and D. Giannacopoulos, "Finite-element sparse matrix vector multiplication on graphic processing unit," IEEE Transactions on Magnetics, Vol. 46, No. 8, 2982-2985, Aug. 2010. doi:10.1109/TMAG.2010.2043511
16. Dziekonski, A., A. Lamecki, and M. Mrozowski, "A memory effcient and fast sparse matrix vector product on a GPU," Progress In Electromagnetics Research, Vol. 116, 49-63, Jan.2011.
17. Dziekonski, A., A. Lamecki, and M. Mrozowski, "GPU acceleration of multilevel solvers for analysis of microwave components with finite element method," IEEE Microwave and Wireless Components Letters, Vol. 21, No. 1, 1-3, 2011. doi:10.1109/LMWC.2010.2089974
18. Dziekonski, A., A. Lamecki, and M. Mrozowski, "Tuning a hybrid GPU-CPU V-Cycle multilevel preconditioner for solving large real and complex systems of FEM equations," IEEE Antennas and Wireless Propagation Letters, Vol. 10, 619-622, 2011. doi:10.1109/LAWP.2011.2159769
19. Plaszewski, P., K. Banas, and P. Maciol, "Higher order FEM numerical integration on GPUs with OpenCL," Proceedings of the International Multiconference on Computer Science and Information Technology (IMCSIT), 337-34, Oct. 18-20,2010.
20. Maciol, P., P. Plaszewski, and K. Banas, "3D finite element numerical integration on GPUs," Procedia Computer Science, Vol. 1, No. 1, 1093-1100, 2010. doi:10.1016/j.procs.2010.04.121
21. Markall, G., A. Slemmer, D. Ham, P. Kelly, C. Cantwell, and S. Sherwin, "Finite element assembly strategies on multi-core and many-core architectures," International Journal for Numerical Methods in Fluids, 2012.
22. Cecka, C., A. Lew, and E. Darve, "Application of assembly of finite element methods on graphics processors for real-time elastodynamics," GPU Gems 3, Jul. 2011.
23. Ingelstrom, P., "A new set of H(curl)-conforming hierarchical basis functions for tetrahedral meshes," IEEE Trans. on Microwave Theory and Techniques, Vol. 54, 106-114, Jan.2006. doi:10.1109/TMTT.2005.860295
24. Zhang, L., T. Cui, and H. Liu, "A set of symmetric quadrature rules on triangles and tetrahedra," Journal of Computational Mathematics, Vol. 26, No. 3, 1-16, 2008.
25. Schberl, J., "NETGEN an advancing front 2D/3D-mesh generator based on abstract rules," Computing and Visualization in Science, Vol. 1, No. 1, 41-52, Jul.1997.
26. Sanders, J. and E. Kandrot, CUDA by Example: An Introduction to General-Purpose GPU Programming, NVIDIA Co., 2011.
28., , CUBLAS Library Nvidia Co., 2011.
29. Saad, Y., Iterative Methods for Sparse Linear Systems,SIAM, 2004.