PIER
 
Progress In Electromagnetics Research
ISSN: 1070-4698, E-ISSN: 1559-8985
Home | Search | Notification | Authors | Submission | PIERS Home | EM Academy
Home > Vol. 135 > pp. 297-316

IMPLEMENTATION OF FDTD-COMPATIBLE GREEN'S FUNCTION ON HETEROGENEOUS CPU-GPU PARALLEL PROCESSING SYSTEM

By T. P. Stefanski

Full Article PDF (478 KB)

Abstract:
This paper presents an implementation of the FDTD-compatible Green's function on a heterogeneous parallel processing system. The developed implementation simultaneously utilizes computational power of the central processing unit (CPU) and the graphics processing unit (GPU) to the computational tasks best suited to each architecture. Recently, closed-form expression for this discrete Green's function (DGF) was derived, which facilitates its applications in the FDTD simulations of radiation and scattering problems. Unfortunately, implementation of the new DGF formula in software requires a multiple precision arithmetic and may cause long runtimes. Therefore, an acceleration of the DGF computations on a CPU-GPU heterogeneous parallel processing system was developed using the multiple precision arithmetic and the OpenMP and CUDA parallel programming interfaces. The method avoids drawbacks of the CPU- and GPU-only accelerated implementations of the DGF, i.e. long runtime on the CPU and significant overhead of the GPU initialization respectively for long and short lengths of the DGF waveform. As a result, the seven-fold speedup was obtained relative to the reference DGF implementation on a multicore CPU thus applicability of the DGF in FDTD simulations was significantly improved.

Citation:
T. P. Stefanski, "Implementation of FDTD-Compatible Green's Function on Heterogeneous Cpu-GPU Parallel Processing System," Progress In Electromagnetics Research, Vol. 135, 297-316, 2013.
doi:10.2528/PIER12111702
http://www.jpier.org/PIER/pier.php?paper=12111702

References:
1. Chew, , W. C., , "Electromagnetic theory on a lattice," Journal of Applied Physics, Vol. 75, No. 10, 4843-4850, 1994.
doi:10.1063/1.355770

2. Clemens, , M. and T. Weiland, "Discrete electromagnetism with the finite integration technique," Progress In Electromagnetics Research, Vol. 32, 65-87, 2001.
doi:10.2528/PIER00080103

3. Schuhmann, , R., T. Weiland, and , "Conservation of discrete energy and related laws in the finite integration technique," Progress In Electromagnetics Research, Vol. 32, 301-316, 2001.
doi:10.2528/PIER00080112

4. Bossavit, , A., Progress In Electromagnetics Research, and , "`Generalized finite differences' in computational electromagnetics,", Vol. 32, 45-64, 2001.
doi:10.2528/PIER00080102

5. Teixeira, F. L., "Geometric aspects of the simplicial discretization of Maxwell's equations," Progress In Electromagnetics Research, Vol. 32, 171-188, 2001.
doi:10.2528/PIER00080107

6. Vazquez, , J. , C. G. Parini, and , "Discrete Green's function formulation of FDTD method for electromagnetic modelling," Electron. Lett., Vol. 35, No. 7, 554-555, 1999.
doi:10.1049/el:19990416

7. Holtzman, , R. , R. Kastner, and , "The time-domain discrete Green's function method (GFM) characterizing the FDTD grid boundary," IEEE Trans. Antennas Propag., , Vol. 49, No. 7, 1079-1093, 2001.
doi:10.1109/8.933488

8. Holtzman, , R, , R. Kastner, E. Heyman, and R. W. Ziolkowski, "Stability analysis of the Green's function method (GFM) used as an ABC for arbitrarily shaped boundaries," IEEE Trans. Antennas Propag., Vol. 50, No. 7, 1017-1029, 2002.
doi:10.1109/TAP.2002.802272

9. Jeng, S.-K., "An analytical expression for 3-D dyadic FDTD-compatible Green's function in infinite free space via z-transform and partial di®erence operators," IEEE Trans. Antennas Propag.,, Vol. 59, No. 4, 1347-1355, 2011.
doi:10.1109/TAP.2011.2109363

10. Vazquez, , J., C. G. Parini, and , "Antenna modelling using discrete Green's function formulation of FDTD method," Electron. Lett.,, Vol. 35, No. 13, 1033-1034, 1999.
doi:10.1049/el:19990741

11. Ma, W., , M. R. Rayner, and C. G. Parini, "Discrete Green's function formulation of the FDTD method and its application in antenna modeling," IEEE Trans. Antennas Propag., Vol. 53, No. 1, 339-346, 2005.
doi:10.1109/TAP.2004.838797

12. Holtzman, , R, , R. Kastner, E. Heyman, and R. W. Ziolkowski, "Ultra-wideband cylindrical antenna design using the Green's function method (GFM) as an absorbing boundary condition (ABC) and the radiated ¯eld propagator in a genetic optimization ," Microw. Opt. Tech. Lett., Vol. 48, No. 2, 348-354, 2006.
doi:10.1002/mop.21346

13. De Hon, B. P. , J. M. Arnold, and , "Stable FDTD on disjoint domains --- A discrete Green's function diakoptics approach," Proc. The 2nd European Conf. on Antennas and Propag., 1-6, 2007.

14. Malevsky, , S., E. Heyman, and R. Kastner, "Source decomposition as a diakoptic boundary condition in FDTD with reflecting external regions," IEEE Trans. Antennas Propag., Vol. 58, No. 11, 3602-3609, 2010.
doi:10.1109/TAP.2010.2052577

15. Schneider, J. B., K. Abdijalilov, and , "Analytic fleld propagation TFSF boundary for FDTD problems involving planar interfaces: PECs, TE, and TM," IEEE Trans. Antennas Propag., Vol. 54, No. 9, 2531-2542, 2006.
doi:10.1109/TAP.2006.880757

16. Stefanski, , T. P., "Fast implementation of FDTD-compatible Green's function on multicore processor," IEEE Antennas Wireless Propag. Lett., Vol. 11, 81-84, 2012.
doi:10.1109/LAWP.2012.2183632

17. Stefanski, T. P. and K. Krzyzanowska, "Implementation of FDTD-compatible Green's function on graphics processing unit," IEEE Antennas Wireless Propag. Lett., Vol. 11, 1422-1425, 2012.
doi:10.1109/LAWP.2012.2229380

18. Sypek, , P., A. Dziekonski, and M. Mrozowski, "How to render FDTD computations more effective using a graphics accelerator," IEEE Trans. Magn., Vol. 45, No. 3, 1324-1327, 2009.
doi:10.1109/TMAG.2009.2012614

19. Toivanen, , J. I., , T. P. Stefanski, N. Kuster, and N. Chavannes, "Comparison of CPML implementations for the GPU-accelerated FDTD solver ," Progress In Electromagnetics Research M,, Vol. 19, 61-75, 2011.
doi:10.2528/PIERM11061002

20. Tay, , W. C., , D. Y. Heh, and E. L. Tan, "GPU-accelerated funda-mental ADI-FDTD with complex frequency shifted convolutional perfectly matched layer," Progress In Electromagnetics Research M, Vol. 14, 177-192, 2010 .
doi:10.2528/PIERM10090605

21. Stefanski, T. P. and Acceleration of the 3D, "Acceleration of the 3D ADI-FDTD method using graphics processor units," IEEE MTT-S International Microwave Symposium Digest, 241-244, 2009.

22. Xu, , K., , Z. Fan, D.-Z. Ding, and R.-S. Chen, "GPU accelerated unconditionally stable Crank-Nicolson FDTD method for the analysis of three-dimensional microwave circuits," Progress In Electromagnetics Research, Vol. 102, 381-395, 2010.
doi:10.2528/PIER10020606

23. Shahmansouri, , A. , B. Rashidian, and , "GPU implementation of split-field finite-difference time-domain method for Drude-Lorentz dispersive media," Progress In Electromagnetics Research , Vol. 125, 55-77, 2012.
doi:10.2528/PIER12010505

24. Zainud-Deen, , S. H. , E. El-Deen, and , "Electromagnetic scattering using GPU-based finite difference frequency domain method," Progress In Electromagnetics Research B, Vol. 16, 351-369, 2009..
doi:10.2528/PIERB09060703

25. Demir, , V., "Graphics processor unit (GPU) acceleration of finite-difference frequency-domain (FDFD) method," Progress In Electromagnetics Research M, Vol. 23, 29-51, 2012.
doi:10.2528/PIERM11090909

26. Dziekonski, , A., , A. Lamecki, and M. Mrozowski, "GPU acceleration of multilevel solvers for analysis of microwave components with finite element method," IEEE Microw. Wireless Comp. Lett., Vol. 21, No. 1, 1-3, 2011.
doi:10.1109/LMWC.2010.2089974

27. Dziekonski, , A., , A. Lamecki, and M. Mrozowski, , "Tuning a hybrid GPU-CPU V-cycle multilevel preconditioner for solving large real and complex systems of FEM equations," IEEE Antennas Wireless Propag. Lett., Vol. 10, 619-622, 2011.
doi:10.1109/LAWP.2011.2159769

28. Dziekonski, , A., P. Sypek, A. Lamecki, and M. Mrozowski, "Finite element matrix generation on a GPU," Progress In Electromagnetics Research, Vol. 249, 249-265, 2012.

29. Dziekonski, A., , A. Lamecki, and M. Mrozowski, "A memory e±cient and fast sparse matrix vector product on a GPU," Progress In Electromagnetics Research, Vol. 116, 49-63, 2011.

30. Peng, , S. , Z. Nie, and , "Acceleration of the method of moments calculations by using graphics processing units," IEEE Trans. Antennas Propag., Vol. 56, No. 7, 2130-2133, 2008..
doi:10.1109/TAP.2008.924768

31. Xu, , K., , D. Z. Ding, Z. H. Fan, and R. S. Chen, "Multilevel fast multipole algorithm enhanced by GPU parallel technique for electromagnetic scattering problems," Microw. Opt. Technol. Lett., Vol. 52, No. 3, 502-507, 2010.
doi:10.1002/mop.24963

32. Lopez-Fernandez, J. A., , M. Lopez-Portugues, Y. Alvarez-Lopez, C. Garcia-Gonzalez, D. Martinez, and F. Las-Heras, "Fast antenna characterization using the sources reconstruction method on graphics processors," Progress In Electromagnetics Research , Vol. 126, 185-201, , 2012.
doi:10.2528/PIER11121408

33. Gao, , P. C., Y. B. Tao, Z. H. Bai, and H. Lin, , "Mapping the SBR and TW-ILDCs to heterogeneous CPU-GPU architecture for fast computation of electromagnetic scattering," Progress In Electromagnetics Research, Vol. 122, 137-154, 2012.

34. Granlund, , T., "The multiple precision integers and ratio-nals library," Edition 2.2.1, GMP Development Team, 2010,.
doi:http://www.mpir.org.

35. Nakayama, , T., D. Takahashi, and , "Implementation of multiple-precision floating-point arithmetic library for GPU computing," Proc. 23rd IASTED International Conference on Parallel and Distributed Computing and Systems (PDCS), 343-349, , 2011.

36. OpenMP Architecture Review Board, "OpenMP application program interface," Version 3.1, 2011.
doi:www.openmp.org.

37. Nvidia, "CUDA C programming guide," Version 4.2,.
doi:http://developer.nvidia.com/cuda/nvidia-gpu-computing-docum-enta

38. Harris, , M., "Optimizing parallel reduction in CUDA," NVIDIA.
doi:http://developer.download.nvidia.com/co-mpute/cuda/1.1-Beta/x86

39. Shen, , W., , D. Wei, W. Xu, X. Zhu, and S. Yuan, "Parallelized computation for computer simulation of electrocardiograms using personal computers with multi-core CPU and general-purpose GPU ," Computer Methods and Programs in Biomedicine,, Vol. 100, No. 1, 87-96, 2010 .
doi:10.1016/j.cmpb.2010.06.015


© Copyright 2014 EMW Publishing. All Rights Reserved