Vol. 135
Latest Volume
All Volumes
PIER 179 [2024] PIER 178 [2023] PIER 177 [2023] PIER 176 [2023] PIER 175 [2022] PIER 174 [2022] PIER 173 [2022] PIER 172 [2021] PIER 171 [2021] PIER 170 [2021] PIER 169 [2020] PIER 168 [2020] PIER 167 [2020] PIER 166 [2019] PIER 165 [2019] PIER 164 [2019] PIER 163 [2018] PIER 162 [2018] PIER 161 [2018] PIER 160 [2017] PIER 159 [2017] PIER 158 [2017] PIER 157 [2016] PIER 156 [2016] PIER 155 [2016] PIER 154 [2015] PIER 153 [2015] PIER 152 [2015] PIER 151 [2015] PIER 150 [2015] PIER 149 [2014] PIER 148 [2014] PIER 147 [2014] PIER 146 [2014] PIER 145 [2014] PIER 144 [2014] PIER 143 [2013] PIER 142 [2013] PIER 141 [2013] PIER 140 [2013] PIER 139 [2013] PIER 138 [2013] PIER 137 [2013] PIER 136 [2013] PIER 135 [2013] PIER 134 [2013] PIER 133 [2013] PIER 132 [2012] PIER 131 [2012] PIER 130 [2012] PIER 129 [2012] PIER 128 [2012] PIER 127 [2012] PIER 126 [2012] PIER 125 [2012] PIER 124 [2012] PIER 123 [2012] PIER 122 [2012] PIER 121 [2011] PIER 120 [2011] PIER 119 [2011] PIER 118 [2011] PIER 117 [2011] PIER 116 [2011] PIER 115 [2011] PIER 114 [2011] PIER 113 [2011] PIER 112 [2011] PIER 111 [2011] PIER 110 [2010] PIER 109 [2010] PIER 108 [2010] PIER 107 [2010] PIER 106 [2010] PIER 105 [2010] PIER 104 [2010] PIER 103 [2010] PIER 102 [2010] PIER 101 [2010] PIER 100 [2010] PIER 99 [2009] PIER 98 [2009] PIER 97 [2009] PIER 96 [2009] PIER 95 [2009] PIER 94 [2009] PIER 93 [2009] PIER 92 [2009] PIER 91 [2009] PIER 90 [2009] PIER 89 [2009] PIER 88 [2008] PIER 87 [2008] PIER 86 [2008] PIER 85 [2008] PIER 84 [2008] PIER 83 [2008] PIER 82 [2008] PIER 81 [2008] PIER 80 [2008] PIER 79 [2008] PIER 78 [2008] PIER 77 [2007] PIER 76 [2007] PIER 75 [2007] PIER 74 [2007] PIER 73 [2007] PIER 72 [2007] PIER 71 [2007] PIER 70 [2007] PIER 69 [2007] PIER 68 [2007] PIER 67 [2007] PIER 66 [2006] PIER 65 [2006] PIER 64 [2006] PIER 63 [2006] PIER 62 [2006] PIER 61 [2006] PIER 60 [2006] PIER 59 [2006] PIER 58 [2006] PIER 57 [2006] PIER 56 [2006] PIER 55 [2005] PIER 54 [2005] PIER 53 [2005] PIER 52 [2005] PIER 51 [2005] PIER 50 [2005] PIER 49 [2004] PIER 48 [2004] PIER 47 [2004] PIER 46 [2004] PIER 45 [2004] PIER 44 [2004] PIER 43 [2003] PIER 42 [2003] PIER 41 [2003] PIER 40 [2003] PIER 39 [2003] PIER 38 [2002] PIER 37 [2002] PIER 36 [2002] PIER 35 [2002] PIER 34 [2001] PIER 33 [2001] PIER 32 [2001] PIER 31 [2001] PIER 30 [2001] PIER 29 [2000] PIER 28 [2000] PIER 27 [2000] PIER 26 [2000] PIER 25 [2000] PIER 24 [1999] PIER 23 [1999] PIER 22 [1999] PIER 21 [1999] PIER 20 [1998] PIER 19 [1998] PIER 18 [1998] PIER 17 [1997] PIER 16 [1997] PIER 15 [1997] PIER 14 [1996] PIER 13 [1996] PIER 12 [1996] PIER 11 [1995] PIER 10 [1995] PIER 09 [1994] PIER 08 [1994] PIER 07 [1993] PIER 06 [1992] PIER 05 [1991] PIER 04 [1991] PIER 03 [1990] PIER 02 [1990] PIER 01 [1989]
2012-12-24
Implementation of FDTD-Compatible Green's Function on Heterogeneous Cpu-GPU Parallel Processing System
By
Progress In Electromagnetics Research, Vol. 135, 297-316, 2013
Abstract
This paper presents an implementation of the FDTD-compatible Green's function on a heterogeneous parallel processing system. The developed implementation simultaneously utilizes computational power of the central processing unit (CPU) and the graphics processing unit (GPU) to the computational tasks best suited to each architecture. Recently, closed-form expression for this discrete Green's function (DGF) was derived, which facilitates its applications in the FDTD simulations of radiation and scattering problems. Unfortunately, implementation of the new DGF formula in software requires a multiple precision arithmetic and may cause long runtimes. Therefore, an acceleration of the DGF computations on a CPU-GPU heterogeneous parallel processing system was developed using the multiple precision arithmetic and the OpenMP and CUDA parallel programming interfaces. The method avoids drawbacks of the CPU- and GPU-only accelerated implementations of the DGF, i.e. long runtime on the CPU and significant overhead of the GPU initialization respectively for long and short lengths of the DGF waveform. As a result, the seven-fold speedup was obtained relative to the reference DGF implementation on a multicore CPU thus applicability of the DGF in FDTD simulations was significantly improved.
Citation
Tomasz P. Stefanski, "Implementation of FDTD-Compatible Green's Function on Heterogeneous Cpu-GPU Parallel Processing System," Progress In Electromagnetics Research, Vol. 135, 297-316, 2013.
doi:10.2528/PIER12111702
References

1. Chew, , W. C., , "Electromagnetic theory on a lattice," Journal of Applied Physics, Vol. 75, No. 10, 4843-4850, 1994.
doi:10.1063/1.355770

2. Clemens, , M. and T. Weiland, "Discrete electromagnetism with the finite integration technique," Progress In Electromagnetics Research, Vol. 32, 65-87, 2001.
doi:10.2528/PIER00080103

3. Schuhmann, , R., T. Weiland, and , "Conservation of discrete energy and related laws in the finite integration technique," Progress In Electromagnetics Research, Vol. 32, 301-316, 2001.
doi:10.2528/PIER00080112

4. Bossavit, , A., Progress In Electromagnetics Research, and , "`Generalized finite differences' in computational electromagnetics,", Vol. 32, 45-64, 2001.
doi:10.2528/PIER00080102

5. Teixeira, F. L., "Geometric aspects of the simplicial discretization of Maxwell's equations," Progress In Electromagnetics Research, Vol. 32, 171-188, 2001.
doi:10.2528/PIER00080107

6. Vazquez, , J., C. G. Parini, and , "Discrete Green's function formulation of FDTD method for electromagnetic modelling," Electron. Lett., Vol. 35, No. 7, 554-555, 1999.
doi:10.1049/el:19990416

7. Holtzman, , R., R. Kastner, and , "The time-domain discrete Green's function method (GFM) characterizing the FDTD grid boundary," IEEE Trans. Antennas Propag., , Vol. 49, No. 7, 1079-1093, 2001.
doi:10.1109/8.933488

8. Holtzman, , R, R. Kastner, E. Heyman, and R. W. Ziolkowski, "Stability analysis of the Green's function method (GFM) used as an ABC for arbitrarily shaped boundaries," IEEE Trans. Antennas Propag., Vol. 50, No. 7, 1017-1029, 2002.
doi:10.1109/TAP.2002.802272

9. Jeng, S.-K., "An analytical expression for 3-D dyadic FDTD-compatible Green's function in infinite free space via z-transform and partial di®erence operators," IEEE Trans. Antennas Propag.,, Vol. 59, No. 4, 1347-1355, 2011.
doi:10.1109/TAP.2011.2109363

10. Vazquez, , J., C. G. Parini, and , "Antenna modelling using discrete Green's function formulation of FDTD method," Electron. Lett.,, Vol. 35, No. 13, 1033-1034, 1999.
doi:10.1049/el:19990741

11. Ma, W., M. R. Rayner, and C. G. Parini, "Discrete Green's function formulation of the FDTD method and its application in antenna modeling," IEEE Trans. Antennas Propag., Vol. 53, No. 1, 339-346, 2005.
doi:10.1109/TAP.2004.838797

12. Holtzman, , R, R. Kastner, E. Heyman, and R. W. Ziolkowski, "Ultra-wideband cylindrical antenna design using the Green's function method (GFM) as an absorbing boundary condition (ABC) and the radiated ¯eld propagator in a genetic optimization ," Microw. Opt. Tech. Lett., Vol. 48, No. 2, 348-354, 2006.
doi:10.1002/mop.21346

13. De Hon, B. P., J. M. Arnold, and , "Stable FDTD on disjoint domains --- A discrete Green's function diakoptics approach," Proc. The 2nd European Conf. on Antennas and Propag., 1-6, 2007.

14. Malevsky, , S., E. Heyman, and R. Kastner, "Source decomposition as a diakoptic boundary condition in FDTD with reflecting external regions," IEEE Trans. Antennas Propag., Vol. 58, No. 11, 3602-3609, 2010.
doi:10.1109/TAP.2010.2052577

15. Schneider, J. B., K. Abdijalilov, and , "Analytic fleld propagation TFSF boundary for FDTD problems involving planar interfaces: PECs, TE, and TM," IEEE Trans. Antennas Propag., Vol. 54, No. 9, 2531-2542, 2006.
doi:10.1109/TAP.2006.880757

16. Stefanski, , T. P., "Fast implementation of FDTD-compatible Green's function on multicore processor," IEEE Antennas Wireless Propag. Lett., Vol. 11, 81-84, 2012.
doi:10.1109/LAWP.2012.2183632

17. Stefanski, T. P. and K. Krzyzanowska, "Implementation of FDTD-compatible Green's function on graphics processing unit," IEEE Antennas Wireless Propag. Lett., Vol. 11, 1422-1425, 2012.
doi:10.1109/LAWP.2012.2229380

18. Sypek, , P., A. Dziekonski, and M. Mrozowski, "How to render FDTD computations more effective using a graphics accelerator," IEEE Trans. Magn., Vol. 45, No. 3, 1324-1327, 2009.
doi:10.1109/TMAG.2009.2012614

19. Toivanen, , J. I., T. P. Stefanski, N. Kuster, and N. Chavannes, "Comparison of CPML implementations for the GPU-accelerated FDTD solver ," Progress In Electromagnetics Research M,, Vol. 19, 61-75, 2011.
doi:10.2528/PIERM11061002

20. Tay, , W. C., D. Y. Heh, and E. L. Tan, "GPU-accelerated funda-mental ADI-FDTD with complex frequency shifted convolutional perfectly matched layer," Progress In Electromagnetics Research M, Vol. 14, 177-192, 2010 .
doi:10.2528/PIERM10090605

21. Stefanski, T. P. and Acceleration of the 3D, "Acceleration of the 3D ADI-FDTD method using graphics processor units," IEEE MTT-S International Microwave Symposium Digest, 241-244, 2009.

22. Xu, , K., Z. Fan, D.-Z. Ding, and R.-S. Chen, "GPU accelerated unconditionally stable Crank-Nicolson FDTD method for the analysis of three-dimensional microwave circuits," Progress In Electromagnetics Research, Vol. 102, 381-395, 2010.
doi:10.2528/PIER10020606

23. Shahmansouri, , A., B. Rashidian, and , "GPU implementation of split-field finite-difference time-domain method for Drude-Lorentz dispersive media," Progress In Electromagnetics Research , Vol. 125, 55-77, 2012.
doi:10.2528/PIER12010505

24. Zainud-Deen, , S. H., E. El-Deen, and , "Electromagnetic scattering using GPU-based finite difference frequency domain method," Progress In Electromagnetics Research B, Vol. 16, 351-369, 2009..
doi:10.2528/PIERB09060703

25. Demir, , V., "Graphics processor unit (GPU) acceleration of finite-difference frequency-domain (FDFD) method," Progress In Electromagnetics Research M, Vol. 23, 29-51, 2012.
doi:10.2528/PIERM11090909

26. Dziekonski, , A., A. Lamecki, and M. Mrozowski, "GPU acceleration of multilevel solvers for analysis of microwave components with finite element method," IEEE Microw. Wireless Comp. Lett., Vol. 21, No. 1, 1-3, 2011.
doi:10.1109/LMWC.2010.2089974

27. Dziekonski, , A., A. Lamecki, and M. Mrozowski, , "Tuning a hybrid GPU-CPU V-cycle multilevel preconditioner for solving large real and complex systems of FEM equations," IEEE Antennas Wireless Propag. Lett., Vol. 10, 619-622, 2011.
doi:10.1109/LAWP.2011.2159769

28. Dziekonski, , A., P. Sypek, A. Lamecki, and M. Mrozowski, "Finite element matrix generation on a GPU," Progress In Electromagnetics Research, Vol. 249, 249-265, 2012.

29. Dziekonski, A., A. Lamecki, and M. Mrozowski, "A memory e±cient and fast sparse matrix vector product on a GPU," Progress In Electromagnetics Research, Vol. 116, 49-63, 2011.

30. Peng, , S., Z. Nie, and , "Acceleration of the method of moments calculations by using graphics processing units," IEEE Trans. Antennas Propag., Vol. 56, No. 7, 2130-2133, 2008..
doi:10.1109/TAP.2008.924768

31. Xu, , K., D. Z. Ding, Z. H. Fan, and R. S. Chen, "Multilevel fast multipole algorithm enhanced by GPU parallel technique for electromagnetic scattering problems," Microw. Opt. Technol. Lett., Vol. 52, No. 3, 502-507, 2010.
doi:10.1002/mop.24963

32. Lopez-Fernandez, J. A., M. Lopez-Portugues, Y. Alvarez-Lopez, C. Garcia-Gonzalez, D. Martinez, and F. Las-Heras, "Fast antenna characterization using the sources reconstruction method on graphics processors," Progress In Electromagnetics Research , Vol. 126, 185-201, , 2012.
doi:10.2528/PIER11121408

33. Gao, , P. C., Y. B. Tao, Z. H. Bai, and H. Lin, , "Mapping the SBR and TW-ILDCs to heterogeneous CPU-GPU architecture for fast computation of electromagnetic scattering," Progress In Electromagnetics Research, Vol. 122, 137-154, 2012.

34. Granlund, , T., "The multiple precision integers and ratio-nals library," Edition 2.2.1, GMP Development Team, 2010.
doi:http://www.mpir.org.

35. Nakayama, , T., D. Takahashi, and , "Implementation of multiple-precision floating-point arithmetic library for GPU computing," Proc. 23rd IASTED International Conference on Parallel and Distributed Computing and Systems (PDCS), 343-349, , 2011.

36. OpenMP Architecture Review Board, "OpenMP application program interface," Version 3.1, 2011.
doi:www.openmp.org.

37. Nvidia, "CUDA C programming guide," Version 4.2,.
doi:http://developer.nvidia.com/cuda/nvidia-gpu-computing-docum-enta

38. Harris, , M., "Optimizing parallel reduction in CUDA," NVIDIA.
doi:http://developer.download.nvidia.com/co-mpute/cuda/1.1-Beta/x86

39. Shen, , W., D. Wei, W. Xu, X. Zhu, and S. Yuan, "Parallelized computation for computer simulation of electrocardiograms using personal computers with multi-core CPU and general-purpose GPU ," Computer Methods and Programs in Biomedicine,, Vol. 100, No. 1, 87-96, 2010 .
doi:10.1016/j.cmpb.2010.06.015