{"id":259,"date":"2022-11-26T10:55:55","date_gmt":"2022-11-26T01:55:55","guid":{"rendered":"http:\/\/mystouswp.cafe24.com\/?p=259"},"modified":"2022-11-28T07:58:40","modified_gmt":"2022-11-27T22:58:40","slug":"xpugpu-ipu-tpu-%eb%93%b1-acceleration-1","status":"publish","type":"post","link":"http:\/\/kyunam.com\/?p=259","title":{"rendered":"XPU(GPU, IPU, TPU \ub4f1) &#8211; Hardware Acceleration #1"},"content":{"rendered":"\n<h3 class=\"wp-block-heading\"><strong>\ub4e4\uc5b4 \uac00\uba70..<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">\uc778\uacf5\uc9c0\ub2a5\uc5d0 \ub300\ud55c \uac1c\ubc1c \uad00\uc2ec\ub3c4\uc640 \uae30\uc220\uc774 \ub098\ub0a0\uc774 \ubc1c\uc804\ud558\uace0 \uc788\uc2b5\ub2c8\ub2e4. \uc870\uae08 \ub354 \ube60\ub978 \uc18d\ub3c4\uc640 \ub192\uc740 \uc815\ud655\ub3c4\ub97c \uc5bb\uae30 \uc704\ud55c \ubb34\ud55c \uacbd\uc7c1\uc774 \uc26c\uc9c0 \uc54a\uace0 \uc774\ub8e8\uc5b4 \uc9c0\uace0 \uc788\uc8e0.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">GPGPU\uc0ac\uc6a9\uc774 \ub2f9\uc5f0\uc2dc \ub418\ub294 Deep Learning \ud658\uacbd\uc5d0\uc11c \ud0c8\ud53c\ud558\uace0\uc790 \uad6c\uae00\uc774\ub098 \uadf8\ub798\ud504\ucf54\uc5b4 \uac19\uc740 \ud68c\uc0ac\uc5d0\uc11c TPU, IPU\uac19\uc740 \uc0c8\ub85c\uc6b4 \ud558\ub4dc\uc6e8\uc5b4\ub97c \uac1c\ubc1c\ud558\uace0 \uad00\ub828\ub41c SW\ub4e4\ub3c4 \ud568\uaed8 \ub0b4 \ub193\uace0 \uc788\uc2b5\ub2c8\ub2e4.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" src=\"https:\/\/www.nvidia.com\/content\/dam\/en-zz\/es_em\/es_em\/Solutions\/Data-Center\/tesla-v100\/data-center-tesla-v100-pcie-625-ud@2x.jpg\" alt=\"\"\/><figcaption class=\"wp-element-caption\">NVIDIA Tesla V100 \uc774\ubbf8\uc9c0 (\ucd9c\ucc98: https:\/\/www.nvidia.com\/ko-kr\/data-center\/v100\/)<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">GPU\ub97c \uc0ac\uc6a9\ud558\uba74 \ub610\ub294 \ub2e4\uc591\ud55c XPU\ub97c \uc0ac\uc6a9\ud558\uba74 \ub525\ub7ec\ub2dd\uc774 \ube60\ub974\ub2e4\uace0 \uc54c\uace0 \uc788\uc9c0\ub9cc XPU\uc758 \uc5b4\ub5a4 \uae30\uc220 \ub54c\ubb38\uc5d0 \ub525\ub7ec\ub2dd\uc774 \ube68\ub9ac \uc9c0\ub294\uc9c0\ub294 \uc798 \ubaa8\ub974\ub294\uac8c \ud604\uc2e4\uc785\ub2c8\ub2e4.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\uc774\ubc88 \ud3ec\uc2a4\ud2b8\uc5d0\uc11c\ub294 GPU, TPU, IPU\ub4e4\uc744 \ud569\uccd0\uc11c \ubd80\ub974\ub294 XPU\uac00 \uc5b4\ub5a4 \uae30\uc220\uc801\uc778 \uad00\uc810\uc5d0\uc11c Deep Learning\uc744 \ube60\ub974\uac8c \ud560 \uc218 \uc788\ub294\uc9c0\uc640 \uac01\uac01\uc774 \uc5b4\ub5a4 \ud2b9\uc9d5\uc744 \uac00\uc9c0\uace0 \uc788\ub294\uc9c0 \ud655\uc778\ud574 \ubcf4\ub3c4\ub85d \ud558\uaca0\uc2b5\ub2c8\ub2e4.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\uadf8\ub0e5 \uc88b\uc740 GPU \uacc4\uc18d \uc0ac\uc11c \uac08\uc544 \ub123\uc73c\uba74 \uc88b\uc740\ub370 \uad6c\uc9c0 \uc54c\uc544\uc57c \ud558\ub098? \ub77c\uace0 \uc0dd\uac01\ud558\uc2e4 \uc218 \uc788\uc9c0\ub9cc \uc0ac\uc2e4 \uad6c\ud615 GPU\ub77c\ub3c4 100% \ud65c\uc6a9\uc744 \ubabb\ud558\uace0 \uc788\ub294\uac8c \uc0ac\uc2e4\uc774\uae30 \ub54c\ubb38\uc5d0 NVIDIA P, V, A \uacc4\uc5f4\uc774 \ubb50\uac00 \uc88b\uc544 \uc9c0\ub294\uc9c0, \uc65c \uc0ac\uc57c \ud558\ub294\uc9c0\ub97c \uc54c\uac8c \ub418\uba74 \ubcf8\uc778\ub4e4\uc758 Workload\uc5d0 \uc801\ud569\ud55c \ud22c\uc790\uc778\uc9c0 \uc544\ub2cc\uc9c0 \uc54c\uc218 \uc788\uac8c \ub418\ub294\uac70\uc8e0.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Deep Learning\uc5d0\uc11c Hardware Accelerator \uc5ed\ud560<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">\uba3c\uc800 XPU\uac00 \uc5b4\ub5bb\uac8c Deep Learning(\uc55e\uc73c\ub85c DL\ub85c \ud45c\uae30)\uc5d0\uc11c \ud559\uc2b5 \uc18d\ub3c4\ub97c \ube60\ub974\uac8c \ud560 \uc218 \uc788\ub294\uc9c0\ubd80\ud130 \uc815\ub9ac\ud574 \ubcf4\ub3c4\ub85d \ud558\uaca0\uc2b5\ub2c8\ub2e4. XPU(GPU, TPU, NPU, IPU)\ub97c \uc8fc CPU \uc5f0\uc0b0\uc774 \uc544\ub2cc \ubcf4\uc870\uc801\uc778 \uc5f0\uc0b0\uc774\ub77c\ub294 \uc758\ubbf8\uc5d0\uc11c Hardware Accelerator\ub77c\uace0 \ubd80\ub985\ub2c8\ub2e4. \uc6a9\uc5b4\uc5d0 \ub300\ud574\uc11c \uc57d\uac04 \uc0bc\ucc9c\ud3ec\ub85c \ube60\uc838\uc11c \ud55c\uac00\uc9c0\ub9cc \ub367\ubd99\uc774\uc790\uba74, \uc9c0\uae08\uc740 \uc57d\uac04 \uc2dc\ub4e4\ud574 \uc84c\uc9c0\ub9cc \uc778\ud154\uc5d0\uc11c x486\uae09\uc758 CPU\ub97c \uc9d1\uc801\ud574\uc11c \ub9cc\ub4e0 Intel Xeon Phi\ub77c\ub294 Hardware Accelerator\uac00 \uc788\uc5c8\uc2b5\ub2c8\ub2e4. \uc778\ud154\uc5d0\uc11c\ub294 \uc5ec\ub7ec\uac1c\uc758 \ud504\ub85c\uc138\uc2a4\uac00 \uba54\uc778 \ud504\ub85c\uc138\uc2a4\uc640 \ud568\uaed8 \uc5f0\uc0b0\ub41c\ub2e4\uace0 \ud558\uc5ec co-processor\ub77c\ub294 \ub2e8\uc5b4\ub97c \uc0ac\uc6a9\ud558\uae30\ub3c4 \ud558\uc600\uc2b5\ub2c8\ub2e4.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">XPU\uac00 Deep Learning\uc5d0\uc11c \uc18d\ub3c4\ub97c \ube60\ub974\uac8c \ud558\ub294 \uac83\uc740 DL \uc548\uc5d0\uc11c \uc218\uc5c5\uc774 \ubc18\ubcf5\ub418\ub294 Linear Algebra(\uc120\ud615 \ub300\uc218) \uc5f0\uc0b0\uc758 \uc18d\ub3c4\ub97c \ube60\ub974\uac8c \ud558\ub294 \uac83\uc785\ub2c8\ub2e4. \ub610\ub294 DL \uc5f0\uc0b0 \uc790\uccb4\ub97c \ubcd1\ub82c\ud654 \ub610\ub294 \ucd5c\uc801\ud654 \ud558\uc5ec \uc18d\ub3c4\ub97c \ub192\uc774\uae30\ub3c4 \ud558\uc8e0.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Linear Algebra\ub294 DL \uc5f0\uc0b0\uc758 \ub300\ubd80\ubd84\uc744 \ucc28\uc9c0 \ud558\ub294 \uc911\uc694\ud55c Operation \ub4e4\uc785\ub2c8\ub2e4. \ub300\ud45c\uc801\uc778 \uc5f0\uc0b0\uc740 \uc544\ub798\uc640 \uac19\uc2b5\ub2c8\ub2e4.<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Transpose (A<sup>T<\/sup>)<\/li>\n\n\n\n<li>Matrix Product (AB)<\/li>\n\n\n\n<li>Element-wise Product or Hadamard Product (A\u2299B)<\/li>\n\n\n\n<li>Matrix Inverse (A<sup>-1<\/sup>)<\/li>\n\n\n\n<li>Norm<\/li>\n\n\n\n<li>Eigen Decomposition (Av=\u03bbv)<\/li>\n\n\n\n<li>Vector reductions<\/li>\n\n\n\n<li>Gradients and Jacobians<\/li>\n\n\n\n<li>determinant ( det(A) )<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">Gradients\uc640 Jacobians\ub294 \uae30\ubcf8\uc801\uc778 Linear Algebra\ub294 \uc544\ub2c8\uc9c0\ub9cc \ubbf8\ubd84\uc744 \uc704\ud574\uc11c \ub9ce\uc774 \uc0ac\uc6a9\ub418\ub294 \uc5f0\uc0b0\uc785\ub2c8\ub2e4. XPU\ub294 \uc704\uc758 \uc5f4\uac70\ud55c \uc5f0\uc0b0\ub4e4\uc744 \uac01\uc790\uc758 \ubc29\uc2dd\uc5d0 \ub9de\uac8c \ube60\ub974\uac8c \uc5f0\uc0b0\ud558\ub294 \ubc29\ubc95\uc744 \ud558\ub4dc\uc6e8\uc5b4 \uc801\uc73c\ub85c \uad6c\ud604\ud55c \uac83\uc785\ub2c8\ub2e4.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Linear Algebra\uc758 \uc5f0\uc0b0\uc744 \uc5b4\ub5bb\uac8c \ube60\ub974\uac8c \ub9cc\ub4e4 \uc218 \uc788\uc744\uae4c\uc694? XPU\ubcc4\ub85c \uc57d\uac04\uc529 \ucc28\uc774\uac00 \uc788\uc9c0\ub9cc \uc5f0\uc0b0\uc744 \ube60\ub974\uac8c \ub9cc\ub4dc\ub294 \uae30\uc220\uc740 \uc544\ub798\ucc98\ub7fc \ubd84\ub958 \ud560 \uc218 \uc788\uc2b5\ub2c8\ub2e4.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>SIMD (Single Instruction Multiple Data), <\/li>\n\n\n\n<li>MIMD (Multiple Instructions Multiple Data)<\/li>\n\n\n\n<li>Systolic Architectures (Systolic Operation)<\/li>\n\n\n\n<li>Memory Locality Optimization<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">\uc774\uc678\uc5d0\ub3c4 \ub9ce\uc740 \ud14c\ud06c\ub2c9\ub4e4\uc774 \uc788\uc9c0\ub9cc \ub300\ud45c\uc801\uc73c\ub85c \uc774 4\uac00\uc9c0\uac00 \uac00\uc7a5 \ub9ce\uc774 \uc0ac\uc6a9\ub429\ub2c8\ub2e4. \ud2b9\ud788 SIMD\ub294 Hardware Accelerator\uac00 \uc81c\uacf5\ud558\ub294 \uc131\ub2a5 \ud5a5\uc0c1\uc758 Key \uc785\ub2c8\ub2e4. \ud558\ub098\uc758 \uba85\ub839\uc5b4\ub97c \uc5ec\ub7ec\uac1c\uc758 \ub370\uc774\ud130\uc5d0 \uc801\uc6a9\ud558\ub294 \uac83\uc785\ub2c8\ub2e4. Matrix \uc5f0\uc0b0\uc744 \uc0dd\uac01\ud574 \ubcf4\uba74 \ud589\uacfc \uc5f4\uc758 \uacf1\uacfc \ud569\uc744 \ubc18\ubcf5 \ud569\ub2c8\ub2e4. \ud558\ub098\uc758 \uba85\ub839\uc5b4\uc5d0 \ub370\uc774\ud130\ub9cc \ubcc0\uacbd\uc774 \ub418\ub294\uac70\uc8e0. \uc774 \uac83\uc744 \ud55c\ubc88\uc5d0 \uc5ec\ub7ec \ud589\uacfc \uc5f4\uc758 \uacc4\uc0b0\uc744 \ud55c \ubc88\uc5d0 \ud558\ub294 \uac83\uc785\ub2c8\ub2e4. \ube60\ub974\uaca0\uc8e0? \uc774\ub7f0 \ubcd1\ub82c \uc5f0\uc0b0\uc744 \uc5b4\ub5bb\uac8c \ud6a8\uacfc\uc801\uc73c\ub85c \ucc98\ub9ac \ud558\ub290\ub0d0\uac00 XPU \ud558\ub4dc\uc6e8\uc5b4 \ub0b4\ubd80\uc801\uc73c\ub85c \ud558\uace0 \uc788\ub294 \uc77c\uc785\ub2c8\ub2e4.<\/p>\n\n\n<div class=\"wp-block-image aligncenter size-full wp-image-271\">\n<figure class=\"is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/mystouswp.cafe24.com\/wp-content\/uploads\/2020\/08\/simd-1-1024x503.png\" alt=\"\" class=\"wp-image-271\" width=\"776\" height=\"381\" srcset=\"http:\/\/kyunam.com\/wp-content\/uploads\/2020\/08\/simd-1-1024x503.png 1024w, http:\/\/kyunam.com\/wp-content\/uploads\/2020\/08\/simd-1-300x148.png 300w, http:\/\/kyunam.com\/wp-content\/uploads\/2020\/08\/simd-1-768x378.png 768w, http:\/\/kyunam.com\/wp-content\/uploads\/2020\/08\/simd-1.png 1444w\" sizes=\"auto, (max-width: 776px) 100vw, 776px\" \/><figcaption class=\"wp-element-caption\">SIMD Operation<\/figcaption><\/figure>\n<\/div>\n\n\n<pre class=\"wp-block-code\"><code class=\"\">\u203b GPGPU\uc5d0\uc11c\ub294 SIMD \ub300\uc2dc SIMT(Single Instruction Multiple Threads) \ub77c\ub294 \ub2e8\uc5b4\ub97c \uc0ac\uc6a9\ud569\ub2c8\ub2e4. Hardware Thread\uc5d0 \ub3d9\uc77c\ud55c \uc5f0\uc0b0\uc744 \ub3d9\uc2dc\uc5d0 \uc218\ud589\ud558\ub294 \uac83\uc744 \ub9d0\ud569\ub2c8\ub2e4. SIMD\uc640 \uc720\uc0ac\ud55c \uac1c\ub150\uc774\uae30 \ub300\ubb38\uc5d0 SIMD\ub85c \ud1b5\uce6d \ud558\ub3c4\ub85d \ud558\uaca0\uc2b5\ub2c8\ub2e4.<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">FMA(FFMA), Systolic Architecture \uc790\uc138\ud788 \uc54c\uc544 \ubcf4\uae30<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">SIMD, MIMD\uc640 Memory Locality Optimization\uc678\uc5d0 \ud55c\uac00\uc9c0 \ub354 \uc0b4\ud3b4 \ubcfc\uae4c\uc694? Matrix Product\uc744 \uc0dd\uac01\ud574 \ubcf4\uba74 \ud589\uacfc \uc5f4\uc758 \uac01 \uc5d8\ub9ac\uba3c\ud2b8\ub4e4\uc758 \uacf1\uc744 \ud569\ud558\ub294 \uacfc\uc815\uc785\ub2c8\ub2e4. \uc790\uc138\ud788 \uc0b4\ud3b4 \ubcfc\uae4c\uc694?<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">ax(\ud589\uacfc \uc5f4 \uac01 \uc5d8\ub9ac\uba3c\ud2b8 \uac01\uac01\uc758 \uacf1) + b(\uacb0\uacfc \ub204\uc801\uce58, \ucc98\uc74c\uc740 0)<\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">\uba85\ub839\uc5b4\uac00 2\uac00\uc9c0\uc774\uc8e0. \uacf1\uacfc \ud569. \uc774 \uc774\uc57c\uae30\ub294 \uba85\ub839\uc5b4 cycle\uc774 2\ubc88\uc774\ub77c\ub294 \uc774\uc57c\uae30 \uc785\ub2c8\ub2e4. \uadf8\ub7f0\ub370 \uc774 \uacfc\uc815\uc774 \uc5c4\uccad \ub9ce\uc774 \ubc18\ubcf5\ub429\ub2c8\ub2e4. \ubc18\ubcf5\uc774 \ub418\ub2c8 \uc774\uac83\uc744 \ube60\ub974\uac8c \ud558\uae30 \uc704\ud55c \uc5f0\uc0b0\uc774 \ub098\uc654\uc2b5\ub2c8\ub2e4.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">FMA(fused multiply-add) : ax+b\ub97c 1 cycle\uc5d0 \uc5f0\uc0b0\ud558\ub294 \uac83<\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">\uadf8\ub9ac\uace0 \uc774\uac83\uc744 \ubd80\ub3d9\uc18c\uc218\uc810\uc5d0\ub3c4 \uc801\uc6a9\uc744 \ud558\uba74<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">FFMA(floating point fused multiply-add)<\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">FFMA\ub294 XPU\uc5d0\uc11c \ub9ce\uc774 \uc0ac\uc6a9\ub418\ub294 \uc5f0\uc0b0\ub4e4\uc785\ub2c8\ub2e4. SIMD\ub97c \uae30\ubcf8\uc73c\ub85c \ud558\uba74\uc11c\ub3c4 \uc5f0\uc0b0\uc758 \ud69f\uc218\ub97c \uc904\uc774\ub294 \ubc29\ubc95\uc774\uc8e0. \uadf8\ub7fc CPU\ub294 \uc5c6\ub290\ub0d0? \uadf8\ub807\uc9c0 \uc54a\uc2b5\ub2c8\ub2e4. \uc5ec\ub7ec\ubd84\uc774 \uc0ac\uc6a9\ud558\uc2dc\ub294 CPU\uc5d0\ub3c4 \ubcd1\ub82c\ucc98\ub9ac\ub97c \uc704\ud55c \uc5f0\uc0b0\ub4e4\uc774 \ub4e4\uc5b4 \uc788\uc2b5\ub2c8\ub2e4. AVX2\ub77c\uace0 \ubd88\ub9ac\ub294 \uba85\ub839\uc5b4 Set\uc774 \ud558\uc2a4\uc6f0 \uc544\ud0a4\ud14d\uccd0 \ubd80\ud130 \uc9c0\uc6d0\uc774 \ub429\ub2c8\ub2e4. \ud558\uc9c0\ub9cc CPU \uc790\uccb4\ub294 ALU\uc758 \uc22b\uc790\uac00 XPU\uc5d0 \ube44\ud574\uc11c \ud604\uc800\ud788 \uc801\uae30 \ub54c\ubb38\uc5d0 \ud070 \ud6a8\uc728\uc744 \ubabb \ub0b4\uace0 \uc788\uc8e0. \ud558\uc9c0\ub9cc \uc11c\ubc84\uc6a9 CPU\uc778 Xeon\uc758 \uacbd\uc6b0 \uc218\uc2ed\uac1c\uc758 CPU Core\uac00 \uc874\uc7ac \ud558\uae30 \ub54c\ubb38\uc5d0 \ud6a8\uacfc\uac00 \uaf64 \uc788\uc2b5\ub2c8\ub2e4. \ub54c\ubb38\uc5d0 \uc11c\ubc84\uc5d0\uc11c\ub294 CPU\uc5d0\uc11c\ub3c4 \ubcd1\ub82c\ucc98\ub9ac\ub97c \ud1b5\ud574\uc11c DL \uc804\ucc98\ub9ac\ub97c \ud558\uae30\ub3c4 \ud569\ub2c8\ub2e4.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">FMA, FFMA \uc5ed\uc2dc Hardware Accelerator\uac00 \uc9c0\uc6d0\ud558\ub294 \uc911\uc694\ud55c \uc5f0\uc0b0 \uc911 \ud558\ub098 \uc785\ub2c8\ub2e4.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Systolic Architecture\ub294 GPGPU\ubcf4\ub2e4\ub294 TPU, IPU\uc5d0\uc11c \ub9ce\uc774 \uc0ac\uc6a9\ub418\ub294 \ubc29\uc2dd\uc785\ub2c8\ub2e4. \ubcd1\ub82c \ucc98\ub9ac\uc5d0\uc11c \ub3d9\uc2dc \ucc98\ub9ac \ubcf4\ub2e4 \uc911\uc694\ud55c \ubd80\ubd84\uc774 \ub370\uc774\ud130\uac00 \uacc4\uc0b0\uc774 \ud544\uc694\ud55c \uacf3\uc5d0 \uac00\uc7a5 \uc54c\ub9de\uc740 \uc2dc\uae30\uc5d0 \ub3c4\ucc29\ud574 \uc788\ub294 \uac83\uc785\ub2c8\ub2e4. \uc774\ub97c \uc704\ud574\uc11c Pre-Patch\ub97c \ud560 \uc218\ub3c4 \uc788\uace0, Pipiline\uc744 \uc798 \uc124\uacc4\ud560 \uc218 \ub3c4 \uc788\uc2b5\ub2c8\ub2e4. \ud558\uc9c0\ub9cc Host\uc5d0\uc11c Accelerator\ub85c \ub370\uc774\ud130 \ubcf5\uc0ac\uac00 \ube48\ubc88\ud560 \uacbd\uc6b0 \uc131\ub2a5 \uc800\ud558\uac00 \ubc1c\uc0dd\ud560 \uc218 \uc788\uc2b5\ub2c8\ub2e4. Systolic Architecture\ub294 \ub370\uc774\ud130 \uc5f0\uc0b0\uc2dc \ub370\uc774\ud130 \ubcf5\uc0ac\uac00 \ucd5c\uc18c\ud654 \ub418\ub3c4\ub85d \ud558\ub294 \ubc29\ubc95\uc785\ub2c8\ub2e4. \uc774\ub294 PE(Processing Element)\uac00 \ub3c5\ub9bd\ub41c NPU \uacc4\uc5f4\uc774 \ucc98\ub9ac\uac00 \uc720\ub9ac\ud558\uae30 \ub54c\ubb38\uc5d0 GPGPU\ubcf4\ub2e4 NPU \uacc4\uc5f4\uc5d0\uc11c \ub9ce\uc774 \uc0ac\uc6a9 \ub418\ub294 \ubc29\ubc95\uc785\ub2c8\ub2e4. \uc774\ub97c \ud1b5\ud574\uc11c Memory Locality\ub97c \ud5a5\uc0c1\ud560 \uc218 \uc788\uc2b5\ub2c8\ub2e4. \uad6c\uae00\uc758 TPU\uac00 \uc0ac\uc6a9\ud558\ub294 \ub300\ud45c\uc801\uc778 \ubc29\ubc95\uc785\ub2c8\ub2e4.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\uc704\uc758 \uadf8\ub9bc\uc740 Matrix Multiply\ub97c \uadf8\ub9bc\uc73c\ub85c \ub098\ud0c0\ub0b8 \uadf8\ub9bc \uc785\ub2c8\ub2e4. \uadf8\ub9bc\uc744 \ud1b5\ud574\uc11c Systolic\uc5d0 \ub300\ud574\uc11c \uc54c\uc544 \ubcf4\uaca0\uc2b5\ub2c8\ub2e4. A Matrix\uc640 B Matrix\ub97c \uacf1\ud574\uc11c C Marix\uc5d0 \uc800\uc7a5\ud55c\ub2e4\uace0 \ud560 \ub54c, A Matrix A<sub>00<\/sub>\uc774 \uc5f0\uc0b0\uc5d0 \uc0ac\uc6a9 \ub418\ub294 C Matrix  \uc704\uce58\uc5d0 \ube68\uac04\uc0c9 \ub3d9\uadf8\ub77c\ubbf8\ub97c \ud45c\uc2dc \ud558\uc600\uc2b5\ub2c8\ub2e4. \uac19\uc740 \ubc29\ubc95\uc73c\ub85c B Matrix\uc758 B<sub>00<\/sub>\uc744 \ud30c\ub780\uc0c9 \ub3d9\uadf8\ub77c\ubbf8\ub85c \ud45c\uc2dc \ud558\uc600\uc2b5\ub2c8\ub2e4. <\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"306\" src=\"http:\/\/mystouswp.cafe24.com\/wp-content\/uploads\/2021\/03\/image-1024x306.png\" alt=\"\" class=\"wp-image-468\" srcset=\"http:\/\/kyunam.com\/wp-content\/uploads\/2021\/03\/image-1024x306.png 1024w, http:\/\/kyunam.com\/wp-content\/uploads\/2021\/03\/image-300x90.png 300w, http:\/\/kyunam.com\/wp-content\/uploads\/2021\/03\/image-768x230.png 768w, http:\/\/kyunam.com\/wp-content\/uploads\/2021\/03\/image-1536x459.png 1536w, http:\/\/kyunam.com\/wp-content\/uploads\/2021\/03\/image-2048x612.png 2048w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">C<sub>00<\/sub>\uc740 A<sub>00<\/sub>, A<sub>01<\/sub>, A<sub>02<\/sub>, A<sub>03<\/sub>\uac01\uac01\uacfc B<sub>00<\/sub>, B<sub>10<\/sub>, B<sub>20<\/sub>, B<sub>30<\/sub>\uac01\uac01\uc744 \uacf1\ud55c \uac12\uc744 \ub354\ud55c \uacb0\uacfc \uc785\ub2c8\ub2e4.  C<sub>01<\/sub>\uc740 A<sub>00<\/sub>, A<sub>01<\/sub>, A<sub>02<\/sub>, A<sub>03<\/sub> \uac01\uac01\uacfc B<sub>10<\/sub>, B<sub>11<\/sub>, B<sub>12<\/sub>, B<sub>13<\/sub>\uac01\uac01\uc744 \uacf1\ud55c \uac12\uc744 \ub354\ud55c \uacb0\uacfc \uc785\ub2c8\ub2e4. \uc774\ub7f0 \uc2dd\uc73c\ub85c \uc0dd\uac01\ud574 \ubcf4\uba74 A<sub>00<\/sub>\ub294 C<sub>00<\/sub>, C<sub>01<\/sub>, C<sub>02<\/sub>, C<sub>03<\/sub> \uc5f0\uc0b0\uc5d0 \uc0ac\uc6a9\ub429\ub2c8\ub2e4. B<sub>00<\/sub>\ub294 C<sub>00<\/sub>, C<sub>10<\/sub>, C<sub>20<\/sub>, C<sub>30<\/sub> \ub97c \uacc4\uc0b0\ud558\ub294 \uc5f0\uc0b0\uc5d0 \uc0ac\uc6a9\ub429\ub2c8\ub2e4. A<sub>00<\/sub>, B<sub>00<\/sub>\ub4f1\uc744 \uacc4\uc0b0\uc2dc \ub9c8\ub2e4 Offloading\uc744 \ud558\uba74 \uc544\uc8fc \ub9ce\uc740 Cost\uac00 \ubc1c\uc0dd\ud560 \uac83 \uc785\ub2c8\ub2e4. Systolic\uc740 \ud608\uad00\uc5d0 \ud608\uc561\uc774 \ud750\ub974\ub4ef\uc774 \uc774 A<sub>00<\/sub>, B<sub>00<\/sub>\ub97c PE\ub4e4\uc774 \uc5f0\uc0b0\ud560 \uc218 \uc788\ub3c4\ub85d \uccab\ubc88\uc9f8 \uc5f0\uc0b0\uc2dc\uc5d0\ub9cc Offloading\uc744 \ud558\uace0 PE\uac04 \ub370\uc774\ud130\ub97c \uc774\ub3d9\ud558\uba74\uc11c \uc0ac\uc6a9\ud558\ub294 \ubc29\ubc95\uc785\ub2c8\ub2e4. \ub2f9\uc5f0\ud788 PE\uc5d0 Local \uba54\ubaa8\ub9ac\uac00 \uc874\uc7ac \ud574\uc57c\ub9cc \ud569\ub2c8\ub2e4. \ub9ce\uc740 NPU\ub4e4\uc774 Local Cache, Local Memoery\ub97c \uac00\uc9c0\uace0 \uc788\uae30 \ub54c\ubb38\uc5d0 Offloading\uc758 Overhead\ub97c \uadf9\ubcf5\ud560 \uc218 \uc788\ub294 \uac83\uc785\ub2c8\ub2e4. \uc544\ub798 \uadf8\ub9bc\uc774 \ubcf4\uc2dc\uba74 \uc774\ud574\uac00 \ub418\uc2e4 \uac81\ub2c8\ub2e4.  C Matrix\uc758 \uac01 \uc5d8\ub9ac\uba3c\ud2b8\uc758 \uc5f0\uc0b0\uc744 1\uac1c\uc758 PE\uac00 \ub2f4\ub2f9\ud55c\ub2e4\uace0 \uac00\uc815\ud569\ub2c8\ub2e4.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"508\" src=\"http:\/\/mystouswp.cafe24.com\/wp-content\/uploads\/2021\/03\/image-1-1024x508.png\" alt=\"\" class=\"wp-image-469\" srcset=\"http:\/\/kyunam.com\/wp-content\/uploads\/2021\/03\/image-1-1024x508.png 1024w, http:\/\/kyunam.com\/wp-content\/uploads\/2021\/03\/image-1-300x149.png 300w, http:\/\/kyunam.com\/wp-content\/uploads\/2021\/03\/image-1-768x381.png 768w, http:\/\/kyunam.com\/wp-content\/uploads\/2021\/03\/image-1-1536x762.png 1536w, http:\/\/kyunam.com\/wp-content\/uploads\/2021\/03\/image-1-2048x1016.png 2048w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Systolic Achitecture\uc5d0\uc11c\ub294 \uc55e\uc11c \uc124\uba85\ud55c \uac83\ucc98\ub7fc A<sub>00<\/sub>, B<sub>00<\/sub>\uac00 \uac00\uc7a5 \uba3c\uc800 offloading \ub41c \ud6c4\uc5d0 \uacf1\ud574\uc9c4 \ud6c4\uc5d0 C<sub>00<\/sub>\uc5d0 \uc800\uc7a5\ub429\ub2c8\ub2e4. A<sub>00<\/sub>, B<sub>00<\/sub>\ub294 \ub2e4\uc74c \uc5f0\uc0b0\uc744 \uc704\ud574 \uac01\uac01  C<sub>01<\/sub>, C<sub>10<\/sub>\uc73c\ub85c \uc774\ub3d9\ud569\ub2c8\ub2e4. \uadf8\ub9ac\uace0 \ub2e4\uc74c \uc5f0\uc0b0\uc744 \uc704\ud574\uc11c \ub370\uc774\ud130\uac00 \ucd94\uac00\ub85c offloading \ub429\ub2c8\ub2e4. \uc774\ub54c\ub294 C<sub>00<\/sub> \uc758 \uc804\uccb4 \uc5f0\uc0b0\uc758 \ub2e4\uc74c \uacc4\uc0b0\uc744 \uc704\ud574\uc11c A<sub>01<\/sub>, B<sub>10<\/sub>\uc774 Offloading\ub418\uace0 C<sub>01<\/sub>\uc5d0\uc11c A<sub>00<\/sub>\uc640 \uacc4\uc0b0\ub420 B<sub>01<\/sub>\uc774 C<sub>10<\/sub>\uc5d0\uc11c B<sub>00<\/sub>\uc640 \uacc4\uc0b0\ub420 A<sub>10<\/sub> \uc774 Offloading \ub429\ub2c8\ub2e4. \uc774\ub7f0 \uc5f0\uc0b0\uc744 \uc218\ud589\ud558\uba74 \uc5f0\uc0b0\uc758 \uc21c\uc11c\uac00 \uc67c\ucabd \uc0c1\ub2e8\uc5d0\uc11c \ubd80\ud130 \uc624\ub978\ucabd \ud558\ub2e8\uc73c\ub85c \uc5f0\uc0b0\uc774 \ud655\uc0b0\ub418\ub294 \ud615\ud0dc\ub97c \ubcfc \uc218 \uc788\uc2b5\ub2c8\ub2e4. \uc0c8\ub86d\uac8c Offloading\ub41c \ub370\uc774\ud130\ub294 A Matrix \uac12\uc740 Row Index\ub97c \uc720\uc9c0\ud558\uba74\uc11c \uc624\ub978\ucabd\uc73c\ub85c B Matrix\uc73c \uac12\uc740 Column Index\ub97c \uc720\uc9c0 \ud558\uba74\uc11c \uc544\ub7ab\ucabd\uc73c\ub85c \uc774\ub3d9\ud558\uba74\uc11c \uc5f0\uc0b0\uc5d0 \uc0ac\uc6a9\ub429\ub2c8\ub2e4. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\uc5ec\uae30\uae4c\uc9c0 XPU\ub4e4\uc774 DL\uc744 \ube60\ub974\uac8c \ud558\uae30 \uc704\ud55c \uc804\ub7b5\uc744 \uac04\ub2e8\ud558\uac8c \uc815\ub9ac\ud574 \ubcf4\uc558\uc2b5\ub2c8\ub2e4. \uc774\uc81c\ubd80\ud130 XPU\ub97c \ud558\ub098\uc529 \uc0b4\ud3b4 \ubcf4\ub3c4\ub85d \ud558\uaca0\uc2b5\ub2c8\ub2e4.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>\uc5f0\uc0b0 \uac00\uc18d\uc744 \uc704\ud55c Hardware Accelerator<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>XPU \ubcc4 \ud2b9\uc9d5<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\uc55e\uc5d0\uc11c \uc774\uc57c\uae30 \ud55c SIMD, MIMD, Systolic Architecture, FMA, Memory Locality\ub4f1\uc740  XPU \ubaa8\ub450\uac00 \ud65c\uc6a9\ud558\uace0 \uc788\ub294 \uae30\uc220\ub4e4\uc785\ub2c8\ub2e4. \ubaa8\ub450\uac00 \ub3d9\uc77c\ud55c \ubc29\ubc95\uc73c\ub85c \uc5f0\uc0b0\uc744 \ud55c\ub2e4\uba74 \uc5f0\uc0b0\uc758 \uc18d\ub3c4(Latency)\uc640 \uc5f0\uc0b0\uc758 \uc6a9\ub7c9(Throughput)\uc774 XPU \uc120\ud0dd\uc758 \uc8fc\uc694 \uace0\ub824 \uc0ac\ud56d\uc774 \ub420 \uac83\uc785\ub2c8\ub2e4. \ub2f9\uc5f0\ud788 \uac00\uaca9 \ub300\ube44 \uc131\ub2a5\ub3c4 \uace0\ub824\ub418\uaca0\uc9c0\uc694. \ud558\uc9c0\ub9cc XPU\ub4e4\uc740 \uac01\uae30 \ub2e4\ub978 \uc800\ub9c8\ub2e4\uc758 \ud2b9\uc9d5\uc774 \uc788\uc2b5\ub2c8\ub2e4. GPGPU, TPU, IPU \uc138\uac00\uc9c0 Hardware Accelerator\uc5d0 \ub300\ud574\uc11c \uac04\ub7b5\ud788 \uc815\ub9ac\ud574 \ubcf4\uc558\uc2b5\ub2c8\ub2e4. \uac1c\ub7b5\uc801\uc778 \ub0b4\uc6a9\uc744 \uc0b4\ud3b4 \ubcf8 \ud6c4\uc5d0 \ud558\ub098\uc529 \uc790\uc138\ud788 \uc0b4\ud3b4 \ubcfc\uae4c\uc694? <\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>NVIDIA\uc758 GPGPU (General-Purpose computing on Graphics Processing Unit)<\/strong><br>Hardware accelerator\uc5d0\uc11c \uac00\uc7a5 \ub9ce\uc774 \uc54c\ub824\uc9c4 \uac00\uc18d\uae30\ub294 NVIDIA\uc758 Tesla\uc785\ub2c8\ub2e4. (General Purpose GPU\uc758 Tesla \uacc4\uc5f4\ub85c \ud55c\uc815) Tesla Pascal \uacc4\uc5f4 \uae4c\uc9c0\ub294 Core\uc758 \uace0\uc9d1\uc801\ub3c4\ud654, \uacf5\uc720 \uba54\ubaa8\ub9ac \ud5a5\uc0c1 \ubc0f \uba54\ubaa8\ub9ac \ubc84\uc2a4 \uc18d\ub3c4 \ud5a5\uc0c1\uc73c\ub85c \uc18d\ub3c4\ub97c \ud5a5\uc0c1 \uc2dc\ucf30\uc2b5\ub2c8\ub2e4. Volta \uacc4\uc5f4 \ubd80\ud130\ub294 Tensor\uc5f0\uc0b0\uc744 \uc704\ud55c Tensor core\ub97c \ucd94\uac00 \ud558\uc5ec Tensor \uc5f0\uc0b0 \uc790\uccb4\ub97c \ubcd1\ub82c\ud654 \ud558\uc600\uc2b5\ub2c8\ub2e4. \uc774\ub85c \uc778\ud574\uc11c ML \ud559\uc2b5, \ucd94\ub860\uc758 \uc804\uccb4 \uc18d\ub3c4\ub97c \ud5a5\uc0c1 \uc2dc\ucf30\uc2b5\ub2c8\ub2e4. Volta \uacc4\uc5f4\uc5d0\uc11c\ub294 FP16\ub9cc\uc744 \uc9c0\uc6d0\ud558\uc9c0\ub9cc Ampere \uacc4\uc5f4 \ubd80\ud130\ub294 FP64, TF32, bfloat16, FP16, INT8, INT4, INT1\uc744 \uc9c0\uc6d0\ud558\uc5ec \ub2e4\uc591\ud55c \ub370\uc774\ud130 \ud0c0\uc785\uc744 \uc9c0\uc6d0\ud558\uc5ec \uc5ec\ub7ec \uacbd\uc6b0\uc758 \uc5f0\uc0b0\uc5d0\uc11c \uc131\ub2a5 \uc18d\ub3c4\ub97c \ud5a5\uc0c1 \uc2dc\ucf30\uc2b5\ub2c8\ub2e4.<\/li>\n\n\n\n<li><strong>Google\uc758 TPU (Tensor Processing Unit)<\/strong><br>TPU\ub294 Tensor\ub97c \ud558\ub4dc\uc6e8\uc5b4\uc801\uc73c\ub85c \ucc98\ub9ac\ub97c \ud558\ub294 \ubcf4\uc870 \uc5f0\uc0b0 \uc7a5\uce58\ub97c \ub9d0\ud569\ub2c8\ub2e4. TPU\ub294 Linear Algebra \uc5f0\uc0b0\uc744 \ube60\ub974\uac8c \uad6c\ud604\ud55c \uac83 \ubfd0\ub9cc \uc544\ub2c8\ub77c inference(\ucd94\ub860)\uc5d0\uc11c 32 bits floating point\ub97c \uc0ac\uc6a9\ud558\uc9c0 \uc54a\uace0 8 bits int\ub97c \uc0ac\uc6a9\ud558\uc5ec 2015\ub144 Nvidia\uc758 K Series GPU\ubcf4\ub2e4 10~30\ubc30\uc758 \ud6a8\uacfc \uc801\uc778 \uc131\ub2a5\uc744 \uad6c\ud604 \ud558\uc600\uc2b5\ub2c8\ub2e4. NPU \uacc4\uc5f4\uc758 TPU\ub294 Systolic Architecture\ub97c \ub3c4\uc785\ud558\uc5ec Tensor(Matrix) \uc5f0\uc0b0\uc758 \ud6a8\uc728\uc744 \uadf9\ub300\ud654 \ud558\uc5ec TPU v2, 3\uc5d0\uc11c\ub294 inference\ubfd0 \uc544\ub2c8\ub77c training\uc5d0\uc11c\ub3c4 \ub192\uc740 \uc131\ub2a5\uc744 \ubcf4\uc5ec\uc8fc\uace0 \uc788\uc2b5\ub2c8\ub2e4.<\/li>\n\n\n\n<li><strong>GraphCore\uc758 IPU (Intelligence Processing Unit)<\/strong><br>Matrix \uc5f0\uc0b0\uc744 \uc704\ud55c Hardware accelerator\ub294 \ub9e4\uc6b0 \ub2e4\uc591\ud569\ub2c8\ub2e4. \uadf8 \uc911\uc5d0\uc11c GraphCore\uc758 IPU\ub97c \uc608\ub85c \ub4e0 \uc774\uc720\ub294 2\uac00\uc9c0 \uc785\ub2c8\ub2e4. \uccab \ubc88\uc9f8 \uc774\uc720\ub294 \uad6c\uc870\ub098 \uc6d0\ub9ac\uc5d0 \ub300\ud55c \uc815\ubcf4\uac00 \ub9ce\uc774 \uc788\uc5b4\uc11c \uc785\ub2c8\ub2e4.(\uc124\uba85\uc744 \ud574\uc57c \ud558\ubbc0\ub85c) \ub450 \ubc88\uc9f8 \uc774\uc720\ub294 IPU\uac00 \uac00\uc9c0\uace0 \uc788\ub294 \uad6c\uc870\uc801 \ud2b9\uc9d5 \ub54c\ubb38\uc785\ub2c8\ub2e4.  IPU\ub294 GPGPU\ucc98\ub7fc SIMD \ucc98\ub9ac\ub97c \uc704\ud55c ALU\uac00 \uace0\uc9d1\uc801 \ub418\uc5b4 \uc788\ub294 \uad6c\uc870\uc785\ub2c8\ub2e4. \uadf8\ub9ac\uace0 GPGPU\uc640 \ub2e4\ub974\uac8c Control Flow\/Prediction\uacfc Local Memory(SRAM)\ub97c \uac00\uc9c0\uace0 \uc788\uc2b5\ub2c8\ub2e4. \uc774\ub97c \ud65c\uc6a9\ud558\uc5ec MIMD\uac00 \uac00\ub2a5\ud558\ub3c4\ub85d \ud558\uc600\uace0 Memory Locality\ub97c \uadf9\ub300\ud654 \ud558\uc5ec \uacc4\uc0b0 \uc18d\ub3c4\ub97c \ucd5c\ub300\ud654 \ud558\uc600\uc2b5\ub2c8\ub2e4.<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/upload.wikimedia.org\/wikipedia\/commons\/b\/be\/Tensor_Processing_Unit_3.0.jpg\" alt=\"\" width=\"631\" height=\"450\"\/><figcaption class=\"wp-element-caption\">Google\uc758TPUv3 Karte (\ucd9c\ucc98: https:\/\/commons.wikimedia.org\/wiki\/File:Tensor_Processing_Unit_3.0.jpg)<br>Zinskauf \/ CC BY-SA (https:\/\/creativecommons.org\/licenses\/by-sa\/4.0)<\/figcaption><\/figure>\n<\/div>\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.graphcore.ai\/hs-fs\/hubfs\/IPU%20Machine2000%20Top_KL_compressed.jpg?width=960&amp;name=IPU%20Machine2000%20Top_KL_compressed.jpg\" alt=\"\" width=\"631\" height=\"474\"\/><figcaption class=\"wp-element-caption\">GraphCore\uc758 IPU (\ucd9c\ucc98: GraphCore Homepage &#8211; https:\/\/www.graphcore.ai\/products\/ipu)<\/figcaption><\/figure>\n<\/div>\n\n\n<p class=\"wp-block-paragraph\">3\uac00\uc9c0 \ubaa8\ub450 Linear Algebra\ub97c \ube60\ub974\uac8c \ud558\uae30 \uc704\ud574\uc11c \uac01\uae30 \ub2e4\ub978 \uc804\ub7b5\uc744 \uad6c\uc0ac\ud558\uace0 \uc788\uc2b5\ub2c8\ub2e4. \uc81c\uc870\uc0ac\ubcc4\ub85c \ubcf8\uc778\ub4e4\uc758 \uc7a5\uce58\uac00 \uac00\uc7a5 \ub6f0\uc5b4\ub098\ub2e4\uace0 \uc8fc\uc7a5\uc740 \ud558\uc9c0\ub9cc \uc800\uc758 \uc0dd\uac01\uc740 \uac01\uac01\uc758 \ud2b9\uc131\uc744 \uc774\ud574\ud558\uace0 \uc774\ub97c \ud65c\uc6a9\ud560 \uc218 \uc788\ub294 \ubc29\ubc95\uc774 \uc911\uc694\ud55c \uac83\uc774\uc9c0 \uc6b0\uc5f4\uc744 \uac00\ub974\ub294 \uac83\uc740 \uc758\ubbf8\uac00 \uc5c6\ub2e4\uace0 \uc0dd\uac01\ud569\ub2c8\ub2e4. \uc5b4\ucc28\ud53c Hardware\ub77c\ub294 \uac83\uc740 \uc2dc\uac04\uc774 \uc9c0\ub098\uba74 \uc88b\uc544\uc9c0\uae30 \ub54c\ubb38\uc5d0 \uc624\ub298\uc758 1\ub4f1\uc774 \uc601\uc6d0\ud55c 1\ub4f1\uc740 \uc544\ub2c8\uae30 \ub54c\ubb38\uc785\ub2c8\ub2e4. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\ub2e4\uc74c \ud3ec\uc2a4\ud2b8\uc5d0\uc11c\ub294 3\uac00\uc9c0 Hardware Accelerator\uc774 \uac00\uc9c0\uace0 \uc788\ub294 \ud2b9\uc9d5\uc5d0 \ub300\ud574\uc11c \uc790\uc138\ud788 \uc54c\uc544 \ubcf4\ub3c4\ub85d \ud558\uaca0\uc2b5\ub2c8\ub2e4. GPGPU, TPU, IPU\uc758 \ub3d9\uc791 \ubc29\uc2dd\uc5d0 \ub300\ud574\uc11c \uc54c\uc544 \ubcf4\uba74\uc11c \ucd94\uac00\uc801\uc778 Hardware Accelerator\uc778 Cerebras \uc2dc\uc2a4\ud15c\uacfc Habana Lab \uc81c\ud488\uc5d0 \ub300\ud574\uc11c\ub3c4 \uac04\ub2e8\ud558\uac8c \uc54c\uc544 \ubcf4\uaca0\uc2b5\ub2c8\ub2e4.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>\ub4e4\uc5b4 \uac00\uba70.. \uc778\uacf5\uc9c0\ub2a5\uc5d0 \ub300\ud55c \uac1c\ubc1c \uad00\uc2ec\ub3c4\uc640 \uae30\uc220\uc774 \ub098\ub0a0\uc774 \ubc1c\uc804\ud558\uace0 \uc788\uc2b5\ub2c8\ub2e4. \uc870\uae08 \ub354 \ube60\ub978 \uc18d\ub3c4\uc640 \ub192\uc740 \uc815\ud655\ub3c4\ub97c \uc5bb\uae30 \uc704\ud55c \ubb34\ud55c \uacbd\uc7c1\uc774 \uc26c\uc9c0 \uc54a\uace0 \uc774\ub8e8\uc5b4 \uc9c0\uace0 \uc788\uc8e0. GPGPU\uc0ac\uc6a9\uc774 \ub2f9\uc5f0\uc2dc \ub418\ub294 Deep Learning<span class=\"more-button\"><a href=\"http:\/\/kyunam.com\/?p=259\" class=\"more-link\">\uc77d\uc5b4 \ubcfc\uae4c?<span class=\"screen-reader-text\">XPU(GPU, IPU, TPU \ub4f1) &#8211; Hardware Acceleration #1<\/span><\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":271,"comment_status":"open","ping_status":"open","sticky":true,"template":"","format":"standard","meta":{"footnotes":""},"categories":[63],"tags":[168,170,171,172,169,167],"class_list":["post-259","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-parallel-computing","tag-gpu","tag-ipu","tag-npu","tag-quantum","tag-tpu","tag-xpu"],"_links":{"self":[{"href":"http:\/\/kyunam.com\/index.php?rest_route=\/wp\/v2\/posts\/259","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/kyunam.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/kyunam.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/kyunam.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/kyunam.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=259"}],"version-history":[{"count":3,"href":"http:\/\/kyunam.com\/index.php?rest_route=\/wp\/v2\/posts\/259\/revisions"}],"predecessor-version":[{"id":658,"href":"http:\/\/kyunam.com\/index.php?rest_route=\/wp\/v2\/posts\/259\/revisions\/658"}],"wp:featuredmedia":[{"embeddable":true,"href":"http:\/\/kyunam.com\/index.php?rest_route=\/wp\/v2\/media\/271"}],"wp:attachment":[{"href":"http:\/\/kyunam.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=259"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/kyunam.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=259"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/kyunam.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=259"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}