CUDA лекции (презентация)

  • Published on
    11-Aug-2015

  • View
    187

  • Download
    3

Embed Size (px)

Transcript

<p> . .. </p> <p> CUDA. GPU. CUDA C ..</p> <p>GPGPU GPU? GPU UDA </p> <p>. , 2011 .</p> <p> CUDA. GPU. CUDA C</p> <p>2</p> <p>GPGPU</p> <p>. , 2011 .</p> <p> CUDA. GPU. CUDA C</p> <p>3</p> <p>GPGPU </p> <p>GPGPU (general-purpose computing on GPU) . , , (, ). , , .</p> <p>. , 2011 .</p> <p> CUDA. GPU. CUDA C</p> <p>4</p> <p> GPGPU </p> <p> . . . , . CUDA: http://www.nvidia.com/object/cuda_showcase_html.html</p> <p>. , 2011 .</p> <p> CUDA. GPU. CUDA C</p> <p>5</p> <p> (heterogeneous computing) . : (CPU); APU (GPU); (, DSP); ; . CPU + GPU. , GPGPU , .. , 2011 . CUDA. GPU. CUDA C</p> <p>6</p> <p> GPU</p> <p> GPU:</p> <p> : HLSL GLSL Cg</p> <p> : NVIDIA CUDA AMD Stream ( )</p> <p> : OpenCL C++ AMP ()</p> <p>. , 2011 .</p> <p> CUDA. GPU. CUDA C</p> <p>7</p> <p>NVIDIA CUDA </p> <p>CUDA Compute Unified Device Architecture. - NVIDIA. GPU . .</p> <p>. , 2011 .</p> <p> CUDA. GPU. CUDA C</p> <p>8</p> <p>NVIDIA CUDA</p> <p> :</p> <p>: NVIDIA CUDA C Programming Guide v. 4.0. , 2011 . CUDA. GPU. CUDA C</p> <p>9</p> <p> CUDA </p> <p> 4.0: http://developer.nvidia.com/cuda-toolkit-40 CUDA driver. CUDA toolkit. ; ; ; . GPU Computing SDK. .</p> <p>. , 2011 .</p> <p> CUDA. GPU. CUDA C</p> <p>10</p> <p> GPU?</p> <p>. , 2011 .</p> <p> CUDA. GPU. CUDA C</p> <p>11</p> <p> CPU GPU</p> <p>: NVIDIA CUDA C Programming Guide v. 4.0. , 2011 . CUDA. GPU. CUDA C</p> <p>12</p> <p> CPU GPU</p> <p>: NVIDIA CUDA C Programming Guide v. 4.0. , 2011 . CUDA. GPU. CUDA C</p> <p>13</p> <p> GPU </p> <p>3 5 TOP500 2011 GPU:</p> <p>: top500.org. , 2011 . CUDA. GPU. CUDA C</p> <p>14</p> <p> GPU </p> <p>3 5 Green500 2011 GPU:</p> <p>: green500.org. , 2011 . CUDA. GPU. CUDA C</p> <p>15</p> <p> GPU</p> <p>. , 2011 .</p> <p> CUDA. GPU. CUDA C</p> <p>16</p> <p> CPU GPU</p> <p>CPU cache-oriented</p> <p>GPU cache-miss oriented</p> <p>: NVIDIA CUDA C Programming Guide v. 3.2. , 2011 . CUDA. GPU. CUDA C</p> <p>17</p> <p> CPU GPUGPU , : , . . . GPU .</p> <p>. , 2011 .</p> <p> CUDA. GPU. CUDA C</p> <p>18</p> <p> GPU: </p> <p>GPU - . (streaming multiprocessor, MP), CUDA- (CUDA core) . Fermi CUDA- (scalar processor, SP). CUDA- SIMD. , .</p> <p>. , 2011 .</p> <p> CUDA. GPU. CUDA C</p> <p>19</p> <p> Tesla 8/10</p> <p>: .. , .. - . , 2011 . CUDA. GPU. CUDA C</p> <p>20</p> <p> Tesla 8</p> <p>: .. , .. - . , 2011 . CUDA. GPU. CUDA C</p> <p>21</p> <p> Tesla 10</p> <p>: .. , .. - . , 2011 . CUDA. GPU. CUDA C</p> <p>22</p> <p> Tesla 10 </p> <p> (device/global) . (shared) SP MP. (constant cache) , SP MP). (texture cache) , SP MP). (register) () SP. (local) () SP.</p> <p>. , 2011 .</p> <p> CUDA. GPU. CUDA C</p> <p>23</p> <p> Fermi</p> <p>: .. , .. - . , 2011 . CUDA. GPU. CUDA C</p> <p>24</p> <p> Fermi NVIDIA. L2- . L1- . , L1- 48kB/16kB. . SFU. C++. ECC.. , 2011 . CUDA. GPU. CUDA C</p> <p>25</p> <p> (compute capability) . , (major) (minor) , major.minor. , 1.3. ( CUDA-) Appendix A NVIDIA CUDA C Programming Guide. Fermi 2 ( 2.x), 1 ( 1.x).. , 2011 . CUDA. GPU. CUDA C</p> <p>26</p> <p> Appendix G NVIDIA CUDA C Programming Guide. 1.0. , 1.3. (C++), 2.0. ++ 2.0 , CUDA 4.0. .. , 2011 . CUDA. GPU. CUDA C</p> <p>27</p> <p>. , 2011 .</p> <p> CUDA. GPU. CUDA C</p> <p>28</p> <p> (thread) (kernel). Fermi 4 . (thread blocks). , CUDA- . / (grid). . .</p> <p>. , 2011 .</p> <p> CUDA. GPU. CUDA C</p> <p>29</p> <p> ( ) , . Block ID (1D, 2D 3D). CUDA 4.0. Thread ID (1D, 2D 3D). . : , . , , x y- .. , 2011 . CUDA. GPU. CUDA C</p> <p>30</p> <p>: NVIDIA CUDA C Programming Guide v. 4.0. , 2011 . CUDA. GPU. CUDA C</p> <p>31</p> <p> , : ; . . .</p> <p>. , 2011 .</p> <p> CUDA. GPU. CUDA C</p> <p>32</p> <p> . . , . .</p> <p>. , 2011 .</p> <p> CUDA. GPU. CUDA C</p> <p>33</p> <p>: NVIDIA CUDA C Programming Guide v. 4.0. , 2011 . CUDA. GPU. CUDA C</p> <p>34</p> <p> . . , .: .. , .. - . , 2011 . CUDA. GPU. CUDA C</p> <p>35</p> <p> CUDA C</p> <p>. , 2011 .</p> <p> CUDA. GPU. CUDA C</p> <p>36</p> <p>CUDA C</p> <p>CUDA C /C++, ; ; . C++. C++ CUDA. : (host) = CPU; (device) = GPU; (kernel) , GPU.. , 2011 . CUDA. GPU. CUDA C</p> <p>37</p> <p> __host__ __global__ : host device : host host</p> <p>__device__ </p> <p>device</p> <p>device</p> <p>__host__ ( ) , . __global__ , (). __device__ , ( ) .</p> <p>. , 2011 .</p> <p> CUDA. GPU. CUDA C</p> <p>38</p> <p>__global__ </p> <p> void. / . . . . __global__ .</p> <p>. , 2011 .</p> <p> CUDA. GPU. CUDA C</p> <p>39</p> <p>__device__ </p> <p>__device__ __host__, 2 . . . 2.0 . __device__ 2.0 , . 1.x.. , 2011 . CUDA. GPU. CUDA C</p> <p>40</p> <p> __device__ , : ; ; , , . __constant__ , .</p> <p>. , 2011 .</p> <p> CUDA. GPU. CUDA C</p> <p>41</p> <p>__shared__ , , ; ; , .</p> <p>. , 2011 .</p> <p> CUDA. GPU. CUDA C</p> <p>42</p> <p>[u]char[1..4], [u]int[1..4], [u]long[1..4], float[1..4], double2 , .x, .y, .z, .w. , make_. . dim3 = uint3 + ; 1 dim3 . (/++) .</p> <p>. , 2011 .</p> <p> CUDA. GPU. CUDA C</p> <p>43</p> <p> GPU : gridDim dim3, ; blockIdx uint3, ; blockDim dim3, ; threadIdx uint3, . .</p> <p>. , 2011 .</p> <p> CUDA. GPU. CUDA C</p> <p>44</p> <p> (.. ). . , x- . idx = blockIdx.x * blockDim.x + threadIdx.x; 45</p> <p> . , 2011 .</p> <p> CUDA. GPU. CUDA C</p> <p>CUDA API</p> <p> CUDA API: ; ; ; ; API. cudaError_t, cudaSuccess . API: (CUDA driver API): cu*; (C runtime for CUDA): cuda*.. , 2011 . CUDA. GPU. CUDA C</p> <p>46</p> <p> : cudaError_t cudaGetDeviceCount(int* count) ; cudaError_t cudaGetDevice (int* dev) ; cudaError_t cudaGetDeviceProperties (struct cudaDeviceProp* prop, int dev) , .</p> <p>. , 2011 .</p> <p> CUDA. GPU. CUDA C</p> <p>47</p> <p> : cudaError_t cudaSetDevice (int dev) ; cudaError_t cudaChooseDevice (int* dev, const struct cudaDeviceProp* prop) , .</p> <p>. , 2011 .</p> <p> CUDA. GPU. CUDA C</p> <p>48</p> <p> : cudaError_t cudaMalloc (void** devPtr, size_t count) ; cudaError_t cudaFree (void* devPtr) . : cudaError_t cudaMemcpy (void* dst, const void* src, size_t count, enum cudaMemcpyKind kind) ( ); cudaMemcpyAsync .. , 2011 . CUDA. GPU. CUDA C</p> <p>49</p> <p> . . &gt; . Dg , Dg.x * Dg.y * Dg.z , . Db , Db.x * Db.y * Db.z . Ns, S , .. , 2011 . CUDA. GPU. CUDA C</p> <p>50</p> <p> : __global__ void kernel_func() { } dim3 dim_grid(100, 50); dim3 dim_block (8, 8, 8); kernel_func( );</p> <p>. , 2011 .</p> <p> CUDA. GPU. CUDA C</p> <p>51</p> <p>void __syncthreads() ( ). . cudaThreadSyncronize() ( ). . cudaStream_t cudaEvent_t.</p> <p>. , 2011 .</p> <p> CUDA. GPU. CUDA C</p> <p>52</p> <p>CUDA Hello, World!#include #include #include </p> <p>__global__ void hello(int * output) { int globalIdx = blockIdx.x * blockDim.x + threadIdx.x; output[globalIdx] = globalIdx; }</p> <p>. , 2011 .</p> <p> CUDA. GPU. CUDA C</p> <p>53</p> <p>CUDA Hello, World!int main() { int buffer_size = 4 * 512; int * buffer = new int[buffer_size]; int * buffer_gpu; cudaMalloc((void **) &amp;buffer_gpu, buffer_size * sizeof(float)); hello(buffer_gpu); cudaMemcpy(buffer, buffer_gpu, buffer_size * sizeof(float), cudaMemcpyDeviceToHost); cudaFree(buffer_gpu); delete [] buffer; return 0; }. , 2011 . CUDA. GPU. CUDA C</p> <p>54</p> <p> nvcc. Build rules Microsoft Visual Studio. CUDA 4.0 MSVS 2005 2008. CUDA 4.0 MSVS 2010.</p> <p>. , 2011 .</p> <p> CUDA. GPU. CUDA C</p> <p>55</p> <p> CUDA </p> <p>/C++ CPU</p> <p>NVCC</p> <p> CPU</p> <p> CUDA</p> <p> CPU</p> <p> CPU-GPU. , 2011 . CUDA. GPU. CUDA C</p> <p>56</p> <p>NVIDIA CUDA C Programming Guide v. 4.0: http://developer.download.nvidia.com/compute/cuda/4_0/tool kit/docs/CUDA_C_Programming_Guide.pdf CUDA : https://sites.google.com/site/cudacsmsusu/file-cabinet .. , .. CUDA: https://sites.google.com/site/cudacsmsusu/file-cabinet . , . CUDA : (. .).</p> <p>. , 2011 .</p> <p> CUDA. GPU. CUDA C</p> <p>57</p>