GC 程式屋: [KaaS] NVIDIA announces CUDA 9.2

2018年5月21日星期一

[KaaS] NVIDIA announces CUDA 9.2

參考資料:
https://news.developer.nvidia.com/cuda-9-2-now-available/

本文

CUDA 9.2

CUDA 9.2 includes updates to libraries, a new library for accelerating custom linear-algebra algorithms, and lower kernel launch latency.

With CUDA 9.2, you can:

Speed up recurrent and convolutional neural networks through cuBLAS optimizations
Speed up FFT of prime size matrices through Bluestein kernels in cuFFT
Accelerate custom linear algebra algorithms with CUTLASS 1.0
Launch CUDA kernels up to 2X faster than CUDA 9 with new optimizations to the CUDA runtime

Additionally, CUDA 9.2 includes bug fixes and supports new operating systems and popular development tools. CUDA 9.2 is freely available for download today!

Download Now

“Red Hat works closely with NVIDIA to help bring the full power of NVIDIA CUDA to our users. Collaborating with NVIDIA, we’ve paired the new features and performance improvements of CUDA 9.2 with new Red Hat Enterprise Linux versions, giving our expanding community of CUDA developers an easier-to-install, more tightly integrated software stack that helps deliver greater application performance for demanding AI and HPC workloads.”

Chris Wright, Vice President and Chief Technology Officer at Red Hat, Inc.

CUDA - New Features and Beyond

Learn about new features in CUDA including updates to the programming model, computing libraries and development tools.

CUTLASS: CUDA Primitives for Dense Linear Algebra

Learn how to implement high-performance matrix-multiplication (GEMM) using open-source C++ template abstractions.

Multi-GPU Programming Techniques in CUDA

Learn techniques and pitfalls of direct multi-GPU programming in CUDA and a novel method using NVLink to scale programs with minimal effort.

Everything You Need to Know About Unified Memory

Learn fundamental principles, important use cases, performance considerations and optimization ideas using Unified Memory.

CUDA 9

CUDA 9 is the most powerful software platform for GPU-accelerated applications. It has been built for Volta GPUs and includes faster GPU-accelerated libraries, a new programming model for flexible thread management, and improvements to the compiler and developer tools. With CUDA 9 you can speed up your applications while making them more scalable and robust.

Release Highlights

2X - 5X

UP TO 5X FASTER LIBRARIES WITH OPTIMIZATIONS AND HEURISTICS

POWERFUL THREAD MANAGEMENT WITH COOPERATIVE GROUPS

UP TO 1.5X FASTER HPC APPS WITH VOLTA GPUS, NVLINK AND HBM2

Key Features

Libraries

Speed up high performance computing (HPC) and deep learning apps with new GEMM kernels in cuBLAS
Execute image and signal processing apps faster with performance optimizations across multiple GPU configurations in cuFFT and NVIDIA Performance Primitives
Solve linear and graph analytics problems common in HPC with new algorithms in cuSOLVER and nvGRAPH

Cooperative Groups

Express rich parallel algorithms with threads from sub-tiles to warps, blocks and grids
Manage and reuse threads efficiently within an application with new API and function primitives
Replace warp-synchronous programming with robust programming model on Kepler architecture and above

Volta Architecture

Execute AI applications faster with Tensor Cores performing 5X faster than Pascal GPUs
Scale multi-GPU applications with next generation NVLink delivering 2X throughput of prior generation
Increase GPU utilization with Volta Multi-Process Service (MPS)

Development Tools

Optimize and pre-fetch memory access by identifying source code causing page faults in unified memory
Profile NVLink efficiently by adding events to timeline and color coding connections
Inspect unified memory performance bottlenecks with new event filters based on virtual address, migration reason and page fault access type

See Release Notes for details.

心得討論:

此次更新聲稱可以比CUDA 9更提高兩倍內核速度，但一般deep learning使用者比較常用的DNN沒有特別去更新，而是主要更新cuBLAS跟cuFFT，期待tensorflow盡快支援cuda9.2!

GC 程式屋

2018年5月21日星期一

[KaaS] NVIDIA announces CUDA 9.2

CUDA 9.2

CUDA - New Features and Beyond

CUTLASS: CUDA Primitives for Dense Linear Algebra

Multi-GPU Programming Techniques in CUDA

Everything You Need to Know About Unified Memory

CUDA 9

Release Highlights

UP TO 5X FASTER LIBRARIES WITH OPTIMIZATIONS AND HEURISTICS

POWERFUL THREAD MANAGEMENT WITH COOPERATIVE GROUPS

UP TO 1.5X FASTER HPC APPS WITH VOLTA GPUS, NVLINK AND HBM2

Key Features

沒有留言:

張貼留言

2018年5月21日 星期一

[KaaS] NVIDIA announces CUDA 9.2

CUDA 9.2

CUDA - New Features and Beyond

CUTLASS: CUDA Primitives for Dense Linear Algebra

Multi-GPU Programming Techniques in CUDA

Everything You Need to Know About Unified Memory

CUDA 9

Release Highlights

UP TO 5X FASTER LIBRARIES WITH OPTIMIZATIONS AND HEURISTICS

POWERFUL THREAD MANAGEMENT WITH COOPERATIVE GROUPS

UP TO 1.5X FASTER HPC APPS WITH VOLTA GPUS, NVLINK AND HBM2

Key Features

沒有留言:

張貼留言

2018年5月21日星期一