Например, Бобцов

ANALYSIS OF CUDA EFFICIENCY IN SOLVING LINEAR TRIDIAGONAL SYSTEMS FOR THEORETICAL OPTION PRICING

Аннотация:

Parallel cyclic reduction method for solving linear tridiagonal systems is implemented on GPU. The advisability of matrix formation directly in GPU global memory is shown. The approach provides a more than 20-fold acceleration as compared to single-threaded calculation. With the account for data transfer between RAM and GPU, a 5—8-fold acceleration is attained with the use of mapped memory.

Ключевые слова:

Статьи в номере