Jump to content

RCUDA: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Repairing links to disambiguation pages - You can help! - Node
remove url, is spam
 
(46 intermediate revisions by 17 users not shown)
Line 1: Line 1:
{{Short description|Type of middleware software framework for remote GPU virtualization}}
{{Infobox software
{{Infobox software
| name = rCUDA
| name = rCUDA
| screenshot =
| screenshot = RCUDA.gif
| developer = [[Universitat Politecnica de Valencia]]
| caption =
| latest release version = 20.07
| developer = [[G.A.P.]]/[[Universidad Politécnica de Valencia]], [[H.P.C.A]]/[[Universitat Jaume I]]
| latest release date = {{Start date and age|2020|07|26}}
| latest_release_version = 3.1
| operating system = [[Linux]]
| latest_release_date = {{Start date and age|2011|10|19}}
| operating_system = [[Linux]]
| platforms =
| genre = [[GPGPU]]
| genre = [[GPGPU]]
| license = [[GNU/GPL]], [[LGPL]]
| website = [http://www.gap.upv.es/rCUDA http://www.gap.upv.es/rCUDA]
}}
}}


'''rCUDA''', which stands for '''Remote CUDA''', is a type of [[middleware]] software framework for remote [[GPU]] virtualization. Fully compatible with the [[CUDA]] application programming interface ([[API]]), it allows the allocation of one or more CUDA-enabled GPUs to a single application. Each GPU can be part of a [[Cluster (computing)|cluster]] or running inside of a [[virtual machine]]. The approach is aimed at improving performance in GPU clusters that are lacking full utilization. GPU virtualization reduces the number of GPUs needed in a cluster, and in turn, leads to a lower cost configuration – less energy, acquisition, and maintenance.
'''rCUDA''' is a [[middleware]] that enables '''Computer Unified Device Architecture''' [[CUDA]] remoting over a commodity network. That is, the middleware allows an application to use a CUDA-compatible [[graphics processing unit]] ([[GPU]]) installed in a remote computer as if it were installed in the computer where the application is being executed. This approach is based on the observation that GPUs in a cluster are not usually fully utilized, and it is intended to reduce the number of GPUs in the [[Cluster (computing)|cluster]], thus lowering the costs related with acquisition and maintenance while keeping performance close to that of the fully equipped configuration.


The recommended distributed acceleration architecture is a [[Supercomputer|high performance computing]] cluster with GPUs attached to only a few of the cluster nodes. When a node without a local [[GPU]] executes an application needing GPU resources, remote execution of the [[GPGPU#Kernels|kernel]] is supported by data and code transfers between local system memory and remote GPU memory. rCUDA is designed to accommodate this [[client-server]] architecture. On one end, clients employ a library of wrappers to the high-level CUDA Runtime API, and on the other end, there is a network listening service that receives requests on a [[TCP port]]. Several nodes running different GPU-accelerated applications can concurrently make use of the whole set of accelerators installed in the cluster. The client forwards the request to one of the servers, which accesses the GPU installed in that computer and executes the request in it. [[Time-multiplexing]] the GPU, or in other words ''sharing'' it, is accomplished by spawning different server processes for each remote GPU execution request.<ref>{{cite journal |author1=J. Prades |author2=F. Silla | title = GPU-Job Migration: the rCUDA Case | location = Transactions on Parallel and Distributed Systems, vol 30, no. 12 | date = December 2019}}</ref><ref>{{cite journal |author1=J. Prades |author2=C. Reaño |author3=F. Silla | title = On the Effect of using rCUDA to Provide CUDA Acceleration to Xen Virtual Machines | location = Cluster Computing, vol.22, no. 1 | date = March 2019 }}</ref><ref>{{cite journal |author1=F. Silla |author2=S. Iserte |author3=C. Reaño |author4=J. Prades | title = On the Benefits of the Remote GPU Virtualization Mechanism: the rCUDA Case. | location = Concurrency and Computation: Practice and Experience, vol. 29, no. 13 | date = July 2017}}</ref><ref>{{cite journal |author1=J. Prades |author2=B. Varghese |author3=C. Reaño |author4=F. Silla | title = Multi-Tenant Virtual GPUs for Optimising Performance of a Financial Risk Application. | location = Journal of Parallel and Distributed Computing, vol. 108 | date = October 2017|arxiv=1606.04473 }}</ref><ref>{{cite journal |author1=F. Pérez |author2=C. Reaño |author3=F. Silla | title = Providing CUDA Acceleration to KVM Virtual Machines in InfiniBand Clusters with rCUDA. | location = 16th IFIP International Conference on Distributed Applications and Interoperable Systems (DAIS 2016), Heraklion, Crete, Greece | date = June 6–9, 2016}}</ref><ref>{{cite journal |author1=S. Iserte |author2=J. Prades |author3=C. Reaño |author4=F. Silla | title = Increasing the Performance of Data Centers by Combining Remote GPU Virtualization with Slurm. | location = 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID 2016), Cartagena, Colombia | date = May 16–19, 2016}}</ref>
Following a proposed distributed acceleration architecture for [[High Performance Computing]] Clusters<ref>{{cite journal | author = Duato, José; Igual, Francisco; Mayo, Rafael; Peña, Antonio; Quintana-Ortí, Enrique; Silla, Federico; | title = An Efficient Implementation of GPU Virtualization in High Performance Clusters | location = Euro-Par 2009 – Parallel Processing Workshops HPPC, HeteroPar, PROPER, ROIA, UNICORE, VHPC, Delft, The Netherlands | date = August 25, 2009 | series = Lecture Notes in Computer Science | pages = 385–394 | volume = 6043 | isbn = 978-3-642-14122-5 | doi = 10.1007/978-3-642-14122-5_441}}</ref><ref>{{cite journal | author = Duato, José; Peña, Antonio; Silla, Federico; Mayo, Rafael; Quintana-Ortí, Enrique; | title = rCUDA: Reducing the number of GPU-based accelerators in high performance clusters | location = High Performance Computing and Simulation (HPCS), 2010 International Conference on, Caen, France | pages = 224–231 | date = June 28, 2010 | doi = 10.1109/HPCS.2010.5547126 | isbn = 978-1-4244-6827-0 }}</ref><ref>{{cite journal | author = Duato, José; Peña, Antonio; Silla, Federico; Mayo, Rafael; Quintana-Ortí, Enrique; | title = Performance of CUDA Virtualized Remote GPUs in High Performance Clusters. | location = International Conference on Parallel Processing (ICPP), 2011 IInternational Conference on Taipei, Taiwan | pages = 365–374 | date = September 13, 2011 | doi = 10.1109/ICPP.2011.58 | isbn = 978-1-4577-1336-1 }}</ref>with GPUs attached only to a few of its nodes (see Figure 1), when a node without a local [[GPU]] executes an application that makes use of a GPU to accelerate part of its [[code]] (usually referred to as [[kernel]]), some support has to be provided to deal with the data and code transfers between the local [[main memory]] and the remote [[GPU memory]], as well as the remote execution of the kernel.


==rCUDA v20.07==
[[File:RCUDA.gif|thumb|400px|right|'''Figure 1''']]


The rCUDA middleware enables the concurrent usage of CUDA-compatible devices remotely.
rCUDA is designed following the [[client-server]] distributed architecture: on one side, clients employ a library of wrappers to the high-level CUDA Runtime [[API]] and, on the other side, there is a [[GPU network]] service listening for requests on a [[TCP port]]. Figure 1 illustrates this proposal, where several nodes running different [[GPU-accelerated application]]s can concurrently make use of the whole set of accelerators installed in the cluster. When an application demands a GPU service, its request is derived to the client side of our architecture, running in that computer.


rCUDA employs either the InfiniBand network or the socket API for the communication between clients and servers. rCUDA can be useful in three different environments:
The client forwards the request to one of the servers, which accesses the GPU installed in that computer and executes the request in it. [[Time-multiplexing]] ([[sharing]]) the GPU is accomplished by spawning a different [[server process]] for each remote execution over a new GPU context.

== rCUDA 3.1 ==

The rCUDA Framework enables the concurrent usage of CUDA-compatible devices remotely.

rCUDA employs the socket API for the communication between clients and servers. Thus, it can be useful in three different environments:


* Clusters. To reduce the number of GPUs installed in High Performance Clusters. This leads to energy savings, as well as other related savings like acquisition costs, maintenance, space, cooling, etc.
* Clusters. To reduce the number of GPUs installed in High Performance Clusters. This leads to energy savings, as well as other related savings like acquisition costs, maintenance, space, cooling, etc.
Line 33: Line 24:
* Virtual Machines. To enable the access to the CUDA facilities on the physical machine.
* Virtual Machines. To enable the access to the CUDA facilities on the physical machine.


The current version of rCUDA (v3.1) implements all functions in the CUDA Runtime API version 4.0, excluding graphics interoperability. rCUDA 3.1 targets the Linux OS (for 32- and 64-bit architectures) on both client and server sides.
The current version of rCUDA (v20.07) supports CUDA version 9.0, excluding graphics interoperability. rCUDA v20.07 targets the Linux OS (for 64-bit architectures) on both client and server sides.


CUDA applications do not need any change in their source code in order to be executed with rCUDA.
Currently, rCUDA-ready applications have to be programmed using the plain C API. In addition, [[Host (network)|host]] and [[peripheral|device]] code need to be compiled separately. Find code examples in the '''rCUDA SDK package''', based on the '''NVIDIA CUDA SDK'''. The rCUDA User's Guide on the rCUDA webpage explains more.


==References==
==References==
Line 41: Line 32:


==External links==
==External links==
*[http://www.nvidia.com/object/cuda_home.html Nvidia CUDA Official site]
*[http://www.nvidia.com/object/cuda_home_new.html Nvidia CUDA Official site]
*[http://www.gap.upv.es/rCUDA rCUDA Co-Official site]
*[http://www.hpca.uji.es/rCUDA rCUDA Co-Official site]
*[http://gpgpu.org/2011/10/20/rcuda-3-1 GPGPU.ORG rCUDA 3.1 Released]


{{lowercase}}
{{lowercase}}


[[Category:Software]]
[[Category:Software frameworks]]
[[Category:GPGPU]]
[[Category:GPGPU]]
[[Category:Computer libraries]]
[[Category:Middleware]]
[[Category:System software]]
[[Category:Cloud platforms]]
[[Category:Cloud computing]]
[[Category:Distributed computing architecture]]
[[Category:Distributed computing]]
[[Category:Parallel computing]]

Latest revision as of 12:33, 1 June 2024

rCUDA
Developer(s)Universitat Politecnica de Valencia
Stable release
20.07 / July 26, 2020; 4 years ago (2020-07-26)
Operating systemLinux
TypeGPGPU
Websitewww.rcuda.net Edit this on Wikidata

rCUDA, which stands for Remote CUDA, is a type of middleware software framework for remote GPU virtualization. Fully compatible with the CUDA application programming interface (API), it allows the allocation of one or more CUDA-enabled GPUs to a single application. Each GPU can be part of a cluster or running inside of a virtual machine. The approach is aimed at improving performance in GPU clusters that are lacking full utilization. GPU virtualization reduces the number of GPUs needed in a cluster, and in turn, leads to a lower cost configuration – less energy, acquisition, and maintenance.

The recommended distributed acceleration architecture is a high performance computing cluster with GPUs attached to only a few of the cluster nodes. When a node without a local GPU executes an application needing GPU resources, remote execution of the kernel is supported by data and code transfers between local system memory and remote GPU memory. rCUDA is designed to accommodate this client-server architecture. On one end, clients employ a library of wrappers to the high-level CUDA Runtime API, and on the other end, there is a network listening service that receives requests on a TCP port. Several nodes running different GPU-accelerated applications can concurrently make use of the whole set of accelerators installed in the cluster. The client forwards the request to one of the servers, which accesses the GPU installed in that computer and executes the request in it. Time-multiplexing the GPU, or in other words sharing it, is accomplished by spawning different server processes for each remote GPU execution request.[1][2][3][4][5][6]

rCUDA v20.07

[edit]

The rCUDA middleware enables the concurrent usage of CUDA-compatible devices remotely.

rCUDA employs either the InfiniBand network or the socket API for the communication between clients and servers. rCUDA can be useful in three different environments:

  • Clusters. To reduce the number of GPUs installed in High Performance Clusters. This leads to energy savings, as well as other related savings like acquisition costs, maintenance, space, cooling, etc.
  • Academia. In commodity networks, to offer access to a few high performance GPUs concurrently to many students.
  • Virtual Machines. To enable the access to the CUDA facilities on the physical machine.

The current version of rCUDA (v20.07) supports CUDA version 9.0, excluding graphics interoperability. rCUDA v20.07 targets the Linux OS (for 64-bit architectures) on both client and server sides.

CUDA applications do not need any change in their source code in order to be executed with rCUDA.

References

[edit]
  1. ^ J. Prades; F. Silla (December 2019). "GPU-Job Migration: the rCUDA Case". Transactions on Parallel and Distributed Systems, vol 30, no. 12. {{cite journal}}: Cite journal requires |journal= (help)CS1 maint: location (link)
  2. ^ J. Prades; C. Reaño; F. Silla (March 2019). "On the Effect of using rCUDA to Provide CUDA Acceleration to Xen Virtual Machines". Cluster Computing, vol.22, no. 1. {{cite journal}}: Cite journal requires |journal= (help)CS1 maint: location (link)
  3. ^ F. Silla; S. Iserte; C. Reaño; J. Prades (July 2017). "On the Benefits of the Remote GPU Virtualization Mechanism: the rCUDA Case". Concurrency and Computation: Practice and Experience, vol. 29, no. 13. {{cite journal}}: Cite journal requires |journal= (help)CS1 maint: location (link)
  4. ^ J. Prades; B. Varghese; C. Reaño; F. Silla (October 2017). "Multi-Tenant Virtual GPUs for Optimising Performance of a Financial Risk Application". Journal of Parallel and Distributed Computing, vol. 108. arXiv:1606.04473. {{cite journal}}: Cite journal requires |journal= (help)CS1 maint: location (link)
  5. ^ F. Pérez; C. Reaño; F. Silla (June 6–9, 2016). "Providing CUDA Acceleration to KVM Virtual Machines in InfiniBand Clusters with rCUDA". 16th IFIP International Conference on Distributed Applications and Interoperable Systems (DAIS 2016), Heraklion, Crete, Greece. {{cite journal}}: Cite journal requires |journal= (help)CS1 maint: location (link)
  6. ^ S. Iserte; J. Prades; C. Reaño; F. Silla (May 16–19, 2016). "Increasing the Performance of Data Centers by Combining Remote GPU Virtualization with Slurm". 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID 2016), Cartagena, Colombia. {{cite journal}}: Cite journal requires |journal= (help)CS1 maint: location (link)
[edit]