CN105786589A

CN105786589A - Cloud rendering system, server and method

Info

Publication number: CN105786589A
Application number: CN201610107582.2A
Authority: CN
Inventors: 张微; 杨磊; 罗涛; 曾锦平; 邱泳天; 周益; 陈乐吉; 苏永生; 杨学亮; 雷智聪; 唐迎力; 付兵; 谢琼; 陈平
Original assignee: Chengdu Hermes Technology Co Ltd
Current assignee: Chengdu Yun Chuang interconnected Information Technology Co., Ltd.
Priority date: 2016-02-26
Filing date: 2016-02-26
Publication date: 2016-07-20
Also published as: WO2017143718A1

Abstract

The invention discloses a cloud rendering system which comprises a host machine and multiple GPUs.The host machine is provided with multiple virtual machines.Each virtual machine is provided with one corresponding GPU drive.The cloud rendering system further comprises an MMU coupled to each GPU drive and each GPU and coupled with virtual machine internal storages and a host machine internal storage, and an IOMMU coupled to each GPU and the virtual machine internal storages.When any virtual machine makes a request for getting access to the GPUs, the MMU distributes one GPU address to the GPU drive of the virtual machine, wherein the GPU address is used for getting access to the GPU; when any virtual machine makes a request for getting access to the virtual machine internal storage, the MMU distributes the corresponding host machine internal storage address to the virtual machine; when any GPU makes a request for getting access to the virtual machine internal storage, the IOMMU distributes the corresponding host machine internal storage address to the GPU.According to the system, by configuration of the MMU and the IOMMU, the multiple virtual machines can get direct access to the GPUs independently.Compared with a Nvidia VGPU architecture generally adopted in the prior art, the system is low in price and high in rendering performance.

Description

A kind of cloud rendering system, server and method

Technical field

The present invention relates to GPU vitualization technical field, particularly to a kind of cloud rendering system, server and method.

Background technology

Cloud computing is more and more universal, and increasing manufacturer is considering the business of oneself to be transferred on the cloud main frame (such as virtual machine, container virtualizes) of cloud service provider.But current cloud main frame cannot provide stronger 3D rendering capability and GPGPU (GeneralPurposeGPU general-purpose computations graphic process unit) computing capability, support application and the high-performance calculation application of strong 3D real-time rendering demand, current cloud service provider depends on the support to Intel Virtualization Technology of cloud service management platform, virtualization software and hardware, undertaken physical resource cutting, isolate, be encapsulated into cloud main frame, service is externally provided based on this.Owing to complexity and the GPU hardware of GPU (GraphicsProcessingUnit graphic process unit) are delayed to the support of Intel Virtualization Technology so that in for a long time, cloud main frame does not have direct 3D rendering capability.

And each company generally uses the mode of NvidiaVGPU framework and provides 3D cloud to render service at present, owing to vGPU (VirtualGPU virtual pattern processor) technology is monopolized by Nvidia manufacturer, therefore the Nvidia GRIDGPU provided only is used just can to have 3D rendering capability, but as monopolization, the price of this GPU is higher much than common GPU price；Secondly, NvidiaVGPU framework to render performance relatively low, such as GRIDK1 is when carrying out UnigineHeavenBenchmark4.0 test, it on average renders frame per second and is only 8.5FPS, as a comparison NvidiaGTX970 its on average to render frame per second be 95.4, performance differs an order of magnitude, this framework can only meet the service needed that 3D rendering requirements is relatively low such as CAD, only when virtual unit can directly have access to independent GPU, GPU just can be made to have given play to better rendering capability.

Patent CN201010612078.0 discloses that a kind of graphics processing unit is virtualized realizes method, system and device, method disclosed in this patent document achieves and is independent of NvidiaVGPU framework, make the method that multiple virtual unit is able to access that GPU hardware, by the GPU address configuration real physics GPU address that virtual machine V1 is accessed, and making to share between this virtual machine and other multiple virtual machine V2 the mode of same internal memory. other virtual machines V2 stores information into shared drive after receiving request, V1 reads shared drive information, data process is carried out by physics GPU, send the result to after completing in shared drive, and read result of calculation by the virtual machine sending this request.The method realizes multi-dummy machine and indirectly communicates with GPU hardware.But most of virtual machines directly communicate with GPU in the method, but render request is sent to another predetermined virtual machine, allows it on behalf of completing rendering task, therefore computational efficiency, rendering capability is without significantly high.

In sum, all unrealized virtual machine of existing cloud Rendering directly accesses hardware GPU and renders, and existing cloud rendering apparatus is expensive, 3D renders poor-performing.

Summary of the invention

In order to solve these potential problems, it is an object of the invention to overcome above-mentioned deficiency existing in prior art, it is provided that one can make virtual machine directly access hardware GPU, and low price, render the high cloud rendering system of performance, server and method.

In order to realize foregoing invention purpose, the technical solution used in the present invention is:

A kind of cloud rendering system, including host and multiple GPU, described host is provided with multiple virtual machine, and each described virtual machine is equipped with a GPU of correspondence and drives；

Described cloud rendering system also includes: MMU, it coupled to each described GPU to drive and each described GPU, couples virtual machine internal memory and host internal memory, when it is configured as any one virtual machine request access GPU, driving one GPU address of distribution to the GPU of this virtual machine, described GPU address is used for accessing this GPU；When the request of any one virtual machine accesses virutal machine memory, to the host memory address that the distribution of this virtual machine is corresponding；

IOMMU, coupled to each described GPU and virutal machine memory, when it is configured as any one GPU request access virutal machine memory, to the host memory address that this GPU distribution is corresponding.

Further, memory address space is set, and described host internal memory is mapped to memory address space, described memory address space is used for the address that reservoir host's machine internal memory is corresponding, and described virtual machine accesses the host internal memory of correspondence by accessing the address in memory address space.

Further, the discontinuous application heap that described IOMMU is additionally operable to store described memory address space is mapped as continuous print application heap, in order to GPU can carry out reading and writing data by DMA technology.

Further, arranging GPU address space, and described GPU is mapped to GPU address space, described GPU address space controls, for storing GPU, the address that depositor is corresponding, and described GPU is driven through described GPU address space and accesses the GPU control depositor of correspondence.

Further, when virtual machine activation any one described, this virtual machine and a GPU are bound by MMU and/or IOMMU, and during binding, this bound GPU can not bind with other virtual machines again.

Present invention simultaneously provides a kind of cloud rendering server, including the cloud rendering system of the present invention, also include cloud service management platform, for the running status of described system is monitored and manages, the user using described system is managed.

The present invention also provides for a kind of cloud rendering intent, comprises the following steps:

S1, virtual machine receive render request, and the GPU that render request is sent to virtual machine drives, and receives rendering data, and rendering data is write internal memory；

S2, virtual machine GPU drive access GPU control depositor, and by render request information write GPU control depositor；

S3, GPU are according to virutal machine memory corresponding to described render request message reference, and the rendering data in the virutal machine memory of described correspondence is processed.

Further, described virtual machine, described GPU all have multiple, and described virtual machine and described GPU one_to_one corresponding, any one when there being multiple render request, in each described virtual machine the plurality of render request of alignment processing respectively.

Further, described rendering data is write internal memory include, MMU couples described virutal machine memory and host internal memory, when virtual machine request accesses virutal machine memory, by the MMU host memory address to this virtual machine distribution correspondence, described rendering data stores according to described host memory address.

Further, described S2 step includes:

S201, the GPU physical address space section of host is mapped to the virtual address space section of host；

S202, utilize the GPU address space that described virtual address space section is mapped in virtual machine by GPA-HVA conversion table.

S203, virtual machine GPU drive and access GPU address space section, and access corresponding GPU according to the address information in described address space section and control depositor.

Further, described S3 step includes:

S301, IOMMU couple described virutal machine memory and GPU, and the discontinuous memory address space that virtual machine uses is mapped as continuous print address space section；

Described continuous print address space section is carried out DMA read-write according to render request information by S302, GPU, obtains and renders required data and complete to render.

Compared with prior art, beneficial effects of the present invention

A kind of cloud rendering system of the present invention, makes multiple virtual machine independently can both directly access GPU, the NvidiaVGPU framework being generally adopted in compared to existing technology by configuring MMU and IOMMU, and the system price of the present invention is cheap, render performance height.

Accompanying drawing explanation

It it is a kind of cloud rendering system module frame chart of the present invention shown in Fig. 1.

It it is the cloud rendering system internal module block diagram of one specific embodiment of the present invention shown in Fig. 2.

It it is a kind of cloud rendering intent flow chart of the present invention shown in Fig. 3.

It it is the inside schematic diagram realizing the inventive method shown in Fig. 4.

Detailed description of the invention

Below in conjunction with detailed description of the invention, the present invention is described in further detail.But this should not being interpreted as, the scope of the above-mentioned theme of the present invention is only limitted to below example, and all technology realized based on present invention belong to the scope of the present invention.

Being a kind of cloud rendering system module frame chart of the present invention shown in Fig. 1, including host and multiple GPU, described host is provided with multiple virtual machine, and each described virtual machine is equipped with a GPU of correspondence and drives；

Described cloud rendering system also includes: MMU (MemoryManagementUnit memory management unit), it coupled to each described GPU to drive and each described GPU, couples virtual machine internal memory and host internal memory, when it is configured as any one virtual machine request access GPU, driving one GPU address of distribution to the GPU of this virtual machine, described GPU address is used for accessing this GPU；When the request of any one virtual machine accesses virutal machine memory, to the host memory address that the distribution of this virtual machine is corresponding；

IOMMU (Input/OutputMemoryManagementUnit input/output memory management unit), it coupled to each described GPU and virutal machine memory, when it is configured as any one GPU request access virutal machine memory, to the host memory address that this GPU distribution is corresponding.

The host of the present invention is expressed as creating the physical server entity of virtual machine, and operate in the virtual machine on this host and can share the resource of physical server, for the present invention, the physical server of the present invention is physically provided with multiple virtual machine, namely can be described as host.

nullWhen running in host due to virtual machine，By host as a common process，Therefore virtual machine for the access of physical memory and physics GPU time need to carry out address conversion，The present invention utilizes MMU technology，Make virtual machine when accessing internal memory，By MMU technology, its reference address is converted to the physical address of correspondence，Thus，When making virtual machine access internal memory or GPU，All the access mode with physical machine is identical，Need when GPU processes render request to access the rendering data in internal memory，The GPU data that can directly access in internal memory are made by IOMMU，Therefore when carrying out data and processing，Data processing method is also identical with physical machine individual processing mode，This is equivalent to as an independent main frame, virtual machine is carried out Rendering operations，Its rendering efficiency、Render performance to be also just as good as with real unique host.

Adopting the solution of the present invention to carry out cloud and render, a host runs multiple virtual machine process, and when there being multiple different rendering task, each virtual machine and the GPU corresponding with this virtual machine can both process these tasks independently, simultaneously.Such as when virtual machine 1 and GPU1 one rendering task of collaborative process, host have received another render request, now virtual machine 2 and GPU2 process second rendering task according to the solution of the present invention, the like, multiple GPU that multiple virtual machines of operation are corresponding with the plurality of virtual machine on a host can process multiple render request simultaneously, scheme compared to existing technology, the rendering capability of the present invention is higher, rendering efficiency is higher.

Arranging memory address space, and described host internal memory is mapped to memory address space, described memory address space is used for the address that reservoir host's machine internal memory is corresponding, and described virtual machine accesses the host internal memory of correspondence by accessing the address in memory address space.

In the present invention, carry out address mapping by arranging memory address space, facilitate MMU that interior depositing is managed, when there being multiple virtual machine to read internal memory simultaneously, orderly carry out internal storage access according to corresponding the arranging of memory address space, improve rendering efficiency.

The discontinuous application heap that described IOMMU is additionally operable to store described memory address space is mapped as continuous print application heap, in order to GPU can carry out reading and writing data by DMA (DirectionalMemoryAccess direct memory access) technology.

The region of memory used due to virtual machine is usually discontinuous space, for this discontinuous space, traditional internal storage data read mode is to assist to control GPU by CPU to read the content of corresponding address section, this mode needs additionally to dispatch cpu resource, not only produce the wasting of resources, also reduce GPU treatment effeciency, and by IOMMU technology, this discrete region of memory can be mapped as continuous print address space section, so, GPU just directly can read rendering data from internal memory and render, and further increases rendering efficiency.

Arranging GPU address space, and described GPU is mapped to GPU address space, described GPU address space controls, for storing GPU, the address that depositor is corresponding, and described GPU is driven through described GPU address space and accesses the GPU control depositor of correspondence.

Concrete, it it is the cloud rendering system internal module block diagram of one specific embodiment of the present invention shown in Fig. 2, virtual machine V1 is identical with virtual machine VN framework, virutal machine memory first accesses, when accessing internal memory, the memory address space being positioned at address space, memory address space provides the host internal memory corresponding with this virutal machine memory address, then carries out the read-write operation of data；The GPU driving of virtual machine first passes through GPU address space and obtains mapping the GPU control register address being located at GPU address space when accessing GPU, controls depositor by GPU, then controls GPU and render.

When virtual machine activation any one described, this virtual machine and a GPU are bound by MMU and/or IOMMU, and during binding, this bound GPU can not bind with other virtual machines again.

As previously mentioned, when virtual machine runs in host, by host as a common process, when virtual machine activation, a GPU not used by other virtual machines will be distributed to this virtual machine to bind, the address mapping relation between virtual machine and GPU is set up while binding, and when having multiple virtual machine activation and running at synchronization, one GPU of each virtual machine corresponding independent binding, the internal memory of each virtual machine is mapped to the zones of different of address space, each GPU is also mapped to GPU address space, therefore, when information later is mutual, each virtual machine can be simultaneously independent process render request, directly realize information transmission and information storage according to the mapping relations of their correspondence.

The present invention carries out bindings when when virtual machine activation, in actual applications, can bind when virtual machine creating, can also bind when virtual machine activation, can actively bind, can also be passive set up when receiving certain order is bound, and binding time can be determined according to practical situation, or sets up at other times.

A specific embodiment as the present invention, when virtual machine creating, virtual machine has been bound by cloud service management platform with GPU, and MMU, IOMMU have been also carried out the configuration of mapping relations, virtual machine is enable directly to have access to physical memory and physical equipment, when, after virtual machine with certain GPU binding, the virtual machine that this GPU is bound with it is exclusive to be used, until this virtual machine is destroyed or GPU resource is released, just can again bind this GPU.

Described cloud service management platform uses existing plateform system, such as openstack, AmazonWebService, Ali's cloud etc., does not repeat them here.

It is a kind of cloud rendering intent flow chart of the present invention shown in Fig. 3, comprises the following steps:

Virtual machine is when carrying out information processing, it is process according to the handling process of physical machine, each virtual machine is both provided with GPU driver, for driving GPU to work, rendering data is write internal memory, after carrying out memory address space mapping indeed through MMU, data have been written in the physical memory of host, and the resource that virtual machine uses is all the host physical resource accessed by MMU, IOMMU mapping mode.

In a detailed description of the invention, described virtual machine, described GPU all have multiple, and described virtual machine and described GPU one_to_one corresponding, any one when there being multiple render request, in each described virtual machine the plurality of render request of alignment processing respectively.

Described rendering data is write internal memory include, MMU couples described virutal machine memory and host internal memory, when virtual machine request accesses virutal machine memory, by the MMU host memory address to this virtual machine distribution correspondence, described rendering data stores according to described host memory address.

In a specific embodiment, this coupled relation of MMU just has been set up when virtual machine creating, when rendering data writes internal memory, MMU is directly by host virtual machine address (HVA, HostVirtualAddress) be converted to host physical address (HPA), make rendering data be written to host internal memory.

Described S2 step includes:

Described S3 step includes:

GPU is when carrying out DMA, what access is continuous print AddressBus (address bus), but in virtual machine continuous print physical address (GPA, GuestPhysicalAddress) corresponding host physical address (HPA) is actually not continuous print, it is thus desirable to the AddressBus that GPU uses is mapped as continuous HPA by IOMMU, then carry out DMA.

Embodiment 1:

nullFig. 4 gives the inside schematic diagram realizing the inventive method，Wherein，The software virtual machine run on host selects qemu，Operating system is linux，Multiple independent GPU are NvidiaGTX970，This host is provided with multiple virtual machine，When a virtual machine activation，Cloud service management platform is distributed a GPU being not used by and is bound with this virtual machine，And this GPU is by the exclusive use of this virtual machine，Until this virtual machine is destroyed，This GPU can not be shared by other virtual machines again and use，When another virtual machine also starts，Cloud service management platform is also distributed a GPU being not used by and is bound with this virtual machine，Process another rendering task，The multiple virtual machines achieved on a host process different rendering tasks simultaneously，Which ensure that the high efficiency of virtual machine data in coordinative render and reliability，Compared to existing technology，Its computational efficiency is greatly promoted.In binding procedure, vfio-pci drives and configures MMU and IOMMU according to mapping relations, the GPU making virtual machine drives and can directly access GPU hardware, GPU hardware can directly access virutal machine memory, concrete, internal memory is mapped to the memory address region of HPASpace (HostPhysicalAddressSpace physical address space) by host, the access in this region can be realized the access to internal memory, the physical resource of the GPU of host is mapped to the GPU address area of physical address space, the GPU of virtual machine drives can access this sector address space, realize GPU is controlled, and be continuous print PCI address space section by discontinuous virutal machine memory area maps.According to the cloud rendering system that above-mentioned configuration is set up, it is as follows that it is embodied as step:

Rendering data is write in virutal machine memory by the 3D application program in step one, virtual machine, and render request is sent to the GTX970 driver of virtual machine；

Step 2, GTX970 drive and access GPU address space section, render request writes GTX970 and controls in depositor；

Continuous print address space, according to the information controlling depositor, is carried out DMA by step 3, GTX970, obtains and renders required data；

Rendering data is processed by step 4, GTX970, rendering result is stored, and completes to render.Above in conjunction with accompanying drawing, the specific embodiment of the present invention is described in detail, but the present invention is not restricted to above-mentioned embodiment, without departing from the spirit and scope situation of claims hereof, those skilled in the art may be made that various amendment or remodeling.

Claims

1. a cloud rendering system, it is characterised in that include host and multiple GPU, described host is provided with multiple virtual machine, and each described virtual machine is equipped with a GPU of correspondence and drives；

2. a kind of cloud rendering system according to claim 1, it is characterized in that, memory address space is set, and described host internal memory is mapped to memory address space, described memory address space is used for the address that reservoir host's machine internal memory is corresponding, and described virtual machine accesses the host internal memory of correspondence by accessing the address in memory address space.

3. a kind of cloud rendering system according to claim 2, it is characterised in that the discontinuous application heap that described IOMMU is additionally operable to store described memory address space is mapped as continuous print application heap, in order to GPU can carry out reading and writing data by DMA technology.

4. a kind of cloud rendering system according to claim 1, it is characterized in that, GPU address space is set, and described GPU is mapped to GPU address space, described GPU address space controls, for storing GPU, the address that depositor is corresponding, and described GPU is driven through described GPU address space and accesses the GPU control depositor of correspondence.

5. a kind of cloud rendering system according to any one of claim 1-4, it is characterized in that, when virtual machine activation any one described, this virtual machine and a GPU are bound by MMU and/or IOMMU, and during binding, this bound GPU can not bind with other virtual machines again.

6. a cloud rendering server, it is characterised in that include the system as described in any one of claim 1-5, also includes cloud service management platform, for the running status of described system is monitored and is managed, the user using described system is managed.

7. a cloud rendering intent, it is characterised in that comprise the following steps:

8. a kind of cloud rendering intent according to claim 7, it is characterized in that, described virtual machine, described GPU all have multiple, and described virtual machine and described GPU one_to_one corresponding, any one when there being multiple render request, in each described virtual machine the plurality of render request of alignment processing respectively.

9. a kind of cloud rendering intent according to claim 7 or 8, it is characterized in that, described rendering data is write internal memory include, MMU couples described virutal machine memory and host internal memory, when virtual machine request accesses virutal machine memory, by the MMU host memory address to this virtual machine distribution correspondence, described rendering data stores according to described host memory address.

10. a kind of cloud rendering intent according to claim 9, it is characterised in that described S2 step includes:

S202, utilize the GPU address space that described virtual address space section is mapped in virtual machine by GPA-HVA conversion table；

11. a kind of cloud rendering intent according to claim 10, it is characterised in that described S3 step includes: