Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Too many times creating ComputeBuffer

See original GitHub issue

Hi @harujoh

I found out allocate new ComputeBuffer took long CPU time (about 70% of running time). And I changed weight variable to only copy when GpuEnable changed. It makes GPU calculation 34x faster than before. Also, it’s even 272x faster than CPU calculation.

So I think we need to allocate GPU memory when GpuEnable changed.

I added virtual OnGpuEnable() on Function class, and changed the GPU forward code of Linear layer only… How can I change code easily to allocate GPU memory when GpuEnable?

My current idea is adding OnGpuEnable() to NdArray, and then copy values depends on which device is using. Then in code, just use Weight.GpuData or Weight.GpuGrad to access GPU data. In public code, Weight.Data will be redirected into Weight.GpuData when GpuEnabled.

I apologize my poor English first ;-;

Issue Analytics

State:
Created 6 years ago
Comments:8 (5 by maintainers)

Top GitHub Comments

1reaction

harujohcommented, Dec 6, 2017

I’m sorry for the late reply.

I confirmed that the proposed method has very effective effect.

However, there are more steps to follow special rules to handle this data. This is not desirable for beginners of the program.

So I was searching for a better form of incorporation into the master, but the concrete method is not decided now.

Unfortunately, I can not have the time for programming for a while. So, I’d like to take the idea of putting together ComputeBuffer creation in advance.

返信が遅くなり申し訳ありません。

ご提案いただいた方法がとても効果があることを確認しました。

しかし、このデータを取り扱うために特別なルールに従う手順が増えてしまいます。これはプログラムの初心者にとって望ましい状態でなくなってしまいます。

そこでマスターへより良い形での取り込みを模索していましたが、現在具体的な方法は決まっていません。

残念なことに、しばらくまとまった時間が取れません。そこで、先行してComputeBufferの作成をまとめるアイデアを取り込みたいと考えています。

0reactions

harujohcommented, Aug 2, 2019

ご提案いただいた方法を実現しましたが、残念ながら望むパフォーマンスを得ることが出来ませんでした。 https://github.com/harujoh/KelpNet/tree/TryGenNdArrayNativeBase

ご指摘の問題は、調査の結果Delegateが主な原因でDelegateの使用を廃止したことで速度低下は解消しました。

Top Results From Across the Web

Updating a ComputeBuffer every frame

Trying to render mass amounts of sprites using ECS and Graphics.DrawMeshInstancedProcedural. Rendering one million sprites actually ran ...

Scripting API: ComputeBuffer

Description. GPU data buffer, mostly for use with compute shaders. ComputeShader programs often need arbitrary data to be read & written into memory...

ComputeBuffer Constructor - Scripting API

Parameters ; count, Number of elements in the buffer. ; stride, Size of one element in the buffer, in bytes. Must be a...

Passing arrays to a compute buffer : r/Unity3D

I've managed to make something that works to generate small meshes, but it struggles once the vertex count gets into the tens of...

Compute Shaders

A compute buffer contains arbitrary untyped data. ... To make it animate we need to know the time, so add a _Time property....