AudioGraph API FrameOutputNode
See original GitHub issueHey there!
I was missing something in the examples for the Audio Graph API that is really important for me and I can’t figure out how to do this.
When I connect an AudioFileInputNode
with an AudioFrameOutputNode
, I am able to the read frame data with this code:
AudioBuffer buffer = frame.LockBuffer(AudioBufferAccessMode.Read);
IMemoryBufferReference reference = buffer.CreateReference();
((IMemoryBufferByteAccess)reference).GetBuffer(out byte* dataInBytes, out uint capacityInBytes);
float* dataInFloats = (float*)dataInBytes;
As I learned from this stackoverflow answer, the bytes that read into the buffer have to be considered as bytes of floats. But how do I get the actual byte data of a file that I opened with the graph from these floats?
For example, I read a WAV with PCM 16000Hz, 16 bit mono and the raw bytes of that wav are shorts:
0 0 | 255 255 | 254 255 | 255 255 | 255 255 | 254 255 | 253 255 | 252 255 ...
But the buffer of AudioFrameOutputNode reads the following bytes:
0 0 0 0 | 0 0 0 184 | 0 0 128 184 | 0 0 0 184 ...
which I assume are floats.
But the question is how do I get the actual raw shorts from these floats?
Is there some type of conversion I can use?
As you can see in the example, the sample 255 255
in shorts seems to map to sample 0 0 0 184
in float. What’s the magic behind that?
Issue Analytics
- State:
- Created 3 years ago
- Reactions:1
- Comments:14 (4 by maintainers)
Top GitHub Comments
The sample does show that it’s an array of floats. If the original format wasn’t an array of floats, then there is naturally a format conversion. The nature of the format conversion depends on the format. In your specific example, the conversion was linear. Other formats can be more complicated (such as u-law). It’s really out of the scope of the documentation to show how to convert every possible format. You need to apply your understanding of the format. I guess the documentation could say “The buffer is in the form of an array of samples. Each sample is a series of IEEE single-precision floating point values in a linear range from
-1.0
to+1.0
, one value per channel.” It’s then up to you to understand how your format converts to that format.I have limited experience here, but every audio graph I’ve ever set up seems to pass
float
samples around regardless of how I configure the encoding properties. Sorry I can’t be of more help here!