Encoding BFLOAT16 Constant to ONNX Fails
See original GitHub issueBug Report
Is the issue related to model conversion?
no
Describe the bug
BFLOAT16 constants are encoded incorrectly when creating tensor initialization data via ONNX Python support.
This feature was added in v1.11.0 so you need at least that version to reproduce the problem.
I’ve been able to trace the problem to the following code in helper.py
. In make_tensor
there is this code
# floa16/bfloat16 are stored as uint16
elif (data_type == TensorProto.FLOAT16
or data_type == TensorProto.BFLOAT16):
vals = np.array(vals).astype(np.float16).view(dtype=np.uint16).flatten().tolist()
Notice that the code is converting the value to np.float16
before writing it out; this is wrong for BFLOAT16 since it doesn’t have the same encoding. The result is that this code will write out data in FLOAT16 format when the type is BFLOAT16.
System information
- OS Platform and Distribution (e.g. Linux Ubuntu 16.04): N/A
- ONNX version (e.g. 1.7): 1.11.0
- Python version: N/A
- GCC/Compiler version (if compiling from source): N/A
- CMake version: N/A
- Protobuf version: N/A
- Visual Studio version (if applicable): N/A
Reproduction instructions
The following python program reproduces the error when run with ONNX v1.11.0
The program prints the incorrect initialization data and writes out the bad graph to disk (as test.onnx
)
import onnx
from onnx import helper, TensorProto
type = TensorProto.BFLOAT16
dims = [2,2]
input = helper.make_tensor_value_info('input', type, dims, False)
c0 = helper.make_tensor_value_info('c0', type, dims, False)
initializer = helper.make_tensor(c0.name, type, dims, [1.0, 2.0, 3.0, 4.0])
# dims: 2
# dims: 2
# data_type: 16
# int32_data: 15360 | CORRECT_VALUE: 16256 (0x3F80)
# int32_data: 16384 | CORRECT_VALUE: 16384 (0x4000)
# int32_data: 16896 | CORRECT_VALUE: 16448 (0x4040)
# int32_data: 17408 | CORRECT_VALUE: 16512 (0x4080)
# name: "c0"
print(initializer)
output = helper.make_tensor_value_info('output', type, dims)
add = helper.make_node('Add', [input.name, c0.name], [output.name], 'Add')
graph = helper.make_graph([add], 'test', [input], [output], initializer=[initializer])
model = helper.make_model(graph)
onnx.save(model, 'test.onnx')
Expected behavior
BFLOAT16 should be encoded correctly and the program should print:
dims: 2
dims: 2
data_type: 16
int32_data: 16256
int32_data: 16384
int32_data: 16448
int32_data: 16512
name: "c0"
Notes
There was no FLOAT16 or BFLOAT16 support prior to v1.11.0. Starting in v1.11.0 support was added but the BFLOAT16 encoding support has a bug.
Issue Analytics
- State:
- Created a year ago
- Reactions:1
- Comments:5 (5 by maintainers)
Back after disappearing last week while i closed the issue on my end that exposed this ONNX issue.
@jcwchen , i think its possible to create a BFLOAT16 constant here, i’m optimistic that i will have time this week to verify and code it up. The tricky part will probably be getting the rounding correct.
If this drags on, if others hit this issue, or you have a new release imminent that can take a change here, i suggest perhaps issuing an error on BFLOAT16 and recommending folks use the waw encoding feature to write out BFLOAT16 (that was the workaround i used on my side).
Possible fix has been published here: https://github.com/onnx/onnx/pull/4193