Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

New DL4J version breaks Keras model importing

See original GitHub issue

Issue Description

I’ve had a few issues with using some Keras models in DL4J after updating to the most recent version.

These are using .h5 files that can be found here

First Issue: Keras models now load with incorrect channel ordering.

Loading a model (e.g., VGG16.h5) previously worked fine (in 1.0.0-beta6) and was able to be used for training and inference:

ComputationGraph kerasModel = KerasModelImport.importKerasModelAndWeights("VGG16.h5");
INDArray testVals = Nd4j.zeros(1, 3, 224, 224);
kerasModel.feedForward(testVals, false);

Now when they are loaded and run (exact same code, but with 1.0.0-beta7), the ordering is incorrect:

Exception in thread "main" org.deeplearning4j.exception.DL4JInvalidInputException: Cannot do forward pass in Convolution layer (layer name = block1_conv1, layer index = 1): input array channels does not match CNN layer configuration (data format = NHWC, data input channels = 224, [minibatch, height, width, channels]=[1, 3, 224, 224]; expected input channels = 3) (layer name: block1_conv1, layer index: 1, layer type: ConvolutionLayer)
Note: Convolution layers can be configured for either NCHW (channels first) or NHWC (channels last) format for input images and activations.
Layers can be configured using .dataFormat(CNN2DFormat.NCHW/NHWC) when constructing the layer, or for the entire net using .setInputType(InputType.convolutional(height, width, depth, CNN2DForman.NCHW/NHWC)).
ImageRecordReader and NativeImageLoader can also be configured to load image data in either NCHW or NHWC format which must match the network
	at org.deeplearning4j.nn.layers.convolution.ConvolutionLayer.validateInputDepth(ConvolutionLayer.java:327)
	at org.deeplearning4j.nn.layers.convolution.ConvolutionLayer.preOutput(ConvolutionLayer.java:357)
	at org.deeplearning4j.nn.layers.convolution.ConvolutionLayer.activate(ConvolutionLayer.java:489)
	at org.deeplearning4j.nn.graph.vertex.impl.LayerVertex.doForward(LayerVertex.java:111)
	at org.deeplearning4j.nn.graph.ComputationGraph.ffToLayerActivationsDetached(ComputationGraph.java:1976)
	at org.deeplearning4j.nn.graph.ComputationGraph.feedForward(ComputationGraph.java:1581)
	at org.deeplearning4j.nn.graph.ComputationGraph.feedForward(ComputationGraph.java:1524)

I can fix that issue by changing the order (e.g., INDArray testVals = Nd4j.zeros(1, 224, 224, 3);) but this seems like a band-aid fix and probably shouldn’t be necessary; the DL4J version of VGG still works fine with the original channel order.

The question is: Has the default channel order changed when importing Keras models? If so, how does one return it to the default from the last release? I couldn’t find anywhere in the docs that mentioned how to set this (KerasModelBuilder doesn’t have a .setInputType() method).

Second issue: SIGSEGV with running some models

I can fix the above error by changing the channel order at the ImageRecordReader :

reader.setNchw_channels_first(false);

However, running the VGG model causes a SIGSEGV to crash the JVM - error log.

Unfortunately, the code snippet above doesn’t reproduce this issue. For some reason, other Keras model being loaded in a similar way work fine (e.g. ResNet50.h5)

The log says the problematic frame is C 0x00007f1f180a20b3 - is this an issue in the underlying C code that causes running these Keras models to throw an error?

Version Information

Please indicate relevant versions, including, if relevant:

Deeplearning4j version - 1.0.0-beta7
Platform information (OS, etc) - Ubuntu 18.04
CUDA version, if used
NVIDIA driver version, if in use

Issue Analytics

State:
Created 3 years ago
Comments:7 (4 by maintainers)

Top GitHub Comments

1reaction

treocommented, May 24, 2020

As the model is fixed in your case, you have to change the data input, i.e. create your image record reader with nchw_channels_first = false.

Or, if you can’t change that, you can permute the channels yourself, just like the image record reader would have done: https://github.com/eclipse/deeplearning4j/blob/master/datavec/datavec-data/datavec-data-image/src/main/java/org/datavec/image/recordreader/BaseImageRecordReader.java#L250

array = array.permute(0,2,3,1);     //NCHW to NHWC

As this is literally the only difference between nchw_channels_first = false and nchw_channels_first = true, the crash you’ve seen shouldn’t be caused by this change.

0reactions

kwatterscommented, Mar 31, 2022

I believe I’ve found my answer… NativeImageLoader doesn’t support the flag for all of the asMatrix permutations. I was trying to load a BufferedImage from memory into an NDArray …

    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    ImageIO.write(buffImg, "png", baos);
    ByteArrayInputStream bais = new ByteArrayInputStream(baos.toByteArray());
    NativeImageLoader loader = new NativeImageLoader(64, 64, 1, new ColorConversionTransform(COLOR_BGR2GRAY));
    INDArray image = loader.asMatrix(bais, false);

but also reviewing the code of NativeImageLoader , I see that it calls ndarray.permute on it to convert the channel order. so I can achieve the old behavior with the NativeImageLoader if I manually call permute after I load the array as follows:

    NativeImageLoader loader = new NativeImageLoader(64, 64, 1, new ColorConversionTransform(COLOR_BGR2GRAY));
    INDArray image = loader.asMatrix(buffImg);
    image = image.permute(0,2,3,1);     //NCHW to NHWC

Top Results From Across the Web

New DL4J version breaks Keras model importing · Issue #8975

Issue Description I've had a few issues with using some Keras models in DL4J after updating to the most recent version.

Keras Import - Deeplearning4j - Konduit

Keras model import provides routines for importing neural network models originally configured and trained using Keras, a popular Python deep learning library.

Error loading keras model in Deeplearning4j - java

To import models created with the functional API you need to use a different importer. https://deeplearning4j.konduit.ai/keras-import/model- ...

eclipse/deeplearning4j - Gitter

Hi everyone, is it possible to import PyTorch model with Deeplearning4j? I can only find examples for Tensorflow/Keras https://github.com/eclipse/deeplearning4j ...

Deep Learning - RapidMiner Marketplace

Breaking change: models stored in previous versions, are incompatible with ... Updated Keras model import to handle all current sequential models created ...