question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

New DL4J version breaks Keras model importing

See original GitHub issue

Issue Description

I’ve had a few issues with using some Keras models in DL4J after updating to the most recent version.

These are using .h5 files that can be found here

First Issue: Keras models now load with incorrect channel ordering.

Loading a model (e.g., VGG16.h5) previously worked fine (in 1.0.0-beta6) and was able to be used for training and inference:

ComputationGraph kerasModel = KerasModelImport.importKerasModelAndWeights("VGG16.h5");
INDArray testVals = Nd4j.zeros(1, 3, 224, 224);
kerasModel.feedForward(testVals, false);

Now when they are loaded and run (exact same code, but with 1.0.0-beta7), the ordering is incorrect:

Exception in thread "main" org.deeplearning4j.exception.DL4JInvalidInputException: Cannot do forward pass in Convolution layer (layer name = block1_conv1, layer index = 1): input array channels does not match CNN layer configuration (data format = NHWC, data input channels = 224, [minibatch, height, width, channels]=[1, 3, 224, 224]; expected input channels = 3) (layer name: block1_conv1, layer index: 1, layer type: ConvolutionLayer)
Note: Convolution layers can be configured for either NCHW (channels first) or NHWC (channels last) format for input images and activations.
Layers can be configured using .dataFormat(CNN2DFormat.NCHW/NHWC) when constructing the layer, or for the entire net using .setInputType(InputType.convolutional(height, width, depth, CNN2DForman.NCHW/NHWC)).
ImageRecordReader and NativeImageLoader can also be configured to load image data in either NCHW or NHWC format which must match the network
	at org.deeplearning4j.nn.layers.convolution.ConvolutionLayer.validateInputDepth(ConvolutionLayer.java:327)
	at org.deeplearning4j.nn.layers.convolution.ConvolutionLayer.preOutput(ConvolutionLayer.java:357)
	at org.deeplearning4j.nn.layers.convolution.ConvolutionLayer.activate(ConvolutionLayer.java:489)
	at org.deeplearning4j.nn.graph.vertex.impl.LayerVertex.doForward(LayerVertex.java:111)
	at org.deeplearning4j.nn.graph.ComputationGraph.ffToLayerActivationsDetached(ComputationGraph.java:1976)
	at org.deeplearning4j.nn.graph.ComputationGraph.feedForward(ComputationGraph.java:1581)
	at org.deeplearning4j.nn.graph.ComputationGraph.feedForward(ComputationGraph.java:1524)

I can fix that issue by changing the order (e.g., INDArray testVals = Nd4j.zeros(1, 224, 224, 3);) but this seems like a band-aid fix and probably shouldn’t be necessary; the DL4J version of VGG still works fine with the original channel order.

The question is: Has the default channel order changed when importing Keras models? If so, how does one return it to the default from the last release? I couldn’t find anywhere in the docs that mentioned how to set this (KerasModelBuilder doesn’t have a .setInputType() method).

Second issue: SIGSEGV with running some models

I can fix the above error by changing the channel order at the ImageRecordReader :

reader.setNchw_channels_first(false);

However, running the VGG model causes a SIGSEGV to crash the JVM - error log.

Unfortunately, the code snippet above doesn’t reproduce this issue. For some reason, other Keras model being loaded in a similar way work fine (e.g. ResNet50.h5)

The log says the problematic frame is C 0x00007f1f180a20b3 - is this an issue in the underlying C code that causes running these Keras models to throw an error?

Version Information

Please indicate relevant versions, including, if relevant:

  • Deeplearning4j version - 1.0.0-beta7
  • Platform information (OS, etc) - Ubuntu 18.04
  • CUDA version, if used
  • NVIDIA driver version, if in use

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
treocommented, May 24, 2020

As the model is fixed in your case, you have to change the data input, i.e. create your image record reader with nchw_channels_first = false.

Or, if you can’t change that, you can permute the channels yourself, just like the image record reader would have done: https://github.com/eclipse/deeplearning4j/blob/master/datavec/datavec-data/datavec-data-image/src/main/java/org/datavec/image/recordreader/BaseImageRecordReader.java#L250

array = array.permute(0,2,3,1);     //NCHW to NHWC

As this is literally the only difference between nchw_channels_first = false and nchw_channels_first = true, the crash you’ve seen shouldn’t be caused by this change.

0reactions
kwatterscommented, Mar 31, 2022

I believe I’ve found my answer… NativeImageLoader doesn’t support the flag for all of the asMatrix permutations. I was trying to load a BufferedImage from memory into an NDArray …

    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    ImageIO.write(buffImg, "png", baos);
    ByteArrayInputStream bais = new ByteArrayInputStream(baos.toByteArray());
    NativeImageLoader loader = new NativeImageLoader(64, 64, 1, new ColorConversionTransform(COLOR_BGR2GRAY));
    INDArray image = loader.asMatrix(bais, false);

but also reviewing the code of NativeImageLoader , I see that it calls ndarray.permute on it to convert the channel order. so I can achieve the old behavior with the NativeImageLoader if I manually call permute after I load the array as follows:

    NativeImageLoader loader = new NativeImageLoader(64, 64, 1, new ColorConversionTransform(COLOR_BGR2GRAY));
    INDArray image = loader.asMatrix(buffImg);
    image = image.permute(0,2,3,1);     //NCHW to NHWC
Read more comments on GitHub >

github_iconTop Results From Across the Web

New DL4J version breaks Keras model importing · Issue #8975
Issue Description I've had a few issues with using some Keras models in DL4J after updating to the most recent version.
Read more >
Keras Import - Deeplearning4j - Konduit
​Keras model import provides routines for importing neural network models originally configured and trained using Keras, a popular Python deep learning library.
Read more >
Error loading keras model in Deeplearning4j - java
To import models created with the functional API you need to use a different importer. https://deeplearning4j.konduit.ai/keras-import/model- ...
Read more >
eclipse/deeplearning4j - Gitter
Hi everyone, is it possible to import PyTorch model with Deeplearning4j? I can only find examples for Tensorflow/Keras https://github.com/eclipse/deeplearning4j ...
Read more >
Deep Learning - RapidMiner Marketplace
Breaking change: models stored in previous versions, are incompatible with ... Updated Keras model import to handle all current sequential models created ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found