The meaning of `isTrainable` property of a layer
See original GitHub issueI am a bit confused about the meaning of isTrainable
property of layers, so I decided to create this issue to ask for clarification.
In Keras, there is a boolean trainable
attribute for each layer which indicates whether the weights of a layer (if it has any) should be updated during training or not (i.e. whether the layer is freezed or not during training). This is always set to True
upon creation of the layer (even if the layer has no weights), unless the user explicitly set it to False
. Now, if I am correct, its equivalent in KotlinDL is isTrainable
property. From its docs (although, there is a typo: “… and could not be changed …”):
True, if layer’s weights could be changed during training. If false, layer’s weights are frozen and could be changed during the training.
But for the following layers, this property has been set to false
:
AbstractActivationLayer
https://github.com/JetBrains/KotlinDL/blob/03c18901af9dc86edbf0e0d1d199e787bdc2a9dd/api/src/main/kotlin/org/jetbrains/kotlinx/dl/api/core/layer/activation/AbstractActivationLayer.kt#L29BatchNorm
https://github.com/JetBrains/KotlinDL/blob/03c18901af9dc86edbf0e0d1d199e787bdc2a9dd/api/src/main/kotlin/org/jetbrains/kotlinx/dl/api/core/layer/normalization/BatchNorm.kt#L63 https://github.com/JetBrains/KotlinDL/blob/03c18901af9dc86edbf0e0d1d199e787bdc2a9dd/api/src/main/kotlin/org/jetbrains/kotlinx/dl/api/core/layer/normalization/BatchNorm.kt#L77Concatenate
https://github.com/JetBrains/KotlinDL/blob/03c18901af9dc86edbf0e0d1d199e787bdc2a9dd/api/src/main/kotlin/org/jetbrains/kotlinx/dl/api/core/layer/merge/Concatenate.kt#L26Conv3D
https://github.com/JetBrains/KotlinDL/blob/03c18901af9dc86edbf0e0d1d199e787bdc2a9dd/api/src/main/kotlin/org/jetbrains/kotlinx/dl/api/core/layer/convolutional/Conv3D.kt#L96Dropout
https://github.com/JetBrains/KotlinDL/blob/03c18901af9dc86edbf0e0d1d199e787bdc2a9dd/api/src/main/kotlin/org/jetbrains/kotlinx/dl/api/core/layer/regularization/Dropout.kt#L36DepthwiseConv2D
https://github.com/JetBrains/KotlinDL/blob/03c18901af9dc86edbf0e0d1d199e787bdc2a9dd/api/src/main/kotlin/org/jetbrains/kotlinx/dl/api/core/layer/convolutional/DepthwiseConv2D.kt#L92SeparableConv2D
https://github.com/JetBrains/KotlinDL/blob/03c18901af9dc86edbf0e0d1d199e787bdc2a9dd/api/src/main/kotlin/org/jetbrains/kotlinx/dl/api/core/layer/convolutional/SeparableConv2D.kt#L102
This is a bit confusing to me, especially for layers with trainable weights (e.g. Conv3D
). Am I missing something, or is this a bug? @zaleslaw
Issue Analytics
- State:
- Created 2 years ago
- Comments:6 (3 by maintainers)
Java API for TF used auto-generation to generate Java classes for exposed C API, and it regenerates from time to time (in the case of Java API for TF 2.x)
This corresponds to the ops generated in python, gen_nn_ops for example.
The next task - to implement the low-level logic written in python just in this package, and on this level a lot of missed things.
Gen_nn_ops, for example, are generated in the same way in Java and in Python, via api-defs, in both cases you get access to the underlying C++ code, I hope.
I could not say that the coverage is different for TF ops for TF 2.6 in Python or in Java API, but for example (of course, KotlinDL uses 1.15 and it contains a little bit fewer ops than 2.6), but if you mean not only generated ops, yes, at each high level we need to repeat the Python logic or write the same level of abstraction, for GradientTape, for Layers, for Losses and for control flow and so on.
If we will talk about probable obstacles/limitations, the honest answer is “I don’t know”. it looks like there are no obstacles, but I could not guarantee that.
@zaleslaw Thank you for your reply and explanations. I appreciate it.
But still I don’t understand why the coverage of TF Java API is limited compared to TF Python API? I mean is there any specific obstacle/limitation related to Java or the underlying C API of TF, or is it only because no one has invested enough time and effort to extend the TF Java API?