Large `MapValues` elements are loaded very slowly in newer library versions
See original GitHub issueHi, Villu. We are trying to upgrade our Java service, that uses JPMML library, to Java 11 but encountered an issue with evaluation time after the upgrade.
The application docker image: openjdk:11-jre The jpmml library version: 1.5.14 (also we’ve tried with 1.5.16; we haven’t tried 1.6.0 because for upgrading to this version we need to refactor the code due to changes in this version)
The evaluation code:
long startHandle = System.currentTimeMillis();
byte[] pmmlFileContent = pmmlFile.getContent();
String uncompressedValue = new String(pmmlFileContent, StandardCharsets.UTF_8);
StringInputStream source = new StringInputStream(uncompressedValue);
Evaluator evaluator = new LoadingModelEvaluatorBuilder().load(source).build();
optimizePmml(((ModelEvaluator<?>) evaluator).getPMML());
try {
evaluator.verify();
} catch (Exception e) {
errorLogger.reportError(INIT, LOAD, CRITICAL, String.format("PMML file %s is incorrect", pmmlFile.getName()));
throw e;
}
EvaluatorMetadata evaluatorMetadata = extractMetadataFields(evaluator, pmmlFile);
PMMLEvaluatorContainer pmmlEvaluatorContainer = new PMMLEvaluatorContainer(evaluator, evaluatorMetadata);
The possible solution suggested by you as we’ve found: https://openscoring.io/blog/2019/02/28/jpmml_model_api_configuring_jaxb_dependency/ https://groups.google.com/g/jpmml/c/bLvhf3MIQp0/m/j9DHDRo4AQAJ
If the … Java/JVM does not provide built in JAXB runtime,
then perhaps it’s possible to include an independent 3rd party JAXB
runtime into the project? The JPMML-Model project contains two modules
org.jpmml:pmml-model-metro and org.jpmml:pmml-model-moxy, which should
include a full runtime dependency of Glassfish Metro and EclipseLink
MOXy JAXB runtimes, respectively. I’d personally experiment with
depending on the org.jpmml:pmml-model-moxy module (instead of the
standard org.jpmml:pmml-model module), as EclipseLink as a third party
vendor might more flexible/less intrusive in their approach.Also, it should be remembered that newer Java/JVM SE versions also do
not include JAXB runtime. However, using the suggested modules has got
things working for me on all Java SE 9, 10, 11 and 12 so far.
Unfortunately using the new dependency -metro and the latest version rewuires change in the code that currently we cannot allow us to perform.
The another solution was to leave the current pmml-evaluator version and add explicitly JAXB dependencies:
<dependency>
<groupId>org.jpmml</groupId>
<artifactId>pmml-evaluator</artifactId>
<version>1.5.14</version>
</dependency>
<dependency>
<groupId>org.jpmml</groupId>
<artifactId>pmml-evaluator-extension</artifactId>
<version>1.5.14</version>
</dependency>
<dependency>
<groupId>org.glassfish.jaxb</groupId>
<artifactId>jaxb-runtime</artifactId>
<version>2.3.2</version>
</dependency>
<dependency>
<groupId>javax.activation</groupId>
<artifactId>javax.activation-api</artifactId>
<version>1.2.0</version>
</dependency>
<dependency>
<groupId>jakarta.xml.bind</groupId>
<artifactId>jakarta.xml.bind-api</artifactId>
<version>2.3.2</version>
</dependency>
This solution has lead to an increased evaluation time of PMML files (with Java 8 we have up to 5 ms average evaluation for each file):
2021-12-28 13:57:13,685 [pool-2-thread-5] INFO [PriorityManagerImpl] Finished to handle the pmml [3100] file [25649] ms
2021-12-28 13:57:13,721 [pool-2-thread-8] INFO [PriorityManagerImpl] Finished to handle the pmml [3300] file [25588] ms
2021-12-28 13:57:14,297 [pool-2-thread-11] INFO [PriorityManagerImpl] Finished to handle the pmml [3400] file [15163] ms
2021-12-28 13:57:14,596 [pool-2-thread-7] INFO [PriorityManagerImpl] Finished to handle the pmml [3200] file [25095] ms
2021-12-28 14:02:02,839 [pool-2-thread-4] INFO [PriorityManagerImpl] Finished to handle the pmml [3102] file [312919] ms
2021-12-28 14:02:03,502 [pool-2-thread-12] INFO [PriorityManagerImpl] Finished to handle the pmml [3302] file [315177] ms
2021-12-28 14:02:04,817 [pool-2-thread-6] INFO [PriorityManagerImpl] Finished to handle the pmml [3402] file [313270] ms
2021-12-28 14:02:09,896 [pool-2-thread-8] INFO [PriorityManagerImpl] Finished to handle the pmml [3202] file [295978] ms
Upgrading the dependency as described on GitHub page (https://github.com/jpmml/jpmml-evaluator) leads to the same evaluation time:
2021-12-28 14:43:59,059 [async-channel-group-0-handler-executor] INFO [PmmlGridFsDao] Loaded 1794766 size to pmml file 3300
2021-12-28 14:43:59,100 [async-channel-group-0-handler-executor] INFO [PmmlGridFsDao] Loaded 1198828 size to pmml file 3100
2021-12-28 14:43:59,102 [async-channel-group-0-handler-executor] INFO [PmmlGridFsDao] Loaded 1737901 size to pmml file 3102
2021-12-28 14:43:59,332 [async-channel-group-0-handler-executor] INFO [PmmlGridFsDao] Loaded 2085889 size to pmml file 3302
2021-12-28 14:43:59,656 [async-channel-group-0-handler-executor] INFO [PmmlGridFsDao] Loaded 10188345 size to pmml file 3400
2021-12-28 14:43:59,734 [async-channel-group-0-handler-executor] INFO [PmmlGridFsDao] Loaded 6968065 size to pmml file 21191331
2021-12-28 14:43:59,845 [async-channel-group-0-handler-executor] INFO [PmmlGridFsDao] Loaded 6280020 size to pmml file 61331
2021-12-28 14:43:59,932 [async-channel-group-0-handler-executor] INFO [PmmlGridFsDao] Loaded 11274487 size to pmml file 20691331
2021-12-28 14:44:00,031 [async-channel-group-0-handler-executor] INFO [PmmlGridFsDao] Loaded 10969208 size to pmml file 21391331
2021-12-28 14:44:00,080 [async-channel-group-0-handler-executor] INFO [PmmlGridFsDao] Loaded 5387857 size to pmml file 21291331
2021-12-28 14:44:00,147 [async-channel-group-0-handler-executor] INFO [PmmlGridFsDao] Loaded 15703073 size to pmml file 3200
2021-12-28 14:44:00,297 [async-channel-group-0-handler-executor] INFO [PmmlGridFsDao] Loaded 16025233 size to pmml file 3402
2021-12-28 14:44:00,499 [async-channel-group-0-handler-executor] INFO [PmmlGridFsDao] Loaded 21234942 size to pmml file 3202
2021-12-28 14:44:20,477 [pool-2-thread-2] INFO [PriorityManagerImpl] Finished to handle the pmml [3100] file [21374] ms
2021-12-28 14:44:20,514 [pool-2-thread-4] INFO [PriorityManagerImpl] Finished to handle the pmml [3300] file [21431] ms
2021-12-28 14:44:21,143 [pool-2-thread-10] INFO [PriorityManagerImpl] Finished to handle the pmml [3400] file [21486] ms
2021-12-28 14:44:21,440 [pool-2-thread-1] INFO [PriorityManagerImpl] Finished to handle the pmml [3200] file [21292] ms
2021-12-28 14:49:23,238 [pool-2-thread-11] INFO [PriorityManagerImpl] Finished to handle the pmml [3302] file [323904] ms
2021-12-28 14:49:24,054 [pool-2-thread-3] INFO [PriorityManagerImpl] Finished to handle the pmml [3102] file [324951] ms
2021-12-28 14:49:26,909 [pool-2-thread-7] INFO [PriorityManagerImpl] Finished to handle the pmml [3402] file [326611] ms
2021-12-28 14:49:30,200 [pool-2-thread-2] INFO [PriorityManagerImpl] Finished to handle the pmml [3202] file [309723] ms
Upgrading to the version 1.5.16 gives the same result as described previously:
<dependency>
<groupId>org.jpmml</groupId>
<artifactId>pmml-evaluator-metro</artifactId>
<version>1.5.16</version>
</dependency>
We’ll appreciate if you could advise a solution for this issue. Regards, Pavel
Issue Analytics
- State:
- Created 2 years ago
- Comments:17 (11 by maintainers)
Top GitHub Comments
Maybe this
MapValues
loading issue affects the (late end of the-) 1.5.X development branch only?I haven’t tested my small demo application with the
1.5.16
version.This issue definitely doesn’t exist in the 1.6.X development branch (at the time of testing, the latest version being the
1.6.2
version). Maybe it was solved automagically when switching from Java XML binding (javax.xml.bind
) to Jakarta XML Binding (jakarta.xml.bind
).It’s probably time to close this issue as “invalid” and/or “not reproducible”. Can be re-opened anytime when there’s a clear well-isolated example (targeting the 1.6.X development branch!) available.
Be sure to prepare a solid example case first.
In my understanding, it affects the unmarshalling of elements whose child elements belong to two or more XML namespaces.
In the JPMML-Model case it’s the
org.dmg.pmml.Row
element (notMapValues
), which contains elements that belong to the JPMML-InlineTable namespaces:I wonder if the performance regression is resolved if the
data:
prefix is simply deleted (it makes the PMML document incorrect, but would give the first data point).