Train job stuck executing
See original GitHub issueI’ve followed the quick start guide: https://actionml.com/docs/h_ur_quickstart
My config.json
is as follows:
{
"engineId": "2",
"engineFactory": "com.actionml.engines.ur.UREngine",
"sparkConf": {
"spark.serializer": "org.apache.spark.serializer.KryoSerializer",
"spark.kryo.registrator": "org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator",
"spark.kryo.referenceTracking": "false",
"spark.kryoserializer.buffer": "300m",
"spark.executor.memory": "3g",
"spark.driver.memory": "3g",
"spark.es.index.auto.create": "true",
"spark.es.nodes": "harness-docker-compose_elasticsearch_1",
"spark.es.nodes.wan.only": "true"
},
"algorithm":{
"indicators": [
{
"name": "buy"
},{
"name": "view"
}
]
}
}
I run harness-cli train 2
and when I check harness-cli status engines 2
I see:
/harness-cli/harness-cli/harness-status: line 10: /harness-cli/harness-cli/RELEASE: No such file or directory
Harness CLI v settings
==================================================================
HARNESS_CLI_HOME ........................ /harness-cli/harness-cli
HARNESS_CLI_SSL_ENABLED .................................... false
HARNESS_CLI_AUTH_ENABLED ................................... false
HARNESS_SERVER_ADDRESS ................................... harness
HARNESS_SERVER_PORT ......................................... 9090
==================================================================
Harness Server status: OK
Status for engine-id: 2
{
"engineParams": {
"algorithm": {
"indicators": [
{
"name": "buy"
},
{
"name": "view"
}
]
},
"engineFactory": "com.actionml.engines.ur.UREngine",
"engineId": "2",
"sparkConf": {
"spark.driver.memory": "3g",
"spark.es.index.auto.create": "true",
"spark.es.nodes": "harness-docker-compose_elasticsearch_1",
"spark.es.nodes.wan.only": "true",
"spark.executor.memory": "3g",
"spark.kryo.referenceTracking": "false",
"spark.kryo.registrator": "org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator",
"spark.kryoserializer.buffer": "300m",
"spark.serializer": "org.apache.spark.serializer.KryoSerializer"
}
},
"jobStatuses": {
"ed0becbf-ce7d-4e62-822d-2e3f2138e235": {
"comment": "Spark job",
"jobId": "ed0becbf-ce7d-4e62-822d-2e3f2138e235",
"status": {
"name": "executing"
}
}
}
}
The job never moves from executing, is there anyway I can debug why this is happening?
I’ve set harness up by following https://actionml.com/docs/harness_container_guide
Issue Analytics
- State:
- Created 4 years ago
- Reactions:2
- Comments:7
Top Results From Across the Web
SQL Jobs hanging. Job stays in executing mode but never ...
Ever since all SQL jobs just hang in executing mode. ... Erland, hanging; I meant jobs get stuck in executing state for hours....
Read more >Batch Job stuck in executing status but dooesn't execute...
Hi all, I have created a batch job executing class wich is implemented to insert a record in table every two minutes (just...
Read more >How to fix a Control-M job stuck in an "Executing" status (z/OS)
This video demonstrates how to fix a Control-M job that has become stuck in an executing status on Control-M for z/OS.
Read more >Pipeline gets stuck in a job when a self-hosted runner ... - GitLab
During the train sleep destroy your self-hosted runner. What is the current bug behavior? Job stays running forever.
Read more >Cognos jobs stuck at Executing state for a long time. - IBM
Several random jobs get stuck in Executing state and there is no pattern observed as some jobs process successfully and some not.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
The
ERROR is solved by adding “master”: “local” according to the documentation:
https://actionml.com/docs/h_ur_config#spark-parameters-codesparkconfcode
Thank you!