Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Train job stuck executing

See original GitHub issue

I’ve followed the quick start guide: https://actionml.com/docs/h_ur_quickstart

My config.json is as follows:

{
    "engineId": "2",
    "engineFactory": "com.actionml.engines.ur.UREngine",
    "sparkConf": {
        "spark.serializer": "org.apache.spark.serializer.KryoSerializer",
        "spark.kryo.registrator": "org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator",
        "spark.kryo.referenceTracking": "false",
        "spark.kryoserializer.buffer": "300m",
        "spark.executor.memory": "3g",
        "spark.driver.memory": "3g",
        "spark.es.index.auto.create": "true",
        "spark.es.nodes": "harness-docker-compose_elasticsearch_1",
        "spark.es.nodes.wan.only": "true"
    },
    "algorithm":{
        "indicators": [
            {
                "name": "buy"
            },{
                "name": "view"
            }
        ]
    }
}

I run harness-cli train 2 and when I check harness-cli status engines 2 I see:

/harness-cli/harness-cli/harness-status: line 10: /harness-cli/harness-cli/RELEASE: No such file or directory
Harness CLI v settings
==================================================================
HARNESS_CLI_HOME ........................ /harness-cli/harness-cli
HARNESS_CLI_SSL_ENABLED .................................... false
HARNESS_CLI_AUTH_ENABLED ................................... false
HARNESS_SERVER_ADDRESS ................................... harness
HARNESS_SERVER_PORT ......................................... 9090
==================================================================
Harness Server status: OK
Status for engine-id: 2
{
    "engineParams": {
        "algorithm": {
            "indicators": [
                {
                    "name": "buy"
                },
                {
                    "name": "view"
                }
            ]
        },
        "engineFactory": "com.actionml.engines.ur.UREngine",
        "engineId": "2",
        "sparkConf": {
            "spark.driver.memory": "3g",
            "spark.es.index.auto.create": "true",
            "spark.es.nodes": "harness-docker-compose_elasticsearch_1",
            "spark.es.nodes.wan.only": "true",
            "spark.executor.memory": "3g",
            "spark.kryo.referenceTracking": "false",
            "spark.kryo.registrator": "org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator",
            "spark.kryoserializer.buffer": "300m",
            "spark.serializer": "org.apache.spark.serializer.KryoSerializer"
        }
    },
    "jobStatuses": {
        "ed0becbf-ce7d-4e62-822d-2e3f2138e235": {
            "comment": "Spark job",
            "jobId": "ed0becbf-ce7d-4e62-822d-2e3f2138e235",
            "status": {
                "name": "executing"
            }
        }
    }
}

The job never moves from executing, is there anyway I can debug why this is happening?

I’ve set harness up by following https://actionml.com/docs/harness_container_guide

Issue Analytics

State:
Created 4 years ago
Reactions:2
Comments:7

Top GitHub Comments

5reactions

dataedgehungarycommented, Dec 17, 2019

The

org.apache.spark.SparkException: A master URL must be set in your configuration

ERROR is solved by adding “master”: “local” according to the documentation:

https://actionml.com/docs/h_ur_config#spark-parameters-codesparkconfcode

{
   "engineId": "ecom_ur",
   "engineFactory": "com.actionml.engines.ur.UREngine",
   "sparkConf": {
       "master": "local",
       "spark.serializer": "org.apache.spark.serializer.KryoSerializer",
       "spark.kryo.registrator": "org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator",
       "spark.kryo.referenceTracking": "false",
       "spark.kryoserializer.buffer": "300m",
       "spark.executor.memory": "20g",
       "spark.driver.memory": "10g",
       "spark.es.index.auto.create": "true",
       "spark.es.nodes": "localhost",
       "spark.es.nodes.wan.only": "true"
   },
   "algorithm":{
       "indicators": [ 
           {
               "name": "buy"
           }
       ]
   }
}

0reactions

mick912commented, Dec 18, 2019

Thank you!