question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Error during training when using remote ElasticSearch

See original GitHub issue

I was trying to set up a PIO server with a remote ES and remote HBASE/Zookeeper via Docker.

versions used:

  • SCALA_VERSION 2.11.8
  • PIO_VERSION 0.12.1 (“from source” downloaded from apache mirror)
  • SPARK_VERSION 2.1.2
  • ELASTICSEARCH_VERSION 5.5.2
  • HBASE_VERSION 1.3.1

Here is my config:

pio-env.sh:

#!/usr/bin/env bash

# Filesystem paths where PredictionIO uses as block storage.
PIO_FS_BASEDIR=${HOME}/.pio_store
PIO_FS_ENGINESDIR=${PIO_FS_BASEDIR}/engines
PIO_FS_TMPDIR=${PIO_FS_BASEDIR}/tmp

SPARK_HOME=${PIO_HOME}/vendors/spark-${SPARK_VERSION}-bin-hadoop2.7

HBASE_CONF_DIR=${PIO_HOME}/vendors/hbase-${HBASE_VERSION}/conf

# Storage Repositories
PIO_STORAGE_REPOSITORIES_METADATA_NAME=pio_meta
PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=ELASTICSEARCH

PIO_STORAGE_REPOSITORIES_APPDATA_NAME=pio_appdata
PIO_STORAGE_REPOSITORIES_APPDATA_SOURCE=ELASTICSEARCH

PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event
PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=HBASE

PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model
PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=LOCALFS

# ES config
PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch
PIO_STORAGE_SOURCES_ELASTICSEARCH_CLUSTERNAME=predictionio
PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=es
# PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=localhost
PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9200
PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=${PIO_HOME}/vendors/elasticsearch-${ELASTICSEARCH_VERSION}

PIO_STORAGE_SOURCES_LOCALFS_TYPE=localfs
PIO_STORAGE_SOURCES_LOCALFS_PATH=${PIO_FS_BASEDIR}/models

PIO_STORAGE_SOURCES_HBASE_TYPE=hbase
PIO_STORAGE_SOURCES_HBASE_HOME=${PIO_HOME}/vendors/hbase-${HBASE_VERSION}
# http://actionml.com/docs/small_ha_cluster
HBASE_MANAGES_ZK=true # when you want HBase to manage zookeeper

PIO itself seems to be running fine, here is the output of pio status:

pio status
[INFO] [Management$] Inspecting PredictionIO...
[INFO] [Management$] PredictionIO 0.12.1 is installed at /PredictionIO-0.12.1
[INFO] [Management$] Inspecting Apache Spark...
[INFO] [Management$] Apache Spark is installed at /PredictionIO-0.12.1/vendors/spark-2.1.2-bin-hadoop2.7
[INFO] [Management$] Apache Spark 2.1.2 detected (meets minimum requirement of 1.3.0)
[INFO] [Management$] Inspecting storage backend connections...
[INFO] [Storage$] Verifying Meta Data Backend (Source: ELASTICSEARCH)...
[INFO] [Storage$] Verifying Model Data Backend (Source: LOCALFS)...
[INFO] [Storage$] Verifying Event Data Backend (Source: HBASE)...
[INFO] [Storage$] Test writing to Event Store (App Id 0)...
[INFO] [HBLEvents] The table pio_event:events_0 doesn't exist yet. Creating now...
[INFO] [HBLEvents] Removing table pio_event:events_0...
[INFO] [Management$] Your system is all ready to go.

It seems as if the universal recommender does not pick up the PIO storage settings; but keeps his own settings. Running the integration tests it is using the template examples/handmade-engine.json where I added two lines within the sparkConf object (es.nodes and es.nodes.wan.only):

  "sparkConf": {
    "spark.serializer": "org.apache.spark.serializer.KryoSerializer",
    "spark.kryo.registrator": "org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator",
    "spark.kryo.referenceTracking": "false",
    "spark.kryoserializer.buffer": "300m",
    "es.index.auto.create": "true",
    "es.nodes.wan.only":"true",
    "es.nodes":"es"
  }

It seems to be talking to the right ES server, but I always get the following exception during the training (pio train -- --driver-memory 4g --executor-memory 4g) phase:

2018-09-04 07:49:35,562 ERROR org.apache.predictionio.data.storage.elasticsearch.ESEngineInstances [main] - Failed to update pio_meta/engine_instances/AWWjjner32JscvS-r-c9
org.apache.predictionio.shaded.org.elasticsearch.client.ResponseException: POST http://es:9200/pio_meta/engine_instances/AWWjjner32JscvS-r-c9?refresh=true: HTTP/1.1 400 Bad Request
{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"mapper [sparkConf.es.nodes] of different type, current_type [text], merged_type [ObjectMapper]"}],"type":"illegal_argument_exception","reason":"mapper [sparkConf.es.nodes] of different type, current_type [text], merged_type [ObjectMapper]"},"status":400}
	at org.apache.predictionio.shaded.org.elasticsearch.client.RestClient$1.completed(RestClient.java:354)
	at org.apache.predictionio.shaded.org.elasticsearch.client.RestClient$1.completed(RestClient.java:343)
	at org.apache.predictionio.shaded.org.apache.http.concurrent.BasicFuture.completed(BasicFuture.java:119)
	at org.apache.predictionio.shaded.org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.responseCompleted(DefaultClientExchangeHandlerImpl.java:177)
	at org.apache.predictionio.shaded.org.apache.http.nio.protocol.HttpAsyncRequestExecutor.processResponse(HttpAsyncRequestExecutor.java:436)
	at org.apache.predictionio.shaded.org.apache.http.nio.protocol.HttpAsyncRequestExecutor.inputReady(HttpAsyncRequestExecutor.java:326)
	at org.apache.predictionio.shaded.org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:265)
	at org.apache.predictionio.shaded.org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:81)
	at org.apache.predictionio.shaded.org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:39)
	at org.apache.predictionio.shaded.org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:114)
	at org.apache.predictionio.shaded.org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:162)
	at org.apache.predictionio.shaded.org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:337)
	at org.apache.predictionio.shaded.org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315)
	at org.apache.predictionio.shaded.org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276)
	at org.apache.predictionio.shaded.org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
	at org.apache.predictionio.shaded.org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:588)
	at java.lang.Thread.run(Thread.java:748)

This does not happen when I start a local ES on the same machine where PIO is located (using the original engine.json)

Issue Analytics

  • State:open
  • Created 5 years ago
  • Reactions:3
  • Comments:20

github_iconTop GitHub Comments

2reactions
pshoukrycommented, Apr 1, 2019

@thisismana to fix this issue you need to set wan.only on the hadoop client in PIO and in UR please check those 2 pull requests:

PredictionIO FIX UR FIX

specially: here and here

1reaction
lucacanellacommented, Feb 20, 2019

Same problem here. Elasticsearch docker container is up and running; it responds correctly when queried with curl or node, or wget.

My spark conf:

"sparkConf": {
    "spark.serializer": "org.apache.spark.serializer.KryoSerializer",
    "spark.kryo.registrator": "org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator",
    "spark.kryo.referenceTracking": "false",
    "spark.kryoserializer.buffer": "1024",
    "es.nodes": "elasticsearch",
    "es.port": "9200",
    "es.index.auto.create": "true",
    "es.nodes.wan.only": "true"
  }

When i run pio train i get this error twice:

[ERROR] [ESEngineInstances] Failed to update pio_meta/engine_instances/AWkLF89GASGnnVc278Ds

And some of these warnings:

[WARN] [EsInputFormat] Cannot determine task id...

In the submission command I see this env vars:

PIO_ENV_LOADED=1
PIO_STORAGE_REPOSITORIES_METADATA_NAME=pio_meta
PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=elasticsearch
PIO_HOME=/usr/share/predictionio
PIO_STORAGE_SOURCES_PGSQL_URL=jdbc:postgresql://postgres/pio
PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch
PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=ELASTICSEARCH
PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=PGSQL
PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event
PIO_STORAGE_SOURCES_PGSQL_PASSWORD=pio
PIO_STORAGE_SOURCES_PGSQL_TYPE=jdbc
PIO_STORAGE_SOURCES_PGSQL_USERNAME=pio
PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model
PIO_STORAGE_SOURCES_ELASTICSEARCH_SCHEMES=http
PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=ELASTICSEARCH
PIO_CONF_DIR=/etc/predictionio
PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9200
Read more comments on GitHub >

github_iconTop Results From Across the Web

Fix common cluster issues | Elasticsearch Guide [8.5]
Fix common cluster issuesedit. This guide describes how to fix common errors and problems with Elasticsearch clusters. Error: disk usage exceeded flood- ...
Read more >
3 best practices for using and troubleshooting the Reindex ...
We will review potential errors and give you best practice tips and ... Transferring data between clusters (Reindex from a remote cluster) ...
Read more >
How do I resolve configuration change failures?
Configuration change errors can occur when there is insufficient RAM configured for a data tier. In this case, the cluster typically also shows...
Read more >
Troubleshoot common problems | Fleet and ...
This error occurs when you use self-signed certificates with Elasticsearch using IP as a Common Name (CN). With IP as a CN, Fleet...
Read more >
Connect to remote clusters | Elasticsearch Guide [8.5]
When the compression or ping schedule settings change, all existing node connections must close and re-open, which can cause in-flight requests to fail....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found