Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Getting a error running spark-submit job

See original GitHub issue

Hello,

I am trying to follow the instructions here: https://github.com/facebookresearch/Horizon/blob/master/docs/usage.md

When I run this script: /usr/local/spark/bin/spark-submit
–class com.facebook.spark.rl.Preprocessor preprocessing/target/rl-preprocessing-1.1.jar
“cat ml/rl/workflow/sample_configs/discrete_action/timeline.json”

I am getting2019-02-27 00:57:03 INFO HiveMetaStore:746 - 0: get_database: global_temp 2019-02-27 00:57:03 INFO audit:371 - ugi=root ip=unknown-ip-addr cmd=get_database: global_temp 2019-02-27 00:57:03 WARN ObjectStore:568 - Failed to get database global_temp, returning NoSuchObjectException Exception in thread “main” org.apache.spark.sql.AnalysisException: grouping expressions sequence is empty, and ‘source_table.mdp_id’ is not an aggregate function. Wrap ‘()’ in windowing function(s) or wrap ‘source_table.mdp_id’ in first() (or first_value) if you don’t care which value you get.;; 'Sort ['HASH('mdp_id, 'sequence_number) ASC NULLS FIRST], false ± 'RepartitionByExpression ['HASH('mdp_id, 'sequence_number)], 200 ± 'Project [mdp_id#1, state_features#4, action#5, action_probability#3, reward#6, next_state_features#24, next_action#25, sequence_number#2, sequence_number_ordinal#26, time_diff#27, possible_actions#7, possible_next_actions#28, metrics#8] ± 'Project [mdp_id#1, state_features#4, action#5, action_probability#3, reward#6, sequence_number#2, possible_actions#7, metrics#8, next_state_features#24, next_action#25, sequence_number_ordinal#26, _we3#30, possible_next_actions#28, next_state_features#24, next_action#25, sequence_number_ordinal#26, (coalesce(_we3#30, sequence_number#2) - sequence_number#2) AS time_diff#27, possible_next_actions#28] ± 'Window [lead(state_features#4, 1, null) windowspecdefinition(mdp_id#1, mdp_id#1 ASC NULLS FIRST, sequence_number#2 ASC NULLS FIRST, specifiedwindowframe(RowFrame, 1, 1)) AS next_state_features#24, lead(action#5, 1, null) windowspecdefinition(mdp_id#1, mdp_id#1 ASC NULLS FIRST, sequence_number#2 ASC NULLS FIRST, specifiedwindowframe(RowFrame, 1, 1)) AS next_action#25, row_number() windowspecdefinition(mdp_id#1, mdp_id#1 ASC NULLS FIRST, sequence_number#2 ASC NULLS FIRST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS sequence_number_ordinal#26, lead(sequence_number#2, 1, null) windowspecdefinition(mdp_id#1, mdp_id#1 ASC NULLS FIRST, sequence_number#2 ASC NULLS FIRST, specifiedwindowframe(RowFrame, 1, 1)) AS _we3#30, lead(possible_actions#7, 1, null) windowspecdefinition(mdp_id#1, mdp_id#1 ASC NULLS FIRST, sequence_number#2 ASC NULLS FIRST, specifiedwindowframe(RowFrame, 1, 1)) AS possible_next_actions#28], [mdp_id#1], [mdp_id#1 ASC NULLS FIRST, sequence_number#2 ASC NULLS FIRST] ± 'Filter isnotnull('next_state_features) ± Aggregate [mdp_id#1, state_features#4, action#5, action_probability#3, reward#6, sequence_number#2, possible_actions#7, metrics#8] ± SubqueryAlias source_table ± Project [mdp_id#1, state_features#4, action#5, action_probability#3, reward#6, sequence_number#2, possible_actions#7, metrics#8] ± Filter ((ds#0 >= 2019-01-01) && (ds#0 <= 2019-01-01)) ± SubqueryAlias cartpole_discrete ± Relation[ds#0,mdp_id#1,sequence_number#2,action_probability#3,state_features#4,action#5,reward#6,possible_actions#7,metrics#8] json

I tried the steps, after manually installing Hbase (This step is missing in the documentation. Please let me know, if you want me to add it)

I am using docker on Mac instructions (https://github.com/facebookresearch/Horizon/blob/master/docs/installation.md) to get going. Can anyone please help me on how to move forward?