question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Debugging pyspark applications no longer works after August update

See original GitHub issue

Environment data

  • VS Code version: 1.27
  • Extension version (available under the Extensions sidebar): 2018.8.0
  • OS and version: Linux Mint 18.1 x64
  • Python version (& distribution if applicable, e.g. Anaconda): 2.7.12
  • Type of virtual environment used (N/A | venv | virtualenv | conda | …): N/A
  • Relevant/affected Python packages and their versions: Pyspark 2.21

Actual behavior

Debugging pyspark application doesn’t work after updating to 2018.8.0. After starting the debugger, the terminal shows the following command and error

cd /home/user/etl ; env "PYSPARK_PYTHON=python" "PYTHONPATH=/home/user/etl:/home/user/.vscode/extensions/ms-python.python-2018.8.0/pythonFiles/experimental/ptvsd" "PYTHONIOENCODING=UTF-8" "PYTHONUNBUFFERED=1" /home/user/Spark/spark-2.2.1-bin-hadoop2.7/bin/spark-submit -m ptvsd --host localhost --port 39763 /home/user/etl/etl/jobs/process_data.py 

Error: Unrecognized option: -m

Usage: spark-submit [options] <app jar | python file> [app arguments]
Usage: spark-submit --kill [submission ID] --master [spark://...]
Usage: spark-submit --status [submission ID] --master [spark://...]
Usage: spark-submit run-example [options] example-class [example args]

Expected behavior

With version 2018.7.0, pyspark debugging works fine. The following command is displayed in the terminal after starting the debugger

cd /home/user/etl ; env "PYSPARK_PYTHON=python" "PYTHONPATH=/home/user/etl" "PYTHONIOENCODING=UTF-8" "PYTHONUNBUFFERED=1" /home/user/Spark/spark-2.2.1-bin-hadoop2.7/bin/spark-submit /home/user/.vscode/extensions/ms-python.python-2018.7.0/pythonFiles/PythonTools/visualstudio_py_launcher.py /home/user/etl 46508 34806ad9-833a-4524-8cd6-18ca4aa74f14 RedirectOutput,RedirectOutput /home/user/etl/etl/jobs/process_data.py

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:20 (3 by maintainers)

github_iconTop GitHub Comments

2reactions
DonJayamannecommented, Sep 27, 2018

I’ll have a fix today.

1reaction
simondmoriascommented, Sep 28, 2018

I ran into this issue today. I can confirm that the fix worked for me. Thanks

Read more comments on GitHub >

github_iconTop Results From Across the Web

Debugging PySpark - Apache Spark
This page focuses on debugging Python side of PySpark on both driver and executor sides instead of focusing on debugging with JVM.
Read more >
How can PySpark be called in debug mode? - Stack Overflow
First of all you should add a configuration for remote debugger: alt + shift + a and choose Edit Configurations or Run ->...
Read more >
How to call the Debug Mode in PySpark | Edureka Community
When spark -submit calls myFirstSparkScript.py, the debug mode is not getting started instead it executes as normal. Editing the Apache Spark ...
Read more >
Solving 5 Mysterious Spark Errors | by yhoztak - Medium
It's powerful and great(This post explains how great it is), but it's sometime hard to debug when there's issue. How come it's hard...
Read more >
PySpark debugging — 6 common issues | by Maria Karanasou
Debugging a spark application can range from a fun to a very (and I ... If you want to know a bit about...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found