question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

End to end SBT project in Scala

See original GitHub issue

Description

Feature Request: Not a bug

Citing the fact that this library requires out-of-date version of Scala and Spark, it would significantly help newcomers if there were a complete example project that included sample pom.xml and/or build.sbt files and the proper configuration needed.

Of course, this is not just an issue with this library, but is a core Spark issue. However, it would save a lot of time for newbies to Spark and this library if there were a read-to-go build.sbt file somewhere.

I’m currently sorting through this mess myself (downgrading from Scala 2.13 to 2.11 in order to use Spark), and would be happy to submit my build.sbt file and associated docker-compose.yml file once I figure it out, if we can find a good place to put it on this repo (or another one).

Please let me know what you guys think, and thanks for this awesome toolset!

Steps to Reproduce

Not applicable

Your Environment

Not applicable

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:2
  • Comments:12 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
baughmanncommented, Feb 2, 2021

I’ve temporarily given up on using the MinIO Java client directly in lieu of using Spark’s S3 capabilities (MinIO is S3 compliant). I’ve gotten it to work on my local dev cluster (local[*]) but not on my Docker cluster, even when assembling a fat JAR and submitting it with the job.

It might be a few days or a week or so before I have something. It seems I have even more to learn about dependency management on Spark clusters than I had realized…

1reaction
maziyarpanahicommented, Feb 1, 2021

You are very welcome, we do have lots of users using Fat JAR to submit their jobs on EMR, GCP, etc. It’s true, we don’t have many examples for them apart from that starter project I gave you. Most of the examples are to show how to use the library, especially in PySpark/Jupyter since it’s easier to just run it immediately.

Also, you can easily change the Scala version to 2.12.13 and the Apache Spark version to 3.0.1 as we are doing it ourselves to start supporting Apache Spark 3.0.1, there are a few small issues in some models, but apart from that everything else is compatible. (Apache Spark 2.3.x and 2.4.x are being heavily used by our users and all the people in production, so we had to make sure we will support all 2.3, 2.4, and 3.0 on Scala 2.11 and Scala 2.12 since the other two versions won’t go anywhere anytime soon.)

While you are making a PoC against Apache Spark alone to match your dependencies and see if you can shade some and whether there is an issue, I’ll start adding more examples in that starter project with instructions as to how to package it and use it with spark-submit.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Integration and End-to-End Test Configurations in SBT
I place my unit tests under the test/ source directory and name them *Test , my integration tests under it/ and named *IntegrationTest...
Read more >
Building and Testing Scala Projects with sbt | Scala 3 — Book
We'll start by showing how to use sbt to build your Scala projects, and then we'll show how to use sbt and ScalaTest...
Read more >
SBT stop run without exiting - Stack Overflow
I'm trying CTRL+C but it exits SBT. Is there a way to only exit the running application while keeping SBT open? scala ·...
Read more >
Gerrit Code Review - End to end tests
This document provides descriptions of Gerrit end-to-end ( e2e ) test ... is not mandatory but preferred for sbt and Scala IDE purposes...
Read more >
sbt Reference Manual — Running
Common commands ; console, Starts the Scala interpreter with a classpath including the compiled sources and all dependencies. To return to sbt, type...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found