question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Spark can't find DLL's specified

See original GitHub issue

I am new to DOTNET with spark and facing some issues with passing DLLs. Basically, I have some DLL files (from another c# project) which I want to reuse here in my Spark project UDF.

Error: [Warn] [AssemblyLoader] Assembly 'Classes, Version=3.0.142.0, Culture=neutral, PublicKeyToken=910ab64095116ac0' file not found 'Classes[.dll,.ni.dll]' in '/tmp/spark-e2e6444a-99fc-42c6-ae15-8a5b328e3038/userFiles-aafb5491-4485-46d9-8e17-0849aed7c57a,/home/ubuntu/project/mySparkApp/bin/Debug/net5.0,/opt/Microsoft.Spark.Worker-1.0.0/' [2021-04-13T11:16:15.1691280Z] [ubuntu-Vostro] [Error] [TaskRunner] [1] ProcessStream() failed with exception: System.IO.FileNotFoundException: Could not load file or assembly 'Classes, Version=3.0.142.0, Culture=neutral, PublicKeyToken=910ab64095116ac0'. The system cannot find the file specified.

Here I have copied Classes.dll (an external DLL) file in my home/ubuntu/project/mySparkApp. Initially, I was facing the same error with mySparkApp.dll and I resolved that with copying in my current directory and that woked. But in case of this third party DLL, it failed to find.

Here is my .csproj file where I have mentioned the Classes.dll: ` <Project Sdk="Microsoft.NET.Sdk">

<PropertyGroup> <OutputType>Exe</OutputType> <TargetFramework>net5.0</TargetFramework> </PropertyGroup> <ItemGroup> <PackageReference Include="Microsoft.Spark" Version="1.0.0" /> </ItemGroup> <ItemGroup>
 <Reference Include="Classes">
   <HintPath>/home/incs83/project/mySparkApp/Classes.dll</HintPath>
 </Reference>
 <Reference Include="CSharpZip">
   <HintPath>/home/incs83/project/mySparkApp/CSharpZip.dll</HintPath>
 </Reference>
</ItemGroup> </Project> ` Here is spark-submit:

spark-submit --class org.apache.spark.deploy.dotnet.DotnetRunner --master local bin/Debug/net5.0/microsoft-spark-3-0_2.12-1.0.0.jar dotnet bin/Debug/net5.0/mySparkApp.dll

I have spend a lot of time digging into this, still no luck.

Issue Analytics

  • State:open
  • Created 2 years ago
  • Reactions:1
  • Comments:15 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
clegendrecommented, Apr 13, 2021

Spark.NET will look at your custom DLL using this environment variable; DOTNET_ASSEMBLY_SEARCH_PATHS So, just before spark-submit, you can set the environment variable targeting your dll folder:

set DOTNET_ASSEMBLY_SEARCH_PATHS=absolute_path_to_folder_containing_dlls

You can also copy these DLLs to Microsoft.Spark.Worker installation folder. (This what is perform on Databricks environment)

APP_DEPENDENCIES=/dbfs/apps/dependencies
WORKER_PATH=`readlink $DOTNET_SPARK_WORKER_INSTALLATION_PATH/Microsoft.Spark.Worker`
if [ -f $WORKER_PATH ] && [ -d $APP_DEPENDENCIES ]; then
   sudo cp -fR $APP_DEPENDENCIES/. `dirname $WORKER_PATH`
fi
0reactions
harishukla93commented, Apr 20, 2021

@suhsteve I have used sources directly to get rid off the Classes.dll. But I am deserializing some data in UDF using System.Runtime.Serialization.Formatters BinaryFormatter and MemoryStream. But it is giving me below error:

[Warn] [AssemblyLoader] Assembly 'System.Runtime.Serialization.Formatters.resources, Version=4.0.4.0, Culture=en-IN, PublicKeyToken=b03f5f7f11d50a3a' file not found 'System.Runtime.Serialization.Formatters.resources[.dll,.ni.dll]' in '/tmp/spark-024dfc93-f0fc-4c04-8737-ba0dbc8370bf/userFiles-599198e1-61d3-43f7-b810-c6d5376c2d65,/home/incs83/project/rs-etl-test/bin/Debug/netcoreapp3.1,/opt/Microsoft.Spark.Worker-1.0.0/' [2021-04-20T06:51:51.5112399Z] [incs83-Vostro-3490] [Warn] [AssemblyLoader] Assembly 'System.Runtime.Serialization.Formatters.resources, Version=4.0.4.0, Culture=en, PublicKeyToken=b03f5f7f11d50a3a' file not found 'System.Runtime.Serialization.Formatters.resources[.dll,.ni.dll]' in '/tmp/spark-024dfc93-f0fc-4c04-8737-ba0dbc8370bf/userFiles-599198e1-61d3-43f7-b810-c6d5376c2d65,/home/incs83/project/rs-etl-test/bin/Debug/netcoreapp3.1,/opt/Microsoft.Spark.Worker-1.0.0/'

@clegendre Got to know this is obsolete in .NET5, so I am using .NETCore 3.1 but facing this issue.

Please help!!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Not able to call DLL Method in Apache Spark
I've setup Apache Spark in windows system and have a java program which reads file.txt gets the data and then calls my first.dll...
Read more >
Fix Spark.dll is Missing or Not Found Error Messages
The safest way to repair missing or corrupted Spark.dll file caused by your Windows operating system, is to run the built-in System File...
Read more >
Executable Cannot Find My DLL Even With A Specified Path ... - Spark
Executable Cannot Find My DLL Even With A Specified Path NI. The event was a Type 2 Token Elevation Event (%%1937), indicating that...
Read more >
Specified DLL Function Not Found - IT Programming
I compiled a dll 'Trial.dll' with the following function on a VM (XP) in Windows 7 Public Function Opposite(x As Double) As Do....
Read more >
Spark .dll missing
Spark only installed in standalone. I've tried several reinstalls being careful to specify the correct vst path but to no avail.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found