SocketException thrown by nunit3-console.exe --explore option
See original GitHub issue@JackUkleja commented on Wed Oct 12 2016
I am running NUnit.ConsoleRunner.3.5.0\tools\nunit3-console.exe with the --explore option against one of my test assemblies and I am getting the following stack trace:
System.Net.Sockets.SocketException (0x80004005): An existing connection was forcibly closed by the remote host
Server stack trace: at System.Net.Sockets.Socket.Receive(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags) at System.Runtime.Remoting.Channels.SocketStream.Read(Byte[] buffer, Int32 offset, Int32 size) at System.Runtime.Remoting.Channels.SocketHandler.ReadFromSocket(Byte[] buffer, Int32 offset, Int32 count) at System.Runtime.Remoting.Channels.SocketHandler.Read(Byte[] buffer, Int32 offset, Int32 count) at System.Runtime.Remoting.Channels.SocketHandler.ReadAndMatchFourBytes(Byte[] buffer) at System.Runtime.Remoting.Channels.Tcp.TcpSocketHandler.ReadAndMatchPreamble() at System.Runtime.Remoting.Channels.Tcp.TcpSocketHandler.ReadVersionAndOperation(UInt16& operation) at System.Runtime.Remoting.Channels.Tcp.TcpClientSocketHandler.ReadHeaders() at System.Runtime.Remoting.Channels.Tcp.TcpClientTransportSink.ProcessMessage(IMessage msg, ITransportHeaders requestHeaders, Stream requestStream, ITransportHeaders& responseHeaders, Stream& responseStream) at System.Runtime.Remoting.Channels.BinaryClientFormatterSink.SyncProcessMessage(IMessage msg)
Exception rethrown at [0]: at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg) at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type) at NUnit.Engine.ITestAgent.Stop() at NUnit.Engine.Runners.ProcessRunner.Dispose(Boolean disposing) at NUnit.Engine.Runners.AbstractTestRunner.Dispose() at NUnit.Engine.Runners.MasterTestRunner.Dispose(Boolean disposing) at NUnit.Engine.Runners.MasterTestRunner.Dispose() at NUnit.ConsoleRunner.ConsoleRunner.ExploreTests(TestPackage package, TestFilter filter) at NUnit.ConsoleRunner.Program.Main(String[] args)
Any clues what is going on? The tests in this assembly run just fine, but the explore options seem to kill the runner.
@ChrisMaddock commented on Wed Oct 12 2016
@JackUkleja - What error do you see if you add the --inprocess
option?
@JackUkleja commented on Wed Oct 12 2016
If I try --inprocess
I get:
NUnit.Engine.NUnitEngineException: Cannot run tests in process - a 32 bit process is required.
So obviously my test DLL is 32 bit and the nunit-console.exe is running as 64 bit due to it being 64 bit OS.
If I use Corflags to force nunit-console.exe to run as 32 bit then the --explore
option appears to work just fine.
So it seems I am unable to replicate the original exception when doing --inprocess
.
@CharliePoole commented on Wed Oct 12 2016
It’s useful info nonetheless. We should see if we can replicate this.
@ChrisMaddock commented on Thu Oct 13 2016
Hmm…I can’t reproduce this with a simple x86 dll.
@JackUkleja - What’s your entire command line, is there anything else involved? Could you try setting up a minimal example, with a 1 test assembly you’d be able to share?
@JackUkleja commented on Mon Oct 24 2016
One more clue about this…
I was just trying to replicate this issue with the following command:
“NUnit\NUnit.ConsoleRunner.3.5.0\tools\nunit3-console.exe --explore Artefacts\Debug\Tests\IntegrationTests.dll”
and much to my surprise it worked without any errors. I got a list back of the contained tests, as expected! So I tried a few of my other test assemblies - they all worked too. Finally I tried my IntegrationTests.dll once more, exactly the same command line,…but this time it failed with the aforementioned SocketException
@JackUkleja commented on Mon Oct 24 2016
Nothing I can do now seems to let me run this command successfully again. I’ve rebuilt the assemblies, restarted the console, looked for any “remoting” or “nunit” processes that maybe could be killed or restarted (nothing). Is something persisting in the remoting infrastructure that is breaking things - something that is not being cleaned up? I’ll try log off/on to see if that gets it working again…
@JackUkleja commented on Mon Oct 24 2016
Oh and this only seems to be happening with my IntegrationTests.dll so there is clearly some strange specific interaction between that dll and nunit remoting.
Update: The command just randomly succeeded without me changing anything specifically. I just ran it a bunch of times and about once in 50 attempts it worked again…
@rprouse commented on Thu Dec 01 2016
@JackUkleja do you have any updates on this? I suspect that it is something in your tests and would like to close this issue.
@tfabris commented on Wed Mar 22 2017
Our company is getting this exact same error with this exact same stack trace. It is rare and intermittent for us. We have unit tests that run several thousand test cases on several different code branches several times per day, And we’re hitting this perhaps twice per day overall. So it’s quite rare for us. But we do hit it. So rprouse, please don’t close the issue just yet.
We are running Nunit version 3.6.1 by the way.
@tfabris commented on Wed Mar 22 2017
To clarify: When it happens, it’s random as to which test case happens to be running when it hits. So we don’t think it’s related to something specific in any of our tests. 99% of the time the tests run just fine and they all pass, it’s only on the rare occasion that we hit this error, and when we do, it’s randomly somewhere in the test list, not the same place every time. (But when the crash occurs, it’s always the exact same stack trace shown at the top of this thread.)
@rwencel commented on Thu Mar 23 2017
We have the same problem, same description, though we don’t use the explore option, we just run the tests in parallel. 100% of tests succeed, but at the end of the run we encounter this error. It’s sporadic. One theory is that it’s when our build server is under stress (multiple builds running at once), but I haven’t confirmed that.
I’m trying a build with the workaround mentioned by @eberlid in the issue below to see if it fixes. I’m not sure it’s a proper fix b/c it just ignores the exception, but it seems harmless and I’m not sure what the root cause is.
NUnit Console Runner Issue #171
@tfabris commented on Thu Mar 30 2017
This issue continues to occur for us intermittently when using Nunit 3.6.1 to run our unit test and integration test suites on Team City.
Because it’s so intermittent, we don’t have a way to reliably reproduce the issue locally on our local machines. Is there any forensic information we can obtain from Team City that would be helpful?
It would be nice to be rid of these “exit code -100” socket exception errors, since they block production when they occur.
@ChrisMaddock commented on Fri Mar 31 2017
I also found one of our assemblies which hits this yesterday - but couldn’t for the life of me manage to debug it. 😞
It feels like a race condition - as I can’t reproduce it when running through visual studio or in debug mode. Logging makes it harder to come by - I have some logs, but they’re not useful.
The only difference I can really think about in this assembly which is erroring is that it’s far more dependent on testcasesources than some of our other assemblies. At some point, I’ll try and test out that theory with some slow testcasesources.
@jnm2 commented on Fri Mar 31 2017
I’ve hit this in debugging a fair number of times a few months ago but none recently.
@ChrisMaddock commented on Fri Mar 31 2017
For what it’s worth - this is what the logs look like up until the crash, with --trace=Debug
. I was running a single assembly, with just the --explore
option.
Engine
13:50:38.052 Debug [ 1] TestAgency: Waiting for agent {7c42103a-6371-4f5f-ba31-df7e7016a6a6} to register
13:50:38.253 Debug [ 1] TestAgency: Returning new agent {7c42103a-6371-4f5f-ba31-df7e7016a6a6}
13:50:41.227 Info [ 1] ProcessRunner: Unloading
13:50:41.313 Debug [ 1] ProcessRunner: Stopping remote agent
13:50:41.321 Error [ 1] ProcessRunner: Failed to stop the remote agent. An existing connection was forcibly closed by the remote host
(log continues - stops all services as expected)
Agent
13:50:38.294 Info [ 5] NUnitFrameworkDriver: Loading Tests.exe - see separate log file
13:50:39.956 Info [ 5] NUnitFrameworkDriver: Loaded Tests.exe
13:50:39.957 Info [ 5] NUnitFrameworkDriver: Exploring Tests.exe - see separate log file
13:50:41.315 Info [ 5] RemoteTestAgent: Stopping
(End of agent log)
@tfabris commented on Mon Apr 03 2017
@rprouse and @CharliePoole , do you guys have any idea about where this is coming from? It only started occurring when we upgraded from NUnit 3.4.1 to NUnit 3.6.1, so it is definitely a new regression which occurred somewhere in between those two builds.
@CharliePoole commented on Mon Apr 03 2017
@ChrisMaddock Didn’t you confirm this already?
@ChrisMaddock commented on Mon Apr 03 2017
@CharliePoole - yes, although only as in so much as I can see it happening, I’ve not got much idea what’s wrong!
I was hoping to take some of the characteristics of the assembly I see it happen with a lot, and create a repro I can share. Haven’t managed to get back to that yet.
Update: Nope…not what I thought unfortunately…
@eberlid commented on Mon Apr 03 2017
We got the same symptoms when we began to start multiple Gallio test runners in parallel using MbUnit test framework. Before that we did not start multiple runners in parallel and we did not get this issue.
Probably both runners have something in common?
See https://github.com/nunit/nunit-console/issues/31#issuecomment-267059306
@CharliePoole commented on Mon Apr 03 2017
@eberlid Insofar as I understand this issue is about running a single copy of the console runner using --explore option.
I can envision all sorts of problems that may come up running multiple copies of the runner at the same time and we should actually have another issue for that. Each instance of the engine starts up a server to which the agents launched by that instance report. So with n consoles and n agents running, there’s lots of room for unexpected stuff to happen, especially as this was not a use case we originally had in mind.
@tfabris commented on Mon Apr 03 2017
@CharliePoole, our repros are all running a single copy of the console runner on any given computer as far as I know. We’re launching these tests from within a single build step in Team City, and we’re not yet taking advantage of the parallelization features built into Nunit 3 yet.
However, we do have a fleet of multiple test agent computers on the network. So more than one testrun might be executing simultaneously on completely different computers. I’m assuming that by “single copy of the console runner”, you mean, a single copy on that particular computer.
@CharliePoole commented on Mon Apr 03 2017
@tfabris Yes, that’s what I meant - i misunderstood what you were saying.
@tfabris commented on Mon Apr 03 2017
Well, maybe our situation is different than @eberlid. I don’t know how MbUnit runs things.
@CharliePoole commented on Mon Apr 03 2017
@tfabris Sorry to confuse… my comment was meant for @eberlid
@JackUkleja commented on Mon Apr 03 2017
@CharliePoole OP here. Just FYI my gut feeling is that --explore option is probably not salient to this bug, but I have not experienced the issue since originally posting and have not personally seen it during normal test runs.
It is interesting this thread had a spate of repros in the last couple of weeks. Does this coincide with a new release? Or perhaps this issue is showing up more in organic search? I’m just surprised at the sudden influx of repro after 3 or 4 months of silence.
@tfabris commented on Mon Apr 03 2017
@JackUkleja, that’s probably my fault. The spate probably started when I came a-posting a couple of weeks ago. In my particular case, it was because we had been on NUnit 3.4.1 until then. We upgraded to NUnit 3.6.1 on our Team City server, and that’s when suddenly we started seeing our Team City builds getting all these “the process exited with code -100” errors. Then I did some searching and came across this thread showing the exact same stack trace, and posted about it, which woke things up again for this thread.
We run thousands of unit tests and integration tests every day, and we encounter this error a few times each day. So that should give you an idea of the frequency of the problem, though I don’t have exact numbers.
Even though that seems like a rare frequency for this crash, it has an interesting effect on us because the tests are a big part of our production chain. When the tests fail with this crash, it’s brings our production grinding to a halt and we have to re-run everything from the top of the chain again. So we’re keen on finding either a solution or at least a work-around.
@kditrj2d commented on Mon Apr 17 2017
Hello, I am another user who is encountering this exact same exception and stack-trace. For me the issue has been happening for a while. It was rare (once a day or so) when we were running NUnit 2.6 tests with the NUnit 3.6.1 console runner. We have just upgraded our tests to NUnit 3.6.1 and the problem has become more serious, so much so that it is blocking our CI build. We are currently running CI from within Team City. The last test assembly reported before the crash runs perfectly happily on the command-line and from within VS2015 via the R# test runner. I haven’t tried running with --inprocess yet. I don’t believe that any of our tests are currently using any of the parallelism features available in NUnit 3.
@baluMallisetty commented on Tue Apr 18 2017
Hello, Please check if there is any TCP connection being established along a NAT channel. .net has remoting issue with TCP over NAT. Use WCF if possible.
@tfabris commented on Mon May 08 2017
@CharliePoole any word on this issue? We continue to encounter it randomly among our thousands of unit and integration tests, when running Nunit 3.6.1.
It only seems to occur when we have the tests running on Team City agent servers. I don’t think any of our developers has reported it happening when they run tests on their local computers. I’m not sure if that’s due to the special command line parameters that Team City uses to launch the tests, or if it’s merely a quantity thing (our Team City servers are where the tests get run most frequently).
@tom-dudley commented on Mon May 08 2017
We’ve also been seeing the same exact message when running via Teamcity, using 3.6.0. The arguments being used are:
nunit3-console.exe foo.nunit --result=bar.nunit.xml --noheader --framework=net-4.0 --where “cat != SomeTests” --explore=baz_all.tests
We’re currently avoiding the issue by using --inprocess
@tfabris commented on Mon May 08 2017
@tom-dudley - Thanks for the tip, I will experiment with adding “–inprocess” as a temporary workaround and see if it solves the problem for us.
If I understand it correctly, this prevents parallelism for test fixtures which support parallelism? So it would only be a useful workaround for folks who don’t use the parallel features. Am I correct in that understanding?
@tfabris commented on Mon May 08 2017
@tom-dudley - I take that back. I’m not able to use “–inprocess” as a workaround because some of our test cases must be run with the --x86 parameter, and if I try to add “–inprocess” I get an error message saying that the two parameters are incompatible.
@CharliePoole - Note: Possible help to narrow this down… At least one of the places where we were getting this socketException error is specifically in our 32-bit tests which were running with the --x86 parameter. I don’t have data on whether all instances of the socketException that we encountered were “–x86” cases or not.
@tom-dudley commented on Mon May 08 2017
I managed to strike lucky and find an assembly where this is mostly reproducible. Everything in the Dispose looks to be working fine up to calling _agent.Stop() in ProcessRunner
. Sticking in a call to the agent in the line before works, so the connection seems fine at this point. We then step over to the RemoteTestAgent.Stop
code and execute stopSignal.Set(). After that line executes we immediately hit error handling back in ProcessRunner
and bubble it up. So I wonder if this ManualResetEvent
is causing us to exit prematurely before we’re actually done with the object. The termination logic for the agent is here.
I can’t see any obvious reason why this might fail, but someone else might spot something. If anyone has any other ideas for things for me to try with this assembly I’m all ears. I’ll see about putting together a minimal one if the issues continues on.
One thing of note however was that if I run nunit3-console
without the framework assembly in the same directory, my test assembly fails 99% of the time. But if I drop the framework assembly in the folder then everything works. Not sure what’s going on here! (I’m checked out at 3.6.1 for this work)
@tfabris commented on Mon May 08 2017
@tom-dudley and @CharliePoole - If it helps, I did some history digging and found that the socketException has occurred randomly on both our regular 64-bit tests as well as our 32-bit tests that were run with the --x86 command line parameter. So it’s not related to that.
@rprouse commented on Mon May 08 2017
@tom-dudley I also don’t see anything obvious. Since you are able to fairly reliably reproduce this, what about wrapping the stopSignal.Set()
and the stopSignal.WaitOne(timeout)
in try/catch to see if it is one of these that is throwing and if so, see if you can get more info out of the exception. Maybe even do the same for the while
loop in WaitForStop()
?
I am hoping that we can start narrowing down exactly what is throwing the exception that seems to be killing the agent.
@rprouse commented on Mon May 08 2017
@CharliePoole and @ChrisMaddock is there any reason we haven’t moved this to the NUnit Console repo? Is it only because it has the confirm label?
@rprouse commented on Mon May 08 2017
@CharliePoole, just curious why we use a TcpChannel
rather than an IpcChannel
. I would think that an IpcChannel
would be more reliable and faster since we never run agents on other machines. I don’t think it would fix this issue, but I am curious about the history. Was it because we intended to run agents on other machines? If so, should we rethink that until it is needed?
@tom-dudley commented on Tue May 09 2017
Turns out the RemoteTestAgent
is exiting cleanly as far as it is concerned. It always hits return 0
in Main
.
I believe the following code may reliably reproduce the error. I still don’t understand why this causes the exception to be thrown though!
public class Tests
{
public static readonly string[] TestDirectory = { TestContext.CurrentContext.TestDirectory };
[Test]
public void MyTest([ValueSource(nameof(TestDirectory))] string directory)
{
}
}
@ChrisMaddock commented on Tue May 09 2017
I also have an assembly which produces this fairly reliably with --explore
- but I haven’t been able to create anything reproducible. 🙁 Anything I’ve done to add logs, or try and catch the exception, seems to suppress the error - I guess by slowing the process down.
is there any reason we haven’t moved this to the NUnit Console repo?
I only haven’t moved it as we haven’t tracked down anything to actually fix yet. It’s also likely a duplicate of some of the below - but I didn’t think we could really confirm that, till we had something to test… 🙁
https://github.com/nunit/nunit-console/issues/171 https://github.com/nunit/nunit/issues/2027 https://github.com/nunit/nunit-console/issues/219
@jnm2 commented on Tue May 09 2017
I can manage to get a socket exception once in a while by repeatedly running .\nunit3-console.exe mock-assembly.dll mock-assembly.dll mock-assembly.dll [...] --process=multiple
against 3.6.1:
Unhandled Exception: System.Net.Sockets.SocketException: An existing connection was forcibly closed by the remote host
Server stack trace:
at System.Net.Sockets.Socket.Receive(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags)
at System.Runtime.Remoting.Channels.SocketStream.Read(Byte[] buffer, Int32 offset, Int32 size)
at System.Runtime.Remoting.Channels.SocketHandler.ReadFromSocket(Byte[] buffer, Int32 offset, Int32 count)
at System.Runtime.Remoting.Channels.SocketHandler.Read(Byte[] buffer, Int32 offset, Int32 count)
at System.Runtime.Remoting.Channels.Tcp.TcpFixedLengthReadingStream.Read(Byte[] buffer, Int32 offset, Int32 count)
at System.IO.BinaryReader.ReadBytes(Int32 count)
at System.Runtime.Serialization.Formatters.Binary.SerializationHeaderRecord.Read(__BinaryParser input)
at System.Runtime.Serialization.Formatters.Binary.__BinaryParser.ReadSerializationHeaderRecord()
at System.Runtime.Serialization.Formatters.Binary.__BinaryParser.Run()
at System.Runtime.Serialization.Formatters.Binary.ObjectReader.Deserialize(HeaderHandler handler, __BinaryParser serParser, Boolean fCheck, Boolean isCrossAppDomain, IMethodCallMessage methodCallMessage)
at System.Runtime.Serialization.Formatters.Binary.BinaryFormatter.Deserialize(Stream serializationStream, HeaderHandler handler, Boolean fCheck, Boolean isCrossAppDomain, IMethodCallMessage methodCallMessage)
at System.Runtime.Remoting.Channels.CoreChannel.DeserializeBinaryResponseMessage(Stream inputStream, IMethodCallMessage reqMsg, Boolean bStrictBinding)
at System.Runtime.Remoting.Channels.BinaryClientFormatterSink.SyncProcessMessage(IMessage msg)
Exception rethrown at [0]:
at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)
at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)
at NUnit.Engine.ITestAgent.Stop()
at NUnit.Engine.Runners.ProcessRunner.Dispose(Boolean disposing) in C:\Users\Joseph\Source\Repos\nunit-console\src\NUnitEngine\nunit.engine\Runners\ProcessRunner.cs:line 258
at NUnit.Engine.Runners.AbstractTestRunner.Dispose() in C:\Users\Joseph\Source\Repos\nunit-console\src\NUnitEngine\nunit.engine\Runners\AbstractTestRunner.cs:line 225
at NUnit.Engine.Runners.TestExecutionTask.Execute() in C:\Users\Joseph\Source\Repos\nunit-console\src\NUnitEngine\nunit.engine\Runners\TestExecutionTask.cs:line 47
at NUnit.Engine.Runners.ParallelTaskWorkerPool.ProcessTasksProc() in C:\Users\Joseph\Source\Repos\nunit-console\src\NUnitEngine\nunit.engine\Runners\ParallelTaskWorkerPool.cs:line 81
at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Threading.ThreadHelper.ThreadStart()
@jnm2 commented on Tue May 09 2017
If someone has a consistent repro, would be fun to play with that.
@tom-dudley commented on Tue May 09 2017
@jnm2 The snippet I posted above is consistent for me. I just realised when I said it didn’t show the error any more I was running the wrong assembly. Doh!
@jnm2 commented on Tue May 09 2017
@tom-dudley And is this without TeamCity?
@tom-dudley commented on Tue May 09 2017
Yep, this is creating a new project with that snippet, using nunit-console tagged at 3.6.1
and running it locally with nunit3-console.exe .\NUnitFailure\bin\Debug\NUnitFailure.dll --explore
@jnm2 commented on Tue May 09 2017
My debug and release builds at 3.6.1 don’t repro this.
Ah, repro’d with https://github.com/nunit/nunit-console/releases/download/3.6.1/NUnit.ConsoleRunner.3.6.1.nupkg. 🎉
@tom-dudley commented on Tue May 09 2017
Yes I repro’d with both that nuget package and with building nunit-console
from source at the 3.6.1
tag. I’ve not tried the zip or msi on the Releases page of nunit-console.
@tom-dudley commented on Tue May 09 2017
I have an assembly which contains a solitary reference to nunit.framework
and it still gives the error.
@ChrisMaddock commented on Tue May 09 2017
This repros for me too. Awesome @tom-dudley, I haven’t ever managed to break something out that did! 😄
My assembly I regularly see this with doesn’t use TestContext (It mostly pre-dates it!). It does however make extensive use of ValueSources.
@jnm2 commented on Tue May 09 2017
Also repros with:
public class Tests
{
public static readonly string[] TestDirectory = { "" };
static Tests()
{
_ = TestContext.CurrentContext;
}
[Test]
public void MyTest([ValueSource(nameof(TestDirectory))] string directory)
{
}
}
@baluMallisetty commented on Tue May 09 2017
Hello guys,
I don’t think I should in this conversations, I believe my email is ccied by accident, could you please loop me out?
I am not aware of what Nunit is too! I think I am not the right person included in your email! 😂
Thanks Balu
On Tue, May 9, 2017 at 12:43 PM Chris Maddock notifications@github.com wrote:
This repros for me too. Awesome @tom-dudley https://github.com/tom-dudley, I haven’t ever managed to break something out that did! 😄
My assembly I regularly see this with doesn’t use TestContext (It mostly pre-dates it!). It does however make extensive use of ValueSources.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/nunit/nunit/issues/1834#issuecomment-300226503, or mute the thread https://github.com/notifications/unsubscribe-auth/ALrPmS4rfmJ3KsajabZ34yhEOih4VZAKks5r4JebgaJpZM4KUc6I .
@jnm2 commented on Tue May 09 2017
@baluMallisetty We are not in control of that. You can visit this page (https://github.com/nunit/nunit/issues/1834) and click Unsubscribe in the sidebar for this thread.
@jnm2 commented on Tue May 09 2017
Also repros with:
public class Tests
{
public static readonly string[] TestDirectory = { "" };
static Tests()
{
_ = TestContext.CurrentContext;
}
[TestCaseSource(nameof(TestDirectory))]
public void MyTest(string directory)
{
}
}
@jnm2 commented on Tue May 09 2017
My guess is that the trigger is when NUnit needs to invoke a member in order to enumerate the tests and the class containing the member has a static constructor that accesses TestContext.CurrentContext.
@jnm2 commented on Tue May 09 2017
It’s not specifically the constructor, it’s just caused by the TestContext.CurrentContext
being accessed when NUnit invokes a member in order to enumerate test cases.
public class Tests
{
public static string[] TestDirectory()
{
_ = TestContext.CurrentContext;
return new[] { "" };
}
[TestCaseSource(nameof(TestDirectory))]
public void MyTest(string directory)
{
}
}
Gotta take a break for a while.
@tom-dudley commented on Tue May 09 2017
Just to remove all usages of strings:
public class Tests
{
public static readonly TestContext context;
static Tests()
{
context = TestContext.CurrentContext;
}
[TestCaseSource(nameof(context))]
public void MyTest(TestContext context)
{
}
}
@tom-dudley commented on Tue May 09 2017
After a bit of cowboy-debugging:
<properties>
<property name="_SKIPREASON" value="System.InvalidCastException : Unable to cast object of type 'NUnit.Framework.TestContext' to type 'System.Collections.IEnumerable'."/>
<property name="_PROVIDERSTACKTRACE" value=" at NUnit.Framework.TestCaseSourceAttribute.GetTestCaseSource(IMethodInfo method) at NUnit.Framework.TestCaseSourceAttribute.GetTestCasesFor(IMethodInfo method)"/>
</properties>
@jnm2 commented on Tue May 09 2017
@tom-dudley That’s because you did public static readonly TestContext context;
instead of TestContext[]
. It should be an enumerable containing one item per test case.
@tom-dudley commented on Tue May 09 2017
@jnm2 Good point! After stepping into the TestCaseSourceAttribute
everything looked fine and I got some sensible XML back. Then I hit the _agent.Stop()
, the agent gets to existing its Main
and we get the exception.
@jnm2 commented on Wed May 10 2017
Ah, it’s more complex.
Right now I can consistently repro if and only if the test assembly has nunit.framework.dll
in the same folder and nunit3-console.exe
does not have nunit.framework.dll
in the same folder.
@jnm2 commented on Wed May 10 2017
If this is accurate, we will not be able to do tests on this scenario without build.cake
running nunit3-console.exe
from a separate folder. I do think tests on this are important.
@jnm2 commented on Wed May 10 2017
Hey, I’m getting the repro to happen intermittently in a modified build.cake which copies nunit3-console.exe
to its own folder before running mock-assembly tests. This is on a fresh debug build of nunit-console.
@tom-dudley commented on Wed May 10 2017
@jnm2 Regarding the non-presense of nunit.framework.dll
, I got exactly the same results as you - i.e. nunit-console.exe
not having nunit-framework.dll
in the same folder causing it to repro
@tfabris commented on Wed May 10 2017
Hm. Is it possible that the issue could be caused by a mismatched version of nunit-framework.dll?
I notice that our tests, when run on Team City, which repro the issue only randomly/intermittently, seem to have the 3.4.1 version of nunit-framework.dll on the hard disk in the folder along with the test assembly. This must be a leftover in the build system that we haven’t corrected yet. The tests get run with the 3.6.1 console runner, but there is a 3.4.1 version of nunit-framework in the folder with the tests.
I’m going to try getting our build system to update to the 3.6.1 version of the framework.
@jnm2 commented on Wed May 10 2017
I’m pretty sure the console runner acts differently depending whether the framework DLL is present, and perhaps it’s version related, but it should not cause this exception no matter what.
@jnm2 commented on Wed May 10 2017
It turns out to be pretty tricky getting this under automated testing, but it’s so worth it.
The test can’t be done by an process that references or loads nunit.framework due to the Heisenburg quality of this bug, so the test has to be written in an assembly that does not use nunit.framework. Meaning, no testing framework.
Or, if we’re willing to forgo easy debugging, it can be done out of process. That’s probably the least disruptive option.
@jnm2 commented on Wed May 10 2017
Hey guys, guess what? I have a reliable test at https://github.com/nunit/nunit-console/pull/223.
@tom-dudley commented on Wed May 10 2017
-
nunit3-console
with no framework in the folder. Test linked against 3.6 - Exception -
nunit3-console
with framework 3.6 in the folder. Test linked against 3.6 - No Exception -
nunit3-console
with framework 3.7 in the folder. Test linked against 3.6 - Exception -
nunit3-console
with no framework in the folder. Test linked against 3.7 - Exception -
nunit3-console
with framework 3.6 in the folder. Test linked against 3.7 - Exception -
nunit3-console
with assembly 3.7 in the folder. Test linked against 3.7 - No Exception
And by ‘3.7 in the folder’, I just mean changing the assembly version number. All exceptions were repros of the original issue.
@tom-dudley commented on Wed May 10 2017
@jnm2 The tests do a pretty good job at crashing my VS test runner, so looks like they’re definitely reliable!
@jnm2 commented on Wed May 10 2017
Have to take a break again, anyone is welcome to help find a fix if they so desire. I’ll start looking at a fix when I’m back.
@jnm2 commented on Wed May 10 2017
@tom-dudley Nice! That’s valuable information that should probably be tested against too.
@tfabris commented on Wed May 10 2017
This is amazing work you are all doing. Awesome. In the meantime, my boss has updated our build system so that the framework and the console runner match at version 3.6.1, so we’ll have additional data about whether it might be related to the version mismatch.
@CharliePoole commented on Wed May 10 2017
I’m joining in late, so may be missing something. However…
It’s absolutely certain that the framework you are referencing has to be either in the same folder as your test assembly or somewhere else (like on the probing path) where it can be found. Consequently, there can’t be another version of the assembly there, since all versions have the same file name. The framework is an assembly that you - not the runner - reference, so it’s up to you to get it there.
In this context “up to you” includes any third party software you are using, like TeamCity or even VS. 😄
OTOH, there is no requirement for the version of the runner to match that of the framework. The runner doesn’t reference the framework. Of course, if you are running the nunit3-console runner, your framework had better be >= to 3.0 OR you had better have the V2 framework driver extension. That doesn’t sound like your problem here, so I only mention it for completeness.ere can also be a loss of function if the runner is older than the framework - that is, the framework may have features that the runner doesn’t understand - but things should still run.
@jnm2 Re Runner acting differently if the framework isn’t there… Yes… if the framework is not located with the test assembly that references it, we never find it to load and crash. That may not give the friendliest message and so should be checked.
Guys… why is this an nunit bug rather than a console/engine bug?
@tfabris commented on Mon May 15 2017
I have confirmed that the issue still occurs on our build/CI system, intermittently, even when the framework version and the console runner version are both 3.6.1, and the framework DLL is present in the folder with the test assembly DLL.
@jnm2 commented on Thu May 18 2017
Guys… why is this an nunit bug rather than a console/engine bug?
It was never moved. @rprouse asked the same above.
Issue Analytics
- State:
- Created 6 years ago
- Comments:80 (25 by maintainers)
@asbjornu unfortunately,
StackOverflowException
is one of the exceptions that cannot be caught, because of that, the agent just crashes and doesn’t get a chance to report back to the console runner why it crashed. There are several other exceptions like this.I have been suspicious that many of these errors that users report are actually uncaught exceptions in their test code and have nothing to do with NUnit. It could be that this issue is actually fixed in NUnit.
One possible solution would be to catch the
SocketException
in the console and report a more friendly error message like “The test agent has crashed. This is often because the code being tested threw an exception that cannot be caught and crashed the agent. To diagnose the problem, we recommend running your tests with the --inprocess command line option”.@rprouse - You said “I have been suspicious that many of these errors that users report are actually uncaught exceptions in their test code”.
This idea is inconsistent with the behavior that I have seen related to the socket exception bug, which is:
Catching the socket exception and throwing a more friendly error message would merely cloud the issue.