question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Discovery has not completed after a few weeks

See original GitHub issue

If your issue relates to the Discovery Process, please first follow the steps described in the implementation guide Debugging the Discovery Component


Describe the bug The discovery process has not been completed after a few weeks of being deployed.

We have deployed this into our production account with AWS Config enabled at the same time (we had previously disabled it for cost reasons). We are attempting to discover the resources within a single region (not us-east-1) and within the same account as Perspective is deployed.

Config has only been enabled within the region we are trying to discover. We have no aggregator created as we are only discovering a single account right now, but the account shows 14,286 resources. However, Perspective only shows 991 resources. It is slowly increasing, a few per day, but has not mapped the vast majority of our systems.

I have walked through the debugging of the discovery process. The CloudWatch monitoring of the GremlinFunction shows no errors. I tried searching the logs for “400” and “500”, but our account ID contains both of those so every log line showed up. I did a search for “Exception” and can see a few errors:

{
    "detailedMessage": "The traversal has tried to use a null or non-existent value in the step: [GraphStep(vertex,[46338a5d8e02c6828364c145c3ce003c])]",
    "code": "IllegalArgumentException",
    "requestId": "eb848188-2216-4709-8792-fdc82feaa988"
}

{
    "detailedMessage": "Vertex with id already exists: ",
    "code": "ConstraintViolationException",
    "requestId": "aa027916-a3c2-45f8-9bf7-5773f3a1fc48"
}

In the past week we have seen 6 of the first error and 5 of the second, with no exceptions in the last 24 hours.

I also looked at the ECS container task and can see in the logs a few errors as well from the past week.

{
    "message": "Error Message: You have specified a resource that is either unknown or has not been discovered.",
    "level": "error"
}
{
    "message": "CanImportRun Error:",
    "level": "error"
}

As well as OOM messages from v8.

<--- JS stacktrace --->
==== JS stack trace =========================================
0: ExitFrame [pc: 0x13162b9]
Security context: 0x0367c4840911 <JSObject>
1: copy [0x1294d9f4fe21] [/code/node_modules/ramda/src/internal/_clone.js:~21] [pc=0x1ea30af92bb1](this=0x31c303c84d49 <JSGlobal Object>,0x1294d9f4fe61 <JSArray[0]>)
2: copy [0x20ee9326fba1] [/code/node_modules/ramda/src/internal/_clone.js:~21] [pc=0x1ea30af91da7](this=0x31c303c84d49 <JSGlobal Object>,0x20ee9326fbe1 <JSArray[34]>)
3: copy [0x20ee9326f...
FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
Failed to open Node.js report file: report.20210722.083713.1.0.001.json (errno: 13)
1: 0x9aedf0 node::Abort() [node]
2: 0x9aff86 node::OnFatalError(char const*, char const*) [node]
3: 0xb078ce v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [node]
4: 0xb07c49 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [node]
5: 0xce4ae5  [node]
6: 0xcf032b v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) [node]
7: 0xcf1047 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [node]
8: 0xcf3b78 v8::internal::Heap::AllocateRawWithRetryOrFail(int, v8::internal::AllocationType, v8::internal::AllocationAlignment) [node]
9: 0xcbd487 v8::internal::Factory::NewFillerObject(int, bool, v8::internal::AllocationType) [node]
10: 0xf94048 v8::internal::Runtime_AllocateInYoungGeneration(int, unsigned long*, v8::internal::Isolate*) [node]
11: 0x13162b9  [node]
 

<br class="Apple-interchange-newline" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; text-decoration: none;">

One thing I did notice that felt a bit strange was that there were ~50 tasks running at a time. I’m not sure if this is expected or if it’s because the tasks are not completing in time.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
RyanFrenchcommented, Jul 30, 2021

Great, thanks! I look forward to the next release

0reactions
EricSEkongcommented, Jan 21, 2022

Here’s are some logs from the task image

Read more comments on GitHub >

github_iconTop Results From Across the Web

What happens if the plaintiff does not give me responses to my ...
The plaintiff must respond to your requests for discovery. The plaintiff must ... With 2 weeks notice, you will be sure that you...
Read more >
'Star Trek: Discovery' Taking Mid-Season Hiatus After Next ...
'Star Trek: Discovery' Taking Mid-Season Hiatus After Next Week, Returning February 2022 – TrekMovie.com.
Read more >
Discovery
Discovery must be served sufficiently in advance of the discovery cutoff so as to allow the opposing party sufficient time to respond before...
Read more >
Rule 37. Failure to Make Disclosures or to Cooperate in ...
A failure described in Rule 37(d)(1)(A) is not excused on the ground that the discovery sought was objectionable, unless the party failing to...
Read more >
Cloud AWS Discovery is not completing - ServiceNow
The Discovery status is still active after 1 day. There is no new update on logs or ECC queue. All the ECC queue...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found