question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Telemetry: Report if command succeded or failed

See original GitHub issue

Use case description

It’s valuable to know whether given command succeeded or failed. Currently payload as sent does not share this data

Proposed solution

To be able to report success or failure, we need to store the command information at the end of the processing command and not at the end as it happens now. Switching that order may slightly prolong the command with which telemetry is send, still currently we send it only with deploy command, which already as it involves a full CloudFormation deploy behind is usually time taking, so that shouldn’t influence the Framework DX.

Add to payload outcome property, which should be set to either success or failure property.

When the failure happens, report additional data with failureReason property object, which should have following properties:

  • kind - Kind of an error, either user (error on user side) or programmer (error on our or plugin side)
  • code - error.code property (should be set for all user errors, may be set for some programmer errors)
  • location - First few lines of a stack trace but excluding the error message (message may contain some user created data, we don’t want to send it). I think we should send it only for programmer errors, as for user errors, code should give us all needed information

How to implement it?

Let’s address each top level point with individual PR

  1. Ensure that all ServerlessError constructs, both in this repository and in @serverless/enterprise-plugin provide a meaningful error code as second argument, also verify that currently used codes are good.
  2. Move logic with which we store command telemetry, and eventually send it, so it happens after command is processed, and outside of Serverless instance realm (we need to do that, to be able to track well in a future e.g. interactive setup which may trigger multiple commands internally(e.g. create then deploy)).
    • Add isTelemetryReportedExternally: true to Serverless constructor options, and assign it to serverless.isTelemetryReportedExternally in constructor logic

    • In PluginManager.js apply telemetry handling (as it’s done currently) only if !serverless.isTelemetryReportedExternally (it’s to handle the case, where we have a fallback to local from global version which is older)

    • Move it right after we either show help or run command in scripts/serverless.js.

    • Optimize telemetry handling, and do not attempt to generate payload and call any telemetry related command if telemetry is disabled (currently we handle that setting in storeLocally to which already generated payload is send, while generation of payload involves some async operations, it might be nice to avoid doing that as well).

    • Introduce a hack which ensures that eventual locally resolved servelress instance (which could come form an older version of a Framework) does not store or send telemetry report on its own. For that add following:

      require('../lib/utils/telemetry/areDisabled'); // Ensure value is resolved
      process.env.SLS_TRACKING_DISABLED = 1
      

      right before serverless.init() Add also there a reasoning explanation and TODO comment indicating to remove that hack with next major (as with it, fallback to local version will happen right at begin of process handling)

  3. Make serverless argument optional in lib/utils/telemetry/generatePayload.js:
    • Resolve command by calling lib/cli/resolve-input and not serverless.processedInput (note that interactive CLI command will now be resolved as'', and I think it’s best if we leave it that way)
    • If no serverless passed, we can assume serverless._isInvokedByGlobalInstallation as false
    • If no serverless is passed resolved isLocallyInstalled in same manner as in lib/cli/handle-error.js
    • Assume no service context if no serverless is passed
  4. Refactor lib/cli/handle-error.js, so instead of taking isLocallyInstalled and isInvokedByGlobalInstallation options, it takes serverless instance as an option
  5. Report whether command was successful, and report additional error information:
    • In scripts/severless.js:
      • Add to generated telemetry payload outcome: "success" property
      • Introduce top level local hasTelemetryBeenReported variable. Pass it with option under same name to handleError and set it to true, right prior report telemetry logic
    • In lib/cli/handle-error.js by the end of the execution, if telemetry is enabled, and !options.hasTelemetryBeenReported generate payload and store it and send it (unconditionally in that case).
      • Let’s ignore the server response, and do not show any proposed notification.
      • Add outcome: "failure"property to the payload
      • Add failureReason object property to the payload, with following data:
        • kind - If isUserError report as user otherwise as programmer
        • code - report error.code if it’s found
        • location (only if !isUserError || !error.code) assign a truncated stack trace. How to resolve it (?) For a starting point take logic in this gist: https://gist.github.com/medikoo/a5c5223d69cf7cd80c0a5039cd4ee1ea (it’s a method I’ve used to identify errors in error reporter I once was implementing for other project, and it seemed work well). One caveat is that, it was used for errors as happen in lambdas, so reporting full file paths from stack traces was harmless. Here we should rather truncate the location at some point. Probably simple approach would be to resolve common paths for all file paths in taken lines, and truncate it from each line. Note that it can be an error without any lines (e.g. Node.js fs tends to throw such, then we can safely report full stack track - I don’t remember those errors reporting any user data in messages).
      • In case of uncaught exception postpone the crash until telemetry request succeeds

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:10 (10 by maintainers)

github_iconTop GitHub Comments

1reaction
medikoocommented, Apr 22, 2021

If we move the logic to scripts and we have a situation where global version is older

That’s a very good point. Previously I’ve proposed a patch to handle the case, when it’s the other way (global is new, but local is older), but that one was not envisioned. I’ve just updated the spec, where I proposed to pass isTelemetryReportedExternally option to constructor to handle this case.

0reactions
pgrzesikcommented, Apr 22, 2021

Sounds good to me 👍

Read more comments on GitHub >

github_iconTop Results From Across the Web

Report Generation Telemetry Trace - Business Central
It provides information about whether the report succeeded, failed, or was canceled. For each report, it tells you how long it ran, how...
Read more >
How to Check If a Command Succeeded in Bash - Linux Hint
Practical tutorial on two common ways of checking whether your command succeeded in Bash - the conditional if-else statement and the special variable...
Read more >
Oban.Telemetry — Oban v2.2.0 - HexDocs
event — either :success or :failure depending on whether the job succeeded or errored; queue — the job's queue; source — always "oban";...
Read more >
NGFW Telemetry Uploads Failing - LIVEcommunity - 439841
Solved: We have been receiving critical alerts saying telemetry uploads on all of our NGFWs from all locations are failing since just past...
Read more >
PKS upgrade failed with error "2 of 5 post-start scripts failed ...
Failed Jobs: telemetry-agent-image" (68148) ... upgrading two of the nodes in the cluster; You see the following Bosh task reports errors:.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found