Stack events discarding and snoozing
See original GitHub issueOne of the biggest issues we currently have with Exceptionless is users having some rogue event going off and eating up their entire monthly plan limits. In the past we tried to say that users should fix this on their side of things and didn’t give them any tools to do anything about it on the Exceptionless side. This results in peoples accounts getting throttled and the value we are providing to that user being diminished. One of the big reasons we took that stance is because we are paying for bandwidth and computing power required to process those events on our side and then discard them. We have decided that we are going to do the right thing for the users and eat that cost by allowing stacks of events to be marked as discarded. Any events coming in that match that stack will be thrown away and not count against your plan.
In order to do this, we are going to do a refactoring on stacks to add a new status
field and get rid of the IsHidden
and the IsRegressed
flag as well. Status will be able to be one of open
, discarded
, fixed
, regressed
, snoozed
, ignored
.
Snoozing would be a new feature that would let you mark a stack as snoozed for X amount of time. During that time, the stack would be filtered out of the normal stack lists and you would not receive any notifications for events on that stack. Ignored status would work the same except it would be a permanent ignore.
In addition, for a long time we have wanted to make events immutable. We currently bulk update events to mark them as hidden when their corresponding stack is marked as hidden. Bulk updating millions of events is really expensive especially because Elasticsearch is rather slow for updating documents. We are going to take the stance that events are immutable and are basically just a log message. They can be searched and filtered, but marking a stack as fixed has no effect on its events. You would not be able to filter events based on fixed status. Making events immutable as well as making their associated daily indexes read only after a couple days (can’t add events that happened more than a few days into the past) would allow us to optimize event index storage as well.
Server
- Add new
Status
andSnoozedUntilUtc
properties toStack
, make sure that status and snoozed are indexed so that queries likestatus:discarded
work. - Remove
IsHidden
,IsRegressed
andDisableNotifications
fromStack
, remove from index settings - Remove
IsFixed
,IsHidden
andIsDeleted
fromPersistedEvent
, remove from index settings - Remove all code that is updating the
PersistedEvent
-IsFixed
andIsHidden
properties - Make sure we do not send any notifications for stacks with status of
fixed
,discarded
,ignored
orsnoozed
] - Make sure that we don’t break existing webhooks by adding pseudo fields for is_fixed and is_hidden that are derived from status
- Change spam detection (user agents and massive amounts for single IP) to discard the events instead of hiding them
- Make sure that events are completely immutable (only thing being updated are session start events for now)
- Discard any events coming in where the stack is marked as discarded. Do not count these against the organizations plan limits, but increment a counter just so we know how many events are being discarded
-
/change-status
should clearsnoozeUntilUtc
- Figure out why
status:Open
is treated as a premium feature usage - Figure out why no stacks are being returned
- Move all event deletes into a queue so that we can pause delete processing during migrations as they will no longer support soft deletes
- Add a job that looks for stacks where the snoozed until date is past and changes status back to open
- Update email templates to change disable notifications option to actions for discard and snooze
- Create migration to add new mappings to stack index and set stack status
- If stack filter has
status:open
, then negate the stack query to get a smaller list of stack ids - If stack filter hits 10k limit, then negate the stack query to see if it returns a smaller list of stack ids
- If stack filter still hits 10k limit, then make sure message is shown to the user to tell them narrow their filter criteria
- Consider increasing the 10k ES terms limit
- Job to delete stacks that have a last occurrence date older than the organizations retention period
- Delete stack is just soft delete
- Delete all project data is just marking all stacks as deleted
- Job to cleanup deleted stacks / events after some period of time (definitely should cleanup stacks, might not be worth processing power to delete events in the middle of indexes and cause a lot of index fragmentation)
- Verify summary emails are still working
- Add Event Visitor to remove any queries containing
status:
. Currently filters are applied across all stack and event requests. If the filter containsstatus
then nothing is returned.
UI
- Update default stack filter to exclude
fixed
,discarded
,ignored
orsnoozed
statuses - Update UI Stack grid bulk actions to allow Mark Discarded, Mark Not Discarded, Mark Ignored, Mark Not Ignored, Mark Fixed, Mark Not Fixed, Delete.
- Add ignore action to UI
- Update UI to remove disable notifications actions on stacks
- Update UI to change hide stack action to discard action
- Update UI to add snooze action. Modal will be shown that asks how long to snooze for. Changes status to snoozed and adds a date for when the stack should be snoozed until
- Remove Most Recent view
- Remove Most Frequent section from Timeline.
- Change Most Recent section to be called Events. Make it show 20 items instead of 10.
- Move new Timeline view to the bottom of the list of views in the nav.
- Rename Dashboard to Timeline
- Add Redirects for changed routes
- Update UI Translations
- Fix Sentinel: Failed connecting to configured master for service: exceptionless errors
- Surface error message when hitting the 10k terms limit so users know they need to narrow their filter
Other
- Update search documentation
Issue Analytics
- State:
- Created 4 years ago
- Reactions:1
- Comments:13 (13 by maintainers)
Top GitHub Comments
@niemyjski UI looks good other than /dashboard not getting redirected. I think it makes sense. I think there are other things we can do to improve things, but I think this is good for now.
Just added search documentation.