question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

sdn-controller crash with a high number of services and SCK

See original GitHub issue

What happened:

We deployed SCK 1.4 into our OpenShift 4.x cluster. On an empty cluster, or one with not many pods running it appears to work fine, but as we start to increase the number of pods and services, SCK starts to affect overall cluster health causing total instability of the cluster. Specifically, it is causing the sdn-controller to crash and recycling almost constantly due to overwhelming the OpenShift API. We even end up having actual worker nodes go down due to this. We were able to directly tie this problem to SCK and it occurred in different versions of 4.4.x and 4.5.x of OpenShift. As soon as we undeployed SCK, we have not had a single hiccup with the cluster and have been entirely stable.

What you expected to happen: The sdn-controller shouldn’t crash and the cluster shouldn’t have any problems running pods.

How to reproduce it (as minimally and precisely as possible): Deploy SCK on it’s own namespace on a cluster that has at least 70 pods/services running

Anything else we need to know?:

Environment:

  • oversion: Client Version: 4.3.1 Server Version: 4.5.3 Kubernetes Version: v1.18.3+3107688

  • OS (e.g: cat /etc/os-release): Red Hat Enterprise Linux CoreOS 45.82.202007171855-0

  • Splunk version: 7.0.0

  • Others:

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:6 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
matthewmodestinocommented, Jul 31, 2020

@fshadid96 it was a bug in the metadata filter, i know you had a bad first touch, but we have had no issues since patching it, I don’t expect you to have to worry once you get updated.

0reactions
rockb1017commented, Aug 10, 2020

closing. thank you!

Read more comments on GitHub >

github_iconTop Results From Across the Web

OpenFlow SDN protocol flaw affects all versions, could lead to ...
The vulnerability stems from the inherent trust network controllers give to OpenFlow switches and could be exploited to perform denial-of- ...
Read more >
Hacking the Brain: Customize Evil Protocol to Pwn an SDN ...
Our research shows that it was possible for a weak adversary to execute arbitrary command or manipulate data in the SDN controller without...
Read more >
4. SDN Controllers - SDN: Software Defined Networks [Book]
A high-level data model that captures the relationships between managed resources, policies and other services provided by the controller. In many cases ...
Read more >
Towards Datacenter TCP Congestion Control with SDN for IoT ...
In this paper, we propose a software defined network (SDN)-based TCP congestion control mechanism, referred to as SDTCP, to leverage the features, e.g., ......
Read more >
fQJ - River Thames Conditions - Environment Agency - GOV.UK
Punjabi work in canada, Uc vs mexico, Number 1 rock song 2000, Land use model ... Bhu mbbs admission 2013 14, Crash bandicoot...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found