question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[SIP-91] - Enable SSH Tunneling on Database Connections

See original GitHub issue

Motivation

Users are currently blocked on setting up ssh tunneling entirely through superset. This is causing us to lose potential users to leverage this product as their analytics tool.

Proposed Change

Describe how the feature will be implemented, or the problem will be solved. If possible, include mocks, screenshots, or screencasts (even if from different tools).

  1. wrap @contextmanager around get_sqla_engine()
    1. @contextmanager logic for blocking localhost
      1. Does the DB have ssh tunnel metadata associated with it (stored in a separate table or column)? If it has, create the tunnel binding on 127.0.0.1 and a random port, replace the host and port in the SQLAlchemy URI with 127.0.0.1 and the port that the ssh tunnel created.
      2. If the DB has no ssh tunnel metadata, check the host of the SQL Alchemy URI. If it’s localhost or some variation (127.0.0.1, ::1, 0:0:0:0:0:0:0:1, etc.) then block it (unless the config allows). 1. checking for host name for that resolve to [localhost](http://localhost) as well (library) [there will be a feature flag that will users to override this)
    2. We’ll be able to grab the local_binding_port from server object that sshtunnel package returns in the contextmanager
    3. We are blocking localhost due to security concerns for users who main try to get access to other ports within a given deployment/instance. Specifically for Preset, we want to make sure get user’s cannot get access to other’s db without proper credentials
@contextmanager
def get_sqla_engine_with_context()
        # enable ssh
              # check if ssh tunnel enabled
                   # true -> create tunnel with creds
                   # false -> verify that user isn't connecting to localhost (under config flag)
        
         yield engine
    
         # tear down ssh
  1. Refactor all the places trying to create_engine to have new with format
with get_sqla_engine as engine:
     # use engine with ssh tunnel created
  1. Define schema that is needed for the client to properly SSH tunnel to a remote host

class SSHTunnelCredentials(Schema):
    database_id: int

    server_host: str
    server_port: int
    username: str
    password: Optional[str]
    private_key: Optional[str]
    private_key_password: Optional[str]

    bind_host: str
    bind_port: int
  1. Create table for TunnelConfig that mapped to Database (fk: database_id)
    • migration required and schema should match TunnelConfig
  2. Create tunnel using information provided in the TunnelConfig table for a specific database
    
with sshtunnel.open_tunnel(
    # ...
) as server:
  1. Inside get_sqla_engine with ssh if current db has encrypted_credentials.ssh_tunnel enable tunnel and deconstruct before returning
    1. if doing 2, we’d leave the connection open for this client
  2. Managing SSL with ssh tunnel
    1. if ssh tunnel is enabled + ssl we need allow the certificates to be ignored
    2. For Postgres if you pass sslmode=verify-ca it will ignore the names in the certificates. sslmode=verify-full would fail in this case.

New or Changed Public Interfaces

Describe any new additions to the model, views or REST endpoints. Describe any changes to existing visualizations, dashboards and React components. Describe changes that affect the Superset CLI and how Superset is deployed.

I will be creating a new table name ssh_tunnel_config. This table will hold all the necessary information for the client to establish the connection to any Database living between the proxy.

class SSHTunnelCredentials(Schema):
    database_id: int

    server_host: str
    server_port: int
    username: str
    password: Optional[str]
    private_key: Optional[str]
    private_key_password: Optional[str]

    bind_host: str
    bind_port: int
  • the bind_host + bind_port will be built based upon the information provided in the sqlalchemy_uri.

New dependencies

We’ll be leveraging sshtunnel pip package to help establish connections. https://pypi.org/project/sshtunnel/

Issue Analytics

  • State:closed
  • Created a year ago
  • Reactions:4
  • Comments:7 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
michael-s-molinacommented, Oct 21, 2022

@hughhhh Can we change the title to Enable SSH Tunneling on Database Connections? SSH Tunneling is a generic concept that may be applied in many parts of the application and this SIP is just one part.

+1 to @betodealmeida’s comments

0reactions
rusackascommented, Dec 9, 2022

Closing as approved, and updating the project board! Please continue to reference this issue in related PRs whenever relevant!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Configuring SSH Tunnels for Database Connections
Without the SSH tunnel, connections are not possible since the database server does not allow remote connections. Databases such as MySQL and PostgreSQL...
Read more >
How to connect a database through SSH tunneling? - Bold BI
Connecting SSH Enabled Database in Bold BI · Open the “Bold BI” application · Click the data source on the left panel of...
Read more >
SSH Tunneling: Client Command & Server Configuration
Basically, the SSH client listens for connections on a configured port, and when it receives a connection, it tunnels the connection to an...
Read more >
Access Your Database Remotely Through an SSH Tunnel
As an alternative to setting up an SSH tunnel manually, you can use MySQL Workbench to connect to a MySQL Server using TCP/IP...
Read more >
How to Set up SSH Tunneling (Port Forwarding) - Linuxize
Local port forwarding is mostly used to connect to a remote service on an internal network such as a database or VNC server....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found