question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[RFC] Enable OpenSearch Dashboards to support multiple OpenSearch clusters

See original GitHub issue

Problem Statement

OpenSearch Dashboards (OSD for short) was design and implemented to work with one single OpenSearch cluster. Dashboards users need to navigate between Dashboards endpoints to visualize their data if they have multiple OpenSearch clusters. This experience is not user friendly and also added overheads as users need to maintain multiple OpenSearch Dashboards instances, one for each OpenSearch cluster.

We expect to provide the experience for OpenSearch Dashboards users to have one single Dashboards that can visualize data in different OpenSearch clusters. An OpenSearch that saves raw data for analysis is a data source.

The proposal here is to enable OpenSearch Dashboards to have the capability allow users to dynamically manage their data sources. Then users can build visualization and dashboards against data in those data sources, and put those visualizations into single dashboard.

Proposed Solution

We propose to add a new data-source type in Dashboards saved objects, which includes the data source URL, capabilities (such as what plugins are available), and credentials (credentials will be encrypted by OSD when persisted) to be used to access the data source. Then index-pattern can refer to a data-source, and based on this data-source reference, Dashboards server can execute the query against the specific data-source.

For instance, a data-source object may look like:

{
  "type": "data-source",
  "data-source": {
    "title": "demo-data-source",
    "host": "https://my.opensearch.domain/",
    "auth_type": "basicauth",
    "credentials": {
      "username": "dashboards_user",
      "password": "password",
    },
    "capabilities": {
      "alerting": {
        "enabled": true,
        "version": "1.2",
      },
      "ism": {
        "enabled": true,
        "supported_actions": [
          "roll_over",
          "shrink"
        ]
      }
    }
  },
  ...
} 

And we will add a reference to data-source in index-pattern, so that an index-pattern object will look like:

{
  "type": "index-pattern",
  "index-pattern": {
    "title": "demo-index-pattern",
    "fields": {
      ...
    },
    "dataSource": "data-source-obj-id" 
  },
  "references" : [
    {
      "id": "data-source-obj-id",
      "name": "kibanaSavedObjectMeta.dataSource",
      "type": "data-source"
    }
  ],
  ...
}

With the new data-source model being added, visualziations are able to get the data source reference id from index pattern and then pass it to OSD server along with the query. Then OSD server can get the data source attributes using saved object service, then query that specific data source.

The user experience will be changed by having the new data-source model. Users needs to create data sources before they can create an index pattern. Then, when creating an index pattern, users will need to select a data source which the index pattern will be associated to. Going afterwards, the visualization and dashboard building experience will remain the same as it is today.

A PoC for adding data-source model and use it in index-pattern and visualization can be found at: https://github.com/zengyan-amazon/OpenSearch-Dashboards/tree/ext-data-source-discover

There is a caveat that data-source includes user credentials, which needs to be encrypted and handled carefully. That may break the general data handling in saved object service, as data-source needs to be handled specially. Or we may end up letting OSD to manage another secure index(or data store) to handle data-source/credentails.

Scope

  • For this RFC, we focus on supporting data sources that is compatible with OpenSearch 1.x APIs. We will try to make sure the design and implementation to be extensible to support other data sources, but it is not a design goal.
  • The credentials should be handled in secure way, such as encryption is in scope.
  • Support of non-visualization plugins, such as alerting, to connect to different OpenSearch data sources is in scope.

FAQ

Is it required to have data source defined for all index patterns? What if I don’t want this capability?

The plan is to have this multiple data source feature configurable, so that users can enable or disable it in OSD’s yml config file.

Also, we wanted to maintain backward compatibility, so that users can upgrade safely. When an index pattern doesn’t have a data source, it can fall back to use the same OpenSearch endpoint as its saved object store.

I enabled security plugin for both OpenSearch Dashboards and OpenSearch clusters, can OpenSearch Dashboards use my OSD credentials to query OpenSearch data sources?

This is more about a implementation level detail. It can work with basic auth, but not likely to work with users who logs into OSD using SSO like OIDC or SAML. We want to provide the simplest expreience to users, and will figure out more details during design and implemenation phase.

Issue Analytics

  • State:closed
  • Created a year ago
  • Reactions:11
  • Comments:11 (10 by maintainers)

github_iconTop GitHub Comments

3reactions
peterniedcommented, Apr 28, 2022

@zengyan-amazon This seems like an opportunity to reinvent the primitive data type used to power OpenSearch Dashboard queries, index-patterns are an OpenSearch concept. What do you think about embedding index-patterns into the data-source definition?

When adding support for other sources like SQL tables, DynamoDB, or CosmoDB there would be a common interface. Another way to frame this problem is how to write an OpenSearch data-source.

2reactions
dblockcommented, Aug 31, 2022

I see that the proposal has a separate UX for credentials and data sources. I think this is a bad idea.

  1. Bad user experience. How many data sources will a typical cluster have? I bet no more than 5, so why would users have to configure credentials in a separate panel, associate them with a data source, etc?
  2. It’s a security problem that implies everybody is an admin and anyone who can create a data source can see all credentials. Charlie creates credentials C, attaches them to data source D1. Now Alice attaches credentials C to data source D2, so Alice can now get a copy of C.
  3. It’s a 1-way door. Once you can associate a set of credentials with multiple data sources you cannot go back to a 1:1 relationship because users rely on the many:1 behavior.

I think that for the first cut you should simplify and not build a credentials panel, but let users configure credentials in the data source editor UX. You can still store credentials in a separate object so that you can build a credentials management panel in the future.

  1. It’s a lot simpler for users to edit credentials in the data source editor.
  2. It enables a security model where Charlie can create a set of credentials that will never be accessible by anyone other than Charlie. Charlie owns the data source they create, nobody else needs to modify/see it.
  3. It’s a 2-way door. You can build a 1:1 data-source:credentials now, and always expand it to many:1, but not the other way around.
Read more comments on GitHub >

github_iconTop Results From Across the Web

Using OpenSearch Dashboards with Amazon OpenSearch ...
Connecting a local Dashboards server to OpenSearch Service · In Dashboards, go to Security, Internal users, and choose Create internal user. · Provide...
Read more >
Launch Highlight: Multiple Data Sources - OpenSearch
OpenSearch Dashboards ' current architecture works only with a single OpenSearch cluster, and to view data in different OpenSearch clusters ...
Read more >
Multiple data sources - OpenSearch documentation
Dashboards is configured in the cluster settings, and the multiple data sources feature is disabled by default. To enable it, you need to...
Read more >
Cross-cluster search - OpenSearch documentation
Cross-cluster search is exactly what it sounds like: it lets any node in a cluster execute search requests against other clusters.
Read more >
OpenSearch Dashboards multi-tenancy
OpenSearch Dashboards requires that you add all HTTP headers to the allow list so that the headers pass to OpenSearch. Multi-tenancy uses a...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found