On The Fly Encryption Feature Proposal
See original GitHub issueFeature Proposal
This document is a proposal of On The Fly encryption feature that allows OpenSearch to encrypt search indices on the Directory level using different encryption keys per index.
Why we need it
Enterprise customers require additional controls over data they store in multi-tenanted cloud services. Data encryption with a customer provided key is one of the features these customers are asking for. This feature allows customers to manage their own master key and then give a cloud service access to encrypt or decrypt customer’s data with derived data keys. A customer can revoke master key in a case of a security incident making their data non-decryptable.
This feature enables a better data isolation in a multi-tenanted service, allows for a better audit trail, and for an added security.
OpenSearch does not provide fine-grained multi-tenanted encryption solution yet. It’s either enabled for the whole cluster or for a data node, or is fully disable. When we use a search index per tenant, there is no way to configure encryption per index. Having a separate OpenSearch cluster per tenant is too expensive.
Proposal
The proposal is to implement a new Lucene Directory that will encrypt or decrypt shard data on the fly. We can use existing settings.store.type
configuration to enable encryption when we create an index. For example:
{
"settings": {
"store": {
"type": "cryptofs"
}
}
In this case cryptofs
becomes a new Store Type. OpenSearch will use CryptoDirectory for this specific store type.
Potentially, we can implement CryptoDyrectory as a simple FilterDirectory to leverage existing Index Input and Output classes, however this approach won’t allow us to leverage buffered reads and writes. Lucene issues frequent single byte read and write calls, so it’s better to read from and write into an encrypted buffer instead of decrypting and encrypting single bytes every time.
We propose to override Lucene IndexInput and IndexOutput with a new encrypting implementations to leverage existing IO buffer optimization. CryptoDirectory will extend FSDirectory and will instantiate overridden versions of these inputs.
Also, Index Input and Output classes provide access to underlying IO streams, it allows to leverage existing optimized stream encryption libraries.
Encryption
Concrete encryption algorithm can be made configurable, but it’s critical to use no-padding algorithms to keep Lucene’s random IO access support.
Concrete crypto provider will be also configurable. Crypto providers like Amazon Corretto, SunJCE, or Bouncy Castle come with their own tradeoffs. Consumer of this On The Fly encryption feature should be able to make a decision based on their specific performance, FIPS compliance, or runtime environment requirements.
{
"settings": {
...
"encryption": {
"algorithm": "AES/GCM/NoPadding",
"provider": "SunJCE",
...
}
}
}
Key management
Each index shard will require one or multiple data keys to encrypt data. We can start with only one data key per shard to simplify key management. But this solution can evolve, for example OpenSearch can generate new data keys according to a time-based or usage-based criteria.
All shard data keys will be derived from one master key defined on the index level. When OpenSearch creates a new index, CryptoDirectoryFactory will reach out to a Key Management Service (KMS) to generate a data key pair. Encrypted version of the data key can be persisted in a key
file inside the shard data folder itself. Any encryption or decryption operation will require a plain text version of the key, CryptoDirectory will need to make a call to the KMS service to decrypt encrypted data key. It will cache this plain text key version in a short lived cache for performance reasons.
Here is how we can configure a KMS when we create an index:
{
"settings": {
"store": {
"type": "cryptofs"
},
"encryption": {
"kms_type": "aws_kms",
"master_key": "arn:aws:kms:us-west-2:111122223333:key/943842d0-f961-4322-aff5-e9581e7271b7"
}
}
}
This configuration can support multiple KMS vendors if required.
Key revocation and restoration
When customer revokes access to a master key, OpenSearch cannot decrypt encrypted data keys anymore. It will be able to decrypt encrypted data with a cached plain text version of a key until key cache expires, but after that any requests will start failing. OpenSearch will require a special error code to convey this error to consumers.
Any background operations like merge or refresh will also start failing - they will require a special handling to avoid data corruption.
Key restoration will require no specific logic. Once customer restores key access, then OpenSearch can use immediately to decrypt data keys.
Key rotation and re-encryption
This proposal does not cover managed key rotation and re-encryption. OpenSearch re-indexing satisfies both of these requirements during initial implementation phase.
Audit trail
Customers will be interested in monitoring how OpenSearch uses their encryption keys. Any KMS requests will be logged automatically on the customer’s KMS side. However when OpenSearch uses these data key to encrypt of decrypt data, no logs will be produced.
Performance
Encryption comes with a performance cost. Actual performance degradation will depend on a request type and on encryption algorithm. For example, according to our initial performance benchmarking overhead on injection and simple queries is less than on complex queries with functions and aggregates.
Concrete acceptable performance degradation numbers are still TBD.
Shipment options
We would like this feature to be available in managed AWS OpenSearch service. We can either ship this feature as a community plugin or implement it inside OpenSearch itself.
Issue Analytics
- State:
- Created a year ago
- Reactions:10
- Comments:7 (3 by maintainers)
Top GitHub Comments
I agree they need to be independent, some customers could use encrypted file systems and store encrypted snapshots in clouds, using their own keys or build-in functionality provided by clouds.
Thank you for your explanation now it is clear.
Got it.
I especially asked this question due to the problem I thought existed for merging procedure. But I’m for switching IV each 64 GB, instead of limiting it to 64 GB. Such problem exists for the encrypted plugin, which I’m going to fix soon.
Got it. Thank you for your explanation.
The idea is good But (IMHO):
64GB
?It affects the size of index data on the disk and in the memory as well since encrypted data is worse than non-encrypted
For snapshotting encryption we already introduced a plugin here: https://github.com/aiven/encrypted-repository-opensearch and it was added here: https://github.com/opensearch-project/project-website/pull/812 as a community plugin