feat(security): Add design documents for sensitive data management#10732
feat(security): Add design documents for sensitive data management#10732babumahesh wants to merge 1 commit intoapache:mainfrom
Conversation
1. External Secret Management (SecretProvider) - Integration with
AWS Secrets Manager, Vault, and other external secret stores to
avoid storing secrets in Gravitino's backend database
2. Database Encryption at Rest (EncryptionProvider) - AES-GCM
encryption for sensitive catalog properties stored in Gravitino's
backend database
|
I have design proposals for
Can you please review further or if possible we can connect over call aslo that would be better to discuss in detail. |
|
Thanks for your design document. I am pleased to communicate about this topic. I took a short look.
|
For Existing catalog migrate the sensitive properties
Should we configure a KMS for a Gravitino server or serveral KMS for a Gravitino server:
|
We need a migration document to help users to migrate the sensitive properties when required.
Is it necessary to have multiple KMS for one server? Which situations require this feature? |
Multiple KMS support is not strictly necessary, but provides flexibility for specific enterprise scenarios:
Though we could always start with one KMS for simplicity. |
|
Yes, we could always start with one KMS for simplicity. |
Agreed. And for "Do we need users to know which KMS they are using?" Yes, it's better if users know which KMS is configured. This makes it easier for them to onboard keys or obtain necessary access. We might need to add an admin/settings section in the Gravitino UI to display the KMS configuration (at least the service details), though this admin/settings kind of section doesn't currently exist in the UI. |
|
Is there anything else from the approach perspective? If this looks good , then we can start with implementation further. |
Yes, I need a deeper review. Thanks for your understanding. |
Admin should know that but we shouldn't expose it to normal users. Normal users won't care about which database we used. |
| * @throws SecretResolutionException if secret resolution fails | ||
| */ | ||
| @Nullable | ||
| String resolveSecret(String secretReference); |
There was a problem hiding this comment.
Should we define a class secretReference? String is flexible but it lacks constraint.
| * @throws SecretResolutionException if secret resolution fails | ||
| */ | ||
| @Nullable | ||
| String resolveSecret(String secretReference); |
There was a problem hiding this comment.
Could u give me other systems' secret reference like Databricks and Snowflake?
|
|
||
| ### Core Components | ||
|
|
||
| #### 1. SecretProvider Interface |
There was a problem hiding this comment.
Maybe we can refer to the Iceberg similar design, you can see
/** A minimum client interface to connect to a key management service (KMS). */
public interface KeyManagementClient extends Serializable, Closeable {
/**
* Wrap a secret key, using a wrapping/master key which is stored in KMS and referenced by an ID.
* Wrapping means encryption of the secret key with the master key, and adding optional
* KMS-specific metadata that allows the KMS to decrypt the secret key in an unwrapping call.
*
* @param key a secret key being wrapped
* @param wrappingKeyId a key ID that represents a wrapping key stored in KMS
* @return wrapped key material
*/
ByteBuffer wrapKey(ByteBuffer key, String wrappingKeyId);
/**
* Some KMS systems support generation of secret keys inside the KMS server.
*
* @return true if KMS server supports key generation and KeyManagementClient implementation is
* interested to leverage this capability. Otherwise, return false - Iceberg will then
* generate secret keys locally (using the SecureRandom mechanism) and call {@link
* #wrapKey(ByteBuffer, String)} to wrap them in KMS.
*/
default boolean supportsKeyGeneration() {
return false;
}
/**
* Generate a new secret key in the KMS server, and wrap it using a wrapping/master key which is
* stored in KMS and referenced by an ID. This method will be called only if supportsKeyGeneration
* returns true.
*
* @param wrappingKeyId a key ID that represents a wrapping key stored in KMS
* @return key in two forms: raw, and wrapped with the given wrappingKeyId
*/
default KeyGenerationResult generateKey(String wrappingKeyId) {
throw new UnsupportedOperationException("Key generation is not supported in this KmsClient");
}
/**
* Unwrap a secret key, using a wrapping/master key which is stored in KMS and referenced by an
* ID.
*
* @param wrappedKey wrapped key material (encrypted key and optional KMS metadata, returned by
* the wrapKey method)
* @param wrappingKeyId a key ID that represents a wrapping key stored in KMS
* @return raw key bytes
*/
ByteBuffer unwrapKey(ByteBuffer wrappedKey, String wrappingKeyId);
/**
* Initialize the KMS client with given properties.
*
* @param properties kms client properties
*/
void initialize(Map<String, String> properties);
/**
* Close KMS Client to release underlying resources, this could be triggered in different threads
* when KmsClient is shared by multiple encryption managers.
*/
@Override
default void close() {}
/**
* For KMS systems that support key generation, this class keeps the key generation result - the
* raw secret key, and its wrap.
*/
class KeyGenerationResult {
private final ByteBuffer key;
private final ByteBuffer wrappedKey;
public KeyGenerationResult(ByteBuffer key, ByteBuffer wrappedKey) {
this.key = key;
this.wrappedKey = wrappedKey;
}
public ByteBuffer key() {
return key;
}
public ByteBuffer wrappedKey() {
return wrappedKey;
}
}
}
You can survey about this and realize its design, background and consideration by finding the discussion and design documents.
|
In general, I prefer that we have realized other systems user experience and provide a clear feature. |
What changes were proposed in this pull request?
Raise a document about external secrets and encryption at rest for catalog properties .
External Secret Management (SecretProvider) - Integration with
AWS Secrets Manager, Vault, and other external secret stores to
avoid storing secrets in Gravitino's backend database
(babumahesh@e9a4b4a#diff-5eecc2a25fa5b2a764f787ac1dc8b9f3b117790320d634d6f6f3cf6960814996)
Catalog properties Encryption at Rest (EncryptionProvider) - AES-GCM
encryption for sensitive catalog properties stored in Gravitino's
backend database
(babumahesh@e9a4b4a#diff-be1104f868165d9d33be232eb484de1b821778c07f1d5c775919dc3122433a44)
Why are the changes needed?
It's a feature.
Fix:
#10415
#4681
Does this PR introduce any user-facing change?
N/A
How was this patch tested?