[#10780] feat(catalog-glue): Add GlueSchema and GlueTable model classes with tests#10781
Open
diqiu50 wants to merge 9 commits intoapache:mainfrom
Open
[#10780] feat(catalog-glue): Add GlueSchema and GlueTable model classes with tests#10781diqiu50 wants to merge 9 commits intoapache:mainfrom
diqiu50 wants to merge 9 commits intoapache:mainfrom
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR introduces initial AWS Glue catalog model/conversion utilities (schema/table/column + type conversion), along with properties/capability metadata and a suite of unit tests to validate Glue → Gravitino mappings.
Changes:
- Add Glue model classes (
GlueSchema,GlueTable,GlueColumn) and supporting utilities (GlueTypeConverter,GlueClientProvider, constants). - Define Glue catalog/table properties metadata and connector capability declarations.
- Add unit tests (plus AWS-tagged integration-style tests) for conversions, capabilities, and property metadata.
Reviewed changes
Copilot reviewed 21 out of 21 changed files in this pull request and generated 16 comments.
Show a summary per file
| File | Description |
|---|---|
| catalogs/catalog-glue/src/main/java/org/apache/gravitino/catalog/glue/GlueTypeConverter.java | Type string ↔ Gravitino Type conversion logic |
| catalogs/catalog-glue/src/main/java/org/apache/gravitino/catalog/glue/GlueTable.java | Glue Table → Gravitino table model mapping |
| catalogs/catalog-glue/src/main/java/org/apache/gravitino/catalog/glue/GlueSchema.java | Glue Database → Gravitino schema model mapping |
| catalogs/catalog-glue/src/main/java/org/apache/gravitino/catalog/glue/GlueColumn.java | Glue Column → Gravitino column model mapping |
| catalogs/catalog-glue/src/main/java/org/apache/gravitino/catalog/glue/GlueConstants.java | Shared config/property keys for the connector |
| catalogs/catalog-glue/src/main/java/org/apache/gravitino/catalog/glue/GlueClientProvider.java | GlueClient construction from catalog config |
| catalogs/catalog-glue/src/main/java/org/apache/gravitino/catalog/glue/GlueCatalogPropertiesMetadata.java | Catalog-level property metadata definitions |
| catalogs/catalog-glue/src/main/java/org/apache/gravitino/catalog/glue/GlueTablePropertiesMetadata.java | Table-level property metadata definitions |
| catalogs/catalog-glue/src/main/java/org/apache/gravitino/catalog/glue/GlueCatalogCapability.java | Connector capability declarations |
| catalogs/catalog-glue/src/test/java/org/apache/gravitino/catalog/glue/TestGlueTypeConverter.java | Unit tests for type conversion |
| catalogs/catalog-glue/src/test/java/org/apache/gravitino/catalog/glue/TestGlueClientProvider.java | Unit tests for Glue client construction/validation |
| catalogs/catalog-glue/src/test/java/org/apache/gravitino/catalog/glue/TestGlueCatalogPropertiesMetadata.java | Unit tests for catalog property metadata |
| catalogs/catalog-glue/src/test/java/org/apache/gravitino/catalog/glue/TestGlueTablePropertiesMetadata.java | Unit tests for table property metadata |
| catalogs/catalog-glue/src/test/java/org/apache/gravitino/catalog/glue/TestGlueCatalogCapability.java | Unit tests for capability declarations |
| catalogs/catalog-glue/src/test/java/org/apache/gravitino/catalog/glue/AbstractGlueSchemaTest.java | Shared schema conversion test scenarios |
| catalogs/catalog-glue/src/test/java/org/apache/gravitino/catalog/glue/AbstractGlueTableTest.java | Shared table conversion test scenarios |
| catalogs/catalog-glue/src/test/java/org/apache/gravitino/catalog/glue/SyntheticGlueSchemaTest.java | Runs schema scenarios using SDK builders |
| catalogs/catalog-glue/src/test/java/org/apache/gravitino/catalog/glue/SyntheticGlueTableTest.java | Runs table scenarios using SDK builders |
| catalogs/catalog-glue/src/test/java/org/apache/gravitino/catalog/glue/AwsGlueSchemaIT.java | AWS-tagged schema tests against real Glue |
| catalogs/catalog-glue/src/test/java/org/apache/gravitino/catalog/glue/AwsGlueTableIT.java | AWS-tagged table tests against real Glue |
| catalogs/catalog-glue/build.gradle.kts | Adds dependency + adjusts test task behavior |
Comments suppressed due to low confidence (1)
catalogs/catalog-glue/build.gradle.kts:101
- The AWS integration tests are in
src/test/javaand will run by default, but the Gradle config only excludes thegravitino-aws-testtag when-PskipITsis set. This makes./gradlew :catalogs:catalog-glue:testlikely to fail in CI/local runs without AWS credentials (contradicts the “skipped by default” Javadoc). Consider excludinggravitino-aws-testby default, and only including it when an explicit property (e.g.-PrunAwsTests) is provided.
tasks.test {
val skipITs = project.hasProperty("skipITs")
if (skipITs) {
exclude("**/integration/test/**")
// Skip AWS integration tests (require real AWS credentials).
useJUnitPlatform {
excludeTags("gravitino-aws-test")
}
} else {
dependsOn(tasks.jar)
}
}
Contributor
Author
|
@copilot resolve the merge conflicts in this pull request |
…etadata - Add GlueConstants for all catalog and table property keys - Add GlueClientProvider: static creds / DefaultCredentialChain selection, region, and endpoint override (for VPC endpoints / LocalStack) - Implement GlueCatalogPropertiesMetadata: required aws-region + aws-glue-catalog-id, optional credentials (hidden), endpoint, default-table-format, table-type-filter - Implement GlueCatalogCapability: case-insensitive names, no NOT NULL, no DEFAULT - Implement GlueTablePropertiesMetadata: table_type, metadata_location, location - Add TestGlueClientProvider unit tests
- Fix stale JavaDoc: DEFAULT_TABLE_FORMAT "Defaults to iceberg" -> "Defaults to hive" - GlueClientProvider: fail-fast on partial credentials (one key without the other) - GlueClientProvider: wrap URI.create with property-context error message - GlueCatalogCapability: remove COLUMN from case-insensitive scope (no AWS docs backing) - GlueTablePropertiesMetadata: remove ephemeral PR-05 forward reference from comment - TestGlueClientProvider: use try-with-resources; update partial-cred test to expect exception; add tests for secret-only and invalid endpoint cases - Add TestGlueCatalogCapability: covers all capability method contracts - Add TestGlueCatalogPropertiesMetadata: covers required/hidden/immutable flags and defaults
- Use StringUtils.isNotBlank() to reject blank region and credential values - Make aws-glue-catalog-id optional (Glue defaults to caller's account ID) - Add casing note to TABLE_FORMAT_TYPE JavaDoc distinguishing Glue uppercase from filter lowercase - Clarify deferred validation for default-table-format and table-type-filter
- Rename TABLE_TYPE_FILTER_ALL to DEFAULT_TABLE_TYPE_FILTER - Add default credential chain order comment in GlueClientProvider - Remove try-catch wrapping on URI.create for endpoint validation - Make aws-glue-catalog-id optional - Add casing note to TABLE_FORMAT_TYPE JavaDoc - Clarify deferred validation for default-table-format and table-type-filter - Add StringUtils.isNotBlank() for blank value checks
…ests Add model layer for catalog-glue (PR-03): - GlueConstants: storage descriptor and table-format constants - GlueTypeConverter: Glue/Hive type string to Gravitino Type mapping - GlueSchema: maps AWS Glue Database -> Gravitino BaseSchema - GlueColumn: maps AWS Glue Column -> Gravitino BaseColumn - GlueTable: maps AWS Glue Table -> Gravitino BaseTable (columns, partitioning, distribution, sort orders, properties) Test architecture: abstract base class + two implementations: - SyntheticGlueXxxTest: SDK builder, no network, always runs - AwsGlueXxxIT: real AWS Glue API, tagged gravitino-aws-test, skipped by default
…Table - Rename test classes to follow TestXxx naming convention - Add error handling for malformed type strings in GlueTypeConverter - Fix NPE in GlueTable distribution and sort order null-checks - Add null-safe handling for GlueSchema parameters - Use GlueException for AWS cleanup errors - Add aws.glue test dependency for SdkException class - Fix TABLE_FORMAT description to use uppercase values
Code Coverage Report
Files
|
Glue/Hive timestamps are timezoneless. fromGravitino now throws IllegalArgumentException for TimestampType.withTimeZone(), consistent with HiveDataTypeConverter. Ref: https://cwiki.apache.org/confluence/display/hive/languagemanual+types
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Implement GlueSchema, GlueColumn, GlueTable model classes that convert AWS Glue API objects (Database, Table, Column) to Gravitino's internal models, along with GlueTypeConverter for type conversion and comprehensive unit tests.
Why are the changes needed?
The Glue catalog implementation requires model classes for schema and table operations. These classes provide the foundation for subsequent CRUD implementations.
Fix: #10780
Does this PR introduce any user-facing change?
No. This is internal model class implementation.
How was this patch tested?
./gradlew :catalogs:catalog-glue:test -PskipITs