Skip to content

[#10780] feat(catalog-glue): Add GlueSchema and GlueTable model classes with tests#10781

Open
diqiu50 wants to merge 9 commits intoapache:mainfrom
diqiu50:glue-pr03
Open

[#10780] feat(catalog-glue): Add GlueSchema and GlueTable model classes with tests#10781
diqiu50 wants to merge 9 commits intoapache:mainfrom
diqiu50:glue-pr03

Conversation

@diqiu50
Copy link
Copy Markdown
Contributor

@diqiu50 diqiu50 commented Apr 14, 2026

What changes were proposed in this pull request?

Implement GlueSchema, GlueColumn, GlueTable model classes that convert AWS Glue API objects (Database, Table, Column) to Gravitino's internal models, along with GlueTypeConverter for type conversion and comprehensive unit tests.

Why are the changes needed?

The Glue catalog implementation requires model classes for schema and table operations. These classes provide the foundation for subsequent CRUD implementations.

Fix: #10780

Does this PR introduce any user-facing change?

No. This is internal model class implementation.

How was this patch tested?

  • Unit tests pass: ./gradlew :catalogs:catalog-glue:test -PskipITs

Copilot AI review requested due to automatic review settings April 14, 2026 07:28
@diqiu50 diqiu50 self-assigned this Apr 14, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces initial AWS Glue catalog model/conversion utilities (schema/table/column + type conversion), along with properties/capability metadata and a suite of unit tests to validate Glue → Gravitino mappings.

Changes:

  • Add Glue model classes (GlueSchema, GlueTable, GlueColumn) and supporting utilities (GlueTypeConverter, GlueClientProvider, constants).
  • Define Glue catalog/table properties metadata and connector capability declarations.
  • Add unit tests (plus AWS-tagged integration-style tests) for conversions, capabilities, and property metadata.

Reviewed changes

Copilot reviewed 21 out of 21 changed files in this pull request and generated 16 comments.

Show a summary per file
File Description
catalogs/catalog-glue/src/main/java/org/apache/gravitino/catalog/glue/GlueTypeConverter.java Type string ↔ Gravitino Type conversion logic
catalogs/catalog-glue/src/main/java/org/apache/gravitino/catalog/glue/GlueTable.java Glue Table → Gravitino table model mapping
catalogs/catalog-glue/src/main/java/org/apache/gravitino/catalog/glue/GlueSchema.java Glue Database → Gravitino schema model mapping
catalogs/catalog-glue/src/main/java/org/apache/gravitino/catalog/glue/GlueColumn.java Glue Column → Gravitino column model mapping
catalogs/catalog-glue/src/main/java/org/apache/gravitino/catalog/glue/GlueConstants.java Shared config/property keys for the connector
catalogs/catalog-glue/src/main/java/org/apache/gravitino/catalog/glue/GlueClientProvider.java GlueClient construction from catalog config
catalogs/catalog-glue/src/main/java/org/apache/gravitino/catalog/glue/GlueCatalogPropertiesMetadata.java Catalog-level property metadata definitions
catalogs/catalog-glue/src/main/java/org/apache/gravitino/catalog/glue/GlueTablePropertiesMetadata.java Table-level property metadata definitions
catalogs/catalog-glue/src/main/java/org/apache/gravitino/catalog/glue/GlueCatalogCapability.java Connector capability declarations
catalogs/catalog-glue/src/test/java/org/apache/gravitino/catalog/glue/TestGlueTypeConverter.java Unit tests for type conversion
catalogs/catalog-glue/src/test/java/org/apache/gravitino/catalog/glue/TestGlueClientProvider.java Unit tests for Glue client construction/validation
catalogs/catalog-glue/src/test/java/org/apache/gravitino/catalog/glue/TestGlueCatalogPropertiesMetadata.java Unit tests for catalog property metadata
catalogs/catalog-glue/src/test/java/org/apache/gravitino/catalog/glue/TestGlueTablePropertiesMetadata.java Unit tests for table property metadata
catalogs/catalog-glue/src/test/java/org/apache/gravitino/catalog/glue/TestGlueCatalogCapability.java Unit tests for capability declarations
catalogs/catalog-glue/src/test/java/org/apache/gravitino/catalog/glue/AbstractGlueSchemaTest.java Shared schema conversion test scenarios
catalogs/catalog-glue/src/test/java/org/apache/gravitino/catalog/glue/AbstractGlueTableTest.java Shared table conversion test scenarios
catalogs/catalog-glue/src/test/java/org/apache/gravitino/catalog/glue/SyntheticGlueSchemaTest.java Runs schema scenarios using SDK builders
catalogs/catalog-glue/src/test/java/org/apache/gravitino/catalog/glue/SyntheticGlueTableTest.java Runs table scenarios using SDK builders
catalogs/catalog-glue/src/test/java/org/apache/gravitino/catalog/glue/AwsGlueSchemaIT.java AWS-tagged schema tests against real Glue
catalogs/catalog-glue/src/test/java/org/apache/gravitino/catalog/glue/AwsGlueTableIT.java AWS-tagged table tests against real Glue
catalogs/catalog-glue/build.gradle.kts Adds dependency + adjusts test task behavior
Comments suppressed due to low confidence (1)

catalogs/catalog-glue/build.gradle.kts:101

  • The AWS integration tests are in src/test/java and will run by default, but the Gradle config only excludes the gravitino-aws-test tag when -PskipITs is set. This makes ./gradlew :catalogs:catalog-glue:test likely to fail in CI/local runs without AWS credentials (contradicts the “skipped by default” Javadoc). Consider excluding gravitino-aws-test by default, and only including it when an explicit property (e.g. -PrunAwsTests) is provided.
tasks.test {
  val skipITs = project.hasProperty("skipITs")
  if (skipITs) {
    exclude("**/integration/test/**")
    // Skip AWS integration tests (require real AWS credentials).
    useJUnitPlatform {
      excludeTags("gravitino-aws-test")
    }
  } else {
    dependsOn(tasks.jar)
  }
}

Comment thread catalogs/catalog-glue/build.gradle.kts
@diqiu50
Copy link
Copy Markdown
Contributor Author

diqiu50 commented Apr 14, 2026

@copilot resolve the merge conflicts in this pull request

diqiu50 added 8 commits April 16, 2026 09:55
…etadata

- Add GlueConstants for all catalog and table property keys
- Add GlueClientProvider: static creds / DefaultCredentialChain selection,
  region, and endpoint override (for VPC endpoints / LocalStack)
- Implement GlueCatalogPropertiesMetadata: required aws-region + aws-glue-catalog-id,
  optional credentials (hidden), endpoint, default-table-format, table-type-filter
- Implement GlueCatalogCapability: case-insensitive names, no NOT NULL, no DEFAULT
- Implement GlueTablePropertiesMetadata: table_type, metadata_location, location
- Add TestGlueClientProvider unit tests
- Fix stale JavaDoc: DEFAULT_TABLE_FORMAT "Defaults to iceberg" -> "Defaults to hive"
- GlueClientProvider: fail-fast on partial credentials (one key without the other)
- GlueClientProvider: wrap URI.create with property-context error message
- GlueCatalogCapability: remove COLUMN from case-insensitive scope (no AWS docs backing)
- GlueTablePropertiesMetadata: remove ephemeral PR-05 forward reference from comment
- TestGlueClientProvider: use try-with-resources; update partial-cred test to expect exception;
  add tests for secret-only and invalid endpoint cases
- Add TestGlueCatalogCapability: covers all capability method contracts
- Add TestGlueCatalogPropertiesMetadata: covers required/hidden/immutable flags and defaults
- Use StringUtils.isNotBlank() to reject blank region and credential values
- Make aws-glue-catalog-id optional (Glue defaults to caller's account ID)
- Add casing note to TABLE_FORMAT_TYPE JavaDoc distinguishing Glue uppercase from filter lowercase
- Clarify deferred validation for default-table-format and table-type-filter
- Rename TABLE_TYPE_FILTER_ALL to DEFAULT_TABLE_TYPE_FILTER
- Add default credential chain order comment in GlueClientProvider
- Remove try-catch wrapping on URI.create for endpoint validation
- Make aws-glue-catalog-id optional
- Add casing note to TABLE_FORMAT_TYPE JavaDoc
- Clarify deferred validation for default-table-format and table-type-filter
- Add StringUtils.isNotBlank() for blank value checks
…ests

Add model layer for catalog-glue (PR-03):
- GlueConstants: storage descriptor and table-format constants
- GlueTypeConverter: Glue/Hive type string to Gravitino Type mapping
- GlueSchema: maps AWS Glue Database -> Gravitino BaseSchema
- GlueColumn: maps AWS Glue Column -> Gravitino BaseColumn
- GlueTable: maps AWS Glue Table -> Gravitino BaseTable
  (columns, partitioning, distribution, sort orders, properties)

Test architecture: abstract base class + two implementations:
- SyntheticGlueXxxTest: SDK builder, no network, always runs
- AwsGlueXxxIT: real AWS Glue API, tagged gravitino-aws-test, skipped by default
…Table

- Rename test classes to follow TestXxx naming convention
- Add error handling for malformed type strings in GlueTypeConverter
- Fix NPE in GlueTable distribution and sort order null-checks
- Add null-safe handling for GlueSchema parameters
- Use GlueException for AWS cleanup errors
- Add aws.glue test dependency for SdkException class
- Fix TABLE_FORMAT description to use uppercase values
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 16, 2026

Code Coverage Report

Overall Project 65.21% +0.08% 🟢
Files changed 88.45% 🟢

Module Coverage
aliyun 1.73% 🔴
api 47.09% 🟢
authorization-common 85.96% 🟢
aws 1.1% 🔴
azure 2.6% 🔴
catalog-common 10.2% 🔴
catalog-fileset 80.02% 🟢
catalog-glue 88.66% +17.0% 🟢
catalog-hive 81.83% 🟢
catalog-jdbc-clickhouse 79.06% 🟢
catalog-jdbc-common 42.89% 🟢
catalog-jdbc-doris 80.28% 🟢
catalog-jdbc-hologres 54.03% 🟢
catalog-jdbc-mysql 79.23% 🟢
catalog-jdbc-oceanbase 78.38% 🟢
catalog-jdbc-postgresql 82.05% 🟢
catalog-jdbc-starrocks 78.27% 🟢
catalog-kafka 77.01% 🟢
catalog-lakehouse-generic 45.07% 🟢
catalog-lakehouse-hudi 79.1% 🟢
catalog-lakehouse-iceberg 87.27% 🟢
catalog-lakehouse-paimon 77.71% 🟢
catalog-model 77.72% 🟢
cli 44.51% 🟢
client-java 77.63% 🟢
common 48.97% 🟢
core 81.41% 🟢
filesystem-hadoop3 76.97% 🟢
flink 40.55% 🟢
flink-runtime 0.0% 🔴
gcp 14.2% 🔴
hadoop-common 10.39% 🔴
hive-metastore-common 46.14% 🟢
iceberg-common 50.73% 🟢
iceberg-rest-server 65.85% +0.13% 🟢
integration-test-common 0.0% 🔴
jobs 66.17% 🟢
lance-common 23.88% 🔴
lance-rest-server 57.84% 🟢
lineage 53.02% 🟢
optimizer 82.95% 🟢
optimizer-api 21.95% 🔴
server 85.75% 🟢
server-common 69.52% 🟢
spark 32.79% 🔴
spark-common 39.09% 🔴
trino-connector 33.83% 🔴
Files
Module File Coverage
catalog-glue GlueTablePropertiesMetadata.java 100.0% 🟢
GlueTable.java 96.39% 🟢
GlueColumn.java 93.75% 🟢
GlueSchema.java 93.75% 🟢
GlueTypeConverter.java 90.48% 🟢
GlueConstants.java 0.0% 🔴
iceberg-rest-server IcebergTableHookDispatcher.java 71.88% 🟢

Glue/Hive timestamps are timezoneless. fromGravitino now throws
IllegalArgumentException for TimestampType.withTimeZone(), consistent
with HiveDataTypeConverter.

Ref: https://cwiki.apache.org/confluence/display/hive/languagemanual+types
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Subtask] catalog-glue: Add GlueSchema and GlueTable model classes with tests

2 participants