Skip to content

Fixes #6086: Add transparent image detection and math tag validation#6144

Open
nikhilkumarpanigrahi wants to merge 26 commits intooppia:introduce-asset-download-scriptfrom
nikhilkumarpanigrahi:lesson-asset-validation-v4
Open

Fixes #6086: Add transparent image detection and math tag validation#6144
nikhilkumarpanigrahi wants to merge 26 commits intooppia:introduce-asset-download-scriptfrom
nikhilkumarpanigrahi:lesson-asset-validation-v4

Conversation

@nikhilkumarpanigrahi
Copy link
Copy Markdown
Contributor

@nikhilkumarpanigrahi nikhilkumarpanigrahi commented Mar 12, 2026

Explanation

Fixes #6086

This PR extends the existing asset download pipeline to detect:

Transparent images - Any pixels with alpha < 255 that may cause poor visibility in dark mode (ref [BUG]: Dark mode makes it difficult to see some images and LaTeX #5792)
Invalid math tags - oppia-noninteractive-math tags missing raw_latex content, which would force the app to use math SVGs directly.

Specifically, this PR:

  • Adds hasTransparentPixels() to ImageRepairer.ktto scan image pixel data for non-opaque alpha values, skipping SVGs.
  • Adds checkMathTagsForLatex() to StructureCompatibilityChecker.kt with MathTagMissingRawLatex failure type, integrated into checkHasValidHtml() flow.
  • Adds post-download transparency scan in DownloadLessons.kt with count-based summary and per-image reporting.
  • Adds comprehensive test coverage (ImageRepairerTest.kt with 7 tests, StructureCompatibilityCheckerTest.kt with 6 tests).
  • Fixes LocalizationTracker.kt to make extractMathContentsFromHtml() return non-nullable values and fail fast on parse errors.

Validation performed:

  • All new functionality tests passing.
  • ktlint compliance verified.
  • Static checks passing.

Essential Checklist

  • The PR title starts with "Fix #bugnum: " (If this PR fixes part of an issue, prefix the title with "Fix part of #bugnum: ...".)
  • The explanation section above starts with "Fixes #bugnum: " (If this PR fixes part of an issue, use instead: "Fixes part of #bugnum: ...".)
  • Any changes to scripts/assets files have their rationale included in the PR explanation.
  • The PR follows the style guide.
  • The PR does not contain any unnecessary code changes from Android Studio (reference).
  • The PR is made from a branch that's not called "develop" and is up-to-date with "develop".
  • The PR is assigned to the appropriate reviewers (reference).

For UI-specific PRs only

N/A

@nikhilkumarpanigrahi nikhilkumarpanigrahi requested review from a team as code owners March 12, 2026 09:03
@nikhilkumarpanigrahi nikhilkumarpanigrahi requested review from BenHenning and removed request for a team March 12, 2026 09:03
@nikhilkumarpanigrahi
Copy link
Copy Markdown
Contributor Author

nikhilkumarpanigrahi commented Mar 12, 2026

@BenHenning
This is a clean re-submission replacing the closed #6120. All your previous review feedback has been addressed. I also found and fixed two bugs during testing:
RAW_LATEX_REGEX never matched real Oppia data (wrong trailing escape pattern)
checkHasValidMathTags was internal, which is inaccessible across Bazel module boundaries

PTAL when you get a chance. Thanks!

@github-actions
Copy link
Copy Markdown

@nikhilkumarpanigrahi this PR is being marked as draft because the PR description should not use 'Fix #' or 'Fix part of #' syntax. Instead use 'Fixes' and 'Fixes part of', per referenced issue(s): #6086.

@github-actions github-actions bot marked this pull request as draft March 12, 2026 09:39
@nikhilkumarpanigrahi nikhilkumarpanigrahi changed the title Fix #6086: Add transparent image detection and math tag validation Fixes #6086: Add transparent image detection and math tag validation Mar 12, 2026
@nikhilkumarpanigrahi nikhilkumarpanigrahi marked this pull request as ready for review March 12, 2026 09:40
@github-actions
Copy link
Copy Markdown

@nikhilkumarpanigrahi this PR is being marked as draft because the PR description should not use 'Fix #' or 'Fix part of #' syntax. Instead use 'Fixes' and 'Fixes part of', per referenced issue(s): #6086.

@github-actions github-actions bot marked this pull request as draft March 12, 2026 09:47
@nikhilkumarpanigrahi nikhilkumarpanigrahi marked this pull request as ready for review March 12, 2026 09:49
@github-actions
Copy link
Copy Markdown

Coverage Report

Results

Number of files assessed: 3
Overall Coverage: 9.58%
Coverage Analysis: FAIL

Failure Cases

File Failure Reason Status
DownloadLessons.ktscripts/src/java/org/oppia/android/scripts/assets/DownloadLessons.kt
No appropriate test file found for scripts/src/java/org/oppia/android/scripts/assets/DownloadLessons.kt.

Failing coverage

File Coverage Lines Hit Status Min Required
ImageRepairer.ktscripts/src/java/org/oppia/android/scripts/assets/ImageRepairer.kt
16.44% 12 / 73 70%
StructureCompatibilityChecker.ktscripts/src/java/org/oppia/android/scripts/gae/compat/StructureCompatibilityChecker.kt
8.08% 27 / 334 70%

To learn more, visit the Oppia Android Code Coverage wiki page

@nikhilkumarpanigrahi
Copy link
Copy Markdown
Contributor Author

Regarding coverage failures: The coverage bot flags DownloadLessons.kt, ImageRepairer.kt, and StructureCompatibilityChecker.kt due to low overall file coverage. However, the new functionality I added (hasTransparentPixels and checkMathTagsForLatex) is fully covered by the test files included in this PR (ImageRepairerTest.kt and StructureCompatibilityCheckerTest.kt). The low percentages (16% and 8%) come from pre-existing untested code in those files, not from my changes. I believe the coverage exemptions for these base-branch files would be addressed separately.

@nikhilkumarpanigrahi
Copy link
Copy Markdown
Contributor Author

@BenHenning, @adhiamboperes PTAL

@BenHenning
Copy link
Copy Markdown
Member

Apologies--hoping to look at this tomorrow.

Copy link
Copy Markdown
Member

@BenHenning BenHenning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @nikhilkumarpanigrahi. I'm largely happy with these changes, but I do have one suggestion to make before I test them. PTAL.

@oppiabot
Copy link
Copy Markdown

oppiabot bot commented Mar 27, 2026

Unassigning @nikhilkumarpanigrahi since a re-review was requested. @nikhilkumarpanigrahi, please make sure you have addressed all review comments. Thanks!

@nikhilkumarpanigrahi
Copy link
Copy Markdown
Contributor Author

Hi @BenHenning, I wanted to confirm the preferred base branch for this PR. It is currently based on introduce-asset-download-script, and I see Check base branch failing because of that. Should I keep it on this branch for now, or retarget/rebase it to develop? I’m happy to update it whichever way you prefer.

@github-actions
Copy link
Copy Markdown

Coverage Report

Results

Number of files assessed: 9
Overall Coverage: 36.04%
Coverage Analysis: FAIL

Failure Cases

File Failure Reason Status
DownloadLessons.ktscripts/src/java/org/oppia/android/scripts/assets/DownloadLessons.kt
No appropriate test file found for scripts/src/java/org/oppia/android/scripts/assets/DownloadLessons.kt.
LocalizationTracker.ktscripts/src/java/org/oppia/android/scripts/gae/proto/LocalizationTracker.kt
No appropriate test file found for scripts/src/java/org/oppia/android/scripts/gae/proto/LocalizationTracker.kt.
AndroidActivityEndpointApi.ktscripts/src/java/org/oppia/android/scripts/gae/json/AndroidActivityEndpointApi.kt
No appropriate test file found for scripts/src/java/org/oppia/android/scripts/gae/json/AndroidActivityEndpointApi.kt.
GaeQuestion.ktscripts/src/java/org/oppia/android/scripts/gae/json/GaeQuestion.kt
No appropriate test file found for scripts/src/java/org/oppia/android/scripts/gae/json/GaeQuestion.kt.
GaeAndroidEndpointJsonImpl.ktscripts/src/java/org/oppia/android/scripts/gae/GaeAndroidEndpointJsonImpl.kt
No appropriate test file found for scripts/src/java/org/oppia/android/scripts/gae/GaeAndroidEndpointJsonImpl.kt.

Failing coverage

File Coverage Lines Hit Status Min Required
ImageRepairer.ktscripts/src/java/org/oppia/android/scripts/assets/ImageRepairer.kt
16.22% 12 / 74 70%
StructureCompatibilityChecker.ktscripts/src/java/org/oppia/android/scripts/gae/compat/StructureCompatibilityChecker.kt
3.76% 12 / 319 70%

Passing coverage

Files with passing code coverage
File Coverage Lines Hit Status Min Required
MavenDependenciesRetriever.ktscripts/src/java/org/oppia/android/scripts/license/MavenDependenciesRetriever.kt
95.45% 189 / 198 70%

Exempted coverage

Files exempted from coverage
File Exemption Reason
TopicController.ktdomain/src/main/java/org/oppia/android/domain/topic/TopicController.kt
This file is incompatible with code coverage tooling; skipping coverage check.

Refer test_file_exemptions.textproto for the comprehensive list of file exemptions and their required coverage percentages.

To learn more, visit the Oppia Android Code Coverage wiki page

- Add explicit validation for missing/malformed math_content-with-value attribute
  in StructureCompatibilityChecker.checkMathTagsForLatex()
- Add two new CompatibilityFailure types:
  - MathTagMissingContent: detects missing math_content-with-value attribute
  - MathTagHasInvalidContent: detects unparseable JSON in math content
- Expand StructureCompatibilityCheckerTest with 3 new test cases covering
  missing content and malformed JSON edge cases
- Create LocalizationTrackerTest with 3 test cases for extractMathContentsFromHtml()
  covering empty HTML, valid parsed content, and malformed JSON handling
- Add ImageRepairerTest negative-path test for invalid image data exception handling
- All 18 unit tests passing (StructureCompatibilityCheckerTest: 7,
  LocalizationTrackerTest: 3, ImageRepairerTest: 8)
- Code formatted per ktlint and buildifier standards
- Fix repository fallback logic to preserve aar/jar extension when resolving
  Maven artifact URLs from maven_install.json
- Add regression test for aar fallback coordinate resolution
- Fix POM download retry bookkeeping to avoid mutable-set hash instability that
  could leave null POM content and crash with NPE
- Switch zlib archive URL in WORKSPACE from HTTP to HTTPS to avoid checksum
  mismatch during remote fetch in CI
@github-actions
Copy link
Copy Markdown

Coverage Report

Results

Number of files assessed: 9
Overall Coverage: 25.86%
Coverage Analysis: FAIL

Failure Cases

File Failure Reason Status
DownloadLessons.ktscripts/src/java/org/oppia/android/scripts/assets/DownloadLessons.kt
No appropriate test file found for scripts/src/java/org/oppia/android/scripts/assets/DownloadLessons.kt.
AndroidActivityEndpointApi.ktscripts/src/java/org/oppia/android/scripts/gae/json/AndroidActivityEndpointApi.kt
No appropriate test file found for scripts/src/java/org/oppia/android/scripts/gae/json/AndroidActivityEndpointApi.kt.
GaeQuestion.ktscripts/src/java/org/oppia/android/scripts/gae/json/GaeQuestion.kt
No appropriate test file found for scripts/src/java/org/oppia/android/scripts/gae/json/GaeQuestion.kt.
GaeAndroidEndpointJsonImpl.ktscripts/src/java/org/oppia/android/scripts/gae/GaeAndroidEndpointJsonImpl.kt
No appropriate test file found for scripts/src/java/org/oppia/android/scripts/gae/GaeAndroidEndpointJsonImpl.kt.

Failing coverage

File Coverage Lines Hit Status Min Required
ImageRepairer.ktscripts/src/java/org/oppia/android/scripts/assets/ImageRepairer.kt
17.57% 13 / 74 70%
LocalizationTracker.ktscripts/src/java/org/oppia/android/scripts/gae/proto/LocalizationTracker.kt
5.33% 20 / 375 70%
StructureCompatibilityChecker.ktscripts/src/java/org/oppia/android/scripts/gae/compat/StructureCompatibilityChecker.kt
8.63% 29 / 336 70%

Passing coverage

Files with passing code coverage
File Coverage Lines Hit Status Min Required
MavenDependenciesRetriever.ktscripts/src/java/org/oppia/android/scripts/license/MavenDependenciesRetriever.kt
96.02% 193 / 201 70%

Exempted coverage

Files exempted from coverage
File Exemption Reason
TopicController.ktdomain/src/main/java/org/oppia/android/domain/topic/TopicController.kt
This file is incompatible with code coverage tooling; skipping coverage check.

Refer test_file_exemptions.textproto for the comprehensive list of file exemptions and their required coverage percentages.

To learn more, visit the Oppia Android Code Coverage wiki page

@BenHenning
Copy link
Copy Markdown
Member

Hi @BenHenning, I wanted to confirm the preferred base branch for this PR. It is currently based on introduce-asset-download-script, and I see Check base branch failing because of that. Should I keep it on this branch for now, or retarget/rebase it to develop? I’m happy to update it whichever way you prefer.

@nikhilkumarpanigrahi you can ignore basically all of the code coverage and CI failures here since the base branch isn't yet stable for merging (having other people work on the download script is a bit unusual in and of itself, and I think this is one of the first times we've had someone open an actual PR against it). Normally we don't ignore those, but in this case it's fine since I actually need to go and update a lot (most) of this code later on, anyway.

However, I can't quite take a review pass on this PR because it seems like there are a lot of unrelated changes per the 'Files changed' tab. I'm not sure if a merge went bad or something, but please make sure to self check your code and remove anything not directly related to the objective of the PR. Once that's done please assign this back to me to take another review pass.

@nikhilkumarpanigrahi
Copy link
Copy Markdown
Contributor Author

@BenHenning thank you for the guidance. I did a full self-check and removed unrelated changes so this PR is now scoped only to the objective of #6086 (transparent image detection + math tag validation). The branch has been updated with a regular push, and the Files changed tab should now reflect only relevant changes.

PTAL when you get a chance.

@oppiabot
Copy link
Copy Markdown

oppiabot bot commented Apr 5, 2026

Unassigning @nikhilkumarpanigrahi since a re-review was requested. @nikhilkumarpanigrahi, please make sure you have addressed all review comments. Thanks!

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 5, 2026

Coverage Report

Results

Number of files assessed: 8
Overall Coverage: 7.90%
Coverage Analysis: FAIL

Failure Cases

File Failure Reason Status
DownloadLessonList.ktscripts/src/java/org/oppia/android/scripts/assets/DownloadLessonList.kt
No appropriate test file found for scripts/src/java/org/oppia/android/scripts/assets/DownloadLessonList.kt.
DownloadLessons.ktscripts/src/java/org/oppia/android/scripts/assets/DownloadLessons.kt
No appropriate test file found for scripts/src/java/org/oppia/android/scripts/assets/DownloadLessons.kt.
DtoProtoToLegacyProtoConverter.ktscripts/src/java/org/oppia/android/scripts/assets/DtoProtoToLegacyProtoConverter.kt
No appropriate test file found for scripts/src/java/org/oppia/android/scripts/assets/DtoProtoToLegacyProtoConverter.kt.
TopicPackRepository.ktscripts/src/java/org/oppia/android/scripts/gae/compat/TopicPackRepository.kt
No appropriate test file found for scripts/src/java/org/oppia/android/scripts/gae/compat/TopicPackRepository.kt.
GaeAndroidEndpointJsonImpl.ktscripts/src/java/org/oppia/android/scripts/gae/GaeAndroidEndpointJsonImpl.kt
No appropriate test file found for scripts/src/java/org/oppia/android/scripts/gae/GaeAndroidEndpointJsonImpl.kt.

Failing coverage

File Coverage Lines Hit Status Min Required
ImageRepairer.ktscripts/src/java/org/oppia/android/scripts/assets/ImageRepairer.kt
17.57% 13 / 74 70%
LocalizationTracker.ktscripts/src/java/org/oppia/android/scripts/gae/proto/LocalizationTracker.kt
5.33% 20 / 375 70%
StructureCompatibilityChecker.ktscripts/src/java/org/oppia/android/scripts/gae/compat/StructureCompatibilityChecker.kt
8.63% 29 / 336 70%

To learn more, visit the Oppia Android Code Coverage wiki page

Copy link
Copy Markdown
Member

@BenHenning BenHenning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @nikhilkumarpanigrahi. I had some follow-up thoughts, PTAL.

"<oppia-noninteractive-math math_content-with-value=\"" +
"{&amp;quot;raw_latex&amp;quot;:&amp;quot;x^2&amp;quot;\"></oppia-noninteractive-math>"

assertFailsWith<Exception> {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be using assertThrows from the main testing package for this as it's available for use in tests. See example:

We also should always test the specific exception type (not Exception) that's thrown and its message, when populated, to ensure the correct exception is actually being thrown.

fun testHasTransparentPixels_invalidImageData_throwsError() {
val invalidImageData = "not-a-valid-image".toByteArray()

assertFailsWith<IllegalStateException> {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto here--see my later comment for correctly asserting errors being thrown.

)
converterInitializer = ConverterInitializer(
activityService, coroutineDispatcher, topicDependencies, imageDownloader, downloadConfig
activityService, coroutineDispatcher, topicDependencies, imageDownloader, downloadConfig, filterInvalidTopics
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this showing up? Did you sync your branch to the latest changes from introduce-asset-download-script? This (& the other changes in this file) should already be in the base branch.

return@mapNotNull MathTagMissingContent(contentId, origin)
}

val parsedMathContent = runCatching {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is runCatching needed here? It's generally an antipattern to use exceptions for control flow so we really ought to avoid that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants