Fixes #6086: Add transparent image detection and math tag validation#6144
Conversation
|
@BenHenning PTAL when you get a chance. Thanks! |
|
@nikhilkumarpanigrahi this PR is being marked as draft because the PR description should not use 'Fix #' or 'Fix part of #' syntax. Instead use 'Fixes' and 'Fixes part of', per referenced issue(s): #6086. |
|
@nikhilkumarpanigrahi this PR is being marked as draft because the PR description should not use 'Fix #' or 'Fix part of #' syntax. Instead use 'Fixes' and 'Fixes part of', per referenced issue(s): #6086. |
Coverage ReportResultsNumber of files assessed: 3 Failure Cases
Failing coverage
|
|
Regarding coverage failures: The coverage bot flags DownloadLessons.kt, ImageRepairer.kt, and StructureCompatibilityChecker.kt due to low overall file coverage. However, the new functionality I added (hasTransparentPixels and checkMathTagsForLatex) is fully covered by the test files included in this PR (ImageRepairerTest.kt and StructureCompatibilityCheckerTest.kt). The low percentages (16% and 8%) come from pre-existing untested code in those files, not from my changes. I believe the coverage exemptions for these base-branch files would be addressed separately. |
|
@BenHenning, @adhiamboperes PTAL |
|
Apologies--hoping to look at this tomorrow. |
BenHenning
left a comment
There was a problem hiding this comment.
Thanks @nikhilkumarpanigrahi. I'm largely happy with these changes, but I do have one suggestion to make before I test them. PTAL.
scripts/src/java/org/oppia/android/scripts/gae/compat/StructureCompatibilityChecker.kt
Outdated
Show resolved
Hide resolved
scripts/src/java/org/oppia/android/scripts/gae/compat/StructureCompatibilityChecker.kt
Outdated
Show resolved
Hide resolved
scripts/src/java/org/oppia/android/scripts/assets/ImageRepairer.kt
Outdated
Show resolved
Hide resolved
|
Unassigning @nikhilkumarpanigrahi since a re-review was requested. @nikhilkumarpanigrahi, please make sure you have addressed all review comments. Thanks! |
…t' into lesson-asset-validation-v4
- Fix ktlint import ordering and max-line-length violations - Remove deprecated worked_example fields from skills.textproto to match ConceptCard proto schema
|
Hi @BenHenning, I wanted to confirm the preferred base branch for this PR. It is currently based on introduce-asset-download-script, and I see Check base branch failing because of that. Should I keep it on this branch for now, or retarget/rebase it to develop? I’m happy to update it whichever way you prefer. |
Coverage ReportResultsNumber of files assessed: 9 Failure Cases
Failing coverage
Passing coverageFiles with passing code coverage
Exempted coverageFiles exempted from coverage
|
- Add explicit validation for missing/malformed math_content-with-value attribute in StructureCompatibilityChecker.checkMathTagsForLatex() - Add two new CompatibilityFailure types: - MathTagMissingContent: detects missing math_content-with-value attribute - MathTagHasInvalidContent: detects unparseable JSON in math content - Expand StructureCompatibilityCheckerTest with 3 new test cases covering missing content and malformed JSON edge cases - Create LocalizationTrackerTest with 3 test cases for extractMathContentsFromHtml() covering empty HTML, valid parsed content, and malformed JSON handling - Add ImageRepairerTest negative-path test for invalid image data exception handling - All 18 unit tests passing (StructureCompatibilityCheckerTest: 7, LocalizationTrackerTest: 3, ImageRepairerTest: 8) - Code formatted per ktlint and buildifier standards
- Fix repository fallback logic to preserve aar/jar extension when resolving Maven artifact URLs from maven_install.json - Add regression test for aar fallback coordinate resolution - Fix POM download retry bookkeeping to avoid mutable-set hash instability that could leave null POM content and crash with NPE - Switch zlib archive URL in WORKSPACE from HTTP to HTTPS to avoid checksum mismatch during remote fetch in CI
Coverage ReportResultsNumber of files assessed: 9 Failure Cases
Failing coverage
Passing coverageFiles with passing code coverage
Exempted coverageFiles exempted from coverage
|
@nikhilkumarpanigrahi you can ignore basically all of the code coverage and CI failures here since the base branch isn't yet stable for merging (having other people work on the download script is a bit unusual in and of itself, and I think this is one of the first times we've had someone open an actual PR against it). Normally we don't ignore those, but in this case it's fine since I actually need to go and update a lot (most) of this code later on, anyway. However, I can't quite take a review pass on this PR because it seems like there are a lot of unrelated changes per the 'Files changed' tab. I'm not sure if a merge went bad or something, but please make sure to self check your code and remove anything not directly related to the objective of the PR. Once that's done please assign this back to me to take another review pass. |
|
@BenHenning thank you for the guidance. I did a full self-check and removed unrelated changes so this PR is now scoped only to the objective of #6086 (transparent image detection + math tag validation). The branch has been updated with a regular push, and the Files changed tab should now reflect only relevant changes. PTAL when you get a chance. |
|
Unassigning @nikhilkumarpanigrahi since a re-review was requested. @nikhilkumarpanigrahi, please make sure you have addressed all review comments. Thanks! |
Coverage ReportResultsNumber of files assessed: 8 Failure Cases
Failing coverage
|
BenHenning
left a comment
There was a problem hiding this comment.
Thanks @nikhilkumarpanigrahi. I had some follow-up thoughts, PTAL.
| "<oppia-noninteractive-math math_content-with-value=\"" + | ||
| "{&quot;raw_latex&quot;:&quot;x^2&quot;\"></oppia-noninteractive-math>" | ||
|
|
||
| assertFailsWith<Exception> { |
There was a problem hiding this comment.
We should be using assertThrows from the main testing package for this as it's available for use in tests. See example:
We also should always test the specific exception type (not Exception) that's thrown and its message, when populated, to ensure the correct exception is actually being thrown.
| fun testHasTransparentPixels_invalidImageData_throwsError() { | ||
| val invalidImageData = "not-a-valid-image".toByteArray() | ||
|
|
||
| assertFailsWith<IllegalStateException> { |
There was a problem hiding this comment.
Ditto here--see my later comment for correctly asserting errors being thrown.
| ) | ||
| converterInitializer = ConverterInitializer( | ||
| activityService, coroutineDispatcher, topicDependencies, imageDownloader, downloadConfig | ||
| activityService, coroutineDispatcher, topicDependencies, imageDownloader, downloadConfig, filterInvalidTopics |
There was a problem hiding this comment.
Why is this showing up? Did you sync your branch to the latest changes from introduce-asset-download-script? This (& the other changes in this file) should already be in the base branch.
| return@mapNotNull MathTagMissingContent(contentId, origin) | ||
| } | ||
|
|
||
| val parsedMathContent = runCatching { |
There was a problem hiding this comment.
Why is runCatching needed here? It's generally an antipattern to use exceptions for control flow so we really ought to avoid that.
Explanation
Fixes #6086
This PR extends the existing asset download pipeline to detect:
Transparent images - Any pixels with alpha < 255 that may cause poor visibility in dark mode (ref [BUG]: Dark mode makes it difficult to see some images and LaTeX #5792)
Invalid math tags - oppia-noninteractive-math tags missing raw_latex content, which would force the app to use math SVGs directly.
Specifically, this PR:
hasTransparentPixels()toImageRepairer.ktto scan image pixel data for non-opaque alpha values, skipping SVGs.checkMathTagsForLatex()toStructureCompatibilityChecker.ktwithMathTagMissingRawLatexfailure type, integrated intocheckHasValidHtml()flow.DownloadLessons.ktwith count-based summary and per-image reporting.ImageRepairerTest.ktwith 7 tests,StructureCompatibilityCheckerTest.ktwith 6 tests).LocalizationTracker.ktto makeextractMathContentsFromHtml()return non-nullable values and fail fast on parse errors.Validation performed:
Essential Checklist
For UI-specific PRs only
N/A