ICU-23365 Initialize pointer members in TransliterationRule constructor to prevent crash#3936
Open
OwenSanzas wants to merge 1 commit intounicode-org:mainfrom
Open
Conversation
The main constructor's initializer list did not initialize anteContext, key, postContext, or output. If the constructor returns early (e.g., U_FAILURE(status)), these pointers remain uninitialized. The destructor then calls delete on garbage values, causing a SEGV. Fix: initialize all pointer and scalar members in the initializer list.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
TransliterationRule's main constructor (rbt_rule.cpp:58) does not initializeits pointer members (
anteContext,key,postContext,output) in the initializerlist. If the constructor returns early (e.g., when
U_FAILURE(status)at line 72),these pointers remain uninitialized. When the
LocalPointerdestructor subsequentlycalls
~TransliterationRule(), it executesdeleteon garbage pointer values,causing a SEGV.
Triggered via the public API
Transliterator::createFromRules()with a 15-bytemalformed input.
Root Cause
The initializer list only initializes
segments(nullptr)anddata(theData).On early return,
anteContext,key,postContext, andoutputcontain garbage.The destructor unconditionally deletes all of them.
Fix
Add
nullptrinitialization for all pointer members and zero-initialization forscalar members in the initializer list. After this fix, early returns leave all
pointers as
nullptr, anddelete nullptris a safe no-op per the C++ standard.Reproduction
Crash input (15 bytes):
```bash
echo 'AAAcAAABABgYJAAYGCsA' | base64 -d > crash.bin
```
Fuzzer:
```cpp
#include <stddef.h>
#include <stdint.h>
#include
#include
#include "unicode/translit.h"
#include "unicode/unistr.h"
#include "unicode/parseerr.h"
#include "unicode/utypes.h"
extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) {
if (size < 3) return 0;
uint8_t dir = data[0] & 1;
data++; size--;
size_t unistr_size = size / 2;
std::unique_ptr<char16_t[]> fuzzbuff(new char16_t[unistr_size]);
std::memcpy(fuzzbuff.get(), data, unistr_size * 2);
icu::UnicodeString fuzzstr(false, fuzzbuff.get(), unistr_size);
}
```
Standalone (no fuzzer):
```cpp
#include "unicode/translit.h"
#include "unicode/unistr.h"
#include "unicode/parseerr.h"
#include "unicode/utypes.h"
#include
#include
#include
int main(int argc, char* argv[]) {
std::ifstream file(argv[1], std::ios::binary);
std::vector<uint8_t> data((std::istreambuf_iterator(file)),
std::istreambuf_iterator());
if (data.size() < 3) return 1;
uint8_t dir = data[0] & 1;
size_t unistr_size = (data.size() - 1) / 2;
std::vector<char16_t> buf(unistr_size);
memcpy(buf.data(), data.data() + 1, unistr_size * 2);
icu::UnicodeString rules(false, buf.data(), unistr_size);
UErrorCode status = U_ZERO_ERROR;
UParseError pe;
icu::Transliterator* t = icu::Transliterator::createFromRules(
UNICODE_STRING_SIMPLE("test"), rules,
dir ? UTRANS_FORWARD : UTRANS_REVERSE, pe, status);
if (U_SUCCESS(status) && t) {
icu::UnicodeString sample(u"Hello");
t->transliterate(sample);
delete t;
}
return 0;
}
```
Before fix: SEGV at rbt_rule.cpp:196
After fix: createFromRules failed: U_MEMORY_ALLOCATION_ERROR (graceful error return)
Contribution: Found by FuzzingBrain (O2Lab, Texas A&M University) using a custom transliterator fuzzer targeting the previously unfuzzed createFromRules() API surface.
Checklist