Skip to content

ICU-23365 Initialize pointer members in TransliterationRule constructor to prevent crash#3936

Open
OwenSanzas wants to merge 1 commit intounicode-org:mainfrom
OwenSanzas:ICU-23365-fix-transliteration-rule-crash
Open

ICU-23365 Initialize pointer members in TransliterationRule constructor to prevent crash#3936
OwenSanzas wants to merge 1 commit intounicode-org:mainfrom
OwenSanzas:ICU-23365-fix-transliteration-rule-crash

Conversation

@OwenSanzas
Copy link
Copy Markdown

Description

TransliterationRule's main constructor (rbt_rule.cpp:58) does not initialize
its pointer members (anteContext, key, postContext, output) in the initializer
list. If the constructor returns early (e.g., when U_FAILURE(status) at line 72),
these pointers remain uninitialized. When the LocalPointer destructor subsequently
calls ~TransliterationRule(), it executes delete on garbage pointer values,
causing a SEGV.

Triggered via the public API Transliterator::createFromRules() with a 15-byte
malformed input.

Root Cause

The initializer list only initializes segments(nullptr) and data(theData).
On early return, anteContext, key, postContext, and output contain garbage.
The destructor unconditionally deletes all of them.

Fix

Add nullptr initialization for all pointer members and zero-initialization for
scalar members in the initializer list. After this fix, early returns leave all
pointers as nullptr, and delete nullptr is a safe no-op per the C++ standard.

Reproduction

Crash input (15 bytes):
```bash
echo 'AAAcAAABABgYJAAYGCsA' | base64 -d > crash.bin
```

Fuzzer:
```cpp
#include <stddef.h>
#include <stdint.h>
#include
#include
#include "unicode/translit.h"
#include "unicode/unistr.h"
#include "unicode/parseerr.h"
#include "unicode/utypes.h"

extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) {
if (size < 3) return 0;
uint8_t dir = data[0] & 1;
data++; size--;
size_t unistr_size = size / 2;
std::unique_ptr<char16_t[]> fuzzbuff(new char16_t[unistr_size]);
std::memcpy(fuzzbuff.get(), data, unistr_size * 2);
icu::UnicodeString fuzzstr(false, fuzzbuff.get(), unistr_size);

UErrorCode status = U_ZERO_ERROR;
UParseError pe;
std::unique_ptr<icu::Transliterator> t(
    icu::Transliterator::createFromRules(
        UNICODE_STRING_SIMPLE(\"fuzz\"), fuzzstr,
        dir ? UTRANS_FORWARD : UTRANS_REVERSE, pe, status));
if (U_SUCCESS(status) && t) {
    icu::UnicodeString sample(u\"Hello World 123\");
    t->transliterate(sample);
}
return 0;

}
```

Standalone (no fuzzer):
```cpp
#include "unicode/translit.h"
#include "unicode/unistr.h"
#include "unicode/parseerr.h"
#include "unicode/utypes.h"
#include
#include
#include

int main(int argc, char* argv[]) {
std::ifstream file(argv[1], std::ios::binary);
std::vector<uint8_t> data((std::istreambuf_iterator(file)),
std::istreambuf_iterator());
if (data.size() < 3) return 1;
uint8_t dir = data[0] & 1;
size_t unistr_size = (data.size() - 1) / 2;
std::vector<char16_t> buf(unistr_size);
memcpy(buf.data(), data.data() + 1, unistr_size * 2);
icu::UnicodeString rules(false, buf.data(), unistr_size);
UErrorCode status = U_ZERO_ERROR;
UParseError pe;
icu::Transliterator* t = icu::Transliterator::createFromRules(
UNICODE_STRING_SIMPLE("test"), rules,
dir ? UTRANS_FORWARD : UTRANS_REVERSE, pe, status);
if (U_SUCCESS(status) && t) {
icu::UnicodeString sample(u"Hello");
t->transliterate(sample);
delete t;
}
return 0;
}
```

Before fix: SEGV at rbt_rule.cpp:196
After fix: createFromRules failed: U_MEMORY_ALLOCATION_ERROR (graceful error return)

Contribution: Found by FuzzingBrain (O2Lab, Texas A&M University) using a custom transliterator fuzzer targeting the previously unfuzzed createFromRules() API surface.

Checklist

  • Required: Issue filed: ICU-23365
  • Required: The PR title must be prefixed with a JIRA Issue number
  • Required: Each commit message must be prefixed with a JIRA Issue number
  • Issue accepted (done by Technical Committee after discussion)
  • Tests included, if applicable
  • API docs and/or User Guide docs changed or added, if applicable

The main constructor's initializer list did not initialize anteContext,
key, postContext, or output. If the constructor returns early (e.g.,
U_FAILURE(status)), these pointers remain uninitialized. The destructor
then calls delete on garbage values, causing a SEGV.

Fix: initialize all pointer and scalar members in the initializer list.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant