Enable race detector for CI#1441
Conversation
|
I thought more about a flag like ENABLE_RACE or something like that |
|
LGTM |
|
I have checked 'Allow edits from maintainers.'. You know 😄 Edit: maybe you guys with the maintainer key could upload a new PR with the updated signature. This change is only a one-liner, that would probably easier for you. |
|
LGTM |
|
It is the coverage step that fails, due to: Is that what the signature is for ? |
ab4224e to
57a07cd
Compare
|
The mysql check fails on drone with: Some linker flags missing @typeless ? |
57a07cd to
e9915f9
Compare
|
@lunny Need to resign drone config. |
e9915f9 to
34a00ff
Compare
|
build failed. what's this |
|
Is the drone.sig updated? otherwise put |
|
@bkcsoft yes, updated |
|
@typeless any news? |
|
@lunny It looks that glibc is absent. Alpine Linux uses musl libc which is probably incompatible (or it only supports static linking?) Also I found this golang/go#9918. |
|
So let's move it to v1.x.x |
This comment has been minimized.
This comment has been minimized.
|
Something is logging after or in-between tests again. |
|
The |
It's not that. This is a race forced by go because
If you look carefully at the stacktrace you will see that the race is detected deep within go internal code. This is a deliberate race added by Go in order to detect logging to Logging to I have three suspicions for what is happening: 1. Queue AsynchronicityLet's look at a previous incarnation of this problem - (and perhaps one that has returned):
I previously attempted to handle this through Flushing the queues - but this failed - and ultimately @lunny worked around this by adding the immediate queue - abandoning asynchronicity in queues during tests. Now the above goroutine trace mentions a level db which makes me suspicious that there is still a non-immediate disk/level queue in the tests which may be the cause of this. That could be because of my changes to the default queue settings but it might have always been there waiting for some timing change to appear. 2. Logger AsynchronicityFundamentally the logger has to be asynchronous to be performant and it uses logevent channels as its method of asynchronicity. The current testlogger - which we absolutely need to be able to make interpreting our test logs at all helpful (see above) - is hooked-in using such a channel. There could be a number of logs that that are waiting to be processed when the test ends. An additional 3. Other asynchronous behaviourGitea has other uncontrolled asynchronous activity that is external to Queues and could be the cause of this "race". Go's race detector here is deeply unhelpful as it's not telling us what was trying to be logged and the indirection used in the logger to make it asynchronous means that the stacktrace provided by go's race detection isn't telling us who actually called log. This makes debugging the issue difficult ... SummaryUltimately we're going to need to add complete lifecycle (start/stop) control to every test (and by extension the whole of Gitea) to prevent random races like this in the future. That is - each test will shutdown everything between each test - including logging. There's the other option of just throwing away inter-test logging or emitting it to console but that's just ignoring the issue. |
|
EDIT: nope at least it had passed once https://drone.gitea.io/go-gitea/gitea/42750 🙈 |
Set RACE_ENABLED=0 to disable it when release
|
CI failed seems not related. |
…a into typeless-enable-race-detector-in-ci
|
make L-G-T-M work |
close #1430