start by pulling out broken tests from `neleval/test.py` into somewhere they actually get run
start by pulling out broken tests from
neleval/test.pyinto somewhere they actually get run