Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
63b1727
update s3 contrib
mdragilev Jun 28, 2024
b852f88
remove b1
tempanyman Jul 16, 2024
0bdbfcf
Merge pull request #26 from Affirm/michaeld/s3-contrib-bump-boto
tempanyman Jul 16, 2024
149a0a0
Luigi scheduler running with Py 3.12 and HelloWorldTask completed suc…
mrafayaleem Jan 13, 2026
1569533
Rebase branch
mrafayaleem Jan 13, 2026
95510ea
Add more Python 3.12 compatibility fixes
mrafayaleem Jan 16, 2026
6b905e9
Test fixes
mrafayaleem Jan 16, 2026
0d4f01a
More fixes
mrafayaleem Jan 16, 2026
c81d487
More fixes
mrafayaleem Jan 16, 2026
2b8479c
Add more changes
mrafayaleem Jan 16, 2026
ac961da
Update claude.md
mrafayaleem Jan 16, 2026
daaf37c
Add removed tests
mrafayaleem Jan 16, 2026
af685eb
Rename
mrafayaleem Jan 16, 2026
811cea4
RC1
mrafayaleem Jan 26, 2026
9c66156
RC2
mrafayaleem Jan 26, 2026
72da954
RC2
mrafayaleem Jan 26, 2026
2aa26cc
Update requests
mrafayaleem Jan 27, 2026
4ea003c
Publish rc3
mrafayaleem Jan 27, 2026
bef79d4
More fixes
mrafayaleem Jan 27, 2026
424b5b9
RC4
mrafayaleem Jan 27, 2026
3be6fda
Update versioning to bypass uv pre-release verification
mrafayaleem Jan 30, 2026
1dcda89
Add Py39 compatibility
mrafayaleem Mar 23, 2026
be80135
Fix run_tests.sh script
mrafayaleem Mar 23, 2026
0006a90
Rebase s3 from 1.4.7
mrafayaleem Mar 23, 2026
0aadf9c
Update s3
mrafayaleem Mar 24, 2026
42b1d07
Update gitignore
mrafayaleem Mar 24, 2026
2aa8f2c
S3 compatibility
mrafayaleem Mar 24, 2026
05c7113
Reorder S3 classes and lift boto3 imports to top level
mrafayaleem Apr 13, 2026
9cf1c1a
Fail fast in S3ClientBoto3.__init__ if boto3 is not installed
mrafayaleem Apr 13, 2026
f2893bd
Bump the version
mrafayaleem Apr 13, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -172,3 +172,6 @@ Icon
Network Trash Folder
Temporary Items
.apdisk
.venv*
.pypirc

51 changes: 51 additions & 0 deletions .mcp.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
{
"mcpServers": {
"microsoft/markitdown": {
"type": "stdio",
"command": "uvx",
"args": [
"markitdown-mcp==0.0.1a4"
],
"gallery": "https://api.mcp.github.com",
"version": "1.0.0"
},
"Context7": {
"type": "stdio",
"command": "npx",
"args": [
"-y",
"@upstash/context7-mcp@latest"
]
},
"spark-history-server": {
"type": "http",
"url": "http://localhost:18888/mcp",
"env": {
"SHS_MCP_TRANSPORT": "stdio"
}
},
"memory-bank": {
"type": "stdio",
"command": "uvx",
"args": [
"--from",
"git+ssh://git@github.com/Affirm/ai-memory-bank-mcp",
"mcp_memory_bank_setup"
]
},
"notion": {
"type": "stdio",
"command": "npx",
"args": [
"-y",
"mcp-remote",
"https://mcp.notion.com/mcp"
]
},
"github": {
"type": "http",
"url": "https://api.githubcopilot.com/mcp/"
}
},
"inputs": []
}
1 change: 1 addition & 0 deletions .python-version
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
3.12.7
168 changes: 168 additions & 0 deletions CLAUDE.md
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nicely written instruction file here, did u use it along with the mcp set up json from above for this upgrade?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did yes. Heavily used both MCP and claude.md files. Also whenever I would discover a new command or process that was relevant, I would ask Claude to update this .md file.

Original file line number Diff line number Diff line change
@@ -0,0 +1,168 @@
# Luigi Development Guide

## Overview
Luigi is a Python package for building complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization, and more.

## Development Setup

### Virtual Environment
```bash
source .venv/bin/activate
```

### Running Tests
```bash
# Run all tests and log results (default: Py312)
./run_tests.sh

# Run against Py39 (.venv39) or Py312 (.venv) explicitly
./run_tests.sh --py39
./run_tests.sh --py312

# Run specific test file
python -m pytest test/some_test.py -v

# Run specific test
python -m pytest test/some_test.py::TestClass::test_method -v

# Run with config (required for some tests)
LUIGI_CONFIG_PATH=test/testconfig/luigi.cfg python -m pytest test/some_test.py -v
```

### Running Race-Condition-Sensitive Tests in Isolation

Some tests use multiprocessing, real network ports, or timing-dependent scheduler state. They pass reliably in isolation but may flake when run alongside the full suite due to port conflicts or shared resources. Run them individually:

```bash
# Multiprocess worker tests (spawn new processes; sensitive to system load)
python -m pytest test/worker_multiprocess_test.py -v

# Dynamic dependency tests with multiple workers (timing-sensitive)
python -m pytest "test/worker_test.py::DynamicDependenciesWithMultipleWorkersTest" -v

# Scheduler tests that start a real server process
python -m pytest test/scheduler_test.py -v

# RPC / server tests (bind to real ports)
python -m pytest test/rpc_test.py test/server_test.py -v

# Remote scheduler tests
python -m pytest test/remote_scheduler_test.py -v
```

If a test fails in the full suite but passes in isolation, it is a pre-existing race condition — not a regression.

**Known macOS-only failures (pass on Linux CI):** `worker_multiprocess_test` and `rpc_test::RequestsFetcherTest::test_fork_changes_session` fail on macOS because Python 3.8+ changed the default multiprocessing start method from `fork` to `spawn`. Spawn requires all subprocess targets to be picklable at the top level, which these tests are not. Do not attempt to fix these locally.

## Project Structure
- `luigi/` - Main package source code
- `test/` - Test files
- `test/contrib/` - Tests for contrib modules (AWS, databases, etc.)
- `test/testconfig/` - Test configuration files

## Key Files
- `luigi/worker.py` - Task execution worker
- `luigi/scheduler.py` - Central scheduler
- `luigi/task.py` - Base Task class
- `luigi/parameter.py` - Parameter types
- `luigi/contrib/` - Integration modules (S3, ECS, Batch, etc.)

## Python 3.9 / 3.12 Dual Compatibility

This branch targets **both Python 3.9 and 3.12**. All changes use Python 3.3+ APIs.

### Py39 Test Environment

```bash
# Create a Py39 virtualenv (requires pyenv 3.9.18 installed)
PYENV_VERSION=3.9.18 python -m venv .venv39
source .venv39/bin/activate
pip install -e ".[toml]"
pip install psutil six sqlalchemy mock boto3 hypothesis pygments # test deps from tox.ini

# Run the test suite
python -m pytest test/ --ignore=test/contrib/mysqldb_test.py --ignore=test/visualiser \
--continue-on-collection-errors -x -q 2>&1 | tee /tmp/luigi-test-py39.log
```

### Py312 Compatibility Notes
- `random.seed()` no longer accepts tuples — use `hash()` to convert
- `random.randrange()` no longer accepts floats — use `int(1e10)` instead of `1e10`
- `pickle.dump()` requires binary mode (`"wb"`)
- `pickle.dump()` for scheduler state uses `protocol=3` for cross-version portability
- `collections.Mapping/MutableSet/Iterable` → `collections.abc.*` (removed from top-level in Py312)
- `inspect.getargspec()` → `inspect.getfullargspec()` (removed in Py312)
- `pkg_resources.resource_filename()` → `importlib.resources.files()` (pkg_resources deprecated)
- `nose` module uses removed `imp` module — use pytest marks instead
- `logging.config.fileConfig()` raises `FileNotFoundError` for missing files (was `KeyError`)
- External `six` package (e.g. `from six.moves.urllib...`) → native `urllib.*` (Python 3.0+)
- `six.PY3` checks can be removed entirely — always `True` on any supported Python 3.x

### Known Pre-existing Test Failures (not caused by Py312 changes)
- `test/contrib/mysqldb_test.py` — requires MySQL connector not installed in dev env
- `test/visualiser/` — requires Selenium not installed in dev env
- ~39 other failures confirmed pre-existing on both Py39 and Py312 baselines

## Running Luigi

### Quick Test with Local Scheduler (No Server)
For quick testing without starting a server, use `--local-scheduler` which runs an in-memory scheduler:
```bash
# Run the hello world example with in-memory scheduler (no web UI)
# Note: PYTHONPATH=. is needed to find the examples module from project root
PYTHONPATH=. luigi --module examples.hello_world examples.HelloWorldTask --local-scheduler
```

### Central Scheduler with Web UI (luigid)
The `luigid` daemon provides a central scheduler with web interface at http://localhost:8082

#### Run in Foreground
```bash
# Create log directory first
mkdir -p /tmp/luigi-logs

# Start the scheduler with web UI (http://localhost:8082)
luigid --logdir /tmp/luigi-logs

# Or with state persistence (survives restarts)
luigid --port 8082 --logdir /tmp/luigi-logs --state-path /tmp/luigi-state.pickle

# In another terminal, run a task against the central scheduler
PYTHONPATH=. luigi --module examples.hello_world examples.HelloWorldTask
```

#### Run in Background
```bash
mkdir -p /tmp/luigi-logs

# Start scheduler in background
luigid --background --logdir /tmp/luigi-logs --pidfile /tmp/luigi.pid

# Run a task
PYTHONPATH=. luigi --module examples.hello_world examples.HelloWorldTask

# Kill the scheduler
kill $(cat /tmp/luigi.pid)
# Or if pidfile not used
pkill -f luigid
```

## Building and Publishing

### Build the package
```bash
source .venv/bin/activate
python setup.py sdist bdist_wheel
twine check dist/*
```

### Publish to Artifactory
Credentials are in `.pypirc`. Upload using the `pypi-local` index:
```bash
twine upload --config-file .pypirc -r pypi-local dist/*
```

## Common Test Issues
- boto3 tests require AWS region configuration or proper mocking
- SQLAlchemy tests need eager loading for relationships to avoid DetachedInstanceError
- Process-related tests may need small delays for `/proc` filesystem to be ready
Loading