Skip to content
Binary file added .DS_Store
Binary file not shown.
55 changes: 55 additions & 0 deletions PYDREAM_BUGS_DOCUMENTATION.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# PyDream Diagnostics and Bug Fix Report

**Current Status:** All identified critical bugs have been **FIXED**. The codebase has been modernized and verified through unit tests and reproduction scripts. No further bugs are currently known.

---

## Fixed Bug 1: The `multitry=2` Crash (NumPy Dimensionality Squeeze)

### Symptom
When running `pydream` with `multitry=2` and `parallel=True`, the execution immediately crashed inside the parallel worker pool with: `TypeError: object of type 'numpy.float64' has no len()`.

### Root Cause
A NumPy dimensionality issue where `multitry=2` resulted in a single alternative proposal point. NumPy would "squeeze" this `(1, N_parameters)` array into a 1D array `(N_parameters,)`. When passed to `pool.map()`, it would iterate over individual parameters instead of the parameter set, passing single floats to the likelihood function.

### Implementation of Fix
- **Location**: `pydream/Dream.py`, `mt_evaluate_logps` method.
- **Action**: Enforced 2D dimensionality using `np.atleast_2d(proposed_pts)` before any iteration or mapping.
- **Action**: Standardized the evaluation loop in the serial/nested block to handle any number of points correctly, preventing unpacking errors.

---

## Fixed Bug 2: The Nested Multiprocessing Bottleneck (`multitry` + `parallel=True`)

### Symptom
When running with `parallel=True` and `multitry > 2`, CPU usage would hit 100% but performance would be extremely slow/frozen due to IPC overhead.

### Root Cause
Recursive spawning of worker pools. The main process spawned a pool for chains, and each chain worker then spawned its own sub-pool for multi-try evaluations because they shared the same `parallel=True` flag. This led to massive serialization overhead and thread thrashing.

### Implementation of Fix
- **Location**: `pydream/Dream.py`, `mt_evaluate_logps` method.
- **Action**: Added a check for the current process name: `if parallel and mp.current_process().name == 'MainProcess':`.
- **Result**: Multi-try evaluations are now only parallelized if the chains themselves are running serially (or if there's only one chain). If chains are already in a parallel pool, the multi-try evaluations run sequentially within their worker, eliminating the IPC bottleneck.

---

## Modernization for Python 3.11+ and NumPy 2.x

### NumPy 2.x Compatibility
- **Explicit Dtypes**: Updated all `np.frombuffer` calls in `pydream/Dream.py` to include `dtype=np.float64`. Modern NumPy requires explicit dtypes when reading from shared memory objects to avoid ambiguity.

### Python 3.11+ Standards
- **Test Modernization**: Updated `pydream/tests/test_dream.py` to remove legacy Python 2 checks (`sys.version_info[0] < 3`).
- **Standard Library Usage**: Replaced deprecated `assertRaisesRegexp` with `assertRaisesRegex`.

---

## Verification and Validation

The fixes have been rigorously validated:
1. **Bug Reproduction Script**: A minimal reproduction script confirmed that `multitry=2` with `parallel=True` no longer crashes and that `multitry=5` with `parallel=True` executes with high performance (no nested pool bottleneck).
2. **Unit Tests**: Existing unit tests in `pydream/tests/` were updated and passed (verified using Python 3.11 and modern NumPy).
3. **Cross-Environment Check**: Verified compatibility in environments where external dependencies like PySB or BioNetGen might be absent, ensuring the core algorithm remains robust.

**Conclusion:** The reported issues are resolved. The `pydream` package is now stable and performant under multi-try parallel configurations.
37 changes: 37 additions & 0 deletions gemini.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# Gemini Code Assist Agent Directives: PyDream Modernization & Bug Fixing

## Objective
Your task is to fix critical bugs and modernize the `pydream` codebase to be fully compatible with Python 3.11+ and NumPy 2.4+.

## Strict Constraints
1. **Retain Functionality**: The underlying mathematical and algorithmic logic of Differential Evolution Markov Chain (DREAM) sampling must remain strictly unchanged.
2. **Measured Refactoring**: While modernizing the codebase, allow for stylistic refactoring to meet basic Pylance, PEP-8, and type-hinting standards. Ensure these changes improve readability and maintainability without altering the mathematical logic.
3. **Code Clarity & Efficiency**: Where changes are required, use standard, efficient, and readable Python/NumPy paradigms.
4. **NumPy 2.4+ Compatibility**: Ensure no deprecated NumPy types or functions are used (e.g., `np.float`, `np.int`, `np.bool` must be converted to native `float`, `int`, `bool`).

## Identified Bugs
For a comprehensive list of previously identified bugs, their root causes, and resolution details, please refer to the `PYDREAM_BUGS_DOCUMENTATION.md` file.

**Current Status:** As noted in the documentation, all previously identified critical bugs (such as the `multitry=2` crash and the nested multiprocessing bottleneck) are currently marked as **FIXED**. Maintain these fixes and refer to the documentation if investigating related regressions.

## Modernization Checklist
1. **Python 3.11+ Standards**:
- Verify the `multiprocessing` context management is handled correctly. `mp.get_context()` is currently used, ensure its implementation does not conflict with Python 3.11+ daemon process constraints.
- Remove compatibility fallbacks for Python 2.x (e.g., checking `sys.version_info[0] < 3` for `assertRaisesRegexp`).
2. **NumPy 2.4+ Strict Compatibility**:
- NumPy 2.x removed several aliases. Scan the codebase for `np.float`, `np.int`, `np.bool`, and `np.object` and replace them with standard Python types or valid NumPy `dtypes` (e.g., `np.float64`, `bool`).
- Verify boolean masking and indexing arrays properly return 1D or ND arrays as expected.
- Pay attention to `np.frombuffer` usages with multiprocessing shared arrays to ensure no strict casting violations exist.
- `np.nan_to_num` and `np.linalg.norm` operations should be checked to ensure keyword arguments and behaviors align with NumPy 2.4+.
3. **Pylance & Type Hinting (New)**:
- Introduce basic type hinting (e.g., `int`, `float`, `bool`, `list`, `Callable` and basic `numpy.typing`) for function and method signatures.
- Resolve basic Pylance warnings (e.g., unused imports, undefined variables, unreachable code).
- Modernize string formatting (e.g., replace `'string %s' % var` with modern f-strings where it improves readability).
- Clean up excessive empty lines and enforce standard indentation.

## Workflow Instructions for the Agent
When asked to implement these fixes:
1. Focus on one specific module or bug at a time.
2. First verify the nature of the bugs within the codebase, and then proceed to resolve them.
3. Prioritize providing full diffs (Unified Diff Format) for modified files.
4. Verify the changes against the Constraints listed above before outputting the code.
Binary file added pydream/.DS_Store
Binary file not shown.
Loading