[codex] add TorchRL Flame DQN example by k82cn · Pull Request #453 · xflops/flame

k82cn · 2026-05-13T22:47:46Z

Summary

Add a TorchRL DQN example under examples/rl/torchrl_dqn based on the upstream TorchRL tutorial loop.
Wire distributed collection through Flame Runner services and replay storage through patch_object.
Add configurable replay modes, including sharded replay with parallel sampling controls.
Document local and distributed usage, heavier discrete environments, and validation notes.

Validation

python3 -m py_compile examples/rl/torchrl_dqn/main.py examples/rl/torchrl_dqn/model.py examples/rl/torchrl_dqn/collector.py examples/rl/torchrl_dqn/replay_buffer.py
python3 examples/rl/torchrl_dqn/main.py --help
sdk/python/.venv/bin/ruff check examples/rl/torchrl_dqn
sdk/python/.venv/bin/ruff format --check examples/rl/torchrl_dqn
uv run main.py --local --env acrobot --iterations 1 --collections 1 --frames-per-collection 2 --batch-size 1 --warmup-frames 1 --replay simple --metrics-json /tmp/torchrl-main-simple-smoke.json
uv run main.py --local --env acrobot --iterations 1 --collections 1 --frames-per-collection 2 --batch-size 1 --warmup-frames 1 --replay sharded --replay-shards 2 --sample-work 8 --sample-parallelism 2 --metrics-json /tmp/torchrl-main-sharded-smoke.json
git diff --check

Distributed Flame runtime smoke was not run because it requires an active Flame cluster.

gemini-code-assist

Code Review

This pull request introduces a new reinforcement learning example, examples/rl/torchrl_dqn, which adapts the TorchRL CartPole DQN tutorial to the Flame Runner framework. The implementation includes a distributed rollout collector, a sharded replay buffer using Flame's ObjectRef and patch_object capabilities, and a local training mode for validation. Feedback on the code changes highlights opportunities to improve performance by moving imports out of the hot collection loop and suggests removing an unused helper function in the main entry point.

gemini-code-assist · 2026-05-13T22:49:46Z

+        import random
+
+        import torch
+        from model import flatten_observation
+        from tensordict import TensorDict


These imports are performed inside the _select_action method, which is called for every environment step (e.g., 100 times per collection). Although Python caches imports in sys.modules, repeated lookups in a performance-sensitive reinforcement learning loop add unnecessary overhead. Moving these imports to the top of the file is more efficient and adheres to PEP 8 guidelines.

References

Imports are always put at the top of the file, just after any module comments and docstrings, and before module globals and constants. ^(link)

gemini-code-assist · 2026-05-13T22:49:46Z

+def _sample_request_sizes(batch_size: int, sample_parallelism: int) -> list[int]:
+    if sample_parallelism < 1:
+        raise ValueError("sample_parallelism must be at least 1")
+    if batch_size < 1:
+        raise ValueError("batch_size must be at least 1")
+
+    return split_batch(batch_size, sample_parallelism)


The _sample_request_sizes function is defined but never used in the script. _sample_shard_plan (line 58) implements its own logic using split_batch directly. Removing this dead code improves the maintainability of the example.

codecov · 2026-05-13T23:25:15Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

gemini-code-assist Bot reviewed May 13, 2026

View reviewed changes

k82cn force-pushed the codex/torchrl-flame-dqn-example branch 2 times, most recently from 46529e8 to 7f8e3f1 Compare May 13, 2026 23:55

Add TorchRL DQN Flame example

fb06d43

k82cn force-pushed the codex/torchrl-flame-dqn-example branch from 7f8e3f1 to fb06d43 Compare May 14, 2026 00:02

k82cn merged commit 62d3783 into xflops:main May 14, 2026
6 checks passed

k82cn deleted the codex/torchrl-flame-dqn-example branch May 14, 2026 01:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[codex] add TorchRL Flame DQN example#453

[codex] add TorchRL Flame DQN example#453
k82cn merged 1 commit into
xflops:mainfrom
k82cn:codex/torchrl-flame-dqn-example

k82cn commented May 13, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 13, 2026

Uh oh!

gemini-code-assist Bot May 13, 2026

Uh oh!

codecov Bot commented May 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

k82cn commented May 13, 2026

Summary

Validation

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 13, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 13, 2026

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented May 13, 2026

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant