Skip to content

Sweeps updates#2480

Open
ngrayluna wants to merge 5 commits intomainfrom
sweeps_updates
Open

Sweeps updates#2480
ngrayluna wants to merge 5 commits intomainfrom
sweeps_updates

Conversation

@ngrayluna
Copy link
Copy Markdown
Contributor

Description

Realized that Sweeps flattens YAML objects to make accessing values easier, however, it made debugging my script a pain. Updated docs to make this more clear

@ngrayluna ngrayluna marked this pull request as ready for review April 17, 2026 18:35
@ngrayluna ngrayluna requested a review from a team as a code owner April 17, 2026 18:35
@github-actions
Copy link
Copy Markdown
Contributor

📚 Mintlify Preview Links

🔗 View Full Preview

📝 Changed (2 total)

📄 Pages (2)

File Preview
models/sweeps/add-w-and-b-to-your-code.mdx Add W And B To Your Code
models/sweeps/define-sweep-configuration.mdx Define Sweep Configuration

🤖 Generated automatically when Mintlify deployment succeeds
📍 Deployment: fd4a0d7 at 2026-04-17 20:38:14 UTC

@github-actions
Copy link
Copy Markdown
Contributor

🔗 Link Checker Results

All links are valid!

No broken links were detected.

Checked against: https://wb-21fd5541-sweeps-updates.mintlify.app

configuration file contains the hyperparameters you want the sweep to explore. In
the following example, the batch size (`batch_size`), epochs (`epochs`), and
the learning rate (`lr`) hyperparameters are varied during each sweep.
Create a YAML file that defines the hyperparameters to optimize and the metric to optimize. W&B uses this file to determine which hyperparameters to vary during the sweep and which metric to optimize.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

the hyperparameters to optimize and the metric to optimize.
Feels a bit redundant. Maybe:
the hyperparameters and metric to optimize.
?

5. Log the metric you want to optimize with [`wandb.Run.log()`](/models/ref/python/experiments/run.md/#method-runlog). You must log the metric defined in your configuration. Within the configuration dictionary (`sweep_configuration` in this example) you define the sweep to maximize the `val_acc` value.
1. Import the W&B Python SDK (`wandb`).
2. Initialize a [run](/models/runs) with `wandb.init()`.
3. Read the YAML configuration file with a Python package such as yaml, and pass the configuration to `wandb.init()`.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Steps 3 and 4 are not accurate. The training script for sweeps training just needs to call wandb.init() with no arguments, as the sweeps agents will automatically pass in the appropriate arguments.

2. Initialize a [run](/models/runs) with `wandb.init()`.
3. Read the YAML configuration file with a Python package such as yaml, and pass the configuration to `wandb.init()`.
4. Pass the configuration object to the config parameter of `wandb.init()`.
5. Retrieve the hyperparameter values from `wandb.Run.config` so that your script uses the values defined in the YAML file instead of hard-coded values. W&B flattens configuration values, so you can access nested values with dot notation or bracket notation as though they were top-level keys.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably more idiomatic to just use wandb.config.

so that your script uses the values defined in the YAML file

I would say:

so that your script uses the suggested arguments for each run

3. Read the YAML configuration file with a Python package such as yaml, and pass the configuration to `wandb.init()`.
4. Pass the configuration object to the config parameter of `wandb.init()`.
5. Retrieve the hyperparameter values from `wandb.Run.config` so that your script uses the values defined in the YAML file instead of hard-coded values. W&B flattens configuration values, so you can access nested values with dot notation or bracket notation as though they were top-level keys.
6. Log the metric that you want to optimize with `wandb.Run.log()`.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, wandb.log() is more idiomatic now


def main():
# Set up your default hyperparameters
# Read in the configuration file
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit out of date. The most idiomatic way to do this would be to just go:

with wandb init() as run:

with no arguments. No need to read any config files. It'll pick up the arguments automatically.

You can note that if you DO set config in wandb.init for a single run case, it will actually be overridden by the sweep arguments. (That's why this code as written still does work; it's just confusing because the config it loads gets ignored.)


You can then access `nested_value1` with `yaml_sample["key2"]["nested_key1"]` or `yaml_sample.key2.nested_key1`.

When you pass a configuration to `wandb.init(config=)`, W&B flattens the values. This means that you access nested values as though they were top-level keys.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the previous documentation was confusing and caused you to conflate RUN configs with SWEEPS configs.

with open("config.yaml") as file:
    config = yaml.load(file, Loader=yaml.FullLoader)
with wandb.init(config=config) as run:

This is intended to open a RUN config file, not a SWEEPS config file. You don't actually need to specify any run config file when using Sweeps.

That said, that run config file, as mentioned above, is promptly IGNORED by the sweep args generated from the sweep config that was provided to wandb sweep.

max: 0.1
```

After you read in the file and pass the configuration to `wandb.init(config=)`, access the `goal` value with `run.config["goal"]` instead of `run.config["metric"]["goal"]` or `run.config.metric.goal`.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is mixing up the files. The issue here is that you've provided a sweep config to use as a run config... and then the run config is being ignored in favor of the args generated by the sweep and silently passed in... (The only thing that confuses me is that I don't think there should even exist a goal field at all??)

In any case please rewrite the code to just use wandb.init() with no config loading at all, and see what happens then.

values: ["adam", "sgd"]
```

Within the top level `parameters` key (line 7), the following keys are nested: `learning_rate` (line 8), `batch_size` (line 11), `epochs` (line 14), and `optimizer` (line 17). For each of the nested keys you specify, you can provide one or more values, a distribution, a probability, and more.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't really frame the first layer of keys under parameters as nested. Those are actually the top-level parameters. It's only if you nest another level deep that I'd call it nested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants