The Core W&B API
Every W&B integration starts with wandb.init() and ends with wandb.finish(). Between them, you log metrics, parameters, and artifacts:
wandb.init(project="my-project", config=hyperparams, tags=["v2", "llm"])— Initialises a new run.projectgroups related runs.configstores the hyperparameters for this run (dict or argparse Namespace).tagsallow filtering runs in the dashboard.wandb.log({"train_loss": loss, "val_accuracy": acc}, step=epoch)— Log a dict of metrics at the current step.stepcontrols the x-axis in the W&B dashboard. Call once per training step or epoch.wandb.watch(model, log="all", log_freq=100)— Automatically log model parameter gradients and weights as histograms everylog_freqsteps.log="all"logs both gradients and weights;log="gradients"orlog="parameters"for one or the other. Useful for diagnosing vanishing/exploding gradients.wandb.config— Accessible within a run to read the hyperparameter config. Supports dot notation:wandb.config.learning_rate. During a Sweep, W&B overwrites these values with the sweep's suggested configuration.
Framework integrations: WandbCallback for Keras and HuggingFace Trainer; WandbLogger for PyTorch Lightning. These autolog all standard metrics, model checkpoints, and hyperparameters without explicit log calls.
Sweeps: Hyperparameter Optimisation
W&B Sweeps provide a managed hyperparameter search that coordinates multiple training runs across multiple machines. The sweep controller (run by W&B) tracks all completed runs, fits a probabilistic model, and serves next-configuration recommendations to agents.
Sweep configuration: Define as a Python dict with method (bayes/random/grid), metric (the target metric to optimise, including goal: minimize or maximize), and parameters (the search space):
- Continuous:
{'distribution': 'log_uniform_values', 'min': 1e-5, 'max': 1e-2}for learning rate (log-uniform is appropriate for scale-varying parameters) - Discrete:
{'values': [32, 64, 128, 256]}for batch size - Integer:
{'distribution': 'int_uniform', 'min': 2, 'max': 16}for LoRA rank
Early termination: Configure early_terminate with Hyperband to stop underperforming runs early, freeing compute for more promising configurations. Essential for expensive GPU training runs — can reduce total compute by 2–5× compared to running all configurations to completion.
Parallel agents: Launch multiple agent processes (each calling wandb.agent(sweep_id, train_fn)) on different machines or GPU instances. Each agent requests a configuration from the controller, runs training, and reports results. The controller's Bayesian model improves with each completed run, directing subsequent agents to more promising regions of the search space.
Artifacts and W&B Tables
W&B Artifacts version and track any file used or produced by a training run. The key use cases in ML:
- Dataset versioning: Log your training dataset as an artifact at the start of each experiment. W&B deduplicates content across versions — only changed files are uploaded. The run records which dataset version it used, creating full lineage: dataset v3 → training run abc → model checkpoint.
- Model checkpoints: Log model weights as artifacts with type='model'. The Model Registry in W&B uses artifacts as the storage backend. Checkpoint artifacts are linked to training runs and can be promoted to the registry for deployment tracking.
W&B Tables are interactive spreadsheets for visualising predictions, errors, and distributions. Log a Table with: table = wandb.Table(columns=["input", "prediction", "ground_truth", "confidence"]); table.add_data(text, pred, label, conf); wandb.log({"predictions": table}). In the W&B dashboard, filter, sort, and group table rows interactively — essential for error analysis: filtering by low confidence, examining mislabelled examples, or checking for systematic failure modes on specific input types.
Frequently Asked Questions
What is W&B and what problems does it solve?
W&B is an MLOps platform for experiment tracking, dataset versioning, hyperparameter optimisation, and model evaluation. Solves: impossible to reproduce results or compare experiments without systematic tracking. Hosted dashboard with auto-logged metrics, hyperparameters, GPU stats, and model outputs. Preferred at AI-native companies and research teams for richer visualisation and collaboration.
What is a W&B Sweep and how does Bayesian hyperparameter search work?
Sweeps define a hyperparameter search space. Strategies: grid (exhaustive), random (sample from distributions), or Bayesian (builds a probabilistic model of hyperparameter → metric, suggests configurations with highest Expected Improvement). Bayesian significantly outperforms random for expensive experiments. Configure as YAML, register with wandb.sweep(), run agents with wandb.agent(sweep_id, train_fn).
What are W&B Artifacts?
Versioned storage for datasets, model checkpoints, preprocessing scripts, evaluation results. Content-addressed hashing with cross-version deduplication. Log: wandb.Artifact('dataset', type='dataset'), artifact.add_dir('./data'), run.log_artifact(artifact). Use: run.use_artifact('dataset:latest'), artifact.download('./data'). Full lineage tracking: dataset → model → predictions.
How does W&B compare to MLflow for UK ML engineers?
W&B: richer visualisation, better collaboration (shared dashboards, Reports), superior Sweeps, SaaS-only (data leaves your infrastructure). MLflow: open-source, self-hostable, strong model registry with stage transitions, deeper Databricks/AWS integration. W&B is preferred at AI-native companies; MLflow at enterprises with data residency requirements. Knowing both is a significant advantage.
What are W&B Reports and when should you use them?
Reports are W&B's notebook-like documents for communicating experiment findings. Embed live charts from runs and sweeps directly in a Report, with auto-updating visualisations. Used for: ML project post-mortems, sharing A/B test results with stakeholders, documenting model evaluation for model cards, and team learning documents. Reports replace manually-exported screenshots in slides and make findings reproducible (clicking a chart navigates to the underlying run).