Question 1

Is PyTorch better than TensorFlow for getting a job in the UK?

Accepted Answer

For most UK AI companies — particularly AI-native startups, scale-ups, and research labs — PyTorch is the dominant framework and the expected skill. TensorFlow retains a strong presence in larger enterprises and financial services firms, particularly in systems built before 2020. A pragmatic approach is to know PyTorch deeply and have enough TensorFlow familiarity to read and modify existing code. The PyTorch ecosystem (HuggingFace, PyTorch Lightning, TorchServe) has become the de facto standard for new projects.

Question 2

What is autograd in PyTorch and why does it matter?

Accepted Answer

Autograd is PyTorch's automatic differentiation engine. When you perform operations on tensors with requires_grad=True, PyTorch builds a computation graph tracking every operation. Calling .backward() on the loss tensor traverses this graph in reverse (reverse-mode automatic differentiation) and accumulates gradients in each tensor's .grad attribute. Understanding autograd matters because it explains why you must call optimizer.zero_grad() before each backward pass (to avoid gradient accumulation), why you use torch.no_grad() during inference (to avoid building the graph when you don't need gradients), and why in-place operations on leaf tensors with requires_grad=True raise errors.

Question 3

When should you save model weights as state_dict vs the full model?

Accepted Answer

Always prefer saving state_dict (torch.save(model.state_dict(), 'weights.pt')) over saving the full model object. Saving the full model with torch.save(model, ...) serialises the model class definition alongside the weights using Python's pickle, which means loading it requires the exact same class to be importable from the exact same module path. This breaks when you refactor code. Saving and loading state_dict is decoupled from the class definition and is the standard approach for checkpointing and deployment.

Question 4

What is the difference between DataParallel and DistributedDataParallel in PyTorch?

Accepted Answer

DataParallel (nn.DataParallel) runs on a single machine, replicates the model to each GPU, splits the input batch across them, and gathers the outputs on the primary GPU. It is easy to use but has significant drawbacks: the primary GPU becomes a bottleneck for the gather operation, and it uses Python threads which suffer from GIL contention. DistributedDataParallel (DDP, torch.nn.parallel.DistributedDataParallel) spawns one process per GPU, synchronises gradients via all-reduce (using NCCL on CUDA), and has no bottleneck GPU. DDP is always preferred for multi-GPU training and is the only option for multi-node training.

Question 5

What is PyTorch Lightning and should ML engineers learn it?

Accepted Answer

PyTorch Lightning is a high-level framework built on top of PyTorch that handles the engineering boilerplate of training loops, distributed training, mixed precision, logging, and checkpointing through a structured LightningModule abstraction. It is widely used at UK AI companies and significantly reduces the amount of repetitive training code. ML engineers should understand both raw PyTorch (essential for debugging and custom implementations) and Lightning (important for productive engineering work). Knowing how Lightning maps to raw PyTorch internals is a mark of genuine depth.

PyTorch for Deep Learning
The 2026 Skills Guide

Tensors and Autograd

Building Models with nn.Module

The PyTorch Training Loop

DataLoader and Dataset

Mixed Precision, Serialisation, and Deployment

Learning Path for PyTorch Skills

Foundations (0–2 months)

Core Skills (2–5 months)

Production Skills (5–10 months)

Expert Level (10+ months)

Frequently Asked Questions

Is PyTorch better than TensorFlow for UK jobs?

What is autograd and why does it matter?

When should you save state_dict vs the full model?

What is the difference between DataParallel and DistributedDataParallel?

Should ML engineers learn PyTorch Lightning?

Browse PyTorch Jobs in the UK

Quick Facts

Key Ecosystem

Roles That Need This

Related Skills

PyTorch for Deep LearningThe 2026 Skills Guide