Prompt engineering — patterns that work

Why prompting still matters

Frontier models in 2025-2026 are massively more capable than 2022-23 ancestors, but prompt structure still dramatically affects output quality. The reason: the model's training distribution rewards certain shapes of input (step-by-step reasoning, structured examples, explicit role assignments). Prompt patterns that align with the training distribution out-perform raw queries.

The foundational patterns

Chain-of-Thought (Wei et al., 2022) — append 'let's think step by step' and watch reasoning benchmarks jump. ReAct (Yao et al., 2022) — interleave reasoning + action steps for tool-use agents. Tree of Thoughts (Yao et al., 2023) — generalize CoT to branching exploration for deliberate problem-solving. InstructGPT (Ouyang et al., 2022) — RLHF training on instruction-response pairs is why models follow instructions at all.

What still doesn't work reliably

Self-evaluation (asking the model 'are you sure?') is poorly calibrated. Few-shot prompting beats zero-shot for narrow extraction but doesn't help open-ended generation. 'Adversarial' prompts that try to bypass safety training increasingly fail on aligned models. Prompt engineering ≠ jailbreaking; the patterns that survive are the ones grounded in published research.

Defined terms (4)

Chain-of-Thought (CoT)

Prompting technique that elicits step-by-step reasoning before the final answer. Wei et al. (Google Brain, 2022) found dramatic reasoning-benchmark gains from this single technique.

ReAct

Reasoning + Acting interleaved pattern (Yao et al., Princeton+Google 2022). Foundational to agent loops — the model emits Thought → Action → Observation cycles.

In-context learning

The capability of LLMs to learn new patterns from examples in the prompt without weight updates. Emerged at GPT-3 scale; remains the primary mechanism for few-shot prompting.

Instruction tuning

Fine-tuning a pretrained LM on instruction-response pairs (often RLHF-augmented) so the model follows natural-language instructions. The InstructGPT paper (2022) is the canonical reference.

Prompt engineering — patterns that work

Why prompting still matters

The foundational patterns

What still doesn't work reliably

Defined terms (4)

All claims in this topic (10)

Related

Other topic hubs

Concept pillars

Framework integrations