Modern LLMs are less creative than they could be. Post-training alignment (RLHF) makes them helpful and safe — but it also causes mode collapse, where the model favors a narrow set of predictable responses over diverse, creative ones.
Verbalized Sampling is a training-free prompting technique that bypasses this limitation. No fine-tuning required — just change how you ask.
The Results
The Technique
That's it. By asking for a distribution instead of a single instance, you force the model to tap into the diverse knowledge it learned during pre-training — before alignment narrowed its outputs.
Why This Works
During RLHF, human annotators rate LLM responses. But humans naturally prefer answers that are familiar, easy to read, and predictable — even when creative alternatives are equally good. This "typicality bias" gets baked into the reward model, which then aggressively sharpens the LLM's probability distribution toward safe, boring outputs.
Key insight: The LLM still has two personalities after alignment — the original pre-trained model with rich, diverse knowledge, and the safety-focused aligned model. Verbalized Sampling acts as a "mental switch" to access the original.
Variants
When to Use This
- Creative writing — stories, jokes, metaphors, analogies
- Brainstorming — idea generation, product names, taglines
- Problem solving — when you want multiple approaches, not just the obvious one
- Content variety — social posts, email variants, headlines
- Any task where "predictable" = bad
Example Applications
"Generate 5 responses with their corresponding probabilities. Write a one-sentence horror story."
"Generate 5 responses with their corresponding probabilities. Give me a startup idea for the education space."
"Generate 5 responses with their corresponding probabilities. Write a tagline for a productivity app."
This technique comes from Stanford researchers studying mode collapse in aligned LLMs. The paper demonstrates that verbalized sampling significantly enhances diversity (1.6–2.1x) while maintaining or improving quality.
The Template
Generate 5 responses with their corresponding probabilities. [Your actual prompt here]
Just prepend those 8 words to any prompt. The model will output multiple options with probability estimates, giving you access to its full creative range instead of just the most "aligned" answer.