Objective
Temperature is a parameter in Large Language Models (LLMs) that regulates the balance between deterministic accuracy and creative variability during text generation. This setting, typically ranging from 0.0 to 1.5, determines the probability distribution of the model's next-word selections. A lower temperature (0.0–0.3) produces factual, repetitive, and precise outputs suitable for technical or legal documentation. Conversely, a higher temperature (0.8+) encourages more imaginative responses, making it ideal for brainstorming or creative writing.
While models like ChatGPT and Claude allow for manual tuning via prompts, Gemini often manages these levels dynamically to ensure reasoning performance.
The technical implementation of temperature involves adjusting the "softmax" layer of the neural network. By modifying the logits before they are converted into probabilities, the parameter influences how heavily the model favors the most likely next token. A temperature of 0 essentially results in "greedy decoding," where it always chooses the highest-probability word. In advanced models like Gemini 3, developers recommend maintaining a stable temperature of 1.0, as manual overrides can occasionally degrade the complex "thinking" paths and logic loops the model relies on for multi-step reasoning.
