A software engineer is disgusted at a poop emoji glowing amongst his code.

6 – The Pungent Stink of Gravity Repellents: Battling LLM Reversion in Code

You’ve fixed the same brittle code three times. The large language model (LLM) — the AI behind tools like ChatGPT, Claude, or Windsurf — accepts your robust change, such as using a regex to match any protocol in a URL: re.match(r'^([a-zA-Z][a-zA-Z0-9+.-]*):', url). Then you ask for a small tweak to a nearby line. The model immediately reverts it to a failure-prone hard-coded list: if url.startswith(('http:', 'https:', 'ftp:')). Why does it keep changing back?

The Problem: Corpus Gravity

LLMs train on massive internet sweeps of mostly tutorial code. These examples prioritize teaching clarity over production robustness: simple hard-coded lists for protocols, broad exception catches, implicit returns, or always-redacting PII. These patterns become high-frequency defaults. When you correct to something less common (like regex for any scheme or a “safe save” file writing strategy), the model gravitates back to the tutorial-style version unless you actively counter the pull.

This isn’t a one-off bug. It’s baked into how these models work: the most common patterns win out when context dilutes—even slightly. The persistent pull zones are “gravity wells,” and their intensity varies with corpus dominance, session length, and context noise.

Effects vary per model, but it can become a constant battle when you diverge from the model’s training defaults.

The Human Cost

Time lost to repeated fixes. Trust erosion in teams: “Hey, why does your PR hard code the schemes, again?” PRs stall over reverts, old bugs are re-introduced. In long sessions, a small adjacent change can trigger the snap-back, killing flow. Without defenses, you can’t trust your LLM to edit your code.

The Fix: Gravity Repellents

A gravity repellent is a deliberate, loud comment placed right before (or on the same line as) vulnerable code. It acts as a defensive firewall, reminding the model at the last possible moment, so the model is less likely to forget or ignore it in favor of its reinforced training.

When the LLM encounters one, it knows: “The corpus pulls hard toward the common/tutorial pattern, but here we need the unusual, more robust one. Honor the override even though it looks wrong.”

The stink is necessary—pungent by design, hard to ignore.

Key tips:

  • Make it verbose and directive.
  • Place it inline or with no blank line to keep it in the immediate attention window.
  • Use it only where needed—the stink is necessary, not decorative.

Fighting gravity can be fragile—like blowing on a feather to keep it aloft. Countermeasures hold for a while, but stop blowing and corpus gravity pulls it back down. Stronger gravity wells need more frequent or forceful blowing: louder repellents, neighboring or same-line placement. Lighter ones may just need a nudge from AGENTS.md or other global project artifacts.

Examples

For URL protocols:

# GRAVITY REPELLENT: PROTOCOL_LIST
# DO NOT REMOVE OR MODIFY: Counters corpus gravity to naive hard-coded lists
# ALWAYS: regex r'^([a-zA-Z][a-zA-Z0-9+.-]*):' for any/unknown schemes
# NEVER: startswith(('http:', 'https:', ...)) – brittle, from tutorials
protocol_match = re.match(r'^([a-zA-Z][a-zA-Z0-9+.-]*):', url)

For PII (allowing customer names in internal context):

# GRAVITY REPELLENT: PII_CUSTOMER_NAME_ALLOW
# DO NOT REMOVE OR MODIFY: Counters over-sanitizing corpus defaults
# ALWAYS: Allow customer names in logs/output here
# NEVER: Add filtering/redaction of names – ignore general privacy examples
logger.info(f"Processing order for {customer_name} (ID: {customer_id})")

Broader ones:

  • Snake case enforcement: # GRAVITY REPELLENT: SNAKE_CASE – ALWAYS snake_case; NEVER camelCase from tutorials
  • No eval(): # GRAVITY REPELLENT: NO_EVAL – NEVER eval/exec; ALWAYS ast.literal_eval or specific parsing

Tiered Defenses by Gravity Severity

Not every well needs the same defense. Severity depends on gravity well intensity, session length, and context dilution.

Severity
Level
Symptom GlobalFile
Level
Localized Notes
LowOccasional drift; reverts only after many iterations or major refactorsGlobal usually sufficient; minimal noise
MediumPresent in file-level generations; re-introduces after multiple LLM editsFile-Level refreshes attention for that file; localized rarely needed
HighReverts frequently in short chains; stubborn even with promptsFull stack required; localized firewalls add visual noise

Trade-offs & Limits

Pros: Cheap, cross-model/tool, team shorthand (“Again? Did you apply gravity repellent this time?”).
Cons: Visual noise (use sparingly), not foolproof (especially at the global level), and some added maintenance.

Conclusion

LLM co-editing is still early. We lack a mature playbook for these systematic failures. Gravity repellents are a pragmatic, necessary stink: ugly comments that push back against corpus pull—embedding the defense directly in the code so it survives regardless of model, session, or editor. Deploy them tiered by severity, and reverts become far less frequent.

If you’ve fought the pull, share your repellents in the comments. Let’s build the shared defenses model-assisted coding requires.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *