🌏 閱讀中文版本
The Real Fear
Open your IDE.
File name: OrderProcessor.java
Scroll to the bottom. Takes 5 seconds.
Cursor on line 1. Bottom right shows: 4,127 lines
Search for if. 847 results.
This function handles all order logic. Runs 100,000 times a day. One wrong line, and the entire order system explodes.
No comments. No documentation. Original author left three years ago.
Your task: “Just update the discount logic.”
Sounds simple.
But you can’t even find where the discount logic is.
How many times have you lived this?
Why Legacy Code Is So Hard
Surface Problems vs Real Problems
| What you think the problem is | What the problem actually is |
|---|---|
| Code is too long | Don’t know which parts matter, which are historical garbage |
| No comments | Don’t know “why” it was written this way |
| Original author left | No one can answer “will this break if I change it” |
| Can’t understand it | No confidence, afraid to touch it |
Three Layers of Fear
Layer 1: Cognitive Overload
4000 lines of code. You need to simultaneously track: – What this section does – Where variables come from – What it affects – Edge cases to watch for
Human working memory holds about 7 items. 4000 lines far exceeds capacity.
Layer 2: Uncertainty
You change line 2847.
What does it affect? Don’t know. Are there other places depending on this? Don’t know. Are there tests to verify? No.
Uncertainty amplifies fear.
Layer 3: Responsibility Pressure
If it breaks, it’s on you.
Even though you didn’t write this code. Even though this code was always a minefield. When things go wrong, everyone asks: “Who changed it?”
These three layers of fear stack up to one result:
Don’t touch it.
If you can avoid changing it, avoid it. If you can work around it, work around it. If you can delay it, delay it.
Then the code gets worse, and the next person is even more afraid to touch it.
Vicious cycle.
What AI Can Help With: A Decision Framework
Not everything should go to AI. Not everything should be done by you.
✅ AI Takes Over Completely
Explaining what code does – Translation machine: turn 4000 lines into 20 paragraph summaries – You don’t read line by line—read summaries to grasp the whole picture
Finding dependencies – Which DB tables does this function read – Which external APIs does it call – What gets affected if you change it
Finding potential issues – Null pointer risks – Performance bottlenecks – Security vulnerabilities
Generating test cases – Happy path – Edge cases – Error handling
Suggesting how to split it – Which functions to extract – Which one is safest to start with
These have “standard answers.” AI is faster and more thorough than humans.
🤝 Human-AI Collaboration
Confirming AI’s understanding is correct
AI will guess wrong on business logic.
It doesn’t know “why this multiplies by 1.05.” It doesn’t know “this workaround handles a special requirement from a major client.”
You need to verify AI’s explanations and add context it doesn’t have.
Deciding refactoring priorities
AI says you can split into 7 functions. Which first? Which later? Which can’t be touched?
This requires your judgment on business priorities and risk.
Reviewing test cases
AI-generated tests may miss critical scenarios. You review, supplement, and confirm coverage.
❌ Human Decides
Whether to keep or delete logic
Code that looks useless might handle an edge case. Delete or not—that’s a business decision, not a technical one.
When to refactor
Now or next sprint? Is this risk worth taking? Does the team have bandwidth?
Communicating with PM/management
Why spend time refactoring? What benefit to users? What happens if we don’t?
AI can’t help with these.
Step 1: Build a Cognitive Map
Before changing anything, understand what this code does.
Let AI Be Your Translator
Don’t dump 4000 lines at once. AI will crash, and you’ll get garbage.
Feed it in sections.
First round—give the overview:
This is an e-commerce system's processOrder function.
4000 lines total. I'll give it to you in sections.
Here's the structure:
- Lines 1-400: Order validation
- Lines 401-800: Inventory check
- Lines 801-1200: Shipping calculation
- Lines 1201-1800: Discount processing
- Lines 1801-2500: Payment processing
- Lines 2501-3200: Create shipment
- Lines 3201-4000: Send notifications
Understand this structure first.
Second round—dive into one section:
Now I'll give you lines 1-400 (order validation).
Tell me:
1. Main logic in one sentence
2. List 3-5 key steps
3. Potential issues
4. How this connects to the inventory check that follows
AI will respond:
This section validates orders before processing.
Key steps:
1. Check user login status
2. Verify cart is not empty
3. Check if products are still available
4. Validate shipping address format
5. Check for incomplete pending orders
Potential issues:
- Line 45: user.getCart() has no null check
- Line 120: Address validation uses hardcoded regex
- Line 380: Magic number 1.05, purpose unknown
Connection:
After validation passes, validatedOrder object passes to inventory check.
Repeat this process. You’ll get summaries for 20 sections.
Now you have a “table of contents.”
Let AI Draw a Dependency Map
Analyze this function and list:
1. Which database tables it reads
2. Which external APIs it calls
3. Which global variables or configs it uses
4. Which states it modifies (DB, cache, files)
Why does this matter?
Knowing what it touches tells you what changes will affect.
This is your “blast radius assessment.”
Let AI Find the Landmines
Check this code for:
1. Potential bugs (null pointers, edge cases, race conditions)
2. Performance issues (N+1 queries, redundant calculations, massive loops)
3. Security risks (SQL injection, missing permission checks)
4. Maintainability issues (magic numbers, duplicate logic, deep nesting)
This isn’t about fixing now—it’s about knowing where the landmines are.
Know the landmine locations before you step on them.
Step 2: Write Tests Before Changing Anything
Core Principle: Refactoring Without Tests Is Suicide
You changed line 2847 out of 4000.
How do you know it didn’t break anything?
Run it and see? Manually test a few scenarios? Pray?
This isn’t engineering. This is gambling.
Three Testing Strategies
Strategy A: Golden Master Testing (Most Conservative)
Concept: Record current behavior first, compare after refactoring.
Steps:
1. Prepare 50 sets of real input data (pull from production logs)
2. Run current code once, record all outputs
3. This is your "golden master"
4. After refactoring, run again, compare if outputs match exactly
5. Any difference means you broke something
Ask AI to generate test data:
Based on this function's parameters,
generate 20 test inputs covering different scenarios:
- 5 normal cases (typical orders)
- 5 edge cases (empty cart, single item, many items)
- 5 special cases (discount codes, international shipping, pre-orders)
- 5 error cases (invalid address, out of stock, payment failure)
Strategy B: Characterization Testing (Describe Current State)
Concept: Regardless of whether code is correct, turn “current behavior” into tests.
This function's current behavior:
- Input X returns Y
- Input A throws exception B
- Input null returns empty array
Write these behaviors as tests.
Doesn't matter if they're bugs—lock in current state first.
Why?
Refactoring goal is “behavior unchanged.” Lock in current state, then discuss whether to change behavior.
Strategy C: Critical Path Testing (Pragmatic)
If there’s no time for comprehensive tests:
What are the 5 most important use cases for this function?
Write tests only for these 5.
80/20 rule: 20% of tests cover 80% of risk.
Something is better than nothing.
Pitfalls of AI-Generated Tests
AI-generated tests have common problems:
Problem 1: Only tests happy path
AI generates “normal flow” tests but misses edge cases.
Problem 2: Too much mocking
AI mocks all external dependencies, so tests only test “mock behavior,” not real integration.
Problem 3: Weak assertions
// AI-generated test
assertNotNull(result); // Only confirms not null, doesn't verify content
What you need to do: Review AI’s tests
Ask AI:
Did the tests you just generated miss any important scenarios?
Especially:
- Empty data cases
- Large data volume cases
- Concurrent execution cases
- Insufficient permission cases
- External service timeout cases
Let AI check itself.
Step 3: Splitting Strategy
Why Did It Become 4000 Lines
A 4000-line function didn’t happen overnight.
Cause 1: Feature stacking
Every new requirement gets stuffed in.
“Just add feature A here.” “And add check B there too.”
Three years later, 4000 lines.
Cause 2: Copy-paste
Similar logic copied 10 times.
“This is similar to that, just copy and modify.”
Each copy doubles the line count.
Cause 3: Fear-driven
Afraid to create new functions, worried about breaking existing calls.
“Just stuff it here, it works anyway.”
Cause 4: No code review
No one said “this is too long.” No one said “should be split.” No one cared about code quality.
Understanding causes tells you how to split.
Let AI Suggest Splitting Approach
This 4000-line function needs splitting.
Consider:
1. Single responsibility principle (one function does one thing)
2. Testability (each split should be independently testable)
3. Minimize change scope (minimize impact on callers)
Tell me:
1. Which independent functions can it split into
2. What's each function's responsibility
3. How do they call each other
4. Which is most independent, safest to extract first
AI might suggest:
Suggest splitting into 7 functions:
1. validateOrder() - Order validation (lines 1-400)
Responsibility: Ensure order data complete, user authorized
Dependencies: Lowest, only needs Order object
Suggested order: Extract first
2. checkInventory() - Inventory check (lines 401-800)
Responsibility: Confirm all products have sufficient stock
Dependencies: Needs validateOrder result
3. calculateShipping() - Shipping calculation (lines 801-1200)
Responsibility: Calculate shipping based on address and products
Dependencies: Needs validated address data
...
Three Principles for Splitting
Principle 1: Extract the most independent part first
Which section has the least dependencies on others? Extract that first.
Ask AI:
Of these 7 suggested functions,
which has the lowest dependencies on other parts?
I want to start with the safest one.
Usually: – Pure calculation logic (no DB reads, no API calls) – Validation logic (only checks, doesn’t modify) – Notification logic (last step, doesn’t affect main flow)
Principle 2: Gradual replacement (Strangler Fig Pattern)
Don’t change everything at once.
Steps:
1. Create a new validateOrder() function
2. Copy lines 1-400 logic to it
3. In the original location, change to call validateOrder()
4. Run tests, confirm behavior matches
5. Commit
6. Next sprint, extract the next section
Each time, only change one small piece. Each time, you can rollback. Each time, tests protect you.
Principle 3: Keep function signature unchanged
First phase of refactoring: only change internal structure, not inputs/outputs.
Why? – Callers don’t need changes – Tests don’t need changes – Minimum risk
Once internal structure stabilizes, then consider adjusting the interface.
Step 4: Handling “Only the Person Who Left Knows” Logic
Black Holes Even AI Can’t Understand
Some code, AI can’t understand either:
// Don't know why it's like this, but removing it breaks things
price = price * 1.05 * 0.97 * 1.02;
Ask AI, it’ll say:
“This appears to adjust the price, but I don’t know why these specific numbers. Could be tax rates, discounts, or other business rules.”
This is a business logic black hole.
AI can analyze code structure, but it doesn’t know: – Which client’s special requirement this was – Which historical bug this works around – Who added this workaround at 3 AM three years ago
Archaeology Strategies
Strategy 1: Git Blame
git blame -L 2847,2847 OrderProcessor.java
Find who added this line and when.
git show abc123 # See the full commit
git log --grep="price adjustment" # Search related commit messages
If you’re lucky, the commit message explains why. Even luckier, there’s an issue or ticket number.
Strategy 2: Search Related Documentation
Ask AI:
This code mentions 'specialDiscountRate'.
Search the project for related:
1. Config files (application.yml, config.json)
2. Documentation (README, wiki, confluence)
3. Test cases (might have comments explaining purpose)
4. Comments in other code
Strategy 3: Find Senior Employees
Not to explain the code—to explain the history.
“That big client project in 2021, was there any special pricing logic?”
“This 1.05 number, do you remember what it’s for?”
They might not remember the code, but might remember the business context.
Strategy 4: Mark and Isolate
If you really can’t find the answer:
/**
* WARNING: Unknown pricing adjustment logic
*
* Possibly related to 2021 client project (unconfirmed)
* Original author John has left, cannot ask
*
* Do not modify until confirmed
* If modification needed, confirm business rules with PM first
*
* Discovered: 2025-12-08
* Discovered by: Tim
* Related ticket: Not found
*/
price = price * 1.05 * 0.97 * 1.02;
At least the next person knows: – This is a “known unknown” – Not your oversight – Be careful before touching it
Mindset: This Is Not Your Problem
Legacy Code Is Historical Debt
You didn’t write these 4000 lines.
They were written this way for reasons: – Tight deadlines – Constantly changing requirements – No one reviewed – No time to refactor
Your responsibility is to “make it a little better,” not “make it perfect.”
Three Refactoring Traps
Trap 1: Wanting to refactor everything at once
“I’ll spend two weeks completely rewriting this!”
Result: – Two weeks later, not done – New requirements come in, need to work on old code – Your refactoring branch is 500 commits behind main – Never gets merged
Correct approach: Each time you go in, change one small piece, merge immediately.
Trap 2: Pursuing “clean”
You spend three days making code beautiful.
PM asks: “What does the user experience differently?”
You: “Uh… nothing, but the code is cleaner.”
PM: “…”
Correct approach: Refactoring follows requirements.
When you need to change that functionality, clean up that code. This way refactoring is “part of the requirement,” not “extra work.”
Trap 3: Refactoring becomes rewriting
“This code is too bad, rewriting would be faster.”
90% of the time, this is wrong.
Rewriting means: – Losing all edge case handling (you don’t know what they are) – Losing all bug fixes (you don’t know what was fixed) – Stepping in all the same pits again (predecessors already stepped in them)
The temptation to rewrite is strong, but the cost is higher.
Boy Scout Rule
“Leave the campground cleaner than you found it.”
Every time you change code: – Rename one variable to be clearer – Extract one small function to clarify logic – Add one comment so the next person guesses less
Not big refactoring. Just doing it incidentally.
One year later, 4000 lines becomes 3000. Not from one refactoring session, but daily small improvements.
Practical Operation: Complete Claude Code Conversation Flow
Opening Move: Give AI Enough Context
Wrong approach: Dump 4000 lines directly
You: Help me explain this code
[paste 4000 lines]
AI will crash or give you garbage.
Correct approach: Feed in layers
You: I'm inheriting a legacy project.
There's a processOrder function, 4000 lines.
I'll give it to you in sections.
Background:
- This is the e-commerce system's order processing core
- Runs 100,000 times daily
- Original author left
- No tests, no documentation
My goals:
1. First understand what it does
2. Find risk points
3. Plan how to split and refactor
Ready?
Five Most Useful Prompt Templates
Template 1: Code Translation
Explain what this code does in one sentence.
Then list 3-5 main steps using bullet points.
Don't explain syntax, focus on business logic.
Template 2: Dependency Analysis
Analyze this function:
1. Input: What parameters needed, where from
2. Output: What it returns, what state it changes
3. Dependencies: Which DB tables read/written, which external services called
4. Side effects: Sends email? Writes logs? Changes global variables?
Template 3: Risk Assessment
If I want to modify the logic on line 2847:
1. What other places might it affect?
2. What edge cases should I watch for?
3. What tests do you suggest I write first to protect this change?
Template 4: Test Case Generation
Design test cases for this function:
1. Happy path: 3 normal flows
2. Edge cases: 3 boundary conditions
3. Error cases: 3 error handling scenarios
For each case: input, expected output, test purpose
Template 5: Refactoring Suggestions
This function is too long. Suggest how to split it:
1. What independent functions can it become?
2. What's each function's responsibility?
3. How do they call each other?
4. Which is most independent, safest to extract first?
Complete Conversation Example
You: I'm inheriting a legacy project with a 4000-line processOrder function.
I'll give you the first 200 lines. Help me understand.
[paste code]
AI: This section is order validation logic, doing five things:
1. Check user login status
2. Verify cart not empty
3. Check products still available
4. Validate shipping address format
5. Check for incomplete pending orders
Potential issues:
- Line 45 null check might miss edge case
- Line 120 address validation uses hardcoded regex
You: What specifically is the issue with line 45's null check?
AI: Line 45 is user.getCart().getItems().
If getCart() returns null, it throws NullPointerException.
Suggest adding null check or using Optional.
You: Good, I want to address this section first. Write test cases.
AI: [generates 5 test cases]
You: Do these tests cover the line 45 issue you mentioned?
AI: No, let me add a test case for when cart is null...
You: Now here's lines 201-400...
Key point: Conversation is iterative, not one-shot.
What to Do When AI Is Wrong
AI will guess wrong, especially:
- Business logic: It doesn’t know your company’s rules
- Misleading names: Function called
validatebut actually doestransform - Historical workarounds: Looks like a bug, but it’s intentional
How to handle:
You: You said this section does X, but when I actually run it, it does Y.
What could cause this?
Please re-analyze, considering [your observation].
Let AI correct itself rather than switching to a different question.
Team Collaboration: How to Lead Juniors Through Legacy Code
Why Juniors Get Stuck with Legacy Code
They’re stuck not because they’re lazy or dumb.
It’s because: – Not enough context: Don’t know the system overview, don’t know why it’s designed this way – No confidence: Afraid of breaking things, afraid of getting blamed, afraid of asking stupid questions – No methodology: Don’t know where to start, don’t know how to break down the problem
Pair Programming with AI
New pair programming mode:
Senior + Junior + AI (trio)
Flow:
1. Senior gives context: "This function is order core, affects revenue"
2. Junior talks to AI, Senior listens
3. AI explains code, Junior asks questions
4. Senior intervenes at key moments: "AI is wrong here, actually it's..."
5. Junior learns not just code, but Senior's judgment process
Benefits:
- Senior doesn’t need to explain basic concepts constantly (AI handles it)
- Senior can focus on “what AI doesn’t know” (business logic, historical context)
- Junior has AI as safety net, comfortable asking questions
- Junior sees how Senior collaborates with AI, learns methodology
Build a Legacy Code Knowledge Base
After each legacy code session, leave records:
## OrderProcessor.java Exploration Log
### 2025-12-08 by Tim
- Understood lines 1-400 (order validation)
- Found potential null pointer at line 45
- Added 3 test cases
- Unsolved mystery: What's the magic number 1.05 on line 380?
### 2025-12-15 by Amy
- Asked senior John, 1.05 is special tax rate from 2021 client
- Added comment explaining it
- Extracted validateOrder() function, 400 lines → 80 lines
### 2025-12-22 by Tim
- Processed lines 401-800 (inventory check)
- Found N+1 query issue, not fixing this time, documented
This way knowledge isn’t just in one person’s head.
Next person who comes in reads this log first, knows what predecessors did and discovered.
New Code Review Standards
Old review standard: “Is this code correct?”
New review standard: “Does this code make the system easier to maintain?”
New review checklist:
Are there tests? (at least happy path)
Is readability improved? (variable naming, extracted functions)
Are there comments? (especially for non-intuitive logic)
Is change scope minimized? (don’t sneak in big refactors)
If legacy code, is the knowledge base updated?
Managing Up: How to Convince Your Manager to Give You Time
Managers Don’t Care About “Dirty Code”
You say: “This function is 4000 lines, maintainability is poor, needs refactoring.”
Manager hears: “You want to spend time on something that won’t produce new features.”
You need to translate into manager language.
Three Persuasion Frameworks
Framework 1: Risk Language
This function handles all orders, runs 100,000 times daily.
Currently no tests, no docs, original author left.
Last incident took 12 hours to locate the bug.
If something happens on weekend, no one can handle it quickly.
Suggest investing 3 days for basic cleanup:
- Add tests for critical paths
- Write a structure documentation
- Mark known risk points
This way next incident, resolution time drops from 12 hours to 2 hours.
Framework 2: Efficiency Language
Last time changing order discount logic, estimated 2 days, actually took 5 days.
3 days spent on:
- Understanding 4000 lines (1.5 days)
- Manual testing to ensure nothing broke (1 day)
- Fixing issues hit during changes (0.5 days)
Next three months have 5 requirements touching this area.
If we spend 3 days cleaning up:
- Each requirement saves 2 days
- Three months saves 10 days
- ROI: 3 days investment, 10 days return
Framework 3: Talent Language
Amy joined the team 1 month ago.
She still won't touch the order module because code is too complex, no documentation.
In contrast, user module has tests and docs, she could work independently by week 2.
If we don't clean up order module:
- Only I can ever change this area
- Amy's growth is limited
- I can never take vacation (no one can backup)
Don’t Ask for Too Much Time at Once
Wrong: “I need two weeks to focus on refactoring this.”
Manager will say: “When we have time.”
Then there’s never time.
Right: “Each sprint I spend half a day cleaning one small piece, following requirements.”
Manager easily agrees, and you actually do it.
Show Progress
After each cleanup, send a brief update:
This week cleaned up processOrder's order validation section:
- Extracted validateOrder(), 350 lines → 80 lines
- Added 5 test cases
- Next time changing this area, estimated 2 hours saved
Cumulative progress: 4000 lines, 400 cleaned (10%)
Let manager see: – What you’re doing – Concrete results – Benefits to team
Quantifying Technical Debt: Explaining to Non-Technical People
Analogy: Hidden Costs of a House
Imagine you bought a 30-year-old house.
Looks livable, but:
- Old wiring, occasional power trips
- Rusted pipes, unstable water pressure
- No blueprints, every renovation requires guessing where pipes are
You can keep living there, but:
- Every problem costs more to fix
- Contractors are afraid to touch it, worried about cascading effects
- One day the whole system might collapse
Technical debt is like this.
System runs, but maintenance costs keep rising.
Every change brings bigger risks.
Three Quantifiable Metrics
Metric 1: Time to Fix
Last order system bug:
- Locating problem: 8 hours (searching through 4000 lines)
- Fixing problem: 2 hours
- Testing verification: 2 hours (manual, no automated tests)
Total: 12 hours
With tests and documentation, estimated only 4 hours needed.
Metric 2: Cost of Change
Last time changing order discount logic:
- Requirement itself: change 10 lines of code
- Actual time spent: 5 days
Where time went:
- Understanding code: 2 days
- Manual testing: 1 day
- Fixing issues encountered: 1 day
- Actual changes: 0.5 days
- Code review + deployment: 0.5 days
Modules with tests, same requirement takes only 1 day.
Metric 3: Onboarding Time
Amy joined team 1 month ago:
- User module: Independent by week 2 (has tests, has docs)
- Payment module: Independent by week 3 (has tests, lacks docs)
- Order module: 1 month in, still won't touch (no tests, no docs, too complex)
Every new hire steps in the same pits.
Visual Presentation
Draw a simple chart for your manager:
Module Health Dashboard
User Module ████████░░ 80% ✓ Has tests, has docs
Payment Module ██████░░░░ 60% △ Has tests, lacks docs
Order Module ██░░░░░░░░ 20% ✗ No tests, no docs ← Highest risk
Manager sees the problem at a glance.
Build a Technical Debt List
Don’t just complain “code is bad.” Build a concrete list:
| Item | Risk Level | Estimated Cleanup Time | Benefit After Cleanup |
|---|---|---|---|
| processOrder no tests | High | 3 days | Fix time -8 hours/incident |
| User module no docs | Medium | 1 day | Onboarding -1 week |
| Payment API no error handling | High | 2 days | Reduce complaints, lower refund rate |
| Reports module slow | Low | 5 days | Query time 30s → 3s |
With a list, you can discuss priorities and put it in the roadmap.
Complete Process Summary
Week 1: Build Cognition
Goal: Know what this code does and its risks
- Have AI explain each section → Output: Section summaries
- Have AI draw dependency map → Output: Dependency map
- Have AI find landmines → Output: Risk list
Deliverable: A “system map” document
Week 2: Build Safety Net
Goal: Have enough tests before daring to change
- Generate Golden Master test data
- Have AI generate tests, you review and supplement
- Ensure critical paths have test coverage
Deliverable: Test coverage from 0% → 30%+ (critical paths)
Week 3+: Gradual Improvement
Goal: Each time in, make it a little better
- Start with most independent section
- Each time change only one small piece, merge immediately
- Mark logic you can’t understand, don’t force changes
- Update knowledge base for the next person
Deliverable: Each sprint reduces 5-10% of code lines
From Fear to Control
Before, opening 4000 lines:
- Brain freezes
- Don’t know where to start
- Afraid to change anything
- Want to quit
Now:
- AI translates, you read summaries
- AI maps dependencies, you know boundaries
- AI writes tests, you have safety net
- AI suggests splits, you decide order
Those 4000 lines are still 4000 lines.
But you’re not afraid anymore.
Because you have a method. Because you have tools. Because you know this isn’t your problem—it’s historical debt.
Your job is to make it a little better, bit by bit.
Not heroic big refactoring. Daily small Boy Scout Rule improvements.
One year later, 4000 lines becomes 2000. Has tests, has documentation, people dare to change it.
That’s your accomplishment.