Tutorial: Clean Up a Legacy Codebase
You've inherited a codebase. It's big, it's messy, and nobody knows what's actually used. This tutorial walks you through using Skylos to clean it up safely.
Before You Start
Prerequisites:
- Git repository with clean working tree
- Test suite (even partial coverage helps)
- 2-4 hours of uninterrupted time
git:
git branch # Create a cleanup branch
git checkout -b cleanup/dead-code-removal
Phase 1: Assessment (30 min)
First, understand the scope of the problem.
Step 1: Initial Scan
pip:
skylos . --json -o baseline.json
Step 2: Review the Numbers
skylos:
# .
Skylos Python Static Analysis Results
Analyzed 156 file(s)
Unreachable: 47 Unused imports: 234 Unused params: 89 Unused vars: 56
Don't panic. These numbers are normal for legacy codebases.
Step 3: Categorize the Findings
# High-confidence findings (safe to act on)
skylos . --confidence 80 --json -o high-confidence.json
# Count them
cat high-confidence.json | python -c "import json,sys; d=json.load(sys.stdin); print(sum(len(v) for v in d.values() if isinstance(v,list)))"
Start with high-confidence findings. These are the safest to remove.
Phase 2: Quick Wins (1 hour)
Remove obvious dead code that's safe to delete.
Step 1: Unused Imports
Unused imports are the safest to remove—they can't break runtime behavior.
# See unused imports
skylos . --confidence 90 | grep -A 100 "Unused Imports"
────────────────────── Unused Imports ───────────────────────
# Name Location
1 json api/views.py:3
2 Optional models.py:1
3 deprecated_lib utils/helpers.py:5
Step 2: Interactive Removal
skylos:
Select only imports for now:
? Select unused imports to act on (space to select)
❯ ◉ json (api/views.py:3)
◉ Optional (models.py:1)
◉ deprecated_lib (utils/helpers.py:5)
Step 3: Apply Changes
# Remove selected imports
skylos . -i
Step 4: Verify
# Run tests
pytest
# If tests pass, commit
git add -A
git commit -m "chore: remove unused imports (47 files)"
Commit frequently. Small commits are easier to revert if something breaks.
Phase 3: Dead Functions (1-2 hours)
Functions are riskier than imports. Proceed carefully.
Step 1: List Candidates
skylos:
─────────────────── Unreachable Functions ───────────────────
# Name Location
1 legacy_handler api/legacy.py:45
2 unused_helper utils/helpers.py:120
3 old_validator models/validators.py:30
Step 2: Investigate Before Deleting
For each function, check:
# Is it called dynamically?
grep -r "getattr.*legacy_handler" .
grep -r "'legacy_handler'" .
grep -r '"legacy_handler"' .
# Is it in a public API?
grep -r "from.*import.*legacy_handler" .
# Is it referenced in configs?
grep -r "legacy_handler" *.yaml *.json *.toml
Step 3: Comment Out First (Safer)
skylos:
This adds markers instead of deleting:
# SKYLOS DEADCODE: def legacy_handler():
# SKYLOS DEADCODE: """Old handler, replaced in v2."""
# SKYLOS DEADCODE: pass
Step 4: Run Full Test Suite
pytest:
# Or your full CI command
make test
Step 5: Verify in Staging
If tests pass, deploy to staging:
git:
git commit -m "chore: comment out potentially dead functions"
git push origin cleanup/dead-code-removal
# Deploy to staging
Monitor for a few days. If nothing breaks, convert comments to deletions.
Step 6: Permanent Deletion
After staging validation:
# Find all marked dead code
grep -r "SKYLOS DEADCODE" . --include="*.py"
# Delete the commented lines (or do it manually)
# Then commit
git commit -am "chore: remove confirmed dead functions"
Phase 4: Framework Code (30 min)
Framework code needs special attention due to implicit calling.
Step 1: Lower Confidence Threshold
skylos:
You'll see more findings, but also more false positives.
Step 2: Check Framework Patterns
For each finding, verify it's not:
# Django view (called via URL routing)
def user_detail(request, pk): # Check urls.py
pass
# Flask route (called via decorator)
@app.route("/users")
def get_users(): # Skylos should catch this, but verify
pass
# Celery task (called via .delay())
@celery.task
def process_job(): # Called asynchronously
pass
# Signal receiver
@receiver(post_save)
def on_save(): # Called implicitly
pass
Step 3: Add Exclusions for False Positives
# pyproject.toml
[tool.skylos]
ignore = ["specific_false_positive_function"]
Or inline:
def:
pass
Phase 5: Quality Issues (Optional)
While you're cleaning up, address quality issues too.
skylos:
────────────────────────── Quality Issues ──────────────────────────
# Type Function Detail Location
1 Complexity process_order McCabe=23 (target ≤10) orders.py:45
2 Nesting validate_all Depth 7 (target ≤3) validators.py:23
Prioritize by Severity
| Severity | Action |
|---|---|
| CRITICAL | Fix now (complexity 25+) |
| HIGH | Fix soon (complexity 15-24) |
| MEDIUM | Add to backlog |
Quick Complexity Fixes
# Before: Complexity 15
def process(data):
if data.valid:
if data.type == "A":
for item in data.items:
if item.active:
# nested logic...
# After: Complexity 6 (extract + early return)
def process(data):
if not data.valid:
return None
if data.type == "A":
return process_type_a(data)
return process_other(data)
def process_type_a(data):
return [handle(item) for item in data.items if item.active]
Phase 6: Set Up Prevention
Don't let dead code accumulate again.
Pre-commit Hook
# .pre-commit-config.yaml
repos:
- repo: https://github.com/duriantaco/skylos
rev: v2.6.0
hooks:
- id: skylos-scan
args: [".", "--confidence", "80", "--danger"]
CI Gate
# .github/workflows/quality.yml
- name: Skylos Gate
run: skylos . --danger --quality --gate
Scheduled Cleanup
# .github/workflows/cleanup-reminder.yml
name: Monthly Cleanup Reminder
on:
schedule:
- cron: '0 9 1 * *' # First of each month
jobs:
report:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: pip install skylos
- run: |
skylos . --json -o report.json
echo "## Monthly Skylos Report" >> $GITHUB_STEP_SUMMARY
skylos . >> $GITHUB_STEP_SUMMARY
Checklist
□ Create cleanup branch
□ Run baseline scan
□ Remove unused imports (commit)
□ Comment out dead functions (commit)
□ Run tests
□ Deploy to staging
□ Monitor for 3-5 days
□ Convert comments to deletions (commit)
□ Address high-severity complexity (commit)
□ Set up pre-commit hook
□ Set up CI gate
□ Merge to main
Troubleshooting
Tests fail after removing 'dead' code
The code wasn't dead—it was called dynamically or by tests.
# Revert
git checkout -- path/to/file.py
# Add suppression
def not_actually_dead(): # noqa: skylos
pass
Too many findings to review
Raise the confidence threshold:
skylos:
Focus on the highest-confidence items first.
Framework code keeps getting flagged
Skylos might not recognize your specific framework pattern. Add inline suppression:
@custom_framework_decorator # noqa: skylos
def handler():
pass
Production broke after removing code
If code was called dynamically and tests didn't catch it:
- Revert immediately:
git revert HEAD - Add monitoring for that code path
- Add suppression:
# noqa: skylos - Consider adding a test for the dynamic call
Expected Results
After completing this tutorial:
| Metric | Before | After |
|---|---|---|
| Unused imports | 234 | < 10 |
| Dead functions | 47 | < 5 |
| Lines of code | 50,000 | 42,000 |
| Test coverage | 60% | 72% |
| Build time | 45s | 38s |
Next Steps
CI/CD Integration
Set up automated quality gates
Quality Gate
Configure gate policies