Skip to main content

Tutorial: Clean Up a Legacy Codebase

You've inherited a codebase. It's big, it's messy, and nobody knows what's actually used. This tutorial walks you through using Skylos to clean it up safely.

Before You Start

warning

Prerequisites:

  • Git repository with clean working tree
  • Test suite (even partial coverage helps)
  • 2-4 hours of uninterrupted time

git:

git branch  # Create a cleanup branch
git checkout -b cleanup/dead-code-removal

Phase 1: Assessment (30 min)

First, understand the scope of the problem.

Step 1: Initial Scan

pip:

skylos . --json -o baseline.json

Step 2: Review the Numbers

skylos:

# .
Skylos Python Static Analysis Results
Analyzed 156 file(s)

Unreachable: 47 Unused imports: 234 Unused params: 89 Unused vars: 56

Don't panic. These numbers are normal for legacy codebases.

Step 3: Categorize the Findings

# High-confidence findings (safe to act on)
skylos . --confidence 80 --json -o high-confidence.json

# Count them
cat high-confidence.json | python -c "import json,sys; d=json.load(sys.stdin); print(sum(len(v) for v in d.values() if isinstance(v,list)))"

Start with high-confidence findings. These are the safest to remove.


Phase 2: Quick Wins (1 hour)

Remove obvious dead code that's safe to delete.

Step 1: Unused Imports

Unused imports are the safest to remove—they can't break runtime behavior.

# See unused imports
skylos . --confidence 90 | grep -A 100 "Unused Imports"
────────────────────── Unused Imports ───────────────────────
# Name Location
1 json api/views.py:3
2 Optional models.py:1
3 deprecated_lib utils/helpers.py:5

Step 2: Interactive Removal

skylos:

Select only imports for now:

? Select unused imports to act on (space to select)
❯ ◉ json (api/views.py:3)
◉ Optional (models.py:1)
◉ deprecated_lib (utils/helpers.py:5)

Step 3: Apply Changes

# Remove selected imports
skylos . -i

Step 4: Verify

# Run tests
pytest

# If tests pass, commit
git add -A
git commit -m "chore: remove unused imports (47 files)"
tip

Commit frequently. Small commits are easier to revert if something breaks.


Phase 3: Dead Functions (1-2 hours)

Functions are riskier than imports. Proceed carefully.

Step 1: List Candidates

skylos:

─────────────────── Unreachable Functions ───────────────────
# Name Location
1 legacy_handler api/legacy.py:45
2 unused_helper utils/helpers.py:120
3 old_validator models/validators.py:30

Step 2: Investigate Before Deleting

For each function, check:

# Is it called dynamically?
grep -r "getattr.*legacy_handler" .
grep -r "'legacy_handler'" .
grep -r '"legacy_handler"' .

# Is it in a public API?
grep -r "from.*import.*legacy_handler" .

# Is it referenced in configs?
grep -r "legacy_handler" *.yaml *.json *.toml

Step 3: Comment Out First (Safer)

skylos:

This adds markers instead of deleting:

# SKYLOS DEADCODE: def legacy_handler():
# SKYLOS DEADCODE: """Old handler, replaced in v2."""
# SKYLOS DEADCODE: pass

Step 4: Run Full Test Suite

pytest:

# Or your full CI command
make test

Step 5: Verify in Staging

If tests pass, deploy to staging:

git:

git commit -m "chore: comment out potentially dead functions"
git push origin cleanup/dead-code-removal
# Deploy to staging

Monitor for a few days. If nothing breaks, convert comments to deletions.

Step 6: Permanent Deletion

After staging validation:

# Find all marked dead code
grep -r "SKYLOS DEADCODE" . --include="*.py"

# Delete the commented lines (or do it manually)
# Then commit
git commit -am "chore: remove confirmed dead functions"

Phase 4: Framework Code (30 min)

Framework code needs special attention due to implicit calling.

Step 1: Lower Confidence Threshold

skylos:

You'll see more findings, but also more false positives.

Step 2: Check Framework Patterns

For each finding, verify it's not:

# Django view (called via URL routing)
def user_detail(request, pk): # Check urls.py
pass

# Flask route (called via decorator)
@app.route("/users")
def get_users(): # Skylos should catch this, but verify
pass

# Celery task (called via .delay())
@celery.task
def process_job(): # Called asynchronously
pass

# Signal receiver
@receiver(post_save)
def on_save(): # Called implicitly
pass

Step 3: Add Exclusions for False Positives

# pyproject.toml
[tool.skylos]
ignore = ["specific_false_positive_function"]

Or inline:

def:

    pass

Phase 5: Quality Issues (Optional)

While you're cleaning up, address quality issues too.

skylos:

────────────────────────── Quality Issues ──────────────────────────
# Type Function Detail Location
1 Complexity process_order McCabe=23 (target ≤10) orders.py:45
2 Nesting validate_all Depth 7 (target ≤3) validators.py:23

Prioritize by Severity

SeverityAction
CRITICALFix now (complexity 25+)
HIGHFix soon (complexity 15-24)
MEDIUMAdd to backlog

Quick Complexity Fixes

# Before: Complexity 15
def process(data):
if data.valid:
if data.type == "A":
for item in data.items:
if item.active:
# nested logic...

# After: Complexity 6 (extract + early return)
def process(data):
if not data.valid:
return None
if data.type == "A":
return process_type_a(data)
return process_other(data)

def process_type_a(data):
return [handle(item) for item in data.items if item.active]

Phase 6: Set Up Prevention

Don't let dead code accumulate again.

Pre-commit Hook

# .pre-commit-config.yaml
repos:
- repo: https://github.com/duriantaco/skylos
rev: v2.6.0
hooks:
- id: skylos-scan
args: [".", "--confidence", "80", "--danger"]

CI Gate

# .github/workflows/quality.yml
- name: Skylos Gate
run: skylos . --danger --quality --gate

Scheduled Cleanup

# .github/workflows/cleanup-reminder.yml
name: Monthly Cleanup Reminder
on:
schedule:
- cron: '0 9 1 * *' # First of each month

jobs:
report:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: pip install skylos
- run: |
skylos . --json -o report.json
echo "## Monthly Skylos Report" >> $GITHUB_STEP_SUMMARY
skylos . >> $GITHUB_STEP_SUMMARY

Checklist

□ Create cleanup branch
□ Run baseline scan
□ Remove unused imports (commit)
□ Comment out dead functions (commit)
□ Run tests
□ Deploy to staging
□ Monitor for 3-5 days
□ Convert comments to deletions (commit)
□ Address high-severity complexity (commit)
□ Set up pre-commit hook
□ Set up CI gate
□ Merge to main

Troubleshooting

Tests fail after removing 'dead' code

The code wasn't dead—it was called dynamically or by tests.

# Revert
git checkout -- path/to/file.py

# Add suppression
def not_actually_dead(): # noqa: skylos
pass
Too many findings to review

Raise the confidence threshold:

skylos:

Focus on the highest-confidence items first.

Framework code keeps getting flagged

Skylos might not recognize your specific framework pattern. Add inline suppression:

@custom_framework_decorator  # noqa: skylos
def handler():
pass
Production broke after removing code

If code was called dynamically and tests didn't catch it:

  1. Revert immediately: git revert HEAD
  2. Add monitoring for that code path
  3. Add suppression: # noqa: skylos
  4. Consider adding a test for the dynamic call

Expected Results

After completing this tutorial:

MetricBeforeAfter
Unused imports234< 10
Dead functions47< 5
Lines of code50,00042,000
Test coverage60%72%
Build time45s38s

Next Steps

CI/CD Integration

Set up automated quality gates

Quality Gate

Configure gate policies