Skip to main content
Skylos isn’t a regex-based linter. It’s a multi-pass static analyzer that builds a complete understanding of your codebase before reporting issues.

The Analysis Pipeline

When you run skylos ., here’s what happens:

Phase 1: Discovery

Skylos starts by mapping your project:
# Pseudocode
files = glob("**/*.py")
files = filter_exclusions(files, config.exclude)
# Result: list of files to analyze
Default exclusions: __pycache__, .git, venv, .venv, node_modules, build, dist These are skipped because they contain non-source files or third-party code you don’t control.

Phase 2: Parsing

Each file is parsed into an Abstract Syntax Tree (AST)—a structured representation of your code: From the AST, Skylos extracts:
ExtractionWhat It Captures
DefinitionsFunctions, classes, methods, variables, imports
ReferencesFunction calls, attribute access, name lookups
Framework signalsDecorators, base classes, magic patterns

Why AST, Not Regex?

Regex can’t understand code structure:
# Regex sees "def unused" and might flag it
"""
def unused():  # This is in a docstring, not real code!
    pass
"""

def real_function():
    pass
AST parsing knows the difference between code and strings.

Phase 3: Analysis Engines

Skylos runs multiple analysis engines in parallel:

Reference Graph Builder

Creates a map of what calls what: Any definition with zero incoming edges is potentially dead code.

Taint Analysis Engine

Traces data flow from sources to sinks: The taint “flows” through assignments. When it reaches a sink, we flag it.

Complexity Calculator

Walks function bodies counting decision points:
def example(x, y):      # Base: 1
    if x > 0:           # +1 = 2
        for i in y:     # +1 = 3
            if i:       # +1 = 4
                pass
    return x            # Total: 4

Secret Scanner

Pattern-matches against known credential formats:
AKIA[0-9A-Z]{16}     → AWS Access Key
ghp_[a-zA-Z0-9]{36}  → GitHub Token
sk_live_[a-zA-Z0-9]+ → Stripe Key

Phase 4: Confidence Scoring

Not every “unused” definition is actually dead. Skylos scores confidence based on signals: This is why Skylos has far fewer false positives than tools that do simple “is it referenced?” checks.

Phase 5: Output & Gating

Results are formatted and optionally checked against gate policies:

Performance

Skylos is designed for speed:
OptimizationHow It Helps
Parallel file parsingMulti-core AST parsing
Single-pass collectionDefinitions and references in one walk
Lazy taint analysisOnly runs when --danger is enabled
Early filteringExclusions applied before parsing
Typical performance:
  • 10K LOC: < 2 seconds
  • 100K LOC: < 10 seconds
  • 1M LOC: < 60 seconds

Extensibility

Skylos uses a rule-based architecture:
class SkylosRule(ABC):
    @property
    def rule_id(self): ...      # e.g., "SKY-D210"
    
    @property
    def name(self): ...         # e.g., "SQL Injection"
    
    def visit_node(self, node, context):
        # Return findings or None
        ...
Rules are organized by category:
  • rules/danger/ — Security rules
  • rules/quality/ — Complexity, nesting, structure
  • rules/secrets.py — Credential detection

Next Steps