First iteration
This commit is contained in:
12
.envrc
Normal file
12
.envrc
Normal file
@@ -0,0 +1,12 @@
|
||||
#!/usr/bin/env bash
|
||||
|
||||
export DIRENV_WARN_TIMEOUT=20s
|
||||
|
||||
eval "$(devenv direnvrc)"
|
||||
|
||||
# `use devenv` supports the same options as the `devenv shell` command.
|
||||
#
|
||||
# To silence all output, use `--quiet`.
|
||||
#
|
||||
# Example usage: use devenv --quiet --impure --option services.postgres.enable:bool true
|
||||
use devenv
|
||||
13
.gitignore
vendored
Normal file
13
.gitignore
vendored
Normal file
@@ -0,0 +1,13 @@
|
||||
# Devenv
|
||||
.devenv*
|
||||
devenv.local.nix
|
||||
devenv.local.yaml
|
||||
|
||||
# direnv
|
||||
.direnv
|
||||
|
||||
# pre-commit
|
||||
.pre-commit-config.yaml
|
||||
|
||||
# Temporary
|
||||
/refs/
|
||||
215
README.md
Normal file
215
README.md
Normal file
@@ -0,0 +1,215 @@
|
||||
# gulie
|
||||
|
||||
A linter, static analyser, and formatter for [Guile Scheme](https://www.gnu.org/software/guile/).
|
||||
|
||||
```
|
||||
$ gulie gulie/
|
||||
gulie/engine.scm:97:80: warning: line-length: line exceeds 80 characters (82)
|
||||
gulie/tokenizer.scm:131:0: warning: trailing-whitespace: trailing whitespace
|
||||
```
|
||||
|
||||
## Why
|
||||
|
||||
No linter, formatter, or static analysis tool exists for Guile Scheme. gulie
|
||||
fills that gap with a two-pass architecture that catches both surface-level
|
||||
formatting issues and deep semantic problems.
|
||||
|
||||
## Features
|
||||
|
||||
- **Surface rules** (no parsing needed): trailing whitespace, line length, tabs,
|
||||
excessive blank lines, comment style conventions
|
||||
- **Semantic rules** (via Guile's compiler): unused variables, unbound
|
||||
variables, arity mismatches, format string errors, shadowed top-levels,
|
||||
unused modules
|
||||
- **Inline suppression**: `; gulie:suppress rule-name` on a line, or region
|
||||
disable/enable blocks
|
||||
- **Auto-fix mode**: `--fix` applies automatic corrections where available
|
||||
- **Configuration**: `.gulie.sexp` in your project root, overridable via CLI
|
||||
- **CI friendly**: exit code 0 for clean, 1 for findings
|
||||
|
||||
## Requirements
|
||||
|
||||
- [Guile](https://www.gnu.org/software/guile/) 3.0 or later
|
||||
|
||||
## Installation
|
||||
|
||||
Clone the repository and ensure `bin/gulie` is on your `PATH`, or run it
|
||||
directly:
|
||||
|
||||
```sh
|
||||
bin/gulie --check .
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
```
|
||||
gulie [OPTIONS] [FILE|DIR...]
|
||||
|
||||
Options:
|
||||
-h, --help Show help message
|
||||
-v, --version Print version
|
||||
--check Check mode (default): report issues, exit non-zero on findings
|
||||
--fix Fix mode: auto-fix what's possible, report the rest
|
||||
--init Generate .gulie.sexp template in current directory
|
||||
--pass PASS Run only: surface, semantic, all (default: all)
|
||||
--config FILE Config file path (default: auto-discover .gulie.sexp)
|
||||
--rule RULE Enable only this rule
|
||||
--disable RULE Disable this rule
|
||||
--severity SEV Minimum severity: error, warning, info
|
||||
--output FORMAT Output format: standard (default), json, compact
|
||||
--list-rules List all available rules
|
||||
```
|
||||
|
||||
### Examples
|
||||
|
||||
Check a single file:
|
||||
|
||||
```sh
|
||||
gulie mylib.scm
|
||||
```
|
||||
|
||||
Check an entire project:
|
||||
|
||||
```sh
|
||||
gulie src/
|
||||
```
|
||||
|
||||
Auto-fix trailing whitespace and other fixable issues:
|
||||
|
||||
```sh
|
||||
gulie --fix src/
|
||||
```
|
||||
|
||||
Generate a config template:
|
||||
|
||||
```sh
|
||||
gulie --init
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
gulie looks for `.gulie.sexp` in the current directory and parent directories.
|
||||
Generate a template with `gulie --init`.
|
||||
|
||||
```scheme
|
||||
((line-length . 80)
|
||||
(indent . 2)
|
||||
(max-blank-lines . 2)
|
||||
(enable trailing-whitespace line-length no-tabs blank-lines
|
||||
comment-semicolons unused-variable unbound-variable
|
||||
arity-mismatch)
|
||||
(disable)
|
||||
(rules
|
||||
(line-length (max . 100)))
|
||||
(indent-rules
|
||||
(define . 1) (let . 1) (lambda . 1)
|
||||
(with-syntax . 1) (match . 1))
|
||||
(ignore "build/**" ".direnv/**"))
|
||||
```
|
||||
|
||||
## Inline Suppression
|
||||
|
||||
Suppress a rule on the current line:
|
||||
|
||||
```scheme
|
||||
(define x "messy") ; gulie:suppress trailing-whitespace
|
||||
```
|
||||
|
||||
Suppress all rules on the next line:
|
||||
|
||||
```scheme
|
||||
;; gulie:suppress
|
||||
(define intentionally-long-variable-name "value")
|
||||
```
|
||||
|
||||
Region disable/enable:
|
||||
|
||||
```scheme
|
||||
;; gulie:disable line-length
|
||||
(define long-line ...............................................)
|
||||
(define another .................................................)
|
||||
;; gulie:enable line-length
|
||||
```
|
||||
|
||||
## Architecture
|
||||
|
||||
gulie uses a two-pass design:
|
||||
|
||||
```
|
||||
.gulie.sexp
|
||||
|
|
||||
file.scm --+--> [Tokenizer] --> tokens --> [CST parser] --> CST
|
||||
| |
|
||||
| [Pass 1: Surface] line rules + CST rules
|
||||
| |
|
||||
| diagnostics-1
|
||||
|
|
||||
+--> [Guile compiler] --> Tree-IL --> CPS
|
||||
|
|
||||
[Pass 2: Semantic] Guile's built-in analyses
|
||||
|
|
||||
diagnostics-2
|
||||
|
|
||||
[merge + suppress + sort + report]
|
||||
```
|
||||
|
||||
**Pass 1** uses a hand-written tokenizer that preserves all whitespace, comments,
|
||||
and exact source text. The critical invariant:
|
||||
`(string-concatenate (map token-text (tokenize input)))` reproduces the input
|
||||
exactly. This feeds a lightweight concrete syntax tree for formatting checks.
|
||||
|
||||
**Pass 2** delegates to Guile's own compiler and analysis infrastructure:
|
||||
`unused-variable-analysis`, `arity-analysis`, `format-analysis`, and others.
|
||||
These are battle-tested and handle macroexpansion correctly.
|
||||
|
||||
The two passes are independent because Guile's reader irrecoverably strips
|
||||
comments and whitespace — there is no way to get formatting info and semantic
|
||||
info from a single parse.
|
||||
|
||||
## Rules
|
||||
|
||||
| Rule | Type | Category | Description |
|
||||
|------|------|----------|-------------|
|
||||
| `trailing-whitespace` | line | format | Trailing spaces or tabs |
|
||||
| `line-length` | line | format | Line exceeds maximum width |
|
||||
| `no-tabs` | line | format | Tab characters in source |
|
||||
| `blank-lines` | line | format | Excessive consecutive blank lines |
|
||||
| `comment-semicolons` | cst | style | Comment style conventions (`;`/`;;`/`;;;`) |
|
||||
| `unused-variable` | semantic | correctness | Unused local variable |
|
||||
| `unused-toplevel` | semantic | correctness | Unused top-level definition |
|
||||
| `unused-module` | semantic | correctness | Unused module import |
|
||||
| `unbound-variable` | semantic | correctness | Reference to undefined variable |
|
||||
| `arity-mismatch` | semantic | correctness | Wrong number of arguments |
|
||||
| `shadowed-toplevel` | semantic | correctness | Top-level binding shadows import |
|
||||
| `format-string` | semantic | correctness | Format string validation |
|
||||
|
||||
## Module Structure
|
||||
|
||||
```
|
||||
gulie/
|
||||
cli.scm Command-line interface
|
||||
config.scm Configuration loading and merging
|
||||
diagnostic.scm Diagnostic record type and formatting
|
||||
tokenizer.scm Hand-written lexer preserving all tokens
|
||||
cst.scm Token stream to concrete syntax tree
|
||||
compiler.scm Guile compiler wrapper for semantic analysis
|
||||
rule.scm Rule record type and registry
|
||||
engine.scm Orchestrator: file discovery, pass sequencing
|
||||
suppression.scm Inline suppression parsing and filtering
|
||||
rules/
|
||||
surface.scm Line-based formatting rules
|
||||
comments.scm Comment style rules
|
||||
```
|
||||
|
||||
## Testing
|
||||
|
||||
```sh
|
||||
guile --no-auto-compile -L . -s test/run-tests.scm
|
||||
```
|
||||
|
||||
84 tests covering tokenizer roundtrip, CST parsing, surface rules, suppression,
|
||||
and semantic analysis.
|
||||
|
||||
## Licence
|
||||
|
||||
[TODO: add licence]
|
||||
24
bin/gulie
Executable file
24
bin/gulie
Executable file
@@ -0,0 +1,24 @@
|
||||
#!/usr/bin/env -S guile --no-auto-compile -e main -s
|
||||
!#
|
||||
;;; gulie — a linter and formatter for Guile Scheme
|
||||
|
||||
;; Add project root to load path
|
||||
(let ((dir (dirname (dirname (current-filename)))))
|
||||
(set! %load-path (cons dir %load-path)))
|
||||
|
||||
;; Load rule modules (registering rules as a side effect)
|
||||
(use-modules (gulie rules surface))
|
||||
|
||||
;; Load optional modules if available
|
||||
(false-if-exception (use-modules (gulie tokenizer)))
|
||||
(false-if-exception (use-modules (gulie cst)))
|
||||
(false-if-exception (use-modules (gulie rules comments)))
|
||||
(false-if-exception (use-modules (gulie rules indentation)))
|
||||
(false-if-exception (use-modules (gulie compiler)))
|
||||
(false-if-exception (use-modules (gulie rules semantic)))
|
||||
|
||||
;; Run
|
||||
(use-modules (gulie cli))
|
||||
|
||||
(define (main args)
|
||||
(exit ((@@ (gulie cli) main) args)))
|
||||
123
devenv.lock
Normal file
123
devenv.lock
Normal file
@@ -0,0 +1,123 @@
|
||||
{
|
||||
"nodes": {
|
||||
"devenv": {
|
||||
"locked": {
|
||||
"dir": "src/modules",
|
||||
"lastModified": 1775040883,
|
||||
"owner": "cachix",
|
||||
"repo": "devenv",
|
||||
"rev": "c277ffa27759cd230089700da568864446528e80",
|
||||
"type": "github"
|
||||
},
|
||||
"original": {
|
||||
"dir": "src/modules",
|
||||
"owner": "cachix",
|
||||
"repo": "devenv",
|
||||
"type": "github"
|
||||
}
|
||||
},
|
||||
"flake-compat": {
|
||||
"flake": false,
|
||||
"locked": {
|
||||
"lastModified": 1767039857,
|
||||
"owner": "NixOS",
|
||||
"repo": "flake-compat",
|
||||
"rev": "5edf11c44bc78a0d334f6334cdaf7d60d732daab",
|
||||
"type": "github"
|
||||
},
|
||||
"original": {
|
||||
"owner": "NixOS",
|
||||
"repo": "flake-compat",
|
||||
"type": "github"
|
||||
}
|
||||
},
|
||||
"git-hooks": {
|
||||
"inputs": {
|
||||
"flake-compat": "flake-compat",
|
||||
"gitignore": "gitignore",
|
||||
"nixpkgs": [
|
||||
"nixpkgs"
|
||||
]
|
||||
},
|
||||
"locked": {
|
||||
"lastModified": 1775036584,
|
||||
"owner": "cachix",
|
||||
"repo": "git-hooks.nix",
|
||||
"rev": "4e0eb042b67d863b1b34b3f64d52ceb9cd926735",
|
||||
"type": "github"
|
||||
},
|
||||
"original": {
|
||||
"owner": "cachix",
|
||||
"repo": "git-hooks.nix",
|
||||
"type": "github"
|
||||
}
|
||||
},
|
||||
"gitignore": {
|
||||
"inputs": {
|
||||
"nixpkgs": [
|
||||
"git-hooks",
|
||||
"nixpkgs"
|
||||
]
|
||||
},
|
||||
"locked": {
|
||||
"lastModified": 1762808025,
|
||||
"owner": "hercules-ci",
|
||||
"repo": "gitignore.nix",
|
||||
"rev": "cb5e3fdca1de58ccbc3ef53de65bd372b48f567c",
|
||||
"type": "github"
|
||||
},
|
||||
"original": {
|
||||
"owner": "hercules-ci",
|
||||
"repo": "gitignore.nix",
|
||||
"type": "github"
|
||||
}
|
||||
},
|
||||
"nixpkgs": {
|
||||
"inputs": {
|
||||
"nixpkgs-src": "nixpkgs-src"
|
||||
},
|
||||
"locked": {
|
||||
"lastModified": 1774287239,
|
||||
"owner": "cachix",
|
||||
"repo": "devenv-nixpkgs",
|
||||
"rev": "fa7125ea7f1ae5430010a6e071f68375a39bd24c",
|
||||
"type": "github"
|
||||
},
|
||||
"original": {
|
||||
"owner": "cachix",
|
||||
"ref": "rolling",
|
||||
"repo": "devenv-nixpkgs",
|
||||
"type": "github"
|
||||
}
|
||||
},
|
||||
"nixpkgs-src": {
|
||||
"flake": false,
|
||||
"locked": {
|
||||
"lastModified": 1769922788,
|
||||
"narHash": "sha256-H3AfG4ObMDTkTJYkd8cz1/RbY9LatN5Mk4UF48VuSXc=",
|
||||
"owner": "NixOS",
|
||||
"repo": "nixpkgs",
|
||||
"rev": "207d15f1a6603226e1e223dc79ac29c7846da32e",
|
||||
"type": "github"
|
||||
},
|
||||
"original": {
|
||||
"owner": "NixOS",
|
||||
"ref": "nixpkgs-unstable",
|
||||
"repo": "nixpkgs",
|
||||
"type": "github"
|
||||
}
|
||||
},
|
||||
"root": {
|
||||
"inputs": {
|
||||
"devenv": "devenv",
|
||||
"git-hooks": "git-hooks",
|
||||
"nixpkgs": "nixpkgs",
|
||||
"pre-commit-hooks": [
|
||||
"git-hooks"
|
||||
]
|
||||
}
|
||||
}
|
||||
},
|
||||
"root": "root",
|
||||
"version": 7
|
||||
}
|
||||
48
devenv.nix
Normal file
48
devenv.nix
Normal file
@@ -0,0 +1,48 @@
|
||||
{ pkgs, lib, config, inputs, ... }:
|
||||
|
||||
{
|
||||
# https://devenv.sh/basics/
|
||||
# env.GREET = "devenv";
|
||||
|
||||
# https://devenv.sh/packages/
|
||||
packages = with pkgs; [
|
||||
guile
|
||||
];
|
||||
|
||||
# https://devenv.sh/languages/
|
||||
# languages.rust.enable = true;
|
||||
|
||||
# https://devenv.sh/processes/
|
||||
# processes.dev.exec = "${lib.getExe pkgs.watchexec} -n -- ls -la";
|
||||
|
||||
# https://devenv.sh/services/
|
||||
# services.postgres.enable = true;
|
||||
|
||||
# https://devenv.sh/scripts/
|
||||
# scripts.hello.exec = ''
|
||||
# echo hello from $GREET
|
||||
# '';
|
||||
|
||||
# https://devenv.sh/basics/
|
||||
# enterShell = ''
|
||||
# hello # Run scripts directly
|
||||
# git --version # Use packages
|
||||
# '';
|
||||
|
||||
# https://devenv.sh/tasks/
|
||||
# tasks = {
|
||||
# "myproj:setup".exec = "mytool build";
|
||||
# "devenv:enterShell".after = [ "myproj:setup" ];
|
||||
# };
|
||||
|
||||
# https://devenv.sh/tests/
|
||||
# enterTest = ''
|
||||
# echo "Running tests"
|
||||
# git --version | grep --color=auto "${pkgs.git.version}"
|
||||
# '';
|
||||
|
||||
# https://devenv.sh/git-hooks/
|
||||
# git-hooks.hooks.shellcheck.enable = true;
|
||||
|
||||
# See full reference at https://devenv.sh/reference/options/
|
||||
}
|
||||
15
devenv.yaml
Normal file
15
devenv.yaml
Normal file
@@ -0,0 +1,15 @@
|
||||
# yaml-language-server: $schema=https://devenv.sh/devenv.schema.json
|
||||
inputs:
|
||||
nixpkgs:
|
||||
url: github:cachix/devenv-nixpkgs/rolling
|
||||
|
||||
# If you're using non-OSS software, you can set allowUnfree to true.
|
||||
# allowUnfree: true
|
||||
|
||||
# If you're willing to use a package that's vulnerable
|
||||
# permittedInsecurePackages:
|
||||
# - "openssl-1.1.1w"
|
||||
|
||||
# If you have more than one devenv you can merge them
|
||||
#imports:
|
||||
# - ./backend
|
||||
813
docs/INSPIRATION.md
Normal file
813
docs/INSPIRATION.md
Normal file
@@ -0,0 +1,813 @@
|
||||
# Inspiration: Existing Lisp Linters, Formatters & Static Analysers
|
||||
|
||||
Survey of reference tools in `./refs/` — what they do, how they work, and what
|
||||
we can steal for a Guile linter/formatter.
|
||||
|
||||
---
|
||||
|
||||
## Table of Contents
|
||||
|
||||
| Tool | Ecosystem | Type | Language |
|
||||
|------|-----------|------|----------|
|
||||
| [Eastwood](#eastwood) | Clojure | Linter (bug-finder) | Clojure/JVM |
|
||||
| [fmt](#fmt) | Racket | Formatter | Racket |
|
||||
| [Kibit](#kibit) | Clojure | Linter (idiom suggester) | Clojure |
|
||||
| [Mallet](#mallet) | Common Lisp | Linter + formatter + fixer | Common Lisp |
|
||||
| [OCICL Lint](#ocicl-lint) | Common Lisp | Linter + fixer | Common Lisp |
|
||||
| [racket-review](#racket-review) | Racket | Linter | Racket |
|
||||
| [SBLint](#sblint) | Common Lisp (SBCL) | Compiler-driven linter | Common Lisp |
|
||||
|
||||
---
|
||||
|
||||
## Eastwood
|
||||
|
||||
**Repo:** `refs/eastwood/` — Clojure linter (v1.4.3, by Jonas Enlund)
|
||||
|
||||
### What it does
|
||||
|
||||
A **bug-finding linter** for Clojure. Focuses on detecting actual errors
|
||||
(wrong arity, undefined vars, misplaced docstrings) rather than enforcing style.
|
||||
Achieves high accuracy by using the same compilation infrastructure as the
|
||||
Clojure compiler itself.
|
||||
|
||||
### How it works
|
||||
|
||||
```
|
||||
File discovery (tools.namespace)
|
||||
→ Topological sort by :require/:use deps
|
||||
→ For each namespace:
|
||||
Parse → Macroexpand → AST (tools.analyzer.jvm) → eval
|
||||
→ Run linter functions over AST nodes
|
||||
→ Filter warnings by config
|
||||
→ Report
|
||||
```
|
||||
|
||||
Key: uses `tools.analyzer.jvm/analyze+eval` — it actually **compiles and
|
||||
evaluates** source code to build an AST. This gives compiler-grade accuracy but
|
||||
means it can only lint code that successfully loads.
|
||||
|
||||
### Architecture
|
||||
|
||||
- **`lint.clj`** — Central coordinator: linter registry, namespace ordering,
|
||||
main analysis loop
|
||||
- **`analyze-ns.clj`** — AST generation via tools.analyzer
|
||||
- **`passes.clj`** — Custom analysis passes (reflection validation, def-name
|
||||
propagation)
|
||||
- **`linters/*.clj`** — Individual linter implementations (~8 files)
|
||||
- **`reporting-callbacks.clj`** — Output formatters (multimethod dispatch)
|
||||
- **`util.clj`** — Config loading, AST walking, warning filtering
|
||||
|
||||
### Rules (25+)
|
||||
|
||||
| Category | Examples |
|
||||
|----------|----------|
|
||||
| Arity | `:wrong-arity` — function called with wrong arg count |
|
||||
| Definitions | `:def-in-def`, `:redefd-vars`, `:misplaced-docstrings` |
|
||||
| Unused | `:unused-private-vars`, `:unused-fn-args`, `:unused-locals`, `:unused-namespaces` |
|
||||
| Suspicious | `:constant-test`, `:suspicious-expression`, `:suspicious-test` |
|
||||
| Style | `:unlimited-use`, `:non-dynamic-earmuffs`, `:local-shadows-var` |
|
||||
| Interop | `:reflection`, `:boxed-math`, `:performance` |
|
||||
| Types | `:wrong-tag`, `:deprecations` |
|
||||
|
||||
### Configuration
|
||||
|
||||
Rules are suppressed via **Clojure code** (not YAML/JSON):
|
||||
|
||||
```clojure
|
||||
(disable-warning
|
||||
{:linter :suspicious-expression
|
||||
:for-macro 'clojure.core/let
|
||||
:if-inside-macroexpansion-of #{'clojure.core/when-first}
|
||||
:within-depth 6
|
||||
:reason "False positive from when-first expansion"})
|
||||
```
|
||||
|
||||
Builtin config files ship for `clojure.core`, contrib libs, and popular
|
||||
third-party libraries. Users add their own via `:config-files` option.
|
||||
|
||||
### What we can learn
|
||||
|
||||
- **Macroexpansion-aware suppression** — Can distinguish user code from
|
||||
macro-generated code; suppression rules can target specific macro expansions.
|
||||
Critical for any Lisp linter.
|
||||
- **Topological namespace ordering** — Analyse dependencies before dependents.
|
||||
Relevant if we want cross-module analysis.
|
||||
- **Linter registry pattern** — Each linter is a map `{:name :fn :enabled-by-default :url}`.
|
||||
Simple, extensible.
|
||||
- **Warning filtering pipeline** — Raw warnings → handle result → remove ignored
|
||||
faults → remove excluded kinds → filter by config → final warnings. Clean
|
||||
composable chain.
|
||||
- **Metadata preservation through AST transforms** — Custom `postwalk` that
|
||||
preserves metadata. Essential for accurate source locations.
|
||||
|
||||
---
|
||||
|
||||
## fmt
|
||||
|
||||
**Repo:** `refs/fmt/` — Racket code formatter (v0.0.3, by Sorawee Porncharoenwase)
|
||||
|
||||
### What it does
|
||||
|
||||
An **extensible code formatter** for Racket. Reads source, reformats according
|
||||
to style conventions using **cost-based optimal layout selection**. Supports
|
||||
custom formatting rules via pluggable formatter maps.
|
||||
|
||||
### How it works
|
||||
|
||||
Clean **4-stage pipeline**:
|
||||
|
||||
```
|
||||
Source string
|
||||
→ [1] Tokenize (syntax-color/module-lexer)
|
||||
→ [2] Read/Parse → tree of node/atom/wrapper structs
|
||||
→ [3] Realign (fix sexp-comments, quotes)
|
||||
→ [4] Pretty-print (pretty-expressive library, cost-based)
|
||||
→ Formatted string
|
||||
```
|
||||
|
||||
The pretty-printer uses the **Wadler/Leijen optimal layout algorithm** via the
|
||||
`pretty-expressive` library. It evaluates multiple layout alternatives and
|
||||
selects the one with the lowest cost vector.
|
||||
|
||||
### Architecture
|
||||
|
||||
- **`tokenize.rkt`** (72 lines) — Lexer wrapper around Racket's `syntax-color`
|
||||
- **`read.rkt`** (135 lines) — Token stream → tree IR; preserves comments
|
||||
- **`realign.rkt`** (75 lines) — Post-process sexp-comments and quote prefixes
|
||||
- **`conventions.rkt`** (640 lines) — **All formatting rules** for 100+ Racket forms
|
||||
- **`core.rkt`** (167 lines) — `define-pretty` DSL, AST-to-document conversion
|
||||
- **`main.rkt`** (115 lines) — Public API, cost factory, entry point
|
||||
- **`params.rkt`** (38 lines) — Configuration parameters (width, indent, etc.)
|
||||
- **`raco.rkt`** (148 lines) — CLI interface (`raco fmt`)
|
||||
|
||||
### Formatting rules (100+)
|
||||
|
||||
Rules are organised by form type in `conventions.rkt`:
|
||||
|
||||
| Category | Forms |
|
||||
|----------|-------|
|
||||
| Control flow | `if`, `when`, `unless`, `cond`, `case-lambda` |
|
||||
| Definitions | `define`, `define-syntax`, `lambda`, `define/contract` |
|
||||
| Bindings | `let`, `let*`, `letrec`, `parameterize`, `with-handlers` |
|
||||
| Loops | `for`, `for/list`, `for/fold`, `for/hash` (15+ variants) |
|
||||
| Modules | `module`, `begin`, `class`, `interface` |
|
||||
| Macros | `syntax-rules`, `match`, `syntax-parse`, `syntax-case` |
|
||||
| Imports | `require`, `provide` — vertically stacked |
|
||||
|
||||
### Configuration
|
||||
|
||||
**Pluggable formatter maps** — a function `(string? → procedure?)`:
|
||||
|
||||
```racket
|
||||
;; .fmt.rkt
|
||||
(define (the-formatter-map s)
|
||||
(case s
|
||||
[("my-form") (format-uniform-body/helper 4)]
|
||||
[else #f])) ; delegate to standard
|
||||
```
|
||||
|
||||
Formatter maps compose via `compose-formatter-map` (chain of responsibility).
|
||||
|
||||
**Runtime parameters:**
|
||||
|
||||
| Parameter | Default | Purpose |
|
||||
|-----------|---------|---------|
|
||||
| `current-width` | 102 | Page width limit |
|
||||
| `current-limit` | 120 | Computation width limit |
|
||||
| `current-max-blank-lines` | 1 | Max consecutive blank lines |
|
||||
| `current-indent` | 0 | Extra indentation |
|
||||
|
||||
### Cost-based layout selection
|
||||
|
||||
The pretty-printer evaluates layout alternatives using a **3-dimensional cost
|
||||
vector** `[badness, height, characters]`:
|
||||
|
||||
- **Badness** — Quadratic penalty for exceeding page width
|
||||
- **Height** — Number of lines used
|
||||
- **Characters** — Total character count (tiebreaker)
|
||||
|
||||
This means the formatter provably selects the **optimal layout** within the
|
||||
configured width, not just the first one that fits.
|
||||
|
||||
### What we can learn
|
||||
|
||||
- **Cost-based layout is the gold standard** for formatter quality. Worth
|
||||
investing in an optimal pretty-printer (Wadler/Leijen family) rather than
|
||||
ad-hoc heuristics.
|
||||
- **Staged pipeline** (tokenize → parse → realign → pretty-print) is clean,
|
||||
testable, and easy to reason about. Each stage has well-defined I/O.
|
||||
- **Form-specific formatting rules** (`define-pretty` DSL) — each Scheme
|
||||
special form gets a dedicated formatter. Extensible via user-provided maps.
|
||||
- **Comment preservation as metadata** — Comments are attached to AST nodes, not
|
||||
discarded. Essential for a practical formatter.
|
||||
- **Pattern-based extraction** — `match/extract` identifies which elements can
|
||||
stay inline vs. must be on separate lines. Smart structural analysis.
|
||||
- **Memoisation via weak hash tables** — Performance optimisation for AST
|
||||
traversal without memory leaks.
|
||||
- **Config file convention** — `.fmt.rkt` in project root, auto-discovered. We
|
||||
should do similar (`.gulie.scm` or similar).
|
||||
|
||||
---
|
||||
|
||||
## Kibit
|
||||
|
||||
**Repo:** `refs/kibit/` — Clojure idiom suggester (v0.1.11, by Jonas Enlund)
|
||||
|
||||
### What it does
|
||||
|
||||
A **static code analyser** that identifies non-idiomatic Clojure code and
|
||||
suggests more idiomatic replacements. Example: `(if x y nil)` → `(when x y)`.
|
||||
Supports auto-replacement via `--replace` flag.
|
||||
|
||||
**Status:** Maintenance mode. Authors recommend **Splint** as successor
|
||||
(faster, more extensible).
|
||||
|
||||
### How it works
|
||||
|
||||
```
|
||||
Source file
|
||||
→ Parse with edamame (side-effect-free reader)
|
||||
→ Extract S-expressions
|
||||
→ Tree walk (depth-first via clojure.walk/prewalk)
|
||||
→ Match each node against rules (core.logic unification)
|
||||
→ Simplify (iterative rewriting until fixpoint)
|
||||
→ Report or Replace (via rewrite-clj zippers)
|
||||
```
|
||||
|
||||
The key insight: rules are expressed as **logic programming patterns** using
|
||||
`clojure.core.logic`. Pattern variables (`?x`, `?y`) unify against arbitrary
|
||||
subexpressions.
|
||||
|
||||
### Architecture
|
||||
|
||||
- **`core.clj`** (33 lines) — Core simplification logic (tiny!)
|
||||
- **`check.clj`** (204 lines) — Public API for checking expressions/files
|
||||
- **`check/reader.clj`** (189 lines) — Source parsing with alias tracking
|
||||
- **`rules.clj`** (39 lines) — Rule aggregation and indexing
|
||||
- **`rules/*.clj`** (~153 lines) — Rule definitions by category
|
||||
- **`reporters.clj`** (59 lines) — Output formatters (text, markdown)
|
||||
- **`replace.clj`** (134 lines) — Auto-replacement via rewrite-clj zippers
|
||||
- **`driver.clj`** (144 lines) — CLI entry point, file discovery
|
||||
|
||||
Total: ~1,105 lines. Remarkably compact.
|
||||
|
||||
### Rules (~60)
|
||||
|
||||
Rules are defined via the `defrules` macro:
|
||||
|
||||
```clojure
|
||||
(defrules rules
|
||||
;; Control structures
|
||||
[(if ?x ?y nil) (when ?x ?y)]
|
||||
[(if ?x nil ?y) (when-not ?x ?y)]
|
||||
[(if (not ?x) ?y ?z) (if-not ?x ?y ?z)]
|
||||
[(do ?x) ?x]
|
||||
|
||||
;; Arithmetic
|
||||
[(+ ?x 1) (inc ?x)]
|
||||
[(- ?x 1) (dec ?x)]
|
||||
|
||||
;; Collections
|
||||
[(not (empty? ?x)) (seq ?x)]
|
||||
[(into [] ?coll) (vec ?coll)]
|
||||
|
||||
;; Equality
|
||||
[(= ?x nil) (nil? ?x)]
|
||||
[(= 0 ?x) (zero? ?x)])
|
||||
```
|
||||
|
||||
Categories: **control structures**, **arithmetic**, **collections**,
|
||||
**equality**, **miscellaneous** (string ops, Java interop, threading macros).
|
||||
|
||||
### Auto-replacement
|
||||
|
||||
Uses **rewrite-clj zippers** — functional tree navigation that preserves
|
||||
whitespace, comments, and formatting when applying replacements. Navigate to the
|
||||
target node, swap it, regenerate text.
|
||||
|
||||
### What we can learn
|
||||
|
||||
- **Logic programming for pattern matching** is beautifully expressive for
|
||||
"suggest X instead of Y" rules. `core.logic` unification makes patterns
|
||||
concise and bidirectional. We could use Guile's pattern matching or even a
|
||||
miniKanren implementation.
|
||||
- **Rule-as-data pattern** — Rules are just vectors `[pattern replacement]`.
|
||||
Easy to add, easy to test, easy for users to contribute.
|
||||
- **Iterative rewriting to fixpoint** — Apply rules until nothing changes.
|
||||
Catches nested patterns that only become apparent after an inner rewrite.
|
||||
- **Zipper-based source rewriting** — Preserves formatting/comments when
|
||||
applying fixes. Critical for auto-fix functionality.
|
||||
- **Side-effect-free parsing** — Using edamame instead of `clojure.core/read`
|
||||
avoids executing reader macros. Important for security and for analysing code
|
||||
with unknown dependencies.
|
||||
- **Guard-based filtering** — Composable predicates that decide whether to
|
||||
report a suggestion. Users can plug in custom guards.
|
||||
- **Two resolution modes** — `:toplevel` (entire defn) vs `:subform` (individual
|
||||
expressions). Different granularity for different use cases.
|
||||
|
||||
---
|
||||
|
||||
## Mallet
|
||||
|
||||
**Repo:** `refs/mallet/` — Common Lisp linter + formatter + fixer (~15,800 LOC)
|
||||
|
||||
### What it does
|
||||
|
||||
A **production-grade linter** for Common Lisp with 40+ rules across 7
|
||||
categories, auto-fixing, a powerful configuration system (presets with
|
||||
inheritance), and multiple suppression mechanisms. Targets SBCL.
|
||||
|
||||
### How it works
|
||||
|
||||
**Three-phase pipeline:**
|
||||
|
||||
```
|
||||
File content
|
||||
→ [1] Tokenize (hand-written tokenizer, preserves all tokens incl. comments)
|
||||
→ [2] Parse (Eclector reader with parse-result protocol → forms with precise positions)
|
||||
→ [3] Rule checking (text rules, token rules, form rules)
|
||||
→ Suppression filtering
|
||||
→ Auto-fix & formatting
|
||||
→ Report
|
||||
```
|
||||
|
||||
**Critical design decision:** Symbols are stored as **strings**, not interned.
|
||||
This means the parser never needs to resolve packages — safe to analyse code
|
||||
with unknown dependencies.
|
||||
|
||||
### Architecture
|
||||
|
||||
| Module | Lines | Purpose |
|
||||
|--------|-------|---------|
|
||||
| `main.lisp` | ~600 | CLI parsing, entry point |
|
||||
| `engine.lisp` | ~900 | Linting orchestration, suppression filtering |
|
||||
| `config.lisp` | ~1,200 | Config files, presets, path-specific overrides |
|
||||
| `parser/reader.lisp` | ~800 | Eclector integration, position tracking |
|
||||
| `parser/tokenizer.lisp` | ~200 | Hand-written tokenizer |
|
||||
| `suppression.lisp` | ~600 | Suppression state management |
|
||||
| `formatter.lisp` | ~400 | Output formatters (text, JSON, line) |
|
||||
| `fixer.lisp` | ~300 | Auto-fix application |
|
||||
| `rules/` | ~5,500 | 40+ individual rule implementations |
|
||||
|
||||
### Rules (40+)
|
||||
|
||||
| Category | Rules | Examples |
|
||||
|----------|-------|----------|
|
||||
| Correctness | 2 | `ecase` with `otherwise`, missing `otherwise` |
|
||||
| Suspicious | 5 | Runtime `eval`, symbol interning, `ignore-errors` |
|
||||
| Practice | 6 | Avoid `:use` in `defpackage`, one package per file |
|
||||
| Cleanliness | 4 | Unused variables, unused loop vars, unused imports |
|
||||
| Style | 5 | `when`/`unless` vs `if` without else, needless `let*` |
|
||||
| Format | 6 | Line length, trailing whitespace, tabs, blank lines |
|
||||
| Metrics | 3 | Function length, cyclomatic complexity, comment ratio |
|
||||
| ASDF | 8 | Component strings, redundant prefixes, secondary systems |
|
||||
| Naming | 4 | `*special*` and `+constant+` conventions |
|
||||
| Documentation | 4 | Missing docstrings (functions, packages, variables) |
|
||||
|
||||
Rules are **classes** inheriting from a base `rule` class with generic methods:
|
||||
|
||||
```lisp
|
||||
(defclass if-without-else-rule (base:rule)
|
||||
()
|
||||
(:default-initargs
|
||||
:name :missing-else
|
||||
:severity :warning
|
||||
:category :style
|
||||
:type :form))
|
||||
|
||||
(defmethod base:check-form ((rule if-without-else-rule) form file)
|
||||
...)
|
||||
```
|
||||
|
||||
### Configuration system
|
||||
|
||||
**Layered presets with inheritance:**
|
||||
|
||||
```lisp
|
||||
(:mallet-config
|
||||
(:extends :strict)
|
||||
(:ignore "**/vendor/**")
|
||||
(:enable :cyclomatic-complexity :max 15)
|
||||
(:disable :function-length)
|
||||
(:set-severity :metrics :info)
|
||||
(:for-paths ("tests")
|
||||
(:enable :line-length :max 120)
|
||||
(:disable :unused-variables)))
|
||||
```
|
||||
|
||||
Built-in presets: `:default`, `:strict`, `:all`, `:none`.
|
||||
|
||||
Precedence: CLI flags > config file > preset inheritance > built-in defaults.
|
||||
|
||||
### Suppression mechanisms (3 levels)
|
||||
|
||||
1. **Declarations** — `#+mallet (declaim (mallet:suppress-next :rule-name))`
|
||||
2. **Inline comments** — `; mallet:suppress rule-name`
|
||||
3. **Region-based** — `; mallet:disable rule-name` / `; mallet:enable rule-name`
|
||||
4. **Stale suppression detection** — Warns when suppressions don't match any violation
|
||||
|
||||
### Auto-fix
|
||||
|
||||
Fixes are collected, sorted bottom-to-top (to preserve line numbers), and
|
||||
applied in a single pass. Fix types: `:replace-line`, `:delete-range`,
|
||||
`:delete-lines`, `:replace-form`.
|
||||
|
||||
### What we can learn
|
||||
|
||||
- **Symbols as strings** is a crucial insight for Lisp linters. Avoids
|
||||
package/module resolution entirely. We should do the same for Guile — parse
|
||||
symbols without interning them.
|
||||
- **Eclector-style parse-result protocol** — Every sub-expression gets precise
|
||||
line/column info. Invest in this early; it's the foundation of accurate error
|
||||
reporting.
|
||||
- **Three rule types** (text, token, form) — Clean separation. Text rules don't
|
||||
need parsing, token rules don't need a full AST, form rules get the full tree.
|
||||
Efficient and composable.
|
||||
- **Preset inheritance with path-specific overrides** — Powerful configuration
|
||||
that scales from solo projects to monorepos. `:for-paths` is particularly
|
||||
useful (different rules for `src/` vs `tests/`).
|
||||
- **Multiple suppression mechanisms** — Comment-based, declaration-based,
|
||||
region-based. Users need all three for real-world use.
|
||||
- **Stale suppression detection** — Prevents suppression comments from
|
||||
accumulating after the underlying issue is fixed. Brilliant.
|
||||
- **Rule metaclass pattern** — Base class + generic methods scales cleanly to
|
||||
40+ rules. Each rule is self-contained with its own severity, category, and
|
||||
check method.
|
||||
- **Bottom-to-top fix application** — Simple trick that avoids line number
|
||||
invalidation when applying multiple fixes to the same file.
|
||||
|
||||
---
|
||||
|
||||
## OCICL Lint
|
||||
|
||||
**Repo:** `refs/ocicl/` — Common Lisp linter (part of the OCICL package manager)
|
||||
|
||||
### What it does
|
||||
|
||||
A **129-rule linter with auto-fix** for Common Lisp, integrated into the OCICL
|
||||
package manager as a subcommand (`ocicl lint`). Supports dry-run mode,
|
||||
per-line suppression, and `.ocicl-lint.conf` configuration.
|
||||
|
||||
### How it works
|
||||
|
||||
**Three-pass analysis:**
|
||||
|
||||
```
|
||||
File content
|
||||
→ [Pass 1] Line-based rules (text-level: whitespace, tabs, line length)
|
||||
→ [Pass 2] AST-based rules (via rewrite-cl zippers: naming, bindings, packages)
|
||||
→ [Pass 3] Single-pass visitor rules (pattern matching: 50+ checks in one traversal)
|
||||
→ Suppression filtering (per-line ; lint:suppress comments)
|
||||
→ Auto-fix (via fixer registry)
|
||||
→ Report
|
||||
```
|
||||
|
||||
### Architecture
|
||||
|
||||
```
|
||||
lint/
|
||||
├── linter.lisp — Main orchestrator, issue aggregation, output formatting
|
||||
├── config.lisp — .ocicl-lint.conf parsing
|
||||
├── parsing.lisp — rewrite-cl wrapper (zipper API)
|
||||
├── fixer.lisp — Auto-fix infrastructure with RCS/backup support
|
||||
├── main.lisp — CLI entry point
|
||||
├── rules/
|
||||
│ ├── line-based.lisp — Text-level rules (9 rules)
|
||||
│ ├── ast.lisp — AST-based rules (naming, lambda lists, bindings)
|
||||
│ └── single-pass.lisp — Pattern matching rules (50+ in one walk)
|
||||
└── fixes/
|
||||
├── whitespace.lisp — Formatting fixes
|
||||
└── style.lisp — Style rule fixes
|
||||
```
|
||||
|
||||
### Rules (129)
|
||||
|
||||
| Category | Count | Examples |
|
||||
|----------|-------|---------|
|
||||
| Formatting | 9 | Trailing whitespace, tabs, line length, blank lines |
|
||||
| File structure | 3 | SPDX headers, package declarations, reader errors |
|
||||
| Naming | 6 | Underscores, `*special*` style, `+constant+` style, vague names |
|
||||
| Boolean/conditionals | 18 | `(IF test T NIL)` → `test`, `(WHEN (NOT x) ...)` → `(UNLESS x ...)` |
|
||||
| Logic simplification | 12 | Flatten nested `AND`/`OR`, redundant conditions |
|
||||
| Arithmetic | 4 | `(+ x 1)` → `(1+ x)`, `(= x 0)` → `(zerop x)` |
|
||||
| List operations | 13 | `FIRST`/`REST` vs `CAR`/`CDR`, `(cons x nil)` → `(list x)` |
|
||||
| Comparison | 5 | `EQL` vs `EQ`, string equality, membership testing |
|
||||
| Sequence operations | 6 | `-IF-NOT` variants, `ASSOC` patterns |
|
||||
| Advanced/safety | 26 | Library suggestions, destructive ops on constants |
|
||||
|
||||
### Configuration
|
||||
|
||||
INI-style `.ocicl-lint.conf`:
|
||||
|
||||
```ini
|
||||
max-line-length = 180
|
||||
suppress-rules = rule1, rule2, rule3
|
||||
suggest-libraries = alexandria, uiop, serapeum
|
||||
```
|
||||
|
||||
Per-line suppression:
|
||||
|
||||
```lisp
|
||||
(some-code) ; lint:suppress rule-name1 rule-name2
|
||||
(other-code) ; lint:suppress ;; suppress ALL rules on this line
|
||||
```
|
||||
|
||||
### Fixer registry
|
||||
|
||||
```lisp
|
||||
(register-fixer "rule-name" #'fixer-function)
|
||||
```
|
||||
|
||||
Fixers are decoupled from rule detection. Each fixer takes `(content issue)` and
|
||||
returns modified content or NIL. Supports RCS backup before modification.
|
||||
|
||||
### What we can learn
|
||||
|
||||
- **Single-pass visitor for pattern rules** — 50+ pattern checks in one tree
|
||||
traversal. Much faster than running each rule separately. Good model for
|
||||
performance-sensitive linting.
|
||||
- **Quote awareness** — Detects quoted contexts (`'x`, `quote`, backtick) to
|
||||
avoid false positives inside macro templates. We'll need the same for Guile.
|
||||
- **Fixer registry pattern** — Decouples detection from fixing. Easy to add
|
||||
auto-fix for a rule without touching the rule itself.
|
||||
- **Library suggestion rules** — "You could use `(alexandria:when-let ...)`
|
||||
instead of this pattern." Interesting category that could work for Guile
|
||||
(SRFI suggestions, etc.).
|
||||
- **Three-pass architecture** — Line-based first (fastest, no parsing needed),
|
||||
then AST, then pattern matching. Each pass adds cost; skip what you don't need.
|
||||
|
||||
---
|
||||
|
||||
## racket-review
|
||||
|
||||
**Repo:** `refs/racket-review/` — Racket linter (v0.2, by Bogdan Popa)
|
||||
|
||||
### What it does
|
||||
|
||||
A **surface-level linter** for Racket modules. Intentionally does NOT expand
|
||||
macros — analyses syntax only, optimised for **speed**. Designed for tight
|
||||
editor integration (ships with Flycheck for Emacs).
|
||||
|
||||
### How it works
|
||||
|
||||
```
|
||||
File → read-syntax (Racket's built-in reader)
|
||||
→ Validate as module form (#lang)
|
||||
→ Walk syntax tree via syntax-parse
|
||||
→ Track scopes, bindings, provides, usages
|
||||
→ Report problems
|
||||
```
|
||||
|
||||
The entire rule system is built on Racket's `syntax/parse` — pattern matching
|
||||
on syntax objects with guard conditions and side effects.
|
||||
|
||||
### Architecture
|
||||
|
||||
Remarkably compact:
|
||||
|
||||
| File | Lines | Purpose |
|
||||
|------|-------|---------|
|
||||
| `lint.rkt` | 1,130 | **All linting rules** + semantic tracking |
|
||||
| `problem.rkt` | 26 | Problem data structure |
|
||||
| `cli.rkt` | 25 | CLI interface |
|
||||
| `ext.rkt` | 59 | Extension mechanism |
|
||||
|
||||
### Semantic tracking
|
||||
|
||||
Maintains multiple **parameter-based state machines**:
|
||||
|
||||
- **Scope stack** — Hierarchical scope with parent links, binding hash at each level
|
||||
- **Binding info** — Per-identifier: syntax object, usage count, check flag,
|
||||
related identifiers
|
||||
- **Provide tracking** — What's explicitly `provide`d vs `all-defined-out`
|
||||
- **Punted bindings** — Forward references resolved when definition is encountered
|
||||
- **Savepoints** — Save/restore state for tentative matching in complex patterns
|
||||
|
||||
### Rules
|
||||
|
||||
**Errors (23 patterns):**
|
||||
- Identifier already defined in same scope
|
||||
- `if` missing else branch
|
||||
- `let`/`for` missing body
|
||||
- `case` clauses not quoted literals
|
||||
- Wrong match fallthrough pattern (`_` not `else`)
|
||||
- Provided but not defined
|
||||
|
||||
**Warnings (17+ patterns):**
|
||||
- Identifier never used
|
||||
- Brackets: `let` bindings should use `[]`, not `()`
|
||||
- Requires not sorted (for-syntax first, then alphabetical)
|
||||
- Cond without else clause
|
||||
- Nested if (flatten to cond)
|
||||
- `racket/contract` → use `racket/contract/base`
|
||||
|
||||
### Suppression
|
||||
|
||||
```racket
|
||||
#|review: ignore|# ;; Ignore entire file
|
||||
;; noqa ;; Ignore this line
|
||||
;; review: ignore ;; Ignore this line
|
||||
```
|
||||
|
||||
### Extension mechanism
|
||||
|
||||
Plugins register via Racket's package system:
|
||||
|
||||
```racket
|
||||
(define review-exts
|
||||
'((module-path predicate-proc lint-proc)))
|
||||
```
|
||||
|
||||
Extensions receive a `current-reviewer` parameter with API:
|
||||
`recur`, `track-error!`, `track-warning!`, `track-binding!`, `push-scope!`,
|
||||
`pop-scope!`, `save!`, `undo!`.
|
||||
|
||||
### What we can learn
|
||||
|
||||
- **Surface-level analysis is fast and useful** — No macro expansion means
|
||||
instant feedback. Catches the majority of real mistakes. Good default for
|
||||
editor integration; deeper analysis can be opt-in.
|
||||
- **syntax-parse as rule DSL** — Pattern matching on syntax objects is a natural
|
||||
fit for Lisp linters. Guile has `syntax-case` and `match` which serve a
|
||||
similar role.
|
||||
- **Scope tracking with punted bindings** — Handles forward references in a
|
||||
single pass. Elegant solution for `letrec`-style bindings and mutual recursion.
|
||||
- **Savepoints for tentative matching** — Save/restore state when the parser
|
||||
enters a complex branch. If the branch fails, roll back. Useful for `cond`,
|
||||
`match`, etc.
|
||||
- **Plugin API via reviewer parameter** — Extensions get a well-defined API
|
||||
surface. Clean contract between core and plugins.
|
||||
- **Snapshot-based testing** — 134 test files with `.rkt`/`.rkt.out` pairs.
|
||||
Lint a file, compare output to expected. Simple, maintainable, high coverage.
|
||||
- **Bracket style enforcement** — Racket uses `[]` for bindings, `()` for
|
||||
application. Guile doesn't have this, but we could enforce consistent bracket
|
||||
usage or other parenthesis conventions.
|
||||
|
||||
---
|
||||
|
||||
## SBLint
|
||||
|
||||
**Repo:** `refs/sblint/` — SBCL compiler-driven linter (~650 LOC)
|
||||
|
||||
### What it does
|
||||
|
||||
A **compiler-assisted linter** for Common Lisp. Doesn't implement its own rules —
|
||||
instead, it **compiles code through SBCL** and surfaces all compiler diagnostics
|
||||
(errors, warnings, style notes) with proper file locations.
|
||||
|
||||
### How it works
|
||||
|
||||
```
|
||||
Source code
|
||||
→ Resolve ASDF dependencies (topological sort)
|
||||
→ Load dependencies via Quicklisp
|
||||
→ Compile project via SBCL (handler-bind captures conditions)
|
||||
→ Extract file/position from compiler internals (Swank protocol)
|
||||
→ Convert byte offset → line:column
|
||||
→ Deduplicate and report
|
||||
```
|
||||
|
||||
No custom parser. No AST. Just the compiler.
|
||||
|
||||
### Architecture
|
||||
|
||||
| File | Lines | Purpose |
|
||||
|------|-------|---------|
|
||||
| `run-lint.lisp` | 277 | Core logic: lint file/system/directory |
|
||||
| `compiler-aux.lisp` | 33 | SBCL introspection bridge |
|
||||
| `asdf.lisp` | 153 | Dependency resolution graph |
|
||||
| `file-position.lisp` | 18 | Byte offset → line:column conversion |
|
||||
| `quicklisp.lisp` | 41 | Auto-install missing dependencies |
|
||||
| `sblint.ros` | — | CLI entry point (Roswell script) |
|
||||
|
||||
### What it catches
|
||||
|
||||
Whatever SBCL catches:
|
||||
- Undefined variables and functions
|
||||
- Type mismatches (with SBCL's type inference)
|
||||
- Style warnings (ANSI compliance, naming)
|
||||
- Reader/syntax errors
|
||||
- Dead code paths
|
||||
- Unused declarations
|
||||
|
||||
Filters out: redefinition warnings, Quicklisp dependency warnings, SBCL
|
||||
contrib warnings.
|
||||
|
||||
### What we can learn
|
||||
|
||||
- **Leverage the host compiler** — Guile itself has `compile` and can produce
|
||||
warnings. We should capture Guile's own compiler diagnostics (undefined
|
||||
variables, unused imports, etc.) as a baseline — it's "free" accuracy.
|
||||
- **Condition-based error collection** — CL's condition system (≈ Guile's
|
||||
exception/handler system) lets you catch errors without stopping compilation.
|
||||
`handler-bind` continues execution after catching. Guile's `with-exception-handler`
|
||||
can do the same.
|
||||
- **Dependency-aware compilation** — Load dependencies first, then compile
|
||||
project. Catches "symbol not found" errors that surface-level analysis misses.
|
||||
- **Deduplication** — Multiple compilation passes can report the same issue.
|
||||
Hash table dedup is simple and effective.
|
||||
- **Minimal is viable** — 650 LOC total. A compiler-driven linter layer could
|
||||
be our first deliverable, augmented with custom rules later.
|
||||
|
||||
---
|
||||
|
||||
## Cross-cutting themes
|
||||
|
||||
### Parsing strategies
|
||||
|
||||
| Strategy | Used by | Pros | Cons |
|
||||
|----------|---------|------|------|
|
||||
| Host compiler | SBLint, Eastwood | Maximum accuracy, type checking | Requires loading code, slow |
|
||||
| Custom reader with positions | Mallet, fmt | Full control, no side effects | Must maintain parser |
|
||||
| Language's built-in reader | racket-review | Free, well-tested | May lack position info |
|
||||
| Side-effect-free reader lib | Kibit (edamame) | Safe, preserves metadata | External dependency |
|
||||
| Zipper-based AST | OCICL (rewrite-cl) | Preserves formatting for fixes | Complex API |
|
||||
|
||||
**For Guile:** We should explore whether `(ice-9 read)` or Guile's reader
|
||||
provides sufficient source location info. If not, a custom reader (or a reader
|
||||
wrapper that annotates with positions) is needed. Guile's `read-syntax` (if
|
||||
available) or source properties on read forms could be the answer.
|
||||
|
||||
### Rule definition patterns
|
||||
|
||||
| Pattern | Used by | Character |
|
||||
|---------|---------|-----------|
|
||||
| Logic programming (unification) | Kibit | Elegant, concise; slow |
|
||||
| OOP classes + generic methods | Mallet | Scales well, self-contained rules |
|
||||
| Registry maps | Eastwood | Simple, data-driven |
|
||||
| Syntax-parse patterns | racket-review, fmt | Natural for Lisps |
|
||||
| Single-pass visitor | OCICL | High performance |
|
||||
| Compiler conditions | SBLint | Zero-effort, limited scope |
|
||||
|
||||
**For Guile:** A combination seems right — `match`/`syntax-case` patterns for
|
||||
the rule DSL (natural in Scheme), with a registry for rule metadata (name,
|
||||
severity, category, enabled-by-default).
|
||||
|
||||
### Configuration patterns
|
||||
|
||||
| Feature | Mallet | OCICL | Eastwood | Kibit | racket-review | fmt |
|
||||
|---------|--------|-------|----------|-------|---------------|-----|
|
||||
| Config file | `.mallet.lisp` | `.ocicl-lint.conf` | Clojure maps | `project.clj` | - | `.fmt.rkt` |
|
||||
| Presets | Yes (4) | - | - | - | - | - |
|
||||
| Preset inheritance | Yes | - | - | - | - | - |
|
||||
| Path-specific rules | Yes | - | - | - | - | - |
|
||||
| Inline suppression | Yes (3 mechanisms) | Yes | Yes | - | Yes | - |
|
||||
| Stale suppression detection | Yes | - | - | - | - | - |
|
||||
| CLI override | Yes | Yes | Yes | Yes | - | Yes |
|
||||
|
||||
**For Guile:** Mallet's configuration system is the most sophisticated and
|
||||
worth emulating — presets, inheritance, path-specific overrides, and stale
|
||||
suppression detection.
|
||||
|
||||
### Auto-fix patterns
|
||||
|
||||
| Tool | Fix mechanism | Preserves formatting? |
|
||||
|------|--------------|----------------------|
|
||||
| Kibit | rewrite-clj zippers | Yes |
|
||||
| Mallet | Bottom-to-top line replacement | Partial |
|
||||
| OCICL | Fixer registry + zipper AST | Yes |
|
||||
|
||||
**For Guile:** Zipper-based AST manipulation (or Guile's SXML tools) for
|
||||
formatting-preserving fixes. The fixer registry pattern (OCICL) keeps rule
|
||||
detection and fixing decoupled.
|
||||
|
||||
### Output formats
|
||||
|
||||
All tools support at minimum: `file:line:column: severity: message`
|
||||
|
||||
Additional formats: JSON (Mallet), Markdown (Kibit), line-only for CI (Mallet).
|
||||
|
||||
---
|
||||
|
||||
## Feature wishlist for gulie
|
||||
|
||||
Based on this survey, the features worth cherry-picking:
|
||||
|
||||
### Must-have (core)
|
||||
|
||||
1. **Guile compiler diagnostics** — Capture Guile's own warnings as baseline (SBLint approach)
|
||||
2. **Custom reader with source positions** — Every form, subform, and token gets line:column
|
||||
3. **Staged pipeline** — Text rules → token rules → form rules (Mallet/OCICL)
|
||||
4. **Pattern-based rule DSL** — Using Guile's `match` or `syntax-case` (Kibit/racket-review inspiration)
|
||||
5. **Rule registry** — `{name, severity, category, enabled-by-default, check-fn}` (Eastwood)
|
||||
6. **Standard output format** — `file:line:column: severity: rule: message`
|
||||
7. **Inline suppression** — `; gulie:suppress rule-name` (Mallet/OCICL)
|
||||
|
||||
### Should-have (v1)
|
||||
|
||||
8. **Config file** — `.gulie.scm` with presets and rule enable/disable (Mallet)
|
||||
9. **Auto-fix infrastructure** — Fixer registry, bottom-to-top application (OCICL/Mallet)
|
||||
10. **Idiom suggestions** — Pattern → replacement rules (Kibit style)
|
||||
11. **Unused binding detection** — Scope tracking with forward reference handling (racket-review)
|
||||
12. **Quote/unquote awareness** — Don't lint inside quoted forms (OCICL)
|
||||
13. **Snapshot-based testing** — `.scm`/`.expected` pairs (racket-review)
|
||||
|
||||
### Nice-to-have (v2+)
|
||||
|
||||
14. **Code formatter** — Cost-based optimal layout (fmt)
|
||||
15. **Pluggable formatter maps** — Per-form formatting rules (fmt)
|
||||
16. **Path-specific rule overrides** — Different rules for `src/` vs `tests/` (Mallet)
|
||||
17. **Stale suppression detection** (Mallet)
|
||||
18. **Editor integration** — Flycheck/flymake for Emacs (racket-review)
|
||||
19. **Macroexpansion-aware analysis** — Suppress false positives from macro output (Eastwood)
|
||||
20. **Cyclomatic complexity and other metrics** (Mallet)
|
||||
470
docs/PLAN.md
Normal file
470
docs/PLAN.md
Normal file
@@ -0,0 +1,470 @@
|
||||
# Gulie — Guile Linter/Formatter: Architecture & Implementation Plan
|
||||
|
||||
## Context
|
||||
|
||||
No linter, formatter, or static analyser exists for Guile Scheme. We're building
|
||||
one from scratch, called **gulie**. The tool is written in Guile itself, reusing
|
||||
as much of Guile's infrastructure as possible (reader, compiler, Tree-IL
|
||||
analyses, warning system). The design draws on patterns observed in 7 reference
|
||||
tools (see `docs/INSPIRATION.md`).
|
||||
|
||||
Guile 3.0.11 is available in the devenv. No source code exists yet.
|
||||
|
||||
---
|
||||
|
||||
## High-level architecture
|
||||
|
||||
Two independent passes, extensible to three:
|
||||
|
||||
```
|
||||
.gulie.sexp (config)
|
||||
|
|
||||
file.scm ──┬──> [Tokenizer] ──> tokens ──> [CST parser] ──> CST
|
||||
| |
|
||||
| [Pass 1: Surface] line rules + CST rules
|
||||
| |
|
||||
| diagnostics-1
|
||||
|
|
||||
└──> [Guile reader] ──> s-exprs ──> [Guile compiler] ──> Tree-IL
|
||||
|
|
||||
[Pass 2: Semantic] built-in analyses + custom Tree-IL rules
|
||||
|
|
||||
diagnostics-2
|
||||
|
|
||||
[merge + suppress + sort + report/fix]
|
||||
```
|
||||
|
||||
**Why two passes?** Guile's reader (`ice-9/read.scm:949-973`) irrecoverably
|
||||
strips comments, whitespace, and datum comments in `next-non-whitespace`. There
|
||||
is no way to get formatting info AND semantic info from one parse. Accepting this
|
||||
and building two clean, independent passes is simpler than fighting the reader.
|
||||
|
||||
---
|
||||
|
||||
## Module structure
|
||||
|
||||
```
|
||||
gulie/
|
||||
bin/gulie # CLI entry point (executable Guile script)
|
||||
gulie/
|
||||
cli.scm # (gulie cli) — arg parsing, dispatch
|
||||
config.scm # (gulie config) — .gulie.sexp loading, defaults, merging
|
||||
diagnostic.scm # (gulie diagnostic) — record type, sorting, formatting
|
||||
tokenizer.scm # (gulie tokenizer) — hand-written lexer, preserves everything
|
||||
cst.scm # (gulie cst) — token stream → concrete syntax tree
|
||||
compiler.scm # (gulie compiler) — Guile compile wrapper, warning capture
|
||||
rule.scm # (gulie rule) — rule record, registry, define-rule macros
|
||||
engine.scm # (gulie engine) — orchestrator: file discovery, pass sequencing
|
||||
fixer.scm # (gulie fixer) — fix application (bottom-to-top edits)
|
||||
suppression.scm # (gulie suppression) — ; gulie:suppress parsing/filtering
|
||||
formatter.scm # (gulie formatter) — cost-based optimal pretty-printer
|
||||
rules/
|
||||
surface.scm # (gulie rules surface) — trailing-ws, line-length, tabs, blanks
|
||||
indentation.scm # (gulie rules indentation) — indent checking vs CST
|
||||
comments.scm # (gulie rules comments) — comment style conventions
|
||||
semantic.scm # (gulie rules semantic) — wrappers around Guile's analyses
|
||||
idiom.scm # (gulie rules idiom) — pattern-based suggestions via match
|
||||
module-form.scm # (gulie rules module-form) — define-module checks
|
||||
test/
|
||||
test-tokenizer.scm
|
||||
test-cst.scm
|
||||
test-rules-surface.scm
|
||||
test-rules-semantic.scm
|
||||
fixtures/
|
||||
clean/ # .scm files producing zero diagnostics
|
||||
violations/ # .scm + .expected pairs (snapshot testing)
|
||||
```
|
||||
|
||||
~16 source files. Each has one clear job.
|
||||
|
||||
---
|
||||
|
||||
## Key components
|
||||
|
||||
### Tokenizer (`gulie/tokenizer.scm`)
|
||||
|
||||
Hand-written character-by-character state machine. Must handle the same lexical
|
||||
syntax as Guile's reader but **preserve** what the reader discards.
|
||||
|
||||
```scheme
|
||||
(define-record-type <token>
|
||||
(make-token type text line column)
|
||||
token?
|
||||
(type token-type) ;; symbol (see list below)
|
||||
(text token-text) ;; string: exact source text
|
||||
(line token-line) ;; integer: 1-based
|
||||
(column token-column)) ;; integer: 0-based
|
||||
```
|
||||
|
||||
Token types (~15): `open-paren`, `close-paren`, `symbol`, `number`, `string`,
|
||||
`keyword`, `boolean`, `character`, `prefix` (`'`, `` ` ``, `,`, `,@`, `#'`,
|
||||
etc.), `special` (`#;`, `#(`, `#vu8(`, etc.), `line-comment`, `block-comment`,
|
||||
`whitespace`, `newline`, `dot`.
|
||||
|
||||
**Critical invariant:** `(string-concatenate (map token-text (tokenize input)))` must
|
||||
reproduce the original input exactly. This is our primary roundtrip test.
|
||||
|
||||
Estimated size: ~200-250 lines. Reference: Mallet's tokenizer (163 lines CL).
|
||||
|
||||
### CST (`gulie/cst.scm`)
|
||||
|
||||
Trivial parenthesised tree built from the token stream:
|
||||
|
||||
```scheme
|
||||
(define-record-type <cst-node>
|
||||
(make-cst-node open close children)
|
||||
cst-node?
|
||||
(open cst-node-open) ;; <token> for ( [ {
|
||||
(close cst-node-close) ;; <token> for ) ] }
|
||||
(children cst-node-children)) ;; list of <cst-node> | <token>
|
||||
```
|
||||
|
||||
Children is a flat list of interleaved atoms (tokens) and nested nodes. Comments
|
||||
and whitespace are children like anything else.
|
||||
|
||||
The first non-whitespace symbol child of a `<cst-node>` identifies the form
|
||||
(`define`, `let`, `cond`, etc.) — enough for indentation rules.
|
||||
|
||||
Estimated size: ~80-100 lines.
|
||||
|
||||
### Compiler wrapper (`gulie/compiler.scm`)
|
||||
|
||||
Wraps Guile's compile pipeline to capture warnings as structured diagnostics:
|
||||
|
||||
```scheme
|
||||
;; Key Guile APIs we delegate to:
|
||||
;; - (system base compile): read-and-compile, compile, default-warning-level
|
||||
;; - (language tree-il analyze): make-analyzer, analyze-tree
|
||||
;; - (system base message): %warning-types, current-warning-port
|
||||
```
|
||||
|
||||
Strategy: call `read-and-compile` with `#:to 'tree-il` and `#:warning-level 2`
|
||||
while redirecting `current-warning-port` to a string port, then parse the
|
||||
warning output into `<diagnostic>` records. Alternatively, invoke `make-analyzer`
|
||||
directly and hook the warning printers.
|
||||
|
||||
Guile's built-in analyses (all free):
|
||||
- `unused-variable-analysis`
|
||||
- `unused-toplevel-analysis`
|
||||
- `unused-module-analysis`
|
||||
- `shadowed-toplevel-analysis`
|
||||
- `make-use-before-definition-analysis` (unbound variables)
|
||||
- `arity-analysis` (wrong arg count)
|
||||
- `format-analysis` (format string validation)
|
||||
|
||||
### Rule system (`gulie/rule.scm`)
|
||||
|
||||
```scheme
|
||||
(define-record-type <rule>
|
||||
(make-rule name description severity category type check-proc fix-proc)
|
||||
rule?
|
||||
(name rule-name) ;; symbol
|
||||
(description rule-description) ;; string
|
||||
(severity rule-severity) ;; 'error | 'warning | 'info
|
||||
(category rule-category) ;; 'format | 'style | 'correctness | 'idiom
|
||||
(type rule-type) ;; 'line | 'cst | 'tree-il
|
||||
(check-proc rule-check-proc) ;; procedure (signature depends on type)
|
||||
(fix-proc rule-fix-proc)) ;; procedure | #f
|
||||
```
|
||||
|
||||
Three rule types with different check signatures:
|
||||
- **`'line`** — `(lambda (file line-num line-text config) -> diagnostics)` — fastest, no parsing
|
||||
- **`'cst`** — `(lambda (file cst config) -> diagnostics)` — needs tokenizer+CST
|
||||
- **`'tree-il`** — `(lambda (file tree-il env config) -> diagnostics)` — needs compilation
|
||||
|
||||
Global registry: `*rules*` alist, populated at module load time via
|
||||
`register-rule!`. Convenience macros: `define-line-rule`, `define-cst-rule`,
|
||||
`define-tree-il-rule`.
|
||||
|
||||
### Diagnostic record (`gulie/diagnostic.scm`)
|
||||
|
||||
```scheme
|
||||
(define-record-type <diagnostic>
|
||||
(make-diagnostic file line column severity rule message fix)
|
||||
diagnostic?
|
||||
(file diagnostic-file) ;; string
|
||||
(line diagnostic-line) ;; integer, 1-based
|
||||
(column diagnostic-column) ;; integer, 0-based
|
||||
(severity diagnostic-severity) ;; symbol
|
||||
(rule diagnostic-rule) ;; symbol
|
||||
(message diagnostic-message) ;; string
|
||||
(fix diagnostic-fix)) ;; <fix> | #f
|
||||
```
|
||||
|
||||
Standard output: `file:line:column: severity: rule: message`
|
||||
|
||||
### Config (`gulie/config.scm`)
|
||||
|
||||
File: `.gulie.sexp` in project root (plain s-expression, read with `(read)`,
|
||||
never evaluated):
|
||||
|
||||
```scheme
|
||||
((line-length . 80)
|
||||
(indent . 2)
|
||||
(enable trailing-whitespace line-length unused-variable arity-mismatch)
|
||||
(disable tabs)
|
||||
(rules
|
||||
(line-length (max . 100)))
|
||||
(indent-rules
|
||||
(with-syntax . 1)
|
||||
(match . 1))
|
||||
(ignore "build/**" ".direnv/**"))
|
||||
```
|
||||
|
||||
Precedence: CLI flags > config file > built-in defaults.
|
||||
|
||||
`--init` generates a template with all rules listed and commented.
|
||||
|
||||
### Suppression (`gulie/suppression.scm`)
|
||||
|
||||
```scheme
|
||||
;; gulie:suppress trailing-whitespace — suppress on next line
|
||||
(define x "messy")
|
||||
|
||||
(define x "messy") ; gulie:suppress — suppress on this line
|
||||
|
||||
;; gulie:disable line-length — region disable
|
||||
... code ...
|
||||
;; gulie:enable line-length — region enable
|
||||
```
|
||||
|
||||
Parsed from raw text before rules run. Produces a suppression map that filters
|
||||
diagnostics after all rules have emitted.
|
||||
|
||||
---
|
||||
|
||||
## Indentation rules
|
||||
|
||||
The key data is `scheme-indent-function` values from `.dir-locals.el` — an
|
||||
integer N meaning "N arguments on first line, then body indented +2":
|
||||
|
||||
```scheme
|
||||
(define *default-indent-rules*
|
||||
'((define . 1) (define* . 1) (define-public . 1) (define-syntax . 1)
|
||||
(define-module . 0) (lambda . 1) (lambda* . 1)
|
||||
(let . 1) (let* . 1) (letrec . 1) (letrec* . 1)
|
||||
(if . #f) (cond . 0) (case . 1) (when . 1) (unless . 1)
|
||||
(match . 1) (syntax-case . 2) (with-syntax . 1)
|
||||
(begin . 0) (do . 2) (parameterize . 1) (guard . 1)))
|
||||
```
|
||||
|
||||
Overridable via config `indent-rules`. The indentation checker walks the CST,
|
||||
identifies the form by its first symbol child, looks up the rule, and compares
|
||||
actual indentation to expected.
|
||||
|
||||
---
|
||||
|
||||
## Formatting conventions (Guile vs Guix)
|
||||
|
||||
Both use 2-space indent, same special-form conventions. Key difference:
|
||||
- **Guile:** 72-char fill column, `;;; {Section}` headers
|
||||
- **Guix:** 78-80 char fill column, `;;` headers
|
||||
|
||||
Our default config targets Guile conventions. A Guix preset can override
|
||||
`line-length` and comment style.
|
||||
|
||||
---
|
||||
|
||||
## Formatter: cost-based optimal pretty-printing
|
||||
|
||||
The formatter (`gulie/formatter.scm`) is a later-phase component that
|
||||
**rewrites** files with correct layout, as opposed to the indentation checker
|
||||
which merely **reports** violations.
|
||||
|
||||
### Why cost-based?
|
||||
|
||||
When deciding where to break lines in a long expression, there are often multiple
|
||||
valid options. A greedy approach (fill as much as fits, then break) produces
|
||||
mediocre output — it can't "look ahead" to see that a break earlier would produce
|
||||
a better overall layout. The Wadler/Leijen family of algorithms evaluates
|
||||
alternative layouts and selects the optimal one.
|
||||
|
||||
### The algorithm (Wadler/Leijen, as used by fmt's `pretty-expressive`)
|
||||
|
||||
The pretty-printer works with an abstract **document** type:
|
||||
|
||||
```
|
||||
doc = text(string) — literal text
|
||||
| line — line break (or space if flattened)
|
||||
| nest(n, doc) — increase indent by n
|
||||
| concat(doc, doc) — concatenation
|
||||
| alt(doc, doc) — choose better of two layouts
|
||||
| group(doc) — try flat first, break if doesn't fit
|
||||
```
|
||||
|
||||
The key operator is `alt(a, b)` — "try layout A, but if it overflows the page
|
||||
width, use layout B instead." The algorithm evaluates both alternatives and
|
||||
picks the one with the lower **cost vector**:
|
||||
|
||||
```
|
||||
cost = [badness, height, characters]
|
||||
|
||||
badness — quadratic penalty for exceeding page width
|
||||
height — number of lines used
|
||||
characters — total chars (tiebreaker)
|
||||
```
|
||||
|
||||
This produces provably optimal output: the layout that minimises overflow while
|
||||
using the fewest lines.
|
||||
|
||||
### How it fits our architecture
|
||||
|
||||
```
|
||||
CST (from tokenizer + cst.scm)
|
||||
→ [doc generator] convert CST nodes to abstract doc, using form-specific rules
|
||||
→ [layout solver] evaluate alternatives, select optimal layout
|
||||
→ [renderer] emit formatted text with comments preserved
|
||||
```
|
||||
|
||||
The **doc generator** uses the same form-identification logic as the indentation
|
||||
checker (first symbol child of a CST node) to apply form-specific layout rules.
|
||||
For example:
|
||||
|
||||
- `define` — name on first line, body indented
|
||||
- `let` — bindings as aligned block, body indented
|
||||
- `cond` — each clause on its own line
|
||||
|
||||
These rules are data (the `indent-rules` table extended with layout hints),
|
||||
making the formatter configurable just like the checker.
|
||||
|
||||
### Implementation approach
|
||||
|
||||
We can either:
|
||||
1. **Port `pretty-expressive`** from Racket — the core algorithm is ~300 lines,
|
||||
well-documented in academic papers
|
||||
2. **Upgrade Guile's `(ice-9 pretty-print)`** — it already knows form-specific
|
||||
indentation rules but uses greedy layout; we'd replace the layout engine with
|
||||
cost-based selection
|
||||
|
||||
Option 1 is cleaner (purpose-built). Option 2 reuses more existing code but
|
||||
would be a heavier modification. We'll decide when we reach that phase.
|
||||
|
||||
### Phase note
|
||||
|
||||
The formatter is **Phase 6** work. Phases 0-4 deliver a useful checker without
|
||||
it. The indentation checker (Phase 4) validates existing formatting; the
|
||||
formatter (Phase 6) rewrites it. The checker comes first because it's simpler
|
||||
and immediately useful in CI.
|
||||
|
||||
---
|
||||
|
||||
## CLI interface
|
||||
|
||||
```
|
||||
gulie [OPTIONS] [FILE|DIR...]
|
||||
|
||||
--check Report issues, exit non-zero on findings (default)
|
||||
--fix Fix mode: auto-fix what's possible, report the rest
|
||||
--format Format mode: rewrite files with optimal layout
|
||||
--init Generate .gulie.sexp template
|
||||
--pass PASS Run only: surface, semantic, all (default: all)
|
||||
--rule RULE Enable only this rule (repeatable)
|
||||
--disable RULE Disable this rule (repeatable)
|
||||
--severity SEV Minimum severity: error, warning, info
|
||||
--output FORMAT Output: standard (default), json, compact
|
||||
--config FILE Config file path (default: auto-discover)
|
||||
--list-rules List all rules and exit
|
||||
--version Print version
|
||||
```
|
||||
|
||||
Exit codes: 0 = clean, 1 = findings, 2 = config error, 3 = internal error.
|
||||
|
||||
---
|
||||
|
||||
## Implementation phases
|
||||
|
||||
### Phase 0: Skeleton
|
||||
- `bin/gulie` — shebang script, loads CLI module
|
||||
- `(gulie cli)` — basic arg parsing (`--check`, `--version`, file args)
|
||||
- `(gulie diagnostic)` — record type + standard formatter
|
||||
- `(gulie rule)` — record type + registry + `register-rule!`
|
||||
- `(gulie engine)` — discovers `.scm` files, runs line rules, reports
|
||||
- One trivial rule: `trailing-whitespace` (line rule)
|
||||
- **Verification:** `gulie --check some-file.scm` reports trailing whitespace
|
||||
|
||||
### Phase 1: Tokenizer + CST + surface rules
|
||||
- `(gulie tokenizer)` — hand-written lexer
|
||||
- `(gulie cst)` — token → tree
|
||||
- Surface rules: `trailing-whitespace`, `line-length`, `no-tabs`, `blank-lines`
|
||||
- Comment rule: `comment-semicolons` (check `;`/`;;`/`;;;` usage)
|
||||
- Roundtrip test: tokenize → concat = original
|
||||
- Snapshot tests for each rule
|
||||
|
||||
### Phase 2: Semantic rules (compiler pass)
|
||||
- `(gulie compiler)` — `read-and-compile` wrapper, warning capture
|
||||
- Semantic rules wrapping Guile's built-in analyses:
|
||||
`unused-variable`, `unused-toplevel`, `unbound-variable`, `arity-mismatch`,
|
||||
`format-string`, `shadowed-toplevel`, `unused-module`
|
||||
- **Verification:** run against Guile and Guix source files, check false-positive rate
|
||||
|
||||
### Phase 3: Config + suppression
|
||||
- `(gulie config)` — `.gulie.sexp` loading + merging
|
||||
- `(gulie suppression)` — inline comment suppression
|
||||
- `--init` command
|
||||
- Rule enable/disable via config and CLI
|
||||
|
||||
### Phase 4: Indentation checking
|
||||
- `(gulie rules indentation)` — CST-based indent checker
|
||||
- Default indent rules for standard Guile forms
|
||||
- Configurable `indent-rules` in `.gulie.sexp`
|
||||
|
||||
### Phase 5: Fix mode + idiom rules
|
||||
- `(gulie fixer)` — bottom-to-top edit application
|
||||
- Auto-fix for: trailing whitespace, line-length (where possible)
|
||||
- `(gulie rules idiom)` — `match`-based pattern suggestions on Tree-IL
|
||||
- `(gulie rules module-form)` — `define-module` form checks (sorted imports, etc.)
|
||||
|
||||
### Phase 6: Formatter (cost-based optimal layout)
|
||||
- `(gulie formatter)` — Wadler/Leijen pretty-printer with cost-based selection
|
||||
- Abstract document type: `text`, `line`, `nest`, `concat`, `alt`, `group`
|
||||
- Form-specific layout rules (reuse indent-rules table + layout hints)
|
||||
- Comment preservation through formatting
|
||||
- `--format` CLI mode
|
||||
- **Verification:** format Guile/Guix source files, diff against originals,
|
||||
verify roundtrip stability (format twice = same output)
|
||||
|
||||
### Phase 7: Cross-module analysis (future)
|
||||
- Load multiple modules, walk dependency graph
|
||||
- Unused exports, cross-module arity checks
|
||||
- `--pass cross-module` CLI option
|
||||
|
||||
---
|
||||
|
||||
## Testing strategy
|
||||
|
||||
1. **Roundtrip test** (tokenizer): tokenize → concat must equal original input
|
||||
2. **Snapshot tests**: `fixtures/violations/rule-name.scm` + `.expected` pairs
|
||||
3. **Clean file tests**: `fixtures/clean/*.scm` must produce zero diagnostics
|
||||
4. **Unit tests**: `(srfi srfi-64)` for tokenizer, CST, config, diagnostics
|
||||
5. **Real-world corpus**: run against `test/guix/` and `refs/guile/module/` for
|
||||
false-positive rate validation
|
||||
6. **Formatter idempotency**: `format(format(x)) = format(x)` for all test files
|
||||
|
||||
---
|
||||
|
||||
## Key design decisions
|
||||
|
||||
| Decision | Rationale |
|
||||
|----------|-----------|
|
||||
| Hand-written tokenizer, not extending Guile's reader | The reader is ~1000 lines of nested closures not designed for extension. A clean 200-line tokenizer is easier to write/test. |
|
||||
| Two independent passes, not a unified AST | Reader strips comments irrecoverably. Accepting this gives clean separation. |
|
||||
| Delegate to Guile's built-in analyses | They're battle-tested, handle macroexpansion edge cases, and are maintained upstream. |
|
||||
| `(ice-9 match)` for idiom rules, not logic programming | Built-in, fast, sufficient. miniKanren can be added later if needed. |
|
||||
| S-expression config, not YAML/TOML | Zero deps. Our users write Scheme. `(read)` does the parsing. |
|
||||
| Flat CST (parens + interleaved tokens), not rich AST | Enough for indentation/formatting checks. No overengineering. |
|
||||
| Cost-based optimal layout for the formatter | Greedy formatters produce mediocre output. Wadler/Leijen is cleaner and provably correct. Worth the investment when we reach that phase. |
|
||||
| Checker first, formatter later | Checking is simpler, immediately useful in CI, and validates the tokenizer/CST infrastructure that the formatter will build on. |
|
||||
|
||||
---
|
||||
|
||||
## Critical files to reference during implementation
|
||||
|
||||
- `refs/guile/module/ice-9/read.scm:949-973` — what the reader discards (our tokenizer must keep)
|
||||
- `refs/guile/module/language/tree-il/analyze.scm:1461-1479` — `make-analyzer` API
|
||||
- `refs/guile/module/system/base/compile.scm:298-340` — `read-and-compile` / `compile`
|
||||
- `refs/guile/module/system/base/message.scm:83-220` — `%warning-types` definitions
|
||||
- `refs/guile/module/language/tree-il.scm` — Tree-IL node types and traversal
|
||||
- `refs/guile/module/ice-9/pretty-print.scm` — existing pretty-printer (form-specific rules to extract)
|
||||
- `refs/mallet/src/parser/tokenizer.lisp` — reference tokenizer (163 lines)
|
||||
- `refs/fmt/conventions.rkt` — form-specific formatting rules (100+ forms)
|
||||
- `refs/fmt/main.rkt` — cost-based layout selection implementation
|
||||
105
gulie/cli.scm
Normal file
105
gulie/cli.scm
Normal file
@@ -0,0 +1,105 @@
|
||||
;;; (gulie cli) — command-line argument parsing and main dispatch
|
||||
|
||||
(define-module (gulie cli)
|
||||
#:use-module (ice-9 getopt-long)
|
||||
#:use-module (ice-9 format)
|
||||
#:use-module (srfi srfi-1)
|
||||
#:use-module (gulie config)
|
||||
#:use-module (gulie engine)
|
||||
#:use-module (gulie rule)
|
||||
#:use-module (gulie diagnostic)
|
||||
#:export (main))
|
||||
|
||||
(define version "0.1.0")
|
||||
|
||||
(define option-spec
|
||||
'((help (single-char #\h) (value #f))
|
||||
(version (single-char #\v) (value #f))
|
||||
(check (value #f))
|
||||
(fix (value #f))
|
||||
(init (value #f))
|
||||
(pass (value #t))
|
||||
(config (value #t))
|
||||
(rule (value #t))
|
||||
(disable (value #t))
|
||||
(severity (value #t))
|
||||
(output (value #t))
|
||||
(list-rules (value #f))))
|
||||
|
||||
(define (show-help)
|
||||
(display "gulie — a linter and formatter for Guile Scheme\n\n")
|
||||
(display "Usage: gulie [OPTIONS] [FILE|DIR...]\n\n")
|
||||
(display "Options:\n")
|
||||
(display " -h, --help Show this help message\n")
|
||||
(display " -v, --version Print version\n")
|
||||
(display " --check Check mode (default): report issues\n")
|
||||
(display " --fix Fix mode: auto-fix what's possible\n")
|
||||
(display " --init Generate .gulie.sexp template\n")
|
||||
(display " --pass PASS Run only: surface, semantic, all (default: all)\n")
|
||||
(display " --config FILE Config file path\n")
|
||||
(display " --rule RULE Enable only this rule (repeatable)\n")
|
||||
(display " --disable RULE Disable this rule\n")
|
||||
(display " --severity SEV Minimum severity: error, warning, info\n")
|
||||
(display " --output FORMAT Output format: standard, json, compact\n")
|
||||
(display " --list-rules List all available rules\n"))
|
||||
|
||||
(define (show-version)
|
||||
(format #t "gulie ~a~%" version))
|
||||
|
||||
(define (list-all-rules)
|
||||
(let ((rules (all-rules)))
|
||||
(if (null? rules)
|
||||
(display "No rules registered.\n")
|
||||
(for-each
|
||||
(lambda (r)
|
||||
(format #t " ~20a ~8a ~10a ~a~%"
|
||||
(rule-name r)
|
||||
(rule-severity r)
|
||||
(rule-category r)
|
||||
(rule-description r)))
|
||||
(sort rules (lambda (a b)
|
||||
(string<? (symbol->string (rule-name a))
|
||||
(symbol->string (rule-name b)))))))))
|
||||
|
||||
(define (main args)
|
||||
(let* ((options (getopt-long args option-spec))
|
||||
(rest (option-ref options '() '())))
|
||||
|
||||
(cond
|
||||
((option-ref options 'help #f)
|
||||
(show-help)
|
||||
0)
|
||||
|
||||
((option-ref options 'version #f)
|
||||
(show-version)
|
||||
0)
|
||||
|
||||
((option-ref options 'list-rules #f)
|
||||
;; Ensure rules are loaded
|
||||
(list-all-rules)
|
||||
0)
|
||||
|
||||
((option-ref options 'init #f)
|
||||
(let ((path ".gulie.sexp"))
|
||||
(if (file-exists? path)
|
||||
(begin
|
||||
(format (current-error-port) "~a already exists~%" path)
|
||||
2)
|
||||
(begin
|
||||
(call-with-output-file path generate-template)
|
||||
(format #t "i Generated ~a~%" path)
|
||||
0))))
|
||||
|
||||
(else
|
||||
(let* ((config-path (option-ref options 'config #f))
|
||||
(user-config (load-config config-path))
|
||||
(config (merge-configs default-config user-config))
|
||||
(paths (if (null? rest) (list ".") rest))
|
||||
(ignore-pats (config-ignore-patterns config))
|
||||
(files (discover-scheme-files paths ignore-pats)))
|
||||
(if (null? files)
|
||||
(begin
|
||||
(display "No Scheme files found.\n" (current-error-port))
|
||||
0)
|
||||
(let ((count (lint-files files config)))
|
||||
(if (> count 0) 1 0))))))))
|
||||
92
gulie/compiler.scm
Normal file
92
gulie/compiler.scm
Normal file
@@ -0,0 +1,92 @@
|
||||
;;; (gulie compiler) — Guile compiler wrapper for semantic analysis
|
||||
;;;
|
||||
;;; Wraps Guile's compile pipeline to capture compiler warnings
|
||||
;;; (unused variables, arity mismatches, format string errors, etc.)
|
||||
;;; as structured <diagnostic> records.
|
||||
|
||||
(define-module (gulie compiler)
|
||||
#:use-module (system base compile)
|
||||
#:use-module (system base message)
|
||||
#:use-module (ice-9 regex)
|
||||
#:use-module (ice-9 match)
|
||||
#:use-module (srfi srfi-1)
|
||||
#:use-module (gulie diagnostic)
|
||||
#:export (compile-and-capture-warnings))
|
||||
|
||||
;; Regex to parse Guile's warning format:
|
||||
;; ;;; file:line:column: warning: message
|
||||
(define *warning-re*
|
||||
(make-regexp "^;;; ([^:]+):([0-9]+):([0-9]+): warning: (.+)$"))
|
||||
|
||||
(define (parse-warning-line text file)
|
||||
"Parse a warning line from Guile's compiler output into a <diagnostic>."
|
||||
(let ((m (regexp-exec *warning-re* text)))
|
||||
(if m
|
||||
(let ((wfile (match:substring m 1))
|
||||
(wline (string->number (match:substring m 2)))
|
||||
(wcol (string->number (match:substring m 3)))
|
||||
(wmsg (match:substring m 4)))
|
||||
(make-diagnostic
|
||||
(if (string=? wfile "<unknown-location>") file wfile)
|
||||
wline wcol
|
||||
'warning
|
||||
(classify-warning wmsg)
|
||||
wmsg
|
||||
#f))
|
||||
#f)))
|
||||
|
||||
(define (classify-warning msg)
|
||||
"Derive a rule name symbol from a warning message."
|
||||
(cond
|
||||
((string-contains msg "unused variable") 'unused-variable)
|
||||
((string-contains msg "unused local top-level") 'unused-toplevel)
|
||||
((string-contains msg "unused module") 'unused-module)
|
||||
((string-contains msg "shadows previous") 'shadowed-toplevel)
|
||||
((string-contains msg "unbound variable") 'unbound-variable)
|
||||
((string-contains msg "wrong number of arguments") 'arity-mismatch)
|
||||
((string-contains msg "used before definition") 'use-before-definition)
|
||||
((string-contains msg "macro") 'macro-use-before-definition)
|
||||
((string-contains msg "format") 'format-string)
|
||||
((string-contains msg "non-idempotent") 'non-idempotent-definition)
|
||||
((string-contains msg "duplicate datum") 'duplicate-case-datum)
|
||||
((string-contains msg "cannot be meaningfully") 'bad-case-datum)
|
||||
(else 'compiler-warning)))
|
||||
|
||||
(define (compile-and-capture-warnings file text config)
|
||||
"Compile TEXT (as if from FILE) and capture all compiler warnings.
|
||||
Returns a list of <diagnostic> records."
|
||||
(let* ((warning-output (open-output-string))
|
||||
(diagnostics '()))
|
||||
(parameterize ((current-warning-port warning-output))
|
||||
(catch #t
|
||||
(lambda ()
|
||||
(let ((port (open-input-string text)))
|
||||
(set-port-filename! port file)
|
||||
;; Compile to CPS — analyses run during tree-il→CPS lowering
|
||||
(read-and-compile port
|
||||
#:from 'scheme
|
||||
#:to 'cps
|
||||
#:warning-level 3
|
||||
#:env (make-fresh-user-module))))
|
||||
(lambda (key . args)
|
||||
;; Compilation errors become diagnostics too
|
||||
(let ((msg (call-with-output-string
|
||||
(lambda (p)
|
||||
(display key p)
|
||||
(display ": " p)
|
||||
(for-each (lambda (a) (display a p) (display " " p))
|
||||
args)))))
|
||||
(set! diagnostics
|
||||
(cons (make-diagnostic file 1 0 'error 'compile-error msg #f)
|
||||
diagnostics))))))
|
||||
;; Parse captured warnings
|
||||
(let* ((output (get-output-string warning-output))
|
||||
(lines (string-split output #\newline)))
|
||||
(for-each
|
||||
(lambda (line)
|
||||
(when (> (string-length line) 0)
|
||||
(let ((diag (parse-warning-line line file)))
|
||||
(when diag
|
||||
(set! diagnostics (cons diag diagnostics))))))
|
||||
lines))
|
||||
diagnostics))
|
||||
129
gulie/config.scm
Normal file
129
gulie/config.scm
Normal file
@@ -0,0 +1,129 @@
|
||||
;;; (gulie config) — configuration loading and merging
|
||||
;;;
|
||||
;;; Reads .gulie.sexp from project root (or CLI-specified path),
|
||||
;;; merges with built-in defaults, and provides config accessors.
|
||||
|
||||
(define-module (gulie config)
|
||||
#:use-module (ice-9 rdelim)
|
||||
#:use-module (srfi srfi-1)
|
||||
#:export (default-config
|
||||
load-config
|
||||
merge-configs
|
||||
config-ref
|
||||
config-line-length
|
||||
config-indent-width
|
||||
config-max-blank-lines
|
||||
config-enabled-rules
|
||||
config-disabled-rules
|
||||
config-ignore-patterns
|
||||
config-indent-rules
|
||||
generate-template))
|
||||
|
||||
(define default-config
|
||||
'((line-length . 80)
|
||||
(indent . 2)
|
||||
(max-blank-lines . 2)
|
||||
(enable . ())
|
||||
(disable . ())
|
||||
(ignore . ())
|
||||
(indent-rules
|
||||
(define . 1)
|
||||
(define* . 1)
|
||||
(define-public . 1)
|
||||
(define-syntax . 1)
|
||||
(define-syntax-rule . 1)
|
||||
(define-module . 0)
|
||||
(define-record-type . 1)
|
||||
(lambda . 1)
|
||||
(lambda* . 1)
|
||||
(let . 1)
|
||||
(let* . 1)
|
||||
(letrec . 1)
|
||||
(letrec* . 1)
|
||||
(let-values . 1)
|
||||
(if . special)
|
||||
(cond . 0)
|
||||
(case . 1)
|
||||
(when . 1)
|
||||
(unless . 1)
|
||||
(match . 1)
|
||||
(match-lambda . 0)
|
||||
(match-lambda* . 0)
|
||||
(syntax-case . 2)
|
||||
(syntax-rules . 1)
|
||||
(with-syntax . 1)
|
||||
(begin . 0)
|
||||
(do . 2)
|
||||
(parameterize . 1)
|
||||
(guard . 1)
|
||||
(with-exception-handler . 1)
|
||||
(call-with-values . 1)
|
||||
(receive . 2)
|
||||
(use-modules . 0)
|
||||
(with-fluids . 1)
|
||||
(dynamic-wind . 0))))
|
||||
|
||||
(define (config-ref config key . default)
|
||||
"Look up KEY in CONFIG alist, returning DEFAULT if not found."
|
||||
(let ((pair (assq key config)))
|
||||
(if pair
|
||||
(cdr pair)
|
||||
(if (null? default) #f (car default)))))
|
||||
|
||||
(define (config-line-length config)
|
||||
(or (config-ref config 'line-length) 80))
|
||||
|
||||
(define (config-indent-width config)
|
||||
(or (config-ref config 'indent) 2))
|
||||
|
||||
(define (config-max-blank-lines config)
|
||||
(or (config-ref config 'max-blank-lines) 2))
|
||||
|
||||
(define (config-enabled-rules config)
|
||||
(or (config-ref config 'enable) '()))
|
||||
|
||||
(define (config-disabled-rules config)
|
||||
(or (config-ref config 'disable) '()))
|
||||
|
||||
(define (config-ignore-patterns config)
|
||||
(or (config-ref config 'ignore) '()))
|
||||
|
||||
(define (config-indent-rules config)
|
||||
(or (config-ref config 'indent-rules) '()))
|
||||
|
||||
(define (load-config path)
|
||||
"Load a .gulie.sexp config file at PATH. Returns an alist."
|
||||
(if (and path (file-exists? path))
|
||||
(call-with-input-file path
|
||||
(lambda (port)
|
||||
(let ((data (read port)))
|
||||
(if (list? data) data '()))))
|
||||
'()))
|
||||
|
||||
(define (merge-configs base override)
|
||||
"Merge OVERRIDE config on top of BASE. Override wins for scalar values;
|
||||
lists are replaced, not appended."
|
||||
(let lp ((result base)
|
||||
(pairs override))
|
||||
(if (null? pairs)
|
||||
result
|
||||
(let ((pair (car pairs)))
|
||||
(lp (assq-set! (list-copy result) (car pair) (cdr pair))
|
||||
(cdr pairs))))))
|
||||
|
||||
(define (find-config-file start-dir)
|
||||
"Search upward from START-DIR for .gulie.sexp. Returns path or #f."
|
||||
(let lp ((dir start-dir))
|
||||
(let ((candidate (string-append dir "/.gulie.sexp")))
|
||||
(cond
|
||||
((file-exists? candidate) candidate)
|
||||
((string=? dir "/") #f)
|
||||
(else (lp (dirname dir)))))))
|
||||
|
||||
(define (generate-template port)
|
||||
"Write a template .gulie.sexp to PORT."
|
||||
(display ";;; gulie configuration\n" port)
|
||||
(display ";;; Place this file as .gulie.sexp in your project root.\n" port)
|
||||
(display ";;; All fields are optional — defaults are shown below.\n\n" port)
|
||||
(write default-config port)
|
||||
(newline port))
|
||||
95
gulie/cst.scm
Normal file
95
gulie/cst.scm
Normal file
@@ -0,0 +1,95 @@
|
||||
;;; (gulie cst) — concrete syntax tree from token stream
|
||||
;;;
|
||||
;;; Builds a tree that mirrors the parenthesised structure of the source
|
||||
;;; while preserving ALL tokens (whitespace, comments, atoms).
|
||||
|
||||
(define-module (gulie cst)
|
||||
#:use-module (srfi srfi-9)
|
||||
#:use-module (srfi srfi-1)
|
||||
#:use-module (srfi srfi-11)
|
||||
#:use-module (gulie tokenizer)
|
||||
#:export (<cst-node>
|
||||
make-cst-node
|
||||
cst-node?
|
||||
cst-node-open
|
||||
cst-node-close
|
||||
cst-node-children
|
||||
|
||||
parse-cst
|
||||
cst-form-name
|
||||
cst-significant-children))
|
||||
|
||||
;; A parenthesised group in the CST.
|
||||
(define-record-type <cst-node>
|
||||
(make-cst-node open close children)
|
||||
cst-node?
|
||||
(open cst-node-open) ;; <token> for ( [ {
|
||||
(close cst-node-close) ;; <token> for ) ] } or #f if unmatched
|
||||
(children cst-node-children)) ;; list of <cst-node> | <token>
|
||||
|
||||
(define (parse-cst tokens)
|
||||
"Parse a flat list of tokens into a CST.
|
||||
Returns a <cst-node> with a synthetic root (no open/close parens)
|
||||
whose children are the top-level forms."
|
||||
(let lp ((remaining tokens) (children '()))
|
||||
(if (null? remaining)
|
||||
(make-cst-node #f #f (reverse children))
|
||||
(let ((tok (car remaining))
|
||||
(rest (cdr remaining)))
|
||||
(case (token-type tok)
|
||||
((open-paren)
|
||||
;; Recursively parse until matching close
|
||||
(let-values (((node remaining*) (parse-group tok rest)))
|
||||
(lp remaining* (cons node children))))
|
||||
((close-paren)
|
||||
;; Unmatched close paren at top level — keep as-is
|
||||
(lp rest (cons tok children)))
|
||||
(else
|
||||
(lp rest (cons tok children))))))))
|
||||
|
||||
(define (parse-group open-token tokens)
|
||||
"Parse tokens after an open paren until the matching close.
|
||||
Returns (values <cst-node> remaining-tokens)."
|
||||
(let lp ((remaining tokens) (children '()))
|
||||
(if (null? remaining)
|
||||
;; Unclosed paren
|
||||
(values (make-cst-node open-token #f (reverse children))
|
||||
'())
|
||||
(let ((tok (car remaining))
|
||||
(rest (cdr remaining)))
|
||||
(case (token-type tok)
|
||||
((close-paren)
|
||||
(values (make-cst-node open-token tok (reverse children))
|
||||
rest))
|
||||
((open-paren)
|
||||
(let-values (((node remaining*) (parse-group tok rest)))
|
||||
(lp remaining* (cons node children))))
|
||||
(else
|
||||
(lp rest (cons tok children))))))))
|
||||
|
||||
(define (cst-form-name node)
|
||||
"For a <cst-node>, return the first significant symbol child as a string,
|
||||
or #f if the node has no symbol children."
|
||||
(let lp ((children (cst-node-children node)))
|
||||
(if (null? children)
|
||||
#f
|
||||
(let ((child (car children)))
|
||||
(cond
|
||||
((and (token? child)
|
||||
(eq? (token-type child) 'symbol))
|
||||
(token-text child))
|
||||
((and (token? child)
|
||||
(memq (token-type child) '(whitespace newline line-comment
|
||||
block-comment prefix)))
|
||||
(lp (cdr children)))
|
||||
(else #f))))))
|
||||
|
||||
(define (cst-significant-children node)
|
||||
"Return children of NODE that are not whitespace, newlines, or comments."
|
||||
(filter (lambda (child)
|
||||
(or (cst-node? child)
|
||||
(and (token? child)
|
||||
(not (memq (token-type child)
|
||||
'(whitespace newline line-comment
|
||||
block-comment))))))
|
||||
(cst-node-children node)))
|
||||
91
gulie/diagnostic.scm
Normal file
91
gulie/diagnostic.scm
Normal file
@@ -0,0 +1,91 @@
|
||||
;;; (gulie diagnostic) — diagnostic record type, sorting, formatting
|
||||
;;;
|
||||
;;; A <diagnostic> represents a single finding from a rule check.
|
||||
;;; Diagnostics are the universal currency between rules, the engine,
|
||||
;;; suppression filtering, and the reporter.
|
||||
|
||||
(define-module (gulie diagnostic)
|
||||
#:use-module (srfi srfi-9)
|
||||
#:use-module (ice-9 format)
|
||||
#:export (<diagnostic>
|
||||
make-diagnostic
|
||||
diagnostic?
|
||||
diagnostic-file
|
||||
diagnostic-line
|
||||
diagnostic-column
|
||||
diagnostic-severity
|
||||
diagnostic-rule
|
||||
diagnostic-message
|
||||
diagnostic-fix
|
||||
|
||||
<fix>
|
||||
make-fix
|
||||
fix?
|
||||
fix-type
|
||||
fix-line
|
||||
fix-column
|
||||
fix-end-line
|
||||
fix-end-column
|
||||
fix-replacement
|
||||
|
||||
diagnostic<?
|
||||
format-diagnostic
|
||||
format-diagnostics))
|
||||
|
||||
;; A single finding from a rule check.
|
||||
(define-record-type <diagnostic>
|
||||
(make-diagnostic file line column severity rule message fix)
|
||||
diagnostic?
|
||||
(file diagnostic-file) ;; string: file path
|
||||
(line diagnostic-line) ;; integer: 1-based line number
|
||||
(column diagnostic-column) ;; integer: 0-based column
|
||||
(severity diagnostic-severity) ;; symbol: error | warning | info
|
||||
(rule diagnostic-rule) ;; symbol: rule name
|
||||
(message diagnostic-message) ;; string: human-readable message
|
||||
(fix diagnostic-fix)) ;; <fix> | #f
|
||||
|
||||
;; An auto-fix that can be applied to resolve a diagnostic.
|
||||
(define-record-type <fix>
|
||||
(make-fix type line column end-line end-column replacement)
|
||||
fix?
|
||||
(type fix-type) ;; symbol: replace-line | delete-range | replace-range
|
||||
(line fix-line) ;; integer: 1-based
|
||||
(column fix-column) ;; integer: 0-based
|
||||
(end-line fix-end-line) ;; integer: 1-based
|
||||
(end-column fix-end-column) ;; integer: 0-based
|
||||
(replacement fix-replacement)) ;; string | #f
|
||||
|
||||
(define (diagnostic<? a b)
|
||||
"Compare two diagnostics by file, then line, then column."
|
||||
(let ((fa (diagnostic-file a))
|
||||
(fb (diagnostic-file b)))
|
||||
(cond
|
||||
((string<? fa fb) #t)
|
||||
((string>? fa fb) #f)
|
||||
(else
|
||||
(let ((la (diagnostic-line a))
|
||||
(lb (diagnostic-line b)))
|
||||
(cond
|
||||
((< la lb) #t)
|
||||
((> la lb) #f)
|
||||
(else (< (diagnostic-column a) (diagnostic-column b)))))))))
|
||||
|
||||
(define (severity->string sev)
|
||||
(symbol->string sev))
|
||||
|
||||
(define (format-diagnostic diag)
|
||||
"Format a diagnostic as file:line:column: severity: rule: message"
|
||||
(format #f "~a:~a:~a: ~a: ~a: ~a"
|
||||
(diagnostic-file diag)
|
||||
(diagnostic-line diag)
|
||||
(diagnostic-column diag)
|
||||
(severity->string (diagnostic-severity diag))
|
||||
(diagnostic-rule diag)
|
||||
(diagnostic-message diag)))
|
||||
|
||||
(define (format-diagnostics diags port)
|
||||
"Write all diagnostics to PORT, sorted by location."
|
||||
(for-each (lambda (d)
|
||||
(display (format-diagnostic d) port)
|
||||
(newline port))
|
||||
(sort diags diagnostic<?)))
|
||||
139
gulie/engine.scm
Normal file
139
gulie/engine.scm
Normal file
@@ -0,0 +1,139 @@
|
||||
;;; (gulie engine) — orchestrator: file discovery, pass sequencing
|
||||
;;;
|
||||
;;; The engine ties everything together: discovers files, runs rules
|
||||
;;; by type (line → cst → tree-il), collects diagnostics, applies
|
||||
;;; suppression, and reports results.
|
||||
|
||||
(define-module (gulie engine)
|
||||
#:use-module (ice-9 ftw)
|
||||
#:use-module (ice-9 rdelim)
|
||||
#:use-module (ice-9 regex)
|
||||
#:use-module (srfi srfi-1)
|
||||
#:use-module (gulie diagnostic)
|
||||
#:use-module (gulie rule)
|
||||
#:use-module (gulie config)
|
||||
#:use-module (gulie suppression)
|
||||
#:export (lint-file
|
||||
lint-files
|
||||
discover-scheme-files))
|
||||
|
||||
(define (read-file-to-string path)
|
||||
"Read entire file at PATH into a string."
|
||||
(call-with-input-file path
|
||||
(lambda (port)
|
||||
(let lp ((acc '()))
|
||||
(let ((line (read-line port)))
|
||||
(if (eof-object? line)
|
||||
(string-join (reverse acc) "\n")
|
||||
(lp (cons line acc))))))))
|
||||
|
||||
(define (run-line-rules file lines config)
|
||||
"Run all line-type rules against LINES. Returns list of diagnostics."
|
||||
(let ((line-rules (rules-of-type 'line))
|
||||
(diagnostics '())
|
||||
(consecutive-blanks 0))
|
||||
(when (not (null? line-rules))
|
||||
(let lp ((remaining lines) (line-num 1))
|
||||
(when (not (null? remaining))
|
||||
(let* ((line-text (car remaining))
|
||||
(is-blank (or (string-null? line-text)
|
||||
(string-every char-whitespace? line-text))))
|
||||
(set! consecutive-blanks
|
||||
(if is-blank (1+ consecutive-blanks) 0))
|
||||
(let ((augmented-config
|
||||
(cons (cons '%consecutive-blanks consecutive-blanks)
|
||||
config)))
|
||||
(for-each
|
||||
(lambda (rule)
|
||||
(let ((results ((rule-check-proc rule)
|
||||
file line-num line-text augmented-config)))
|
||||
(set! diagnostics (append results diagnostics))))
|
||||
line-rules))
|
||||
(lp (cdr remaining) (1+ line-num))))))
|
||||
diagnostics))
|
||||
|
||||
(define (run-cst-rules file cst config)
|
||||
"Run all cst-type rules against CST. Returns list of diagnostics."
|
||||
(let ((cst-rules (rules-of-type 'cst)))
|
||||
(if (null? cst-rules)
|
||||
'()
|
||||
(append-map
|
||||
(lambda (rule)
|
||||
((rule-check-proc rule) file cst config))
|
||||
cst-rules))))
|
||||
|
||||
(define (lint-file file config)
|
||||
"Lint a single FILE with CONFIG. Returns a sorted list of diagnostics."
|
||||
(let* ((text (read-file-to-string file))
|
||||
(lines (string-split text #\newline))
|
||||
(diagnostics '()))
|
||||
;; Pass 1: line-based surface rules
|
||||
(set! diagnostics (append (run-line-rules file lines config)
|
||||
diagnostics))
|
||||
;; Pass 1b: CST rules (if tokenizer is loaded)
|
||||
;; Dynamically check if tokenizer module is available
|
||||
(let ((tok-mod (resolve-module '(gulie tokenizer) #:ensure #f)))
|
||||
(when tok-mod
|
||||
(let ((tokenize (module-ref tok-mod 'tokenize))
|
||||
(cst-mod (resolve-module '(gulie cst) #:ensure #f)))
|
||||
(when cst-mod
|
||||
(let* ((parse-cst (module-ref cst-mod 'parse-cst))
|
||||
(tokens (tokenize text file))
|
||||
(cst (parse-cst tokens)))
|
||||
(set! diagnostics (append (run-cst-rules file cst config)
|
||||
diagnostics)))))))
|
||||
;; Pass 2: semantic rules (if compiler module is loaded)
|
||||
(let ((comp-mod (resolve-module '(gulie compiler) #:ensure #f)))
|
||||
(when comp-mod
|
||||
(let ((compile-and-capture (module-ref comp-mod 'compile-and-capture-warnings)))
|
||||
(set! diagnostics (append (compile-and-capture file text config)
|
||||
diagnostics)))))
|
||||
;; Filter suppressions
|
||||
(let ((suppressions (parse-suppressions text)))
|
||||
(set! diagnostics (filter-suppressions diagnostics suppressions)))
|
||||
;; Sort by location
|
||||
(sort diagnostics diagnostic<?)))
|
||||
|
||||
(define (lint-files files config)
|
||||
"Lint multiple FILES. Returns total diagnostic count."
|
||||
(let ((total 0))
|
||||
(for-each
|
||||
(lambda (file)
|
||||
(let ((diags (lint-file file config)))
|
||||
(set! total (+ total (length diags)))
|
||||
(format-diagnostics diags (current-output-port))))
|
||||
files)
|
||||
total))
|
||||
|
||||
(define (scheme-file? path)
|
||||
"Is PATH a Scheme source file?"
|
||||
(let ((ext (and (string-index path #\.)
|
||||
(substring path (1+ (string-rindex path #\.))))))
|
||||
(and ext (member ext '("scm" "sld" "sls" "ss")))))
|
||||
|
||||
(define (discover-scheme-files paths ignore-patterns)
|
||||
"Expand PATHS into a list of .scm files, recursing into directories.
|
||||
IGNORE-PATTERNS is a list of glob-like strings (currently supports simple suffix matching)."
|
||||
(define (ignored? file)
|
||||
(any (lambda (pat)
|
||||
(string-contains file pat))
|
||||
ignore-patterns))
|
||||
(append-map
|
||||
(lambda (path)
|
||||
(cond
|
||||
((and (file-exists? path) (not (file-is-directory? path)))
|
||||
(if (and (scheme-file? path) (not (ignored? path)))
|
||||
(list path)
|
||||
'()))
|
||||
((file-is-directory? path)
|
||||
(let ((files '()))
|
||||
(ftw path
|
||||
(lambda (filename statinfo flag)
|
||||
(when (and (eq? flag 'regular)
|
||||
(scheme-file? filename)
|
||||
(not (ignored? filename)))
|
||||
(set! files (cons filename files)))
|
||||
#t))
|
||||
(sort files string<?)))
|
||||
(else '())))
|
||||
paths))
|
||||
61
gulie/rule.scm
Normal file
61
gulie/rule.scm
Normal file
@@ -0,0 +1,61 @@
|
||||
;;; (gulie rule) — rule record type, registry, convenience macros
|
||||
;;;
|
||||
;;; Rules are the units of analysis. Each rule has a name, metadata,
|
||||
;;; and a check procedure whose signature depends on the rule type.
|
||||
|
||||
(define-module (gulie rule)
|
||||
#:use-module (srfi srfi-9)
|
||||
#:use-module (srfi srfi-1)
|
||||
#:export (<rule>
|
||||
make-rule
|
||||
rule?
|
||||
rule-name
|
||||
rule-description
|
||||
rule-severity
|
||||
rule-category
|
||||
rule-type
|
||||
rule-check-proc
|
||||
rule-fix-proc
|
||||
|
||||
register-rule!
|
||||
all-rules
|
||||
rules-of-type
|
||||
find-rule
|
||||
clear-rules!))
|
||||
|
||||
;; A lint/format rule.
|
||||
(define-record-type <rule>
|
||||
(make-rule name description severity category type check-proc fix-proc)
|
||||
rule?
|
||||
(name rule-name) ;; symbol
|
||||
(description rule-description) ;; string
|
||||
(severity rule-severity) ;; symbol: error | warning | info
|
||||
(category rule-category) ;; symbol: format | style | correctness | idiom
|
||||
(type rule-type) ;; symbol: line | cst | tree-il
|
||||
(check-proc rule-check-proc) ;; procedure
|
||||
(fix-proc rule-fix-proc)) ;; procedure | #f
|
||||
|
||||
;; Global rule registry.
|
||||
(define *rules* '())
|
||||
|
||||
(define (register-rule! rule)
|
||||
"Register a rule in the global registry."
|
||||
(set! *rules* (cons rule *rules*)))
|
||||
|
||||
(define (all-rules)
|
||||
"Return all registered rules."
|
||||
(reverse *rules*))
|
||||
|
||||
(define (rules-of-type type)
|
||||
"Return all registered rules of the given TYPE."
|
||||
(filter (lambda (r) (eq? (rule-type r) type))
|
||||
*rules*))
|
||||
|
||||
(define (find-rule name)
|
||||
"Find a rule by NAME, or #f."
|
||||
(find (lambda (r) (eq? (rule-name r) name))
|
||||
*rules*))
|
||||
|
||||
(define (clear-rules!)
|
||||
"Clear all registered rules. Useful for testing."
|
||||
(set! *rules* '()))
|
||||
71
gulie/rules/comments.scm
Normal file
71
gulie/rules/comments.scm
Normal file
@@ -0,0 +1,71 @@
|
||||
;;; (gulie rules comments) — comment style conventions
|
||||
;;;
|
||||
;;; Checks that comments follow standard Scheme conventions:
|
||||
;;; ; — inline comments (after code on same line)
|
||||
;;; ;; — line comments (own line, aligned with code)
|
||||
;;; ;;; — section/file-level comments
|
||||
;;; ;;;; — file headers
|
||||
|
||||
(define-module (gulie rules comments)
|
||||
#:use-module (gulie rule)
|
||||
#:use-module (gulie diagnostic))
|
||||
|
||||
;; Count leading semicolons in a comment string.
|
||||
(define (count-semicolons text)
|
||||
(let lp ((i 0))
|
||||
(if (and (< i (string-length text))
|
||||
(char=? (string-ref text i) #\;))
|
||||
(lp (1+ i))
|
||||
i)))
|
||||
|
||||
;; Is a line (before the comment) only whitespace?
|
||||
(define (comment-only-line? line-text comment-col)
|
||||
(let lp ((i 0))
|
||||
(cond
|
||||
((>= i comment-col) #t)
|
||||
((char-whitespace? (string-ref line-text i)) (lp (1+ i)))
|
||||
(else #f))))
|
||||
|
||||
(register-rule!
|
||||
(make-rule
|
||||
'comment-semicolons
|
||||
"Check comment semicolon count follows conventions"
|
||||
'info 'style 'line
|
||||
(lambda (file line-num line-text config)
|
||||
(let ((pos (string-index line-text #\;)))
|
||||
(if (not pos)
|
||||
'()
|
||||
;; Check if the semicolon is inside a string (rough heuristic:
|
||||
;; count quotes before the semicolon position)
|
||||
(let ((quotes-before (let lp ((i 0) (count 0) (in-escape #f))
|
||||
(cond
|
||||
((>= i pos) count)
|
||||
((and (not in-escape) (char=? (string-ref line-text i) #\\))
|
||||
(lp (1+ i) count #t))
|
||||
((and (not in-escape) (char=? (string-ref line-text i) #\"))
|
||||
(lp (1+ i) (1+ count) #f))
|
||||
(else (lp (1+ i) count #f))))))
|
||||
(if (odd? quotes-before)
|
||||
;; Inside a string — not a real comment
|
||||
'()
|
||||
(let* ((semis (count-semicolons (substring line-text pos)))
|
||||
(own-line? (comment-only-line? line-text pos)))
|
||||
(cond
|
||||
;; Inline comment (after code) should use single ;
|
||||
;; But we don't enforce this strictly — just flag ;;; or more inline
|
||||
((and (not own-line?) (>= semis 3))
|
||||
(list (make-diagnostic
|
||||
file line-num pos
|
||||
'info 'comment-semicolons
|
||||
"inline comments should use ; or ;; not ;;;"
|
||||
#f)))
|
||||
;; Own-line comment with single ; (should be ;;)
|
||||
((and own-line? (= semis 1) (> (string-length line-text) (1+ pos))
|
||||
(not (char=? (string-ref line-text (1+ pos)) #\!)))
|
||||
(list (make-diagnostic
|
||||
file line-num pos
|
||||
'info 'comment-semicolons
|
||||
"line comments should use ;; not ;"
|
||||
#f)))
|
||||
(else '()))))))))
|
||||
#f))
|
||||
93
gulie/rules/surface.scm
Normal file
93
gulie/rules/surface.scm
Normal file
@@ -0,0 +1,93 @@
|
||||
;;; (gulie rules surface) — surface-level line rules
|
||||
;;;
|
||||
;;; These rules operate on raw text lines. They need no parsing —
|
||||
;;; just the file path, line number, and line content.
|
||||
|
||||
(define-module (gulie rules surface)
|
||||
#:use-module (gulie rule)
|
||||
#:use-module (gulie diagnostic))
|
||||
|
||||
;;; trailing-whitespace — trailing spaces or tabs at end of line
|
||||
|
||||
(register-rule!
|
||||
(make-rule
|
||||
'trailing-whitespace
|
||||
"Line has trailing whitespace"
|
||||
'warning 'format 'line
|
||||
(lambda (file line-num line-text config)
|
||||
(let ((trimmed (string-trim-right line-text)))
|
||||
(if (and (not (string=? line-text trimmed))
|
||||
(> (string-length line-text) 0))
|
||||
(list (make-diagnostic
|
||||
file line-num
|
||||
(string-length trimmed)
|
||||
'warning 'trailing-whitespace
|
||||
"trailing whitespace"
|
||||
(make-fix 'replace-line line-num 0
|
||||
line-num (string-length line-text)
|
||||
trimmed)))
|
||||
'())))
|
||||
#f))
|
||||
|
||||
;;; line-length — line exceeds maximum width
|
||||
|
||||
(define (config-max-line-length config)
|
||||
(or (assq-ref config 'line-length) 80))
|
||||
|
||||
(register-rule!
|
||||
(make-rule
|
||||
'line-length
|
||||
"Line exceeds maximum length"
|
||||
'warning 'format 'line
|
||||
(lambda (file line-num line-text config)
|
||||
(let ((max-len (config-max-line-length config)))
|
||||
(if (> (string-length line-text) max-len)
|
||||
(list (make-diagnostic
|
||||
file line-num max-len
|
||||
'warning 'line-length
|
||||
(format #f "line exceeds ~a characters (~a)"
|
||||
max-len (string-length line-text))
|
||||
#f))
|
||||
'())))
|
||||
#f))
|
||||
|
||||
;;; no-tabs — tab characters in source
|
||||
|
||||
(register-rule!
|
||||
(make-rule
|
||||
'no-tabs
|
||||
"Tab character found in source"
|
||||
'warning 'format 'line
|
||||
(lambda (file line-num line-text config)
|
||||
(let ((pos (string-index line-text #\tab)))
|
||||
(if pos
|
||||
(list (make-diagnostic
|
||||
file line-num pos
|
||||
'warning 'no-tabs
|
||||
"tab character found; use spaces for indentation"
|
||||
#f))
|
||||
'())))
|
||||
#f))
|
||||
|
||||
;;; blank-lines — excessive consecutive blank lines
|
||||
|
||||
(register-rule!
|
||||
(make-rule
|
||||
'blank-lines
|
||||
"Excessive consecutive blank lines"
|
||||
'warning 'format 'line
|
||||
(lambda (file line-num line-text config)
|
||||
;; This rule uses a stateful approach: the engine tracks consecutive
|
||||
;; blank lines and passes the count via config. See engine.scm for
|
||||
;; the blank-line counting logic.
|
||||
(let ((max-blanks (or (assq-ref config 'max-blank-lines) 2))
|
||||
(consecutive (or (assq-ref config '%consecutive-blanks) 0)))
|
||||
(if (and (string-every char-whitespace? line-text)
|
||||
(> consecutive max-blanks))
|
||||
(list (make-diagnostic
|
||||
file line-num 0
|
||||
'warning 'blank-lines
|
||||
(format #f "more than ~a consecutive blank lines" max-blanks)
|
||||
#f))
|
||||
'())))
|
||||
#f))
|
||||
139
gulie/suppression.scm
Normal file
139
gulie/suppression.scm
Normal file
@@ -0,0 +1,139 @@
|
||||
;;; (gulie suppression) — inline suppression via comments
|
||||
;;;
|
||||
;;; Parses ; gulie:suppress and ; gulie:disable/enable directives
|
||||
;;; from raw source text. Returns a suppression map used to filter
|
||||
;;; diagnostics after all rules have run.
|
||||
|
||||
(define-module (gulie suppression)
|
||||
#:use-module (ice-9 regex)
|
||||
#:use-module (ice-9 rdelim)
|
||||
#:use-module (srfi srfi-1)
|
||||
#:use-module (gulie diagnostic)
|
||||
#:export (parse-suppressions
|
||||
filter-suppressions))
|
||||
|
||||
;; A suppression entry.
|
||||
;; line: 1-based line number this applies to
|
||||
;; rules: list of rule name symbols, or #t for "all rules"
|
||||
;; kind: 'this-line | 'next-line | 'region-start | 'region-end
|
||||
(define (make-suppression line rules kind)
|
||||
(list line rules kind))
|
||||
|
||||
(define (suppression-line s) (car s))
|
||||
(define (suppression-rules s) (cadr s))
|
||||
(define (suppression-kind s) (caddr s))
|
||||
|
||||
(define *suppress-re*
|
||||
(make-regexp ";+\\s*gulie:suppress\\s*(.*)$"))
|
||||
|
||||
(define *disable-re*
|
||||
(make-regexp ";+\\s*gulie:disable\\s+(.+)$"))
|
||||
|
||||
(define *enable-re*
|
||||
(make-regexp ";+\\s*gulie:enable\\s+(.+)$"))
|
||||
|
||||
(define (parse-rule-names str)
|
||||
"Parse space-separated rule names from STR. Empty → #t (all rules)."
|
||||
(let ((trimmed (string-trim-both str)))
|
||||
(if (string-null? trimmed)
|
||||
#t
|
||||
(map string->symbol (string-split trimmed #\space)))))
|
||||
|
||||
(define (parse-suppressions text)
|
||||
"Parse suppression directives from source TEXT.
|
||||
Returns a list of (line rules kind) entries."
|
||||
(let ((lines (string-split text #\newline))
|
||||
(result '()))
|
||||
(let lp ((lines lines) (line-num 1) (acc '()))
|
||||
(if (null? lines)
|
||||
(reverse acc)
|
||||
(let ((line (car lines)))
|
||||
(cond
|
||||
;; ; gulie:suppress [rules...] — on the SAME line if code precedes it,
|
||||
;; or on the NEXT line if the line is comment-only
|
||||
((regexp-exec *suppress-re* line)
|
||||
=> (lambda (m)
|
||||
(let* ((rules (parse-rule-names (match:substring m 1)))
|
||||
(trimmed (string-trim line))
|
||||
;; If line starts with ;, it's comment-only → suppress next line
|
||||
(kind (if (char=? (string-ref trimmed 0) #\;)
|
||||
'next-line
|
||||
'this-line))
|
||||
(target-line (if (eq? kind 'next-line)
|
||||
(1+ line-num)
|
||||
line-num)))
|
||||
(lp (cdr lines) (1+ line-num)
|
||||
(cons (make-suppression target-line rules kind) acc)))))
|
||||
;; ; gulie:disable rule — region start
|
||||
((regexp-exec *disable-re* line)
|
||||
=> (lambda (m)
|
||||
(let ((rules (parse-rule-names (match:substring m 1))))
|
||||
(lp (cdr lines) (1+ line-num)
|
||||
(cons (make-suppression line-num rules 'region-start) acc)))))
|
||||
;; ; gulie:enable rule — region end
|
||||
((regexp-exec *enable-re* line)
|
||||
=> (lambda (m)
|
||||
(let ((rules (parse-rule-names (match:substring m 1))))
|
||||
(lp (cdr lines) (1+ line-num)
|
||||
(cons (make-suppression line-num rules 'region-end) acc)))))
|
||||
(else
|
||||
(lp (cdr lines) (1+ line-num) acc))))))))
|
||||
|
||||
(define (build-suppression-set suppressions)
|
||||
"Build a procedure (line rule-name) -> #t if suppressed."
|
||||
;; Point suppressions: hash of line-num -> rules
|
||||
(let ((point-map (make-hash-table))
|
||||
(regions '()))
|
||||
;; Collect point suppressions and regions
|
||||
(for-each
|
||||
(lambda (s)
|
||||
(case (suppression-kind s)
|
||||
((this-line next-line)
|
||||
(hashv-set! point-map (suppression-line s)
|
||||
(suppression-rules s)))
|
||||
((region-start)
|
||||
(set! regions (cons s regions)))))
|
||||
suppressions)
|
||||
;; Build region intervals
|
||||
(let ((region-intervals
|
||||
(let lp ((remaining (reverse regions)) (intervals '()))
|
||||
(if (null? remaining)
|
||||
intervals
|
||||
(let* ((start (car remaining))
|
||||
;; Find matching end
|
||||
(end-entry (find (lambda (s)
|
||||
(and (eq? (suppression-kind s) 'region-end)
|
||||
(> (suppression-line s) (suppression-line start))
|
||||
(equal? (suppression-rules s)
|
||||
(suppression-rules start))))
|
||||
suppressions)))
|
||||
(lp (cdr remaining)
|
||||
(cons (list (suppression-line start)
|
||||
(if end-entry (suppression-line end-entry) 999999)
|
||||
(suppression-rules start))
|
||||
intervals)))))))
|
||||
;; Return predicate
|
||||
(lambda (line rule-name)
|
||||
(or
|
||||
;; Check point suppressions
|
||||
(let ((rules (hashv-ref point-map line)))
|
||||
(and rules
|
||||
(or (eq? rules #t)
|
||||
(memq rule-name rules))))
|
||||
;; Check region suppressions
|
||||
(any (lambda (interval)
|
||||
(and (>= line (car interval))
|
||||
(<= line (cadr interval))
|
||||
(let ((rules (caddr interval)))
|
||||
(or (eq? rules #t)
|
||||
(memq rule-name rules)))))
|
||||
region-intervals))))))
|
||||
|
||||
(define (filter-suppressions diagnostics suppressions)
|
||||
"Remove diagnostics that are suppressed."
|
||||
(if (null? suppressions)
|
||||
diagnostics
|
||||
(let ((suppressed? (build-suppression-set suppressions)))
|
||||
(filter (lambda (d)
|
||||
(not (suppressed? (diagnostic-line d) (diagnostic-rule d))))
|
||||
diagnostics))))
|
||||
348
gulie/tokenizer.scm
Normal file
348
gulie/tokenizer.scm
Normal file
@@ -0,0 +1,348 @@
|
||||
;;; (gulie tokenizer) — hand-written lexer preserving all tokens
|
||||
;;;
|
||||
;;; Tokenises Guile Scheme source code into a flat list of <token>
|
||||
;;; records, preserving whitespace, comments, and exact source text.
|
||||
;;;
|
||||
;;; Critical invariant:
|
||||
;;; (string-concatenate (map token-text (tokenize input)))
|
||||
;;; ≡ input
|
||||
|
||||
(define-module (gulie tokenizer)
|
||||
#:use-module (srfi srfi-9)
|
||||
#:use-module (srfi srfi-11)
|
||||
#:export (<token>
|
||||
make-token
|
||||
token?
|
||||
token-type
|
||||
token-text
|
||||
token-line
|
||||
token-column
|
||||
|
||||
tokenize))
|
||||
|
||||
(define-record-type <token>
|
||||
(make-token type text line column)
|
||||
token?
|
||||
(type token-type) ;; symbol
|
||||
(text token-text) ;; string: exact source text
|
||||
(line token-line) ;; integer: 1-based
|
||||
(column token-column)) ;; integer: 0-based
|
||||
|
||||
(define (tokenize input filename)
|
||||
"Tokenise INPUT string. Returns a list of <token> records.
|
||||
FILENAME is used only for error messages."
|
||||
(let ((port (open-input-string input))
|
||||
(tokens '())
|
||||
(line 1)
|
||||
(col 0))
|
||||
|
||||
(define (peek) (peek-char port))
|
||||
|
||||
(define (advance!)
|
||||
(let ((ch (read-char port)))
|
||||
(cond
|
||||
((eof-object? ch) ch)
|
||||
((char=? ch #\newline)
|
||||
(set! line (1+ line))
|
||||
(set! col 0)
|
||||
ch)
|
||||
(else
|
||||
(set! col (1+ col))
|
||||
ch))))
|
||||
|
||||
(define (emit! type text start-line start-col)
|
||||
(set! tokens (cons (make-token type text start-line start-col)
|
||||
tokens)))
|
||||
|
||||
(define (collect-while first pred)
|
||||
"Collect chars starting with FIRST while PRED holds on peek."
|
||||
(let lp ((acc (list first)))
|
||||
(let ((ch (peek)))
|
||||
(if (and (not (eof-object? ch)) (pred ch))
|
||||
(begin (advance!) (lp (cons ch acc)))
|
||||
(list->string (reverse acc))))))
|
||||
|
||||
(define (char-hex? ch)
|
||||
(or (char-numeric? ch)
|
||||
(memv ch '(#\a #\b #\c #\d #\e #\f
|
||||
#\A #\B #\C #\D #\E #\F))))
|
||||
|
||||
(define (delimiter? ch)
|
||||
(or (eof-object? ch)
|
||||
(char-whitespace? ch)
|
||||
(memv ch '(#\( #\) #\[ #\] #\{ #\} #\" #\; #\#))))
|
||||
|
||||
(define (read-string-literal)
|
||||
"Read string body after opening quote. Returns full string including quotes."
|
||||
(let lp ((acc (list #\")))
|
||||
(let ((ch (advance!)))
|
||||
(cond
|
||||
((eof-object? ch)
|
||||
(list->string (reverse acc)))
|
||||
((char=? ch #\\)
|
||||
(let ((next (advance!)))
|
||||
(if (eof-object? next)
|
||||
(list->string (reverse (cons ch acc)))
|
||||
(lp (cons next (cons ch acc))))))
|
||||
((char=? ch #\")
|
||||
(list->string (reverse (cons ch acc))))
|
||||
(else
|
||||
(lp (cons ch acc)))))))
|
||||
|
||||
(define (read-line-comment first)
|
||||
"Read from FIRST (;) to end of line, not including the newline."
|
||||
(collect-while first (lambda (ch) (not (char=? ch #\newline)))))
|
||||
|
||||
(define (read-block-comment)
|
||||
"Read block comment body after #|. Returns full text including #| and |#."
|
||||
(let lp ((acc (list #\| #\#)) (depth 1))
|
||||
(let ((ch (advance!)))
|
||||
(cond
|
||||
((eof-object? ch)
|
||||
(list->string (reverse acc)))
|
||||
((and (char=? ch #\|) (eqv? (peek) #\#))
|
||||
(let ((hash (advance!)))
|
||||
(if (= depth 1)
|
||||
(list->string (reverse (cons hash (cons ch acc))))
|
||||
(lp (cons hash (cons ch acc)) (1- depth)))))
|
||||
((and (char=? ch #\#) (eqv? (peek) #\|))
|
||||
(let ((pipe (advance!)))
|
||||
(lp (cons pipe (cons ch acc)) (1+ depth))))
|
||||
(else
|
||||
(lp (cons ch acc) depth))))))
|
||||
|
||||
(define (read-character-literal)
|
||||
"Read character literal after #\\. Returns full text including #\\."
|
||||
(let ((ch (advance!)))
|
||||
(cond
|
||||
((eof-object? ch) "#\\")
|
||||
((and (char-alphabetic? ch)
|
||||
(let ((pk (peek)))
|
||||
(and (not (eof-object? pk))
|
||||
(char-alphabetic? pk))))
|
||||
(string-append "#\\"
|
||||
(collect-while ch char-alphabetic?)))
|
||||
((and (char=? ch #\x)
|
||||
(let ((pk (peek)))
|
||||
(and (not (eof-object? pk))
|
||||
(char-hex? pk))))
|
||||
(string-append "#\\"
|
||||
(collect-while ch
|
||||
(lambda (c) (or (char-hex? c) (char=? c #\x))))))
|
||||
(else
|
||||
(string-append "#\\" (string ch))))))
|
||||
|
||||
(define (read-shebang)
|
||||
"Read shebang/directive after #!. Returns full text."
|
||||
(let lp ((acc (list #\! #\#)))
|
||||
(let ((c (advance!)))
|
||||
(cond
|
||||
((eof-object? c)
|
||||
(list->string (reverse acc)))
|
||||
((and (char=? c #\!) (eqv? (peek) #\#))
|
||||
(let ((h (advance!)))
|
||||
(list->string (reverse (cons h (cons c acc))))))
|
||||
(else
|
||||
(lp (cons c acc)))))))
|
||||
|
||||
(define (read-other-sharp next)
|
||||
"Read # sequence where NEXT is first char after #. Returns full text."
|
||||
(let ((text (string-append "#"
|
||||
(collect-while next
|
||||
(lambda (c) (not (delimiter? c)))))))
|
||||
(if (eqv? (peek) #\()
|
||||
(begin (advance!)
|
||||
(values 'special (string-append text "(")))
|
||||
(values 'special text))))
|
||||
|
||||
;; Main tokenisation loop
|
||||
(let lp ()
|
||||
(let ((ch (peek)))
|
||||
(cond
|
||||
((eof-object? ch)
|
||||
(reverse tokens))
|
||||
|
||||
;; Newline
|
||||
((char=? ch #\newline)
|
||||
(let ((sl line) (sc col))
|
||||
(advance!)
|
||||
(emit! 'newline "\n" sl sc)
|
||||
(lp)))
|
||||
|
||||
;; Whitespace (non-newline)
|
||||
((char-whitespace? ch)
|
||||
(let ((sl line) (sc col))
|
||||
(advance!)
|
||||
(emit! 'whitespace
|
||||
(collect-while ch
|
||||
(lambda (c) (and (char-whitespace? c)
|
||||
(not (char=? c #\newline)))))
|
||||
sl sc)
|
||||
(lp)))
|
||||
|
||||
;; Line comment
|
||||
((char=? ch #\;)
|
||||
(let ((sl line) (sc col))
|
||||
(advance!)
|
||||
(emit! 'line-comment (read-line-comment ch) sl sc)
|
||||
(lp)))
|
||||
|
||||
;; String
|
||||
((char=? ch #\")
|
||||
(let ((sl line) (sc col))
|
||||
(advance!)
|
||||
(emit! 'string (read-string-literal) sl sc)
|
||||
(lp)))
|
||||
|
||||
;; Open paren
|
||||
((memv ch '(#\( #\[ #\{))
|
||||
(let ((sl line) (sc col))
|
||||
(advance!)
|
||||
(emit! 'open-paren (string ch) sl sc)
|
||||
(lp)))
|
||||
|
||||
;; Close paren
|
||||
((memv ch '(#\) #\] #\}))
|
||||
(let ((sl line) (sc col))
|
||||
(advance!)
|
||||
(emit! 'close-paren (string ch) sl sc)
|
||||
(lp)))
|
||||
|
||||
;; Quote
|
||||
((char=? ch #\')
|
||||
(let ((sl line) (sc col))
|
||||
(advance!)
|
||||
(emit! 'prefix "'" sl sc)
|
||||
(lp)))
|
||||
|
||||
;; Quasiquote
|
||||
((char=? ch #\`)
|
||||
(let ((sl line) (sc col))
|
||||
(advance!)
|
||||
(emit! 'prefix "`" sl sc)
|
||||
(lp)))
|
||||
|
||||
;; Unquote / unquote-splicing
|
||||
((char=? ch #\,)
|
||||
(let ((sl line) (sc col))
|
||||
(advance!)
|
||||
(cond
|
||||
((eqv? (peek) #\@)
|
||||
(advance!)
|
||||
(emit! 'prefix ",@" sl sc))
|
||||
(else
|
||||
(emit! 'prefix "," sl sc)))
|
||||
(lp)))
|
||||
|
||||
;; Sharp sequences
|
||||
((char=? ch #\#)
|
||||
(let ((sl line) (sc col))
|
||||
(advance!) ;; consume #
|
||||
(let ((next (peek)))
|
||||
(cond
|
||||
;; Block comment #|...|#
|
||||
((eqv? next #\|)
|
||||
(advance!)
|
||||
(emit! 'block-comment (read-block-comment) sl sc)
|
||||
(lp))
|
||||
|
||||
;; Datum comment #;
|
||||
((eqv? next #\;)
|
||||
(advance!)
|
||||
(emit! 'special "#;" sl sc)
|
||||
(lp))
|
||||
|
||||
;; Boolean #t, #f, #true, #false
|
||||
((or (eqv? next #\t) (eqv? next #\f))
|
||||
(advance!)
|
||||
(let* ((rest (if (and (not (eof-object? (peek)))
|
||||
(char-alphabetic? (peek)))
|
||||
(collect-while next char-alphabetic?)
|
||||
(string next)))
|
||||
(text (string-append "#" rest)))
|
||||
(emit! 'boolean text sl sc)
|
||||
(lp)))
|
||||
|
||||
;; Character literal #\x
|
||||
((eqv? next #\\)
|
||||
(advance!)
|
||||
(emit! 'character (read-character-literal) sl sc)
|
||||
(lp))
|
||||
|
||||
;; Keyword #:foo
|
||||
((eqv? next #\:)
|
||||
(advance!)
|
||||
(let ((name (if (and (not (eof-object? (peek)))
|
||||
(not (delimiter? (peek))))
|
||||
(collect-while (advance!)
|
||||
(lambda (c) (not (delimiter? c))))
|
||||
"")))
|
||||
(emit! 'keyword (string-append "#:" name) sl sc)
|
||||
(lp)))
|
||||
|
||||
;; Syntax shorthands: #', #`, #,, #,@
|
||||
((eqv? next #\')
|
||||
(advance!)
|
||||
(emit! 'prefix "#'" sl sc)
|
||||
(lp))
|
||||
((eqv? next #\`)
|
||||
(advance!)
|
||||
(emit! 'prefix "#`" sl sc)
|
||||
(lp))
|
||||
((eqv? next #\,)
|
||||
(advance!)
|
||||
(cond
|
||||
((eqv? (peek) #\@)
|
||||
(advance!)
|
||||
(emit! 'prefix "#,@" sl sc))
|
||||
(else
|
||||
(emit! 'prefix "#," sl sc)))
|
||||
(lp))
|
||||
|
||||
;; Vector #(
|
||||
((eqv? next #\()
|
||||
(advance!)
|
||||
(emit! 'special "#(" sl sc)
|
||||
(lp))
|
||||
|
||||
;; Shebang #!...!#
|
||||
((eqv? next #\!)
|
||||
(advance!)
|
||||
(emit! 'block-comment (read-shebang) sl sc)
|
||||
(lp))
|
||||
|
||||
;; Other # sequences: #vu8(, #*, etc.
|
||||
((and (not (eof-object? next))
|
||||
(not (delimiter? next)))
|
||||
(advance!)
|
||||
(let-values (((type text) (read-other-sharp next)))
|
||||
(emit! type text sl sc)
|
||||
(lp)))
|
||||
|
||||
;; Bare # at delimiter boundary
|
||||
(else
|
||||
(emit! 'symbol "#" sl sc)
|
||||
(lp))))))
|
||||
|
||||
;; Dot
|
||||
((char=? ch #\.)
|
||||
(let ((sl line) (sc col))
|
||||
(advance!)
|
||||
(if (delimiter? (peek))
|
||||
(begin
|
||||
(emit! 'dot "." sl sc)
|
||||
(lp))
|
||||
(let ((text (collect-while ch
|
||||
(lambda (c) (not (delimiter? c))))))
|
||||
(emit! 'symbol text sl sc)
|
||||
(lp)))))
|
||||
|
||||
;; Everything else: symbol or number
|
||||
(else
|
||||
(let ((sl line) (sc col))
|
||||
(advance!)
|
||||
(let ((text (collect-while ch
|
||||
(lambda (c) (not (delimiter? c))))))
|
||||
(emit! (if (string->number text) 'number 'symbol)
|
||||
text sl sc)
|
||||
(lp)))))))))
|
||||
13
test/fixtures/clean/well-formatted.scm
vendored
Normal file
13
test/fixtures/clean/well-formatted.scm
vendored
Normal file
@@ -0,0 +1,13 @@
|
||||
;;; A well-formatted Guile source file.
|
||||
;;; This should produce zero surface diagnostics.
|
||||
|
||||
(define-module (test well-formatted)
|
||||
#:export (greet add))
|
||||
|
||||
;; Greet a person by name.
|
||||
(define (greet name)
|
||||
(string-append "Hello, " name "!"))
|
||||
|
||||
;; Add two numbers.
|
||||
(define (add a b)
|
||||
(+ a b))
|
||||
9
test/fixtures/violations/semantic.scm
vendored
Normal file
9
test/fixtures/violations/semantic.scm
vendored
Normal file
@@ -0,0 +1,9 @@
|
||||
(define-module (test semantic)
|
||||
#:use-module (ice-9 format))
|
||||
|
||||
(define (foo x y)
|
||||
(let ((unused 42))
|
||||
(+ x 1)))
|
||||
|
||||
(define (bar a)
|
||||
(baz a))
|
||||
12
test/fixtures/violations/surface.scm
vendored
Normal file
12
test/fixtures/violations/surface.scm
vendored
Normal file
@@ -0,0 +1,12 @@
|
||||
(define x 42)
|
||||
(define y "hello")
|
||||
(define z (+ x y))
|
||||
|
||||
;; This line is fine
|
||||
(define (long-function-name-that-exceeds-the-default-eighty-character-limit arg1 arg2 arg3 arg4 arg5)
|
||||
(+ arg1 arg2))
|
||||
|
||||
|
||||
|
||||
;; After too many blank lines
|
||||
(define w 99)
|
||||
34
test/run-tests.scm
Normal file
34
test/run-tests.scm
Normal file
@@ -0,0 +1,34 @@
|
||||
#!/usr/bin/env -S guile --no-auto-compile -s
|
||||
!#
|
||||
;;; Test runner for gulie
|
||||
|
||||
;; Add project root to load path
|
||||
(let ((dir (dirname (dirname (current-filename)))))
|
||||
(set! %load-path (cons dir %load-path)))
|
||||
|
||||
(use-modules (srfi srfi-64))
|
||||
|
||||
;; Configure test runner for CI-friendly output
|
||||
(test-runner-current
|
||||
(let ((runner (test-runner-simple)))
|
||||
(test-runner-on-final! runner
|
||||
(lambda (runner)
|
||||
(let ((pass (test-runner-pass-count runner))
|
||||
(fail (test-runner-fail-count runner))
|
||||
(skip (test-runner-skip-count runner)))
|
||||
(newline)
|
||||
(format #t "Results: ~a passed, ~a failed, ~a skipped~%"
|
||||
pass fail skip)
|
||||
(when (> fail 0)
|
||||
(exit 1)))))
|
||||
runner))
|
||||
|
||||
;; Load and run all test files (paths relative to project root)
|
||||
(let ((root (dirname (dirname (current-filename)))))
|
||||
(define (load-test name)
|
||||
(load (string-append root "/test/" name)))
|
||||
(load-test "test-tokenizer.scm")
|
||||
(load-test "test-cst.scm")
|
||||
(load-test "test-rules.scm")
|
||||
(load-test "test-suppression.scm")
|
||||
(load-test "test-compiler.scm"))
|
||||
56
test/test-compiler.scm
Normal file
56
test/test-compiler.scm
Normal file
@@ -0,0 +1,56 @@
|
||||
;;; Tests for (gulie compiler) — semantic analysis pass
|
||||
|
||||
(use-modules (srfi srfi-64)
|
||||
(srfi srfi-1)
|
||||
(gulie compiler)
|
||||
(gulie diagnostic))
|
||||
|
||||
(test-begin "compiler")
|
||||
|
||||
(test-group "unused-variable"
|
||||
(let ((diags (compile-and-capture-warnings
|
||||
"test.scm"
|
||||
"(define (foo x)\n (let ((unused 42))\n x))\n"
|
||||
'())))
|
||||
(test-assert "detects unused variable"
|
||||
(any (lambda (d) (eq? (diagnostic-rule d) 'unused-variable))
|
||||
diags))))
|
||||
|
||||
(test-group "unbound-variable"
|
||||
(let ((diags (compile-and-capture-warnings
|
||||
"test.scm"
|
||||
"(define (foo x)\n (+ x unknown-thing))\n"
|
||||
'())))
|
||||
(test-assert "detects unbound variable"
|
||||
(any (lambda (d) (eq? (diagnostic-rule d) 'unbound-variable))
|
||||
diags))))
|
||||
|
||||
(test-group "arity-mismatch"
|
||||
(let ((diags (compile-and-capture-warnings
|
||||
"test.scm"
|
||||
"(define (foo x) x)\n(define (bar) (foo 1 2 3))\n"
|
||||
'())))
|
||||
(test-assert "detects arity mismatch"
|
||||
(any (lambda (d) (eq? (diagnostic-rule d) 'arity-mismatch))
|
||||
diags))))
|
||||
|
||||
(test-group "clean-code"
|
||||
(let ((diags (compile-and-capture-warnings
|
||||
"test.scm"
|
||||
"(define (foo x) (+ x 1))\n"
|
||||
'())))
|
||||
;; May have unused-toplevel but no real errors
|
||||
(test-assert "no compile errors"
|
||||
(not (any (lambda (d) (eq? (diagnostic-severity d) 'error))
|
||||
diags)))))
|
||||
|
||||
(test-group "syntax-error"
|
||||
(let ((diags (compile-and-capture-warnings
|
||||
"test.scm"
|
||||
"(define (foo x) (+ x"
|
||||
'())))
|
||||
(test-assert "catches syntax error"
|
||||
(any (lambda (d) (eq? (diagnostic-severity d) 'error))
|
||||
diags))))
|
||||
|
||||
(test-end "compiler")
|
||||
59
test/test-cst.scm
Normal file
59
test/test-cst.scm
Normal file
@@ -0,0 +1,59 @@
|
||||
;;; Tests for (gulie cst)
|
||||
|
||||
(use-modules (srfi srfi-1)
|
||||
(srfi srfi-64)
|
||||
(gulie tokenizer)
|
||||
(gulie cst))
|
||||
|
||||
(test-begin "cst")
|
||||
|
||||
(test-group "basic-parsing"
|
||||
(let* ((tokens (tokenize "(define x 42)" "test.scm"))
|
||||
(cst (parse-cst tokens)))
|
||||
(test-assert "root is cst-node" (cst-node? cst))
|
||||
(test-assert "root has no open paren" (not (cst-node-open cst)))
|
||||
(let ((sig (cst-significant-children cst)))
|
||||
(test-equal "one top-level form" 1 (length sig))
|
||||
(test-assert "top-level is cst-node" (cst-node? (car sig))))))
|
||||
|
||||
(test-group "form-name"
|
||||
(let* ((tokens (tokenize "(define x 42)" "test.scm"))
|
||||
(cst (parse-cst tokens))
|
||||
(form (car (cst-significant-children cst))))
|
||||
(test-equal "form name is define" "define" (cst-form-name form))))
|
||||
|
||||
(test-group "nested-forms"
|
||||
(let* ((tokens (tokenize "(let ((x 1)) (+ x 2))" "test.scm"))
|
||||
(cst (parse-cst tokens))
|
||||
(form (car (cst-significant-children cst))))
|
||||
(test-equal "form name is let" "let" (cst-form-name form))
|
||||
;; Should have nested cst-nodes for ((x 1)) and (+ x 2)
|
||||
(let ((inner-nodes (filter cst-node? (cst-node-children form))))
|
||||
(test-assert "has nested nodes" (>= (length inner-nodes) 2)))))
|
||||
|
||||
(test-group "multiple-top-level"
|
||||
(let* ((tokens (tokenize "(define a 1)\n(define b 2)\n(define c 3)" "test.scm"))
|
||||
(cst (parse-cst tokens))
|
||||
(sig (cst-significant-children cst)))
|
||||
(test-equal "three top-level forms" 3 (length sig))))
|
||||
|
||||
(test-group "comments-preserved"
|
||||
(let* ((tokens (tokenize ";; header\n(define x 1)\n" "test.scm"))
|
||||
(cst (parse-cst tokens))
|
||||
(children (cst-node-children cst)))
|
||||
;; Should include the comment as a token child
|
||||
(test-assert "has comment token"
|
||||
(any (lambda (c)
|
||||
(and (token? c) (eq? (token-type c) 'line-comment)))
|
||||
children))))
|
||||
|
||||
(test-group "prefix-handling"
|
||||
(let* ((tokens (tokenize "'(1 2 3)" "test.scm"))
|
||||
(cst (parse-cst tokens))
|
||||
(children (cst-node-children cst)))
|
||||
(test-assert "has prefix token"
|
||||
(any (lambda (c)
|
||||
(and (token? c) (eq? (token-type c) 'prefix)))
|
||||
children))))
|
||||
|
||||
(test-end "cst")
|
||||
104
test/test-rules.scm
Normal file
104
test/test-rules.scm
Normal file
@@ -0,0 +1,104 @@
|
||||
;;; Tests for rule modules
|
||||
|
||||
(use-modules (srfi srfi-64)
|
||||
(srfi srfi-1)
|
||||
(gulie rule)
|
||||
(gulie diagnostic)
|
||||
(gulie rules surface)
|
||||
(gulie rules comments))
|
||||
|
||||
(test-begin "rules")
|
||||
|
||||
;;; Surface rules
|
||||
|
||||
(test-group "trailing-whitespace"
|
||||
(let ((rule (find-rule 'trailing-whitespace)))
|
||||
(test-assert "rule registered" rule)
|
||||
|
||||
(test-equal "clean line produces no diagnostics"
|
||||
'()
|
||||
((rule-check-proc rule) "f.scm" 1 "(define x 42)" '()))
|
||||
|
||||
(let ((diags ((rule-check-proc rule) "f.scm" 1 "(define x 42) " '())))
|
||||
(test-equal "trailing spaces detected" 1 (length diags))
|
||||
(test-equal "correct column"
|
||||
(string-length "(define x 42)")
|
||||
(diagnostic-column (car diags))))
|
||||
|
||||
(test-equal "empty line no diagnostic"
|
||||
'()
|
||||
((rule-check-proc rule) "f.scm" 1 "" '()))))
|
||||
|
||||
(test-group "line-length"
|
||||
(let ((rule (find-rule 'line-length)))
|
||||
(test-assert "rule registered" rule)
|
||||
|
||||
(test-equal "short line ok"
|
||||
'()
|
||||
((rule-check-proc rule) "f.scm" 1 "(define x 42)" '()))
|
||||
|
||||
(let* ((long-line (make-string 81 #\x))
|
||||
(diags ((rule-check-proc rule) "f.scm" 1 long-line '())))
|
||||
(test-equal "long line detected" 1 (length diags)))
|
||||
|
||||
(let* ((config '((line-length . 120)))
|
||||
(line (make-string 100 #\x))
|
||||
(diags ((rule-check-proc rule) "f.scm" 1 line config)))
|
||||
(test-equal "respects config" 0 (length diags)))))
|
||||
|
||||
(test-group "no-tabs"
|
||||
(let ((rule (find-rule 'no-tabs)))
|
||||
(test-assert "rule registered" rule)
|
||||
|
||||
(test-equal "no tabs ok"
|
||||
'()
|
||||
((rule-check-proc rule) "f.scm" 1 " (define x 1)" '()))
|
||||
|
||||
(let ((diags ((rule-check-proc rule) "f.scm" 1 "\t(define x 1)" '())))
|
||||
(test-equal "tab detected" 1 (length diags)))))
|
||||
|
||||
(test-group "blank-lines"
|
||||
(let ((rule (find-rule 'blank-lines)))
|
||||
(test-assert "rule registered" rule)
|
||||
|
||||
(test-equal "normal blank ok"
|
||||
'()
|
||||
((rule-check-proc rule) "f.scm" 5 ""
|
||||
'((max-blank-lines . 2) (%consecutive-blanks . 1))))
|
||||
|
||||
(let ((diags ((rule-check-proc rule) "f.scm" 5 ""
|
||||
'((max-blank-lines . 2) (%consecutive-blanks . 3)))))
|
||||
(test-equal "excessive blanks detected" 1 (length diags)))))
|
||||
|
||||
;;; Comment rules
|
||||
|
||||
(test-group "comment-semicolons"
|
||||
(let ((rule (find-rule 'comment-semicolons)))
|
||||
(test-assert "rule registered" rule)
|
||||
|
||||
(test-equal "double semicolon on own line ok"
|
||||
'()
|
||||
((rule-check-proc rule) "f.scm" 1 " ;; good comment" '()))
|
||||
|
||||
;; Single semicolon on own line
|
||||
(let ((diags ((rule-check-proc rule) "f.scm" 1 " ; bad comment" '())))
|
||||
(test-equal "single ; on own line flagged" 1 (length diags)))))
|
||||
|
||||
;;; Diagnostic formatting
|
||||
|
||||
(test-group "diagnostic-format"
|
||||
(let ((d (make-diagnostic "foo.scm" 10 5 'warning 'test-rule "oops" #f)))
|
||||
(test-equal "format matches expected"
|
||||
"foo.scm:10:5: warning: test-rule: oops"
|
||||
(format-diagnostic d))))
|
||||
|
||||
(test-group "diagnostic-sorting"
|
||||
(let ((d1 (make-diagnostic "a.scm" 10 0 'warning 'r "m" #f))
|
||||
(d2 (make-diagnostic "a.scm" 5 0 'warning 'r "m" #f))
|
||||
(d3 (make-diagnostic "b.scm" 1 0 'warning 'r "m" #f)))
|
||||
(let ((sorted (sort (list d1 d2 d3) diagnostic<?)))
|
||||
(test-equal "first is a.scm:5" 5 (diagnostic-line (car sorted)))
|
||||
(test-equal "second is a.scm:10" 10 (diagnostic-line (cadr sorted)))
|
||||
(test-equal "third is b.scm" "b.scm" (diagnostic-file (caddr sorted))))))
|
||||
|
||||
(test-end "rules")
|
||||
43
test/test-suppression.scm
Normal file
43
test/test-suppression.scm
Normal file
@@ -0,0 +1,43 @@
|
||||
;;; Tests for (gulie suppression)
|
||||
|
||||
(use-modules (srfi srfi-64)
|
||||
(gulie suppression)
|
||||
(gulie diagnostic))
|
||||
|
||||
(test-begin "suppression")
|
||||
|
||||
(test-group "parse-inline-suppress"
|
||||
(let ((supps (parse-suppressions
|
||||
"(define x 1) ; gulie:suppress trailing-whitespace\n")))
|
||||
(test-equal "one suppression" 1 (length supps))
|
||||
(test-equal "this-line kind" 'this-line (caddr (car supps)))))
|
||||
|
||||
(test-group "parse-next-line-suppress"
|
||||
(let ((supps (parse-suppressions
|
||||
";; gulie:suppress line-length\n(define x 1)\n")))
|
||||
(test-equal "one suppression" 1 (length supps))
|
||||
(test-equal "targets line 2" 2 (car (car supps)))))
|
||||
|
||||
(test-group "parse-suppress-all"
|
||||
(let ((supps (parse-suppressions
|
||||
"(define x 1) ; gulie:suppress\n")))
|
||||
(test-equal "one suppression" 1 (length supps))
|
||||
(test-eq "all rules" #t (cadr (car supps)))))
|
||||
|
||||
(test-group "filter-diagnostics"
|
||||
(let ((diags (list (make-diagnostic "f.scm" 1 0 'warning 'trailing-whitespace "tw" #f)
|
||||
(make-diagnostic "f.scm" 2 0 'warning 'line-length "ll" #f)))
|
||||
(supps (parse-suppressions
|
||||
"(define x 1) ; gulie:suppress trailing-whitespace\n(define y 2)\n")))
|
||||
(let ((filtered (filter-suppressions diags supps)))
|
||||
(test-equal "one diagnostic filtered" 1 (length filtered))
|
||||
(test-eq "remaining is line-length" 'line-length
|
||||
(diagnostic-rule (car filtered))))))
|
||||
|
||||
(test-group "region-suppression"
|
||||
(let ((supps (parse-suppressions
|
||||
";; gulie:disable line-length\n(define x 1)\n(define y 2)\n;; gulie:enable line-length\n(define z 3)\n")))
|
||||
;; Should have region-start and region-end
|
||||
(test-assert "has region entries" (>= (length supps) 2))))
|
||||
|
||||
(test-end "suppression")
|
||||
127
test/test-tokenizer.scm
Normal file
127
test/test-tokenizer.scm
Normal file
@@ -0,0 +1,127 @@
|
||||
;;; Tests for (gulie tokenizer)
|
||||
|
||||
(use-modules (srfi srfi-1)
|
||||
(srfi srfi-64)
|
||||
(ice-9 rdelim)
|
||||
(gulie tokenizer))
|
||||
|
||||
(test-begin "tokenizer")
|
||||
|
||||
;;; Roundtrip invariant — the most critical test
|
||||
|
||||
(test-group "roundtrip"
|
||||
(define (roundtrip-ok? input)
|
||||
(let* ((tokens (tokenize input "test.scm"))
|
||||
(result (string-concatenate (map token-text tokens))))
|
||||
(string=? input result)))
|
||||
|
||||
(test-assert "empty input"
|
||||
(roundtrip-ok? ""))
|
||||
|
||||
(test-assert "simple expression"
|
||||
(roundtrip-ok? "(define x 42)"))
|
||||
|
||||
(test-assert "nested expressions"
|
||||
(roundtrip-ok? "(define (foo x)\n (+ x 1))\n"))
|
||||
|
||||
(test-assert "string with escapes"
|
||||
(roundtrip-ok? "(define s \"hello \\\"world\\\"\")"))
|
||||
|
||||
(test-assert "line comment"
|
||||
(roundtrip-ok? ";; a comment\n(define x 1)\n"))
|
||||
|
||||
(test-assert "block comment"
|
||||
(roundtrip-ok? "#| block\ncomment |#\n(define x 1)"))
|
||||
|
||||
(test-assert "nested block comment"
|
||||
(roundtrip-ok? "#| outer #| inner |# outer |#"))
|
||||
|
||||
(test-assert "datum comment"
|
||||
(roundtrip-ok? "#;(skip this) (keep this)"))
|
||||
|
||||
(test-assert "character literals"
|
||||
(roundtrip-ok? "(list #\\space #\\newline #\\a #\\x41)"))
|
||||
|
||||
(test-assert "keywords"
|
||||
(roundtrip-ok? "(foo #:bar #:baz)"))
|
||||
|
||||
(test-assert "booleans"
|
||||
(roundtrip-ok? "(list #t #f #true #false)"))
|
||||
|
||||
(test-assert "vectors"
|
||||
(roundtrip-ok? "#(1 2 3)"))
|
||||
|
||||
(test-assert "quasiquote and unquote"
|
||||
(roundtrip-ok? "`(a ,b ,@c)"))
|
||||
|
||||
(test-assert "syntax shorthands"
|
||||
(roundtrip-ok? "#'x #`x #,x #,@x"))
|
||||
|
||||
(test-assert "dot notation"
|
||||
(roundtrip-ok? "(a . b)"))
|
||||
|
||||
(test-assert "numbers"
|
||||
(roundtrip-ok? "(+ 1 2.5 -3 +4 1/3 #xff)"))
|
||||
|
||||
(test-assert "square brackets"
|
||||
(roundtrip-ok? "(let ([x 1] [y 2]) (+ x y))"))
|
||||
|
||||
(test-assert "multiline with mixed content"
|
||||
(roundtrip-ok? "(define-module (foo bar)
|
||||
#:use-module (ice-9 format)
|
||||
#:export (baz))
|
||||
|
||||
;;; Section header
|
||||
|
||||
(define (baz x)
|
||||
;; body comment
|
||||
(format #t \"value: ~a\\n\" x))
|
||||
"))
|
||||
|
||||
;; Real-world file roundtrip
|
||||
(test-assert "real guile source file"
|
||||
(let ((text (call-with-input-file "refs/guile/module/ice-9/pretty-print.scm"
|
||||
(lambda (port)
|
||||
(let lp ((acc '()))
|
||||
(let ((ch (read-char port)))
|
||||
(if (eof-object? ch)
|
||||
(list->string (reverse acc))
|
||||
(lp (cons ch acc)))))))))
|
||||
(roundtrip-ok? text))))
|
||||
|
||||
;;; Token type classification
|
||||
|
||||
(test-group "token-types"
|
||||
(define (first-token-type input)
|
||||
(token-type (car (tokenize input "test.scm"))))
|
||||
|
||||
(test-eq "symbol" 'symbol (first-token-type "foo"))
|
||||
(test-eq "number" 'number (first-token-type "42"))
|
||||
(test-eq "string" 'string (first-token-type "\"hello\""))
|
||||
(test-eq "open-paren" 'open-paren (first-token-type "("))
|
||||
(test-eq "close-paren" 'close-paren (first-token-type ")"))
|
||||
(test-eq "boolean-true" 'boolean (first-token-type "#t"))
|
||||
(test-eq "boolean-false" 'boolean (first-token-type "#f"))
|
||||
(test-eq "keyword" 'keyword (first-token-type "#:foo"))
|
||||
(test-eq "character" 'character (first-token-type "#\\a"))
|
||||
(test-eq "line-comment" 'line-comment (first-token-type ";; hi"))
|
||||
(test-eq "prefix-quote" 'prefix (first-token-type "'"))
|
||||
(test-eq "prefix-quasiquote" 'prefix (first-token-type "`"))
|
||||
(test-eq "dot" 'dot (first-token-type ". "))
|
||||
(test-eq "newline" 'newline (first-token-type "\n")))
|
||||
|
||||
;;; Source location tracking
|
||||
|
||||
(test-group "source-locations"
|
||||
(let ((tokens (tokenize "(define\n x\n 42)" "test.scm")))
|
||||
(test-equal "first token line" 1 (token-line (car tokens)))
|
||||
(test-equal "first token col" 0 (token-column (car tokens)))
|
||||
;; Find 'x' token
|
||||
(let ((x-tok (find (lambda (t) (and (eq? (token-type t) 'symbol)
|
||||
(string=? (token-text t) "x")))
|
||||
tokens)))
|
||||
(test-assert "found x token" x-tok)
|
||||
(test-equal "x on line 2" 2 (token-line x-tok))
|
||||
(test-equal "x at column 2" 2 (token-column x-tok)))))
|
||||
|
||||
(test-end "tokenizer")
|
||||
Reference in New Issue
Block a user