Chorus::Engine

CPAN version CI Perl License

Chorus is a Perl inference engine that turns a normative corpus into a conformity-checking pipeline. An AI agent builds the knowledge base; the engine executes it deterministically and traceably — no LLM, no network, on any machine with Perl.

The system works in two distinct phases:

Phase A — Build   [AI agent, supervised, once per standard]
  Raw corpus → chorus-feed → KB + YAML rules
             → chorus-check → deployable Perl pipeline

Phase B — Execute [Chorus alone, no LLM, for every project]
  project.json → perl run.pl → conformity report
  100 % deterministic · reproducible · certifiable

The LLM intervenes only in Phase A — reading the corpus, structuring knowledge, generating artefacts. In Phase B, it no longer intervenes: the Perl pipeline runs alone, deterministically and reproducibly.

Normative corpus (PDF, plain text, Word, Excel)
        │
   chorus-pdf / chorus-word / chorus-excel + chorus-feed   ← AI agent extracts and formalises the rules
        │
   KB: ontology · YAML rules · normative tables
        │
   chorus-check               ← generates the Perl pipeline, runs it
        │
   perl run.pl project.json   ← deterministic, reproducible, no AI agent
        ▼
  ✅ COMPLIANT / ❌ NON_COMPLIANT  (per element, per agent, with reason and reference)

Origin

Chorus belongs to the tradition of symbolic AI — explicit knowledge representation, typed structures, deterministic inference. In the lineage of expert systems and Marvin Minsky's Frames.

The first version was born in 2013 from the porting to Perl of an original LISP project. The goal was twofold: to show that Perl was perfectly suited to this kind of implementation, and to offer the CPAN community an inference engine inspired by Minsky's Frames — typed objects, slots, inheritance, inference chain.

More than a decade later, an LLM's analysis of the project revealed an unexpected complementarity: where the symbolic engine excels at executing rules deterministically and traceably, the LLM excels at reading a corpus and formalising them. The real friction — writing YAML rules by hand, a tedious task — was the LLM's natural ground.

That encounter gave rise to version 2.

Chorus v2 is an augmented symbolic system: the inference engine remains sovereign — frames, slots, inference chain, no neural network in the decision layer. The LLM is a preprocessing tool, not a decision-maker. Two forms of AI, complementary rather than competing.

Why an LLM cannot run the verification itself

Chorus occupies a specific position in the current AI landscape. Most hybrid systems use a language model as the decision layer and rules as guardrails. Chorus inverts this: the LLM is an extraction tool that reads documents and formalises rules; the inference engine handles all reasoning. The LLM never draws a conclusion.

1. Exhaustive corpus coverage — impossible to guarantee. A language model does probabilistic completion, not exhaustive enumeration. Rare clauses, normative footnotes, and cross-references between standards are silently omitted. The problem: the model does not know what it omits.

2. Consistency across a full project dossier — certain degradation. A real dossier includes many heterogeneous documents — specifications, calculation notes, product data sheets, supporting evidence. On long contexts, an LLM loses precision on items introduced early and does not reliably detect cross-document contradictions.

3. Reproducibility — absent by nature. Two runs on the same project can produce different verdicts. For a control bureau or an insurer, this is disqualifying.

4. Traceability — structurally absent. An LLM may hallucinate references, paraphrase imprecisely, or conflate two clauses. It cannot guarantee that each assertion is anchored to a specific article of a specific standard.

5. Normative updates — opaque. When a standard is revised, there is no way to know which part of the LLM's reasoning is affected. With an explicit rule engine, the update is surgical: the affected YAML rules are identified, corrected, and re-tested in isolation.

The division of labour

An LLM is an excellent extractor and translator of normative text into formal rules. It is a poor conformity checker.

This is precisely the division of labour Chorus implements: the LLM generates and formalises the rules (chorus-feed); the inference engine executes them deterministically and traceably (chorus-check). Together they cover what neither can do alone.

Running chorus-check twice on the same project file, on any machine, always produces the same output — no sampling, no temperature, no randomness in the decision layer.

AI-assisted pipeline — chorus-* commands

The chorus-* commands are AI agent skills — not shell scripts. Each is loaded by an AI agent (Claude, Copilot, ECA…) and executed interactively in your development environment. The Perl pipeline they produce runs entirely on its own: no AI agent, no LLM, no network connection required at runtime.

Pipeline overview

Normative corpus (PDF, plain text, Word, Excel)
        │
   chorus-pdf          ← extracts PDFs (hybrid by default / text / auto / images)
   chorus-word         ← extracts Word documents (.docx)
   chorus-excel        ← extracts Excel spreadsheets and CSV (.xlsx, .csv)
        │
   corpus/<NNN>-<slug>.txt / -vision.md
        │
   chorus-feed         ← builds the KB: ontology, YAML rules, Helpers.pm
        │
   agent/agents/*.org · rules/**/*.yml · lib/.../Helpers.pm
        │                 ← domain expert reviews and corrects
   chorus-check        ← generates Feed.pm, Agent/*.pm, Expert.pm, run.pl
        │                   then runs: perl run.pl project.json
        ▼
  ✅ COMPLIANT / ❌ NON_COMPLIANT  (per element, per agent, with reason)
        │
   chorus-strengthen   ← classifies gaps, produces enrichment roadmap
        │
   chorus-feed --enrich ← targeted KB enrichment
        └──────────────────────────────────────────┐
                                                   │ reinforcement loop
                                            chorus-check --all ✅

The project file fed to chorus-check can be:

| Level | Meaning | |---|---| | ✅ certain | Exact or trivially equivalent match | | ⚠️ probable | Close match with documented transformation | | ❓ ambiguous | Multiple KB candidates — human decision required | | ⛔ gap | Required slot absent from source — blocks the pipeline | | ⬜ out-of-scope | Present in source, absent from KB — noted but ignored |

The alignment report produced (import-report-NNN.org) serves as the audit trail for each mapping decision and the thesaurus is re-read and enriched on subsequent imports to refine the match with the corpus terminology.

Commands at a glance

| Command | Role | |---|---| | chorus-quickstart | Guided overview — start here if new to Chorus | | chorus-pdf | Extract a PDF corpus (hybrid by default / text / auto / images) | | chorus-word | Extract a Word document (.docx) into an enriched corpus | | chorus-excel | Extract an Excel spreadsheet or CSV into an enriched corpus | | chorus-feed | Build or enrich the KB from a corpus | | chorus-check | Generate infrastructure + run conformity check | | chorus-create-project | Generate a synthetic project JSON from the KB | | chorus-import-project | Align engineer documents with KB slot names | | chorus-strengthen | Identify rule gaps, produce enrichment roadmap |

Reinforcement loop

Once the first pipeline is running, chorus-strengthen classifies every discordance (rule too strict, rule too permissive, Feed targeting gap) and recommends the corpus needed to close each gap:

chorus-create-project <sb> --batch          ← 4-file coverage suite
chorus-check <sb> --all                     ← synthesis table
chorus-strengthen <sb>                      ← gap report + roadmap
chorus-feed <sb> corpus-fix.txt --enrich    ← targeted enrichment
chorus-check <sb> --all                     ← verify convergence ✅

Once generated, runs without an AI agent

# On any machine with Perl installed:
perl run.pl project.json

# Re-run with a different project — no regeneration:
perl run.pl other-project.json

Full command reference: doc/en/04-chorus-commands.md

Application domains

Chorus is not tied to any particular sector. A domain is Chorus-compatible whenever three conditions hold:

  1. The project is described by typed elements — each object to validate (structural member, contractual clause, software component…) has measurable attributes and a discriminating type.
  2. The standard states thresholds, conditions and reference tables — explicit requirements, not open-ended prose.
  3. The decision must be traceable and reproducible — audit, certification, regulatory filing, litigation.

| Domain | Typical corpus | |---|---|---| | 🔐 Cybersecurity / NIS2 / DORA | SecNumCloud v3.2, NIS2 Annex II, DORA, ETSI EN 319 412 | | 🌿 CSRD / Environment | ESRS E1–E5, S1–S4, GHG Protocol, EU Taxonomy | | 🏗️ Construction / BIM | Eurocodes EC2/EC3/EC5, Building Regs, DTU | | ⚖️ GDPR / Public procurement | GDPR Art. 13/14/28/30/35, NIS2, procurement code | | 🏦 Finance / RegTech | Basel IV (CRR3), MiFID II, EMIR | | 💊 Pharmaceuticals / GMP | EU GMP Annex 1, ICH Q8/Q9/Q10, European Pharmacopoeia | | 🏥 Medical devices | MDR 2017/745, ISO 13485, IEC 62304, ISO 14971 | | 🚗 Automotive / ISO 26262 | ASIL A/B/C/D, ASPICE v3.1, MISRA C:2012 | | ✈️ Aerospace / DO-178C | DO-178C, ARP4754A, AMC 20-115 (EASA) | | ⚡ Energy / Nuclear | RCC-M, IEC 61511, ASN safety guide, IEC 62351 |

The key variable is corpus quality, not domain complexity. A well-structured corpus (numbered requirements, explicit reference tables, defined hierarchy levels) onboards in 2 to 4 weeks.

Full domain reference: doc/en/03-applications.md

Full working example

sandboxes/demo_en — timber-frame construction compliance against BS EN 338, EC5, Building Regulations Part L/B, BS EN 13501 (simulation).

perl sandboxes/demo_en/run.pl sandboxes/demo_en/project-01.json

Engine internals (YAML DSL, Chorus::Frame API, _MAX_CYCLES, _reset()): doc/en/01-intro.md

The core — Perl inference engine

The chorus-* pipeline runs on a pure Perl inference engine with no runtime dependency beyond the standard CPAN (YAML, Scalar::Util, Digest::MD5).

Chorus implements the classic recognise–act cycle of the expert-system tradition: at each iteration, the engine identifies rules applicable to the current working memory, fires them, then loops — until nothing changes or a goal is reached.

The working memory is made of Chorus::Frame objects whose properties (slots) carry domain knowledge. Chorus::Expert chains several specialised engines over a shared working memory.

| Module | Role | |---|---| | Chorus::Frame | Knowledge representation — slots, inheritance, global registries, forward/backward chaining | | Chorus::Engine | Inference loop — rules, scope combinatorics, flow control, YAML loading | | Chorus::Expert | Multi-agent orchestration — shared BOARD, outer loop | | Chorus::Collection::List | Ordered Frame sequences — bidirectional prev/succ navigation, merge, positional tests | | Chorus::Collection::Filter | Regex-like filtering on Frame sequences — capture groups in @_VFILTER |

Direct API

use Chorus::Engine;
use Chorus::Frame;

my $agent = Chorus::Engine->new();

Chorus::Frame->new(color => 'blue', label => 'sky');
Chorus::Frame->new(color => 'red',  label => 'fire');

$agent->addrule(
    _SCOPE => { f => sub { [ grep { $_->{color} eq 'blue' } fmatch(slot => 'color') ] } },
    _APPLY => sub {
        my %o = @_;
        return if $o{f}->{tagged};
        $o{f}->set('tagged', 'yes');
        print "Tagged: ", $o{f}->{label}, "\n";   # → Tagged: sky
        return 1;
    },
);

$agent->loop();

The YAML DSL expresses the same logic without repetitive Perl code:

RULE: tag-blue-frames
FIND:
  f:
    attribut: color
    filtre:   blue
EXCEPTION: defined $f->{tagged}
ACTION: |
  $f->set('tagged', 'yes');
  print "Tagged: $f->{label}\n";   # → Tagged: sky
  return 1;

Each YAML rule lives in its own .yml file. To load them, save the rule as rules/tag-blue-frames.yml and call loadRules() instead of addrule():

use Chorus::Engine;
use Chorus::Frame;

my $agent = Chorus::Engine->new();

Chorus::Frame->new(color => 'blue', label => 'sky');
Chorus::Frame->new(color => 'red',  label => 'fire');

$agent->loadRules('rules/');   # loads all *.yml in the directory

$agent->loop();

Files are compiled in alphabetical order — prefix with R01-, R02-… to control priority. Multiple loadRules() calls accumulate.

Full technical reference: perldoc Chorus::Engine · perldoc Chorus::Frame · perldoc Chorus::Expert

Installation

cpanm Chorus::Engine

Or from source:

perl Makefile.PL && make && make test && make install

Documentation

Contributing

Contributions are welcome — bug reports, documentation fixes, new examples, or rule engine improvements.

Repository

https://github.com/civorra/Chorus