Project 05

Nicene Heresy Detector

Deterministic content moderation for Christian text — 11 named heresies, rule-level audit trail.

What You're Seeing

A first-pass filter for content moderators, not a doctrinal teacher

This tool flags sentences that contradict named heresies condemned by the Nicene-Constantinopolitan Creed. It is intended for content-moderation pipelines — sermon-archive screening, Christian publishing platforms, AI-generated devotional review, denominational compliance — where a human reviewer makes the final call and the detector's job is to surface candidates worth a closer look.

You enter a sentence. The system either: (1) emits N/A if the sentence engages no Trinitarian entity, no theological vocabulary, and is semantically distant from the reference banks (e.g. "I like pizza"); (2) emits PASS if no rule fires; or (3) emits FAIL with the specific named heresy (Arianism, Docetism, Pneumatomachianism, …), the canonical predicate the rule watches, and the evidence span that triggered the flag.

Architecturally this is a rule-based detector with neural- and NLP-augmented surface coverage. The decision-maker is a symbolic rule base over a fixed Trinitarian ontology; a sentence-transformer is a fallback proposer whose outputs are gated by an orthodox-affirmation regex veto. Every flag is reachable via a named rule a human reviewer can audit and override.

How It Works

11 deterministic rules (NIC-001 … NIC-011) cover the named-heresy taxonomy: Arianism, denial of the Son's divinity, Modalism, Pneumatomachianism, Spirit-as-force, Docetism, Ebionism, ontological subordinationism, temporal origin of the Son, Tritheism, and religious pluralism — each keyed to a forbidden value of a canonical Trinitarian predicate (e.g. created(Son) = True → NIC-001 Arianism)
Three-strategy neural extraction maps free-form sentences to formal propositions: (A) hand-curated regex patterns — the primary detector; (B) a BAAI/bge-small-en-v1.5 bi-encoder comparing the sentence to forbidden- and orthodox-reference banks — a fallback proposer; (C) lightweight negation- and comparative-aware NLP regexes
Orthodox-affirmation veto is what makes the neural fallback safe to ship: ~17 regexes match explicit Nicene phrases ("eternally begotten", "coequal", "of one essence", "one God in three persons", "the only way", …) and unconditionally drop forbidden propositions whose key they cancel, regardless of what the bi-encoder suggested
Relevance gate short-circuits to N/A when a sentence names no canonical Trinitarian entity, uses no theological vocabulary, and the bi-encoder rates it distant from any reference phrase — rule firing is skipped entirely
Every FAIL ships with a full audit trail: rule id, named heresy, canonical predicate, originating extraction strategy (A/B/C), and evidence span. GET /api/rules exposes the full deterministic rule set as JSON for external auditing

Try:

Sentences are split and checked individually.

Paragraph Regression

One heretical paragraph (every sentence must FAIL with the matching rule) and one orthodox paragraph (every sentence must PASS). 9 sentences total — the acceptance test for the three-component pipeline.

Named-Heresy Regression Suite

One canonical sentence per rule (NIC-001 … NIC-011), six orthodox formulations whose vocabulary overlaps with forbidden references, and three off-topic anchors.

Methodology & Honest Caveats

Pipeline: A relevance gate (canonical entity OR theological vocabulary OR bi-encoder topicality ≥ 0.70) decides whether a sentence enters rule firing at all. Three extraction strategies run in parallel — ~18 regex patterns (Strategy A), a bi-encoder against forbidden- and orthodox-reference banks with margin gating (Strategy B), and 6 negation/comparative/temporal NLP regexes (Strategy C). The union is filtered by ~17 affirmation-veto regexes that drop forbidden propositions contradicted by Nicene phrases in the same sentence. Surviving propositions are checked against 11 deterministic rules.

Pattern fit, not benchmark: The regression suite was developed alongside the patterns. A clean run reflects pattern-fit, not generalisation. For real deployment, the patterns would need to be evaluated on a held-out corpus of sermon transcripts (or comparable real-world text), and the false-positive rate against orthodox texts would need to be measured directly. That work has not been done in this prototype.

Known limitations (out of scope by design): reported speech ("Arius taught that…" still flags); conditionals and counterfactuals; recension- and tradition-dependent claims (Filioque); doctrines the Creed presupposes but does not state (Theotokos, hypostatic union); composite orthodox-heretical statements; adversarial paraphrase. The system flags explicit claims for human review; it is not a doctrinal-reasoning system.

What this is not: not a calibrated four-label NLI classifier with abstention (that is a parallel research project); not a doctrinal teacher or authority — the Creed itself, denominational catechisms, and human teachers are the authority; not "AlphaGeometry-inspired" — this system does flat boolean rule firing over a fixed ontology, calling it analogous would overclaim.