● refactoringbuildingsamridhlimbu.com/projects/r1gpt · v0.1

❯ cd projects/r1gpt

R1GPT

● open source · NEM

An AI audit engine for R1 connection-approval submissions in Australia's National Electricity Market. Upload a submission package — GPS baseline, FAT report, OEM metadata, optional PSCAD/EMT studies — and get back a clause-by-clause audit against AEMO's Power System Model Guidelines, with a deterministic readiness score and findings traced to the rule that triggered them.

Aarav261/R1gpt

assessors

0–95

readiness scale

24/24

cases passing

Context

Built with Aarav. Getting a generator or battery connected to the NEM means clearing AEMO's R1 model-acceptance gate — a slow, expensive RFI cycle where a single missed clause or stale baseline costs weeks. A general LLM will happily give a confident, unverifiable opinion on a submission. R1GPT does the opposite: it runs six deterministic assessors against the AEMO Power System Model Guidelines v3.0 (effective 25 Sep 2025) and NER S5.2, and every finding points back to the exact section and source document that triggered it. The score is arithmetic, not vibes — so the same package always scores the same, and the result holds up to a regulator.

Six deterministic assessors

Impedance / firmware delta

Flags changes in machine impedance and firmware against the lodged baseline.

Clause evidence matrix

Maps each NER S5.2 clause to the documents that actually evidence it.

Mandatory-artefact checks

Verifies the required artefacts (GPS baseline, FAT report, OEM metadata) are present and valid.

Firmware DMAT comparison

Compares declared firmware against the DMAT record — a mismatch is a 45-point, lodgement-blocking finding.

NSP staleness

Detects baselines and access standards that have gone stale relative to the network service provider.

EMT / RMS model adequacy

Checks PSCAD/EMT and RMS models are adequate for the connection size and type.

The readiness index

kairos/scheduler.pypy

1# severity weights — plain subtraction, no sigmoid, no fabricated percentiles
2WEIGHTS = { low: 4, medium: 12, high: 25, dmat: 45 }
3
4# capped at 95 — the last 5 points are reserved for chartered-engineer review
5readiness = clamp(0, 95, 95 - sum(f.weight for f in findings))
6
7# every finding traces back to a PSMG v3.0 section + the source document
8return { readiness, findings, predicted_rfis, rfi_risk_band, coverage }

Findings carry severity weights — low 4, medium 12, high 25, and a DMAT-triggering mismatch 45 — and the readiness index is just 95 minus their sum, clamped to 0–95. Alongside the number, R1GPT emits a qualitative RFI-cycle risk band (minimal → elevated → high → severe), an “X of N checks passed” coverage metric, predicted AEMO/NSP questions, and rectification plans with effort estimates. Stress-tested across 24/24 deterministic cases.

Key technical decisions

deterministic subtraction › LLM-scored confidence

The readiness index is plain arithmetic: 95 minus the sum of severity weights, clamped to 0–95. No sigmoid, no confidence intervals, no fabricated percentiles. The same package always yields the same score — auditable, repeatable, defensible to a regulator.

capped at 95 › a perfect 100

The index can never read 100. The last five points are deliberately reserved for chartered-engineer review — the tool gets you submission-ready, it doesn't pretend to replace the signature that AEMO actually requires.

six anchored assessors › one general-purpose prompt

Each assessor is scoped to a regulatory concern and anchored to a specific PSMG v3.0 section, so every finding cites where the rule lives — not a conversational opinion from a general model. The LLM extracts; the rules decide.

no database › persisted submission store

Everything runs from the uploaded package in-request. Zod-validated extraction, pdf-parse for documents, no persistence layer — Vercel-ready with nothing to provision and no submitter data sitting at rest.

Stack

AppNext.js 14 (App Router) · TypeScript (strict) · Tailwind CSS

ExtractionOpenAI gpt-4o · Zod schema validation · pdf-parse

Utilsnanoid · IBM Plex Mono (technical values) · Inter

DeployVercel · no database

DemosIronbark Solar Farm (400 MW · failing) · Wattle Creek BESS (200 MW · passing)

source ← back to projects

1# severity weights — plain subtraction, no sigmoid, no fabricated percentiles

2WEIGHTS = { low: 4, medium: 12, high: 25, dmat: 45 }

4# capped at 95 — the last 5 points are reserved for chartered-engineer review

5readiness = clamp(0, 95, 95 - sum(f.weight for f in findings))

7# every finding traces back to a PSMG v3.0 section + the source document

8return { readiness, findings, predicted_rfis, rfi_risk_band, coverage }