After 3 Invoices, the AI Recognises Your Supplier....

Your sustainability analyst opens a diesel invoice from a fuel supplier they've never seen before. New format. Different layout. The consumption figure is on page two, buried between a delivery address and an ABN. The analyst finds the litres, looks up the NGA emission factor (2.71 kg CO2-e per litre for diesel oil), classifies it as Scope 1, and creates the record. That takes about four minutes.

Now multiply that by 2,000 invoices a quarter. That's over 130 hours of someone's life spent doing the same cognitive task: find the number, match the material, apply the factor, confirm the scope. Every single time. Even when 60% of those invoices come from the same twelve suppliers, in the same format, for the same materials.

This is the part of carbon accounting that nobody talks about at conferences. Not the strategy. Not the science. The mind-numbing repetition.

Static Tools vs. Systems That Learn

Most carbon accounting software treats every document the same way, every time. Upload an invoice. Extract the data. Present it for review. Doesn't matter if you've seen that exact supplier format 400 times. The system has no memory. It doesn't get better with use.

That's fine for a company processing 50 invoices a year. It's not fine for a Tier 1 contractor processing 2,000 a quarter across 200+ suppliers, or a property manager with 40 buildings generating 640 utility documents every three months.

We built Carbonly's document processing differently. The system remembers. When your team confirms a match between a specific supplier, a specific material, and a specific emission factor, that confirmation feeds back into how the system handles the next invoice from that supplier. Not as a rigid template. As a learned pattern that gets more confident over time.

The first invoice from a new diesel supplier? The system extracts the data and presents it for review. It suggests a material match and an emission factor, but it asks you to confirm.

The second invoice from the same supplier? The suggestion comes pre-filled. Confidence is higher. Your analyst glances at it and approves in two seconds.

By the third invoice, the system recognises the pattern: this supplier, this invoice format, this material, this emission factor from the NGA Factors workbook. Confidence hits 90%+.

After ten verified matches with zero corrections, the system crosses a threshold. It auto-confirms that specific supplier-material-factor combination. Your team gets a notification, but nobody needs to manually review it. The record goes straight into your emissions data with a full audit trail showing exactly why it was auto-confirmed.

That progression from "please confirm" to "I've got this" is the difference between a static tool and a system that learns your supply chain.

The Maths on 2,000 Invoices a Quarter

Let's be specific about what this means for a company with real volume.

A mid-to-large construction company or property portfolio might process 2,000 emission-relevant documents per quarter. At an average of three minutes per manual review (finding the consumption figure, confirming the material, checking the emission factor, verifying the scope classification), that's 6,000 minutes. One hundred hours. Two and a half full working weeks of someone's time, every quarter, just reviewing data that's already been extracted.

Period	Auto-confirmed	Manual review	Hours saved
Month 1	~10% (new system, few verified suppliers)	~90%	~10 hrs
Month 3	~55% (common suppliers verified)	~45%	~55 hrs
Month 6	~80% (most repeat suppliers learned)	~20%	~80 hrs
Month 9+	~85-90% (steady state)	~10-15%	~85 hrs

By month nine, you're reviewing maybe 200 to 300 records per quarter instead of 2,000. And those 200 records are the ones that actually need human judgment: new suppliers, unusual materials, format changes, low-confidence matches.

The 85% that auto-confirm aren't being ignored. They're being processed by a system that has demonstrated accuracy on that specific supplier-material combination through repeated verified matches. Every auto-confirmed record is logged with its confidence score, the historical match count, and the specific emission factor applied. Your auditor can see exactly why the system was confident enough to auto-confirm.

We're not claiming 100% auto-confirmation. We're not sure that's even desirable. There's a long tail of one-off suppliers, unusual invoices, and edge cases where you want human eyes. The goal isn't zero review. It's reviewing the right 10-15% instead of reviewing everything.

How the Learning Actually Works

When your analyst reviews a record and confirms the match ("yes, this is diesel, and 2.71 kg CO2-e per litre is the correct NGA factor"), several things happen behind the scenes.

The system records the combination: this supplier, this invoice layout pattern, this material description, this emission factor. It doesn't just remember the supplier name. It learns the relationship between the way that supplier describes the material on their invoices and the correct classification in your emissions system. So when the same supplier writes "automotive diesel" on one invoice and "diesel fuel (ULP)" on another, the system learns that both map to the same NGA factor.

Each new confirmed match for that combination increases a confidence counter. Think of it like a trust score that builds over time. One confirmation is a data point. Three confirmations are a pattern. Ten confirmations with zero corrections are a proven relationship.

The threshold for auto-confirmation is configurable. The default is ten confirmed matches, but your admin can set it higher for sensitive categories or lower for straightforward ones. Some organisations set it to five for electricity bills (simple format, single emission factor per state) and twenty for construction materials (more variability, higher consequence of error).

And here's the part that matters for compliance: if the system ever gets one wrong and your analyst corrects it, the confidence score for that combination resets. It doesn't drop by a bit. It goes back to zero and starts earning trust again. One correction undoes ten confirmations. That asymmetry is deliberate. In carbon reporting under NGER and AASB S2, a false positive (auto-confirming an incorrect emission factor) is far worse than a false negative (asking a human to review something that turns out to be fine).

What Never Auto-Confirms

Not everything should be automated, even at high confidence. We're pretty clear-eyed about this.

Certain categories are excluded from auto-confirmation entirely, regardless of how many times the system has seen them. Scope classification changes never auto-confirm. If a record that's historically been Scope 1 suddenly looks like it might be Scope 3, a human decides. New emission factor assignments never auto-confirm. If the NGA Factors workbook gets updated (it happens annually, and the 2025-26 NGER legislation amendments changed several factors), every affected combination resets and goes through manual review again.

Cross-scope materials don't auto-confirm. Electricity can be Scope 2 (location-based) or Scope 3 (market-based with renewable energy certificates). That distinction has real implications for both NGER and AASB S2 reporting, and the system doesn't make that call on its own.

Your admin also has a kill switch. One click disables auto-confirmation across the entire system, reverting everything to manual review. Not next week. Not after a support request. Immediately. We built this because we've seen what happens when organisations lose confidence in automated systems and can't turn them off fast enough. The Clean Energy Regulator's compliance priorities for 2025-26 emphasise accurate and timely reporting. If your process breaks, you need to be able to fall back to manual without downtime.

Why This Matters for AASB S2 and NGER

Group 2 entities under ASRS start mandatory climate reporting for financial years beginning 1 July 2026. Group 3 follows from 1 July 2027. Both groups need to report Scope 1 and 2 emissions from day one, with Scope 3 required from the second reporting period. NGER reporters already face annual deadlines by 31 October, with civil penalties of up to $660,000 per contravention under Section 19 of the NGER Act.

The ANAO's performance audit of the NGER scheme found that 72% of 545 reports examined contained errors. Seventeen percent had significant errors. These were organisations using manual processes, spreadsheets, and consultants. The data quality problem in Australian emissions reporting isn't theoretical. It's documented by the government's own auditor.

A learning system doesn't just save time. It reduces the error surface. Once a supplier-material-factor combination has been verified ten times and auto-confirmed, that combination is locked in. No one can accidentally fat-finger the emission factor. No one can misclassify diesel as Scope 2. No one can apply the Victorian grid factor (0.78 kg CO2-e/kWh) to a facility in Tasmania (0.20 kg CO2-e/kWh). The verified combination is the combination.

That's worth something to an auditor. Under ASSA 5000, the assurance practitioner needs to understand your internal controls over sustainability information. A system that documents every match, tracks confidence scores over time, logs every human confirmation and correction, and maintains a clear record of which records were auto-confirmed vs. manually reviewed? That's a control environment. A spreadsheet with 2,000 rows and no trace of who entered what? That's a risk.

The First Three Months Are the Hardest

We won't pretend the onboarding period is effortless. Month one with Carbonly is more work than month six. That's by design.

In the first few weeks, almost everything requires manual review. The system is seeing your suppliers for the first time. It's learning your material descriptions. It's building confidence from zero. Your analyst will spend time confirming matches, correcting the occasional misclassification, and teaching the system how your supply chain describes things.

This is the investment period. And it's tempting to look at the review queue in week two and think "this isn't saving me any time." It isn't. Yet.

But the learning compounds. By week four, the most common suppliers are starting to auto-fill. By month two, electricity and gas bills from your regular retailers are barely hitting the review queue. By month three, the bulk of your supply chain is learned. The review queue shrinks from hundreds of records to dozens.

We've seen this pattern consistently across organisations with 50+ regular suppliers. The crossover point, where the system is saving more time than it costs to review, usually hits somewhere around week six to eight. After that, the gap widens every month.

Consider a property manager tracking emissions across 40 buildings. They have maybe eight energy retailers across three states. After three months, every one of those retailers' invoice formats is learned. The quarterly electricity data that used to take two full days of manual entry now processes overnight with zero review needed. The time goes to the things that actually require judgment: new tenants, unusual consumption spikes, reconciling estimated vs. actual meter reads.

The Audit Trail Makes It Defensible

Every auto-confirmed record in Carbonly carries a complete history. Not just the final emission figure. The full chain: source document, extracted fields, matched material, applied emission factor, calculated emissions, confidence score at time of confirmation, number of prior verified matches for that combination, and whether it was auto-confirmed or manually reviewed.

When an assurance practitioner (or the Clean Energy Regulator) asks how a specific record was generated, you don't say "the AI did it." You show them that this supplier-material combination has been manually verified 47 times over eight months with zero corrections, that the applied emission factor matches the current NGA Factors workbook, and that the auto-confirmation threshold was set at ten by your sustainability manager with admin approval.

That's the difference between using AI as a black box and using AI as a documented, auditable control system. The ACCC's continued focus on greenwashing (including an $8.25 million penalty against Clorox in 2025 for misleading environmental claims) means your emissions numbers need to be traceable. Not approximately right. Demonstrably right, with evidence.

What Happens When Suppliers Change

Suppliers change their invoice formats. It happens. A fuel company updates their billing system. A concrete supplier gets acquired and rebadged. An energy retailer redesigns their invoice template mid-year.

When the system encounters a document from a known supplier that doesn't match the learned pattern, it doesn't guess. Confidence drops. The record goes back to manual review. Your analyst confirms the new format, and the learning cycle starts again for that supplier's updated layout.

This is intentional friction. We'd rather ask your team to re-confirm a supplier whose format changed than silently extract the wrong number from a layout the system hasn't seen. The cost of a five-second manual confirmation is negligible. The cost of an incorrect emission factor flowing into your NGER return is not.

The same logic applies when emission factors change. DCCEEW updates the NGA Factors workbook annually. When Carbonly ingests the new factors, any supplier-material combination affected by the change gets flagged. Auto-confirmation pauses for those combinations until your team verifies the new factor. It's a reset, not a failure. The system is doing exactly what it should: recognising that something material has changed and asking a human to validate.

Start With Your Highest-Volume Suppliers

If you're evaluating whether this approach works for your organisation, here's a concrete way to test it.

Pick your top twenty suppliers by invoice volume. For most companies, those twenty suppliers account for 60-70% of all emission-relevant documents. Upload three months of invoices from those suppliers. Spend the first two weeks confirming matches carefully. By the end of month one, most of those suppliers should be auto-confirming.

Then look at the numbers. How many records went through without correction? How much review time did you save in month two vs. month one? What's the error rate on auto-confirmed records vs. manually entered ones?

That data tells you whether a learning system fits your reporting workflow better than a static one. If you want to run that test, reach out at hello@carbonly.ai and we'll walk you through it.