Duplicate Invoices Are Silently Inflating Your...

A sustainability manager we spoke with recently ran the numbers on her company's FY2025 diesel consumption and got a figure that seemed too high. Not dramatically - about 6% above what operations expected. She'd already submitted the draft to her consultant. Nobody questioned it.

Then she noticed two delivery dockets for the same date, same site, same quantity - 8,200 litres of diesel to a construction yard in western Sydney. One came from the fuel card portal as a CSV export. The other arrived as a PDF invoice emailed by the supplier. Different file names. Different formats. Different upload dates. Same fuel.

She went looking for more. Found eleven pairs like it across a single quarter. Each pair counted the same delivery twice in the emissions calculation. The total over-report: 74 tonnes of CO2-e. Not enough to raise eyebrows on its own - but multiplied across a full year and four fuel suppliers, it added roughly 5% to the company's reported Scope 1 emissions.

This isn't an unusual story. It's the most common data quality problem in carbon accounting that nobody talks about.

The Mechanics of Accidental Duplication

Duplicate invoices in accounts payable are a well-documented problem. APQC research puts the rate of duplicate or erroneous payments at 0.8% to 2% of total disbursements. The Institute of Finance and Management estimates up to 1.5% of outgoing cash flow goes to duplicate payments.

Carbon accounting inherits this problem and makes it worse. Much worse.

In AP, invoices typically arrive through one or two channels - a supplier portal and an email inbox. The AP team knows what to expect. In carbon accounting, the same transaction can enter through four or five channels simultaneously. And because the people uploading data often don't talk to each other, nobody realises the same fuel delivery is being counted multiple times.

Here's how it happens with a single diesel delivery:

The fuel card company exports a monthly CSV through their portal. An admin downloads it and uploads it to the carbon accounting system. Meanwhile, the fuel supplier emails a PDF invoice to accounts payable, who forward it to the sustainability team. The site manager photographs the delivery docket on their phone and uploads it through a mobile app. And at the end of the month, the supplier sends a consolidated statement that includes the same transaction as a line item - which gets processed again.

Same fuel. Same litres. Same date. Four entries.

This pattern repeats across electricity bills (retailer portal download plus emailed PDF plus quarterly summary), gas invoices, water bills, and waste collection records. Across a portfolio of 20+ sites with multiple suppliers, the duplicate rate we observe in carbon accounting datasets is consistently higher than the AP benchmark - closer to 3-8% of total transaction volume when you include near-duplicates.

That 3-8% range isn't from a single study. It's an industry-wide observation based on what we and others in the carbon accounting space see when companies first run proper deduplication checks against their historical data. The exact rate depends on how many ingestion channels you have, how many people are involved in data collection, and whether you have any matching controls in place.

Why Simple Deduplication Doesn't Catch It

If you've read our earlier post on SHA-256 file hashing for document deduplication, you know that exact-match detection is a solved problem. Same file, same hash, caught instantly.

But that's not what's happening here.

The fuel card CSV and the supplier PDF are completely different files. Different formats. Different layouts. Different metadata. A hash-based check won't flag them because they aren't the same document. They're different documents that describe the same real-world event.

And the differences go deeper than file format. The amounts might not match exactly. The CSV might show 8,000 litres. The PDF invoice might show 8,012 litres - the same delivery, measured by a different flow meter with slightly different calibration. The delivery docket photograph might be partially illegible, with the AI reading it as 8,100 litres. Close enough to be the same delivery. Different enough to slip past an exact-match check.

Dates create another problem. The fuel card transaction shows the delivery date: 15 March. The supplier invoice is dated 18 March - when they raised it, not when they delivered. The monthly statement shows 31 March. The site manager uploaded the docket photo on 17 March. Four different dates for the same event. Any deduplication system that requires exact date matching will miss it entirely.

Then there's the supplier name. "Acme Fuels" on the fuel card CSV. "ACME FUELS PTY LTD" on the invoice. "Acme" on the delivery docket. "Acme Fuels Pty Ltd trading as AF Energy" on the statement. Simple string matching fails. Even case-insensitive matching fails when the legal name differs from the trading name.

This is why spreadsheet-based carbon accounting is particularly vulnerable. You can't build fuzzy matching logic into a VLOOKUP.

What This Costs You - Beyond the Wrong Number

An inflated emissions figure isn't just an accuracy problem. It triggers a chain of real consequences.

Under NGER, your reported emissions determine whether you hit regulatory thresholds. The Safeguard Mechanism applies to facilities emitting over 100,000 tonnes CO2-e per year, with baselines declining at 4.9% annually to 2030. If duplicate data pushes your reported emissions above your baseline, you're buying ACCUs you don't actually need. At current spot prices around $30-35 per tonne, a 5% over-report on a 120,000-tonne facility is 6,000 phantom tonnes - roughly $180,000 to $210,000 in unnecessary ACCU purchases. For nothing.

And if the over-reporting pushes you above the 100,000-tonne facility threshold in the first place? You've just triggered Safeguard Mechanism obligations that shouldn't apply to you. That's not just money - it's an entire compliance program built on inflated data.

Under AASB S2, overstated emissions in your sustainability report are a potential material misstatement. AASB S2 requires disclosure of absolute gross greenhouse gas emissions expressed as metric tonnes of CO2-e, classified separately for Scope 1 and Scope 2. If your Scope 1 number is 5% too high because of duplicate fuel transactions, and that percentage moves the figure by thousands of tonnes, an auditor performing ASSA 5010 limited assurance over your Scope 1 and 2 disclosures has a problem with your data.

Under ASSA 5010, assurance practitioners performing even limited assurance are required to obtain sufficient and appropriate evidence. They're testing for both completeness and accuracy. Duplicates fail the accuracy test. And when an auditor traces a sampled emission figure back to its source document and finds two source documents for the same real-world transaction, that's an assurance finding - it calls into question whether the rest of the dataset has the same problem. One flagged duplicate can turn a limited assurance engagement into a much longer (and more expensive) conversation.

For your reduction targets, the damage is subtler but just as real. If your baseline year contains duplicate-inflated emissions, your subsequent "reductions" are partly just improved data hygiene. You haven't actually reduced emissions - you've reduced data errors. That's a problem for SBTi validation, CDP disclosure, and any public commitment. And if the ACCC decides your reduction claims are misleading because your baseline was wrong, the penalties are significant - up to $50 million per contravention under Australian Consumer Law.

What Actually Works for Near-Duplicate Detection

Catching near-duplicates requires a fundamentally different approach from catching exact duplicates. Instead of comparing files, you need to compare the real-world events those files describe. That means working with extracted data, not raw documents.

Configurable match fields. The system needs to compare records across multiple dimensions simultaneously: supplier, material type, quantity, delivery date, site, and where available, invoice or reference numbers. No single field is sufficient on its own - a company might receive 8,000 litres of diesel from the same supplier to the same site twice in a month. That's two legitimate deliveries, not a duplicate. But 8,000 litres and 8,012 litres from the same supplier, to the same site, within three days? That's almost certainly the same delivery recorded twice.

Tolerance settings. Quantities need a matching window - not exact equality. Two flow meters measuring the same diesel delivery will give slightly different readings. We've found that a 5% quantity tolerance and a 7-day date window catches the vast majority of near-duplicates without generating excessive false positives. But these tolerances need to be configurable, because a 5% tolerance on a 500-litre delivery is 25 litres (probably just measurement variance), while 5% on a 50,000-litre delivery is 2,500 litres (which could be a genuinely different transaction).

Fuzzy supplier name matching. Matching "Acme Fuels" to "ACME FUELS PTY LTD" requires more than case-insensitive string comparison. It needs normalisation - stripping common suffixes (Pty Ltd, Pty Limited, P/L, ABN numbers), standardising case and whitespace, and applying similarity scoring. This is table-stakes functionality in AP automation, but most carbon accounting tools haven't caught up.

Pair-wise comparison with evidence. When the system flags a potential duplicate, it shouldn't just say "possible duplicate found." It should present the evidence side-by-side: both records, the matched fields, the differences, the calculated CO2-e impact of keeping both versus keeping one, and a recommendation. Something like: "Both records show approximately 8,000 litres of diesel delivered to Site 7 within 3 days. Reference numbers DEL-2026-0847 and DEL-0847-2026 appear to be the same delivery with transposed numbering. Keeping both adds 21,680 kg CO2-e to Scope 1. Recommend keeping the original, flagging the duplicate."

CO2-e impact quantification. Every flagged duplicate should show the emissions impact. Not just "this might be a duplicate" - but "this duplicate represents 21.7 tonnes of CO2-e that would inflate your Scope 1 figure." That's what makes the sustainability manager pay attention. And it's what makes the business case for investing in data quality controls.

We're not going to pretend this is fully automated end-to-end. The fuzzy matching catches probable duplicates, but a human still needs to make the final call on borderline cases. Two diesel deliveries of 7,800 and 8,200 litres to the same site five days apart - is that a duplicate with measurement variance, or two separate deliveries? The system flags it. A person decides.

The Broader Data Quality Problem

Duplicates are just one species of data quality error. They tend to be the most financially material because they systematically inflate your numbers, but they're not the only thing that should keep you up at night.

Missing data is the mirror image of duplication. A gas bill that never arrived, a site that went unreported for a quarter, a supplier who changed their invoicing schedule. Missing data understates your emissions, which creates its own NGER compliance risk - the Clean Energy Regulator requires complete reporting, and an unexplained 25% drop in a facility's gas consumption will attract scrutiny just as much as an unexplained spike. Our anomaly detection module flags gaps in expected periodic data - if a monthly bill stops arriving, the system notices within the next billing cycle.

Round number patterns suggest estimation rather than measurement. If every quarterly gas bill for a facility ends in 000, someone is probably rounding or estimating. That's not necessarily wrong - estimated reads happen - but it means the data carries more uncertainty than measured reads, and your Basis of Preparation document should acknowledge it.

Unit confusion is the one that produces the most spectacular single-point errors. A gas bill entered as 45,000 GJ instead of 45,000 MJ is off by a factor of 1,000. At the NGA Factors 2025 rate of 51.53 kg CO2-e per GJ for natural gas, that's a 2,270-tonne error from one data entry. We've written separately about unit conversion problems - they're rarer than duplicates but far more destructive when they occur.

Statistical outliers are the hardest to categorise. A facility whose electricity consumption doubles in Q3 might have a data error, or it might have installed new equipment. A waste collection that drops to zero might be a missing invoice, or the site might have genuinely stopped producing waste. Outlier detection flags the anomaly; only domain knowledge resolves it. We use z-score analysis against rolling baselines to catch these, but we won't pretend the system can distinguish between a real operational change and a data error. That still takes a human.

The common thread across all of these is that they're invisible in a spreadsheet. You can stare at 2,000 rows of utility data and not see the duplicate, the missing quarter, or the unit error. Your eyes glaze over around row 150. The ANAO found that 72% of NGER reports contained errors, with 17% containing significant ones. Manual processes aren't just slow - they're unreliable.

Building a Deduplication Control That Auditors Trust

When an ASSA 5010 assurance practitioner walks in the door, they're looking for evidence that your data pipeline has integrity controls. Not just that your final number is right - but that your system is designed to produce the right number consistently. That's the lesson from Beach Energy's enforceable undertaking with the Clean Energy Regulator: the CER doesn't just want correct reports, they want evidence of controls.

A deduplication control that auditors trust needs three things.

First, it needs to run automatically on every record, not just the ones someone happens to check. If your deduplication process depends on a person eyeballing a spreadsheet, that's not a control - it's a hope. Auditors distinguish between detective controls (which find problems after the fact) and preventive controls (which stop problems from entering the system). Automated deduplication at the ingestion layer is preventive. That's what they want to see.

Second, it needs a complete audit trail. Every flagged duplicate needs a record: when it was flagged, what matching criteria triggered it, what the emissions impact would have been, and how it was resolved. Kept or removed? By whom? When? Under NGER record-keeping requirements, you need to maintain these records for five years from the end of the reporting year. Under ASRS, your auditor needs to trace any emissions figure back through the full evidence chain.

Third, it needs to be documented in your Basis of Preparation. AASB S2 requires entities to explain their measurement approach, inputs, and assumptions. If your deduplication process uses a 5% quantity tolerance and a 7-day date window, say so. If you resolve borderline cases manually, say so. Auditors don't expect perfection. They expect transparency about your methodology - including its limitations.

We're honest that our near-duplicate matching isn't perfect for every edge case. Amended invoices versus corrections remain genuinely ambiguous. Two legitimate deliveries that happen to be similar quantities on similar dates will occasionally get flagged as potential duplicates when they're not. But false positives are a minor inconvenience - false negatives (missed duplicates that inflate your numbers) are a compliance risk. We'd rather over-flag and let a person decide than under-flag and let duplicates through.

What to Do This Quarter

If you haven't checked your emissions dataset for duplicates, do it before your next reporting cycle. Not after.

Start with your highest-volume, highest-emission transaction type. For most companies, that's diesel. Pull every diesel record for the last quarter. Sort by site, then by date, then by quantity. Look for records within 7 days of each other, from the same supplier, to the same site, with quantities within 5% of each other. You'll probably find some. We've never seen a company run this check for the first time and find zero.

Then count the ingestion channels. How many ways does a utility bill or fuel record enter your system? If it's more than two, you almost certainly have duplicates. Every additional channel multiplies the risk. And if you're heading into ASRS Group 2 reporting from July 2026 or preparing for your next NGER submission, your auditor will ask about this. Better to find the duplicates yourself than to have them found for you.

If you want to see how Carbonly handles this - both the exact-match hashing and the near-duplicate fuzzy matching - upload a batch of your real documents and check the deduplication log. The system shows you exactly what it caught, what it flagged, and what the CO2-e impact would have been.

Because the worst duplicate is the one you don't know about.

Related reading: