Skip to content

vgrid diff

Reconcile two datasets by key column. Reports matched rows, rows only in left/right, and value differences with optional numeric tolerance.

Either <left> or <right> can be - to read from stdin. Format is inferred from the other file’s extension, or set with --stdin-format.

Terminal window
vgrid diff <left> <right> --key <column> [options]
OptionDescription
--keyKey column — name, letter (A), or 1-indexed number (required)
--matchMatching mode: exact (default) or contains (substring)
--key_transformTransform keys before matching: none, trim (default), digits, alnum
--compareColumns to compare (comma-separated; omit for all non-key)
--toleranceNumeric tolerance, absolute (default: 0)
--on_duplicatePolicy for duplicate keys: error (default)
--on_ambiguousPolicy for ambiguous matches (contains mode): error (default), report
--save-ambiguousExport ambiguous matches to CSV file (written before exit, even on error)
--contains-columnRight-side column to search for substring matches (default: key column). Only valid with --match contains. Accepts name, letter, or 1-indexed number.
--no-failExit 0 if the command ran successfully, even with diffs or ambiguous matches. Parse and usage errors still exit non-zero.
--outOutput format: json (default) or csv
--outputOutput file (default: stdout)
--summarySummary destination: stderr (default), json, none
--exportExport rows by status to CSV (repeatable). Format: STATUS:PATH. Statuses: only_left, only_right, matched, diff, ambiguous.
--export-sideWhich side’s columns to include in exports: left (default), right, both. both adds metadata prefix + right_-prefixed columns.
--no_headersTreat first row as data (generate A, B, C headers)
--header_rowHeader row number, 1-indexed
--delimiterCSV delimiter (default: ,)
--stdin-formatFormat for stdin when using - (inferred from other file if omitted)
--strict-exitExit 1 on any diff, even within tolerance (Unix-diff semantics)
-q, --quietSuppress stderr summary and warnings
Terminal window
# Basic diff by name column
vgrid diff q3.csv q4.csv --key name
# With numeric tolerance (e.g., rounding differences)
vgrid diff expected.csv actual.csv --key sku --tolerance 0.01
# Left side from stdin
cat export.csv | vgrid diff - baseline.csv --key id
# Right side from stdin
docker exec db dump.sh | vgrid diff expected.csv - --key sku
# Stdin with explicit format
cat data.tsv | vgrid diff - reference.csv --key id --stdin-format tsv
# Substring matching (left key contained in right key)
vgrid diff short_codes.csv full_names.csv --key id --match contains --on_ambiguous report
# Export ambiguous matches for manual review
vgrid diff vendor.xlsx ours.csv --key Order --match contains \
--on_ambiguous error --save-ambiguous ambiguous.csv
# Compare specific columns only
vgrid diff before.csv after.csv --key id --compare "price,quantity"
# CSV output to file
vgrid diff left.csv right.csv --key name --out csv --output result.csv
# Search a different column for substring matches
vgrid diff excel.csv line_items.csv \
--key Invoice --match contains \
--contains-column description --on_ambiguous report --out csv
# Agent-friendly mode: always exit 0
vgrid diff before.csv after.csv --key id --no-fail
# Export unmatched rows to CSV for a second pass
vgrid diff remit.csv ledger.csv --key Invoice --no-fail \
--export only_left:/tmp/unmatched.csv
# Export matched rows with both sides for audit
vgrid diff left.csv right.csv --key Invoice --no-fail \
--export matched:/tmp/matched.csv --export-side both
# Multiple exports in one invocation
vgrid diff left.csv right.csv --key Invoice --no-fail \
--export only_left:/tmp/unmatched.csv \
--export matched:/tmp/matched.csv

In contains mode, --key selects the left key column; the right “search text” is either the right key column or --contains-column.

Duplicate right-side keys are allowed in contains mode and routed through --on-ambiguous. This is expected — substring matching naturally produces multiple candidates. Use --save-ambiguous to export ambiguous matches for manual review.

Terminal window
# Match invoice IDs from Excel against line item descriptions
vgrid diff orders.csv line_items.csv \
--key Invoice --match contains \
--contains-column description \
--on_ambiguous report --save-ambiguous ambiguous.csv --no-fail

Either side of diff can be - to read from stdin (but not both). This enables piping live exports directly into reconciliation without temp files:

Terminal window
# Pipe a database export into diff
rails runner 'Ledger.export_csv' | vgrid diff - expected.csv --key id
# Pipe right side
curl -s api.example.com/data.csv | vgrid diff baseline.csv - --key sku

When one side is -, the stdin format is inferred from the other file’s extension. Use --stdin-format when the other file’s extension doesn’t match the stdin data, or when the extension is ambiguous.

{
"summary": {
"left_rows": 50,
"right_rows": 48,
"matched": 45,
"only_left": 5,
"only_right": 3,
"diff": 12,
"diff_outside_tolerance": 8,
"tolerance": 0.01,
"key": "name",
"match": "exact",
"key_transform": "trim"
},
"results": [
{
"status": "matched",
"key": "Alice",
"diffs": null
},
{
"status": "diff",
"key": "Bob",
"diffs": [
{ "column": "amount", "left": "1200", "right": "1350", "delta": 150.0, "within_tolerance": false }
]
},
{
"status": "only_left",
"key": "Carol"
}
]
}
TransformBehavior
noneCompare keys as-is
trimStrip leading/trailing whitespace (default)
digitsExtract only ASCII digits from keys
alnumStrip non-ASCII-alphanumeric characters and uppercase. Order #O2025-XORDERO2025X

When --tolerance is set, numeric values are compared with absolute tolerance. Financial formats are parsed automatically ($1,234.56, parenthesized negatives like (500)). The within_tolerance field in output indicates whether a delta falls within the threshold. Within-tolerance diffs do not cause a non-zero exit code — only diffs outside tolerance are considered material.

--export STATUS:PATH writes rows matching a status to a clean CSV file. Repeatable — use multiple --export flags in one invocation.

Exports are written before exit code decisions, so the files always exist if requested, even when diff would normally exit non-zero.

Terminal window
# Chain into a second diff pass
vgrid diff remit.csv ledger.csv --key Invoice --no-fail \
--export only_left:/tmp/unmatched.csv
vgrid diff /tmp/unmatched.csv other_source.csv --key Invoice --no-fail

--export-side controls the CSV schema:

SideSchemaUse case
left (default)Original left columns, no metadataFeed into next vgrid diff pass
rightOriginal right columns, no metadataFeed into next vgrid diff pass
both_status,_key,... metadata + left columns + right_-prefixed columnsAudit artifacts, manual review

In both mode, ambiguous rows expand to one row per candidate with _candidate_count and _candidate_index fields.

only_left always exports left-side data regardless of --export-side. only_right always exports right-side data. These never invert.

When --match contains produces ambiguous matches (one left key matches multiple right keys), --save-ambiguous <path> writes them to CSV before exiting:

Terminal window
vgrid diff vendor.xlsx ours.csv --key Order --match contains \
--on_ambiguous error --save-ambiguous ambiguous.csv

The CSV has three columns: left_key, candidate_count, and candidate_keys (pipe-separated):

left_key,candidate_count,candidate_keys
12,2,100154612|100154312

The file is written even when --on_ambiguous error causes exit code 4, so it is always available for manual review.