Skip to content

test(tuplefile): add unit tests for CSV and JSONL parsing#664

Open
archy-rock3t-cloud wants to merge 1 commit intoopenfga:mainfrom
sophotechlabs:tuplefile-parsing-tests
Open

test(tuplefile): add unit tests for CSV and JSONL parsing#664
archy-rock3t-cloud wants to merge 1 commit intoopenfga:mainfrom
sophotechlabs:tuplefile-parsing-tests

Conversation

@archy-rock3t-cloud
Copy link
Copy Markdown

@archy-rock3t-cloud archy-rock3t-cloud commented Mar 30, 2026

Add unit tests for CSV and JSONL
parsing

Summary by CodeRabbit

  • Tests
    • Added a comprehensive test suite for data parsing covering CSV and JSONL inputs: header validation, required-field checks, various column orders, row extraction, optional fields handling, condition parsing (including JSON context), skipping blank lines, and detailed error cases for invalid or missing data. Includes unit tests for condition-column combinations and header validation.

@archy-rock3t-cloud archy-rock3t-cloud requested a review from a team as a code owner March 30, 2026 14:33
Copilot AI review requested due to automatic review settings March 30, 2026 14:33
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 30, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: cb2381f5-30b5-4434-85fb-0b75a0c8f0c1

📥 Commits

Reviewing files that changed from the base of the PR and between 2db58a8 and ba87fa9.

📒 Files selected for processing (1)
  • internal/tuplefile/tuplefile_test.go
✅ Files skipped from review due to trivial changes (1)
  • internal/tuplefile/tuplefile_test.go

Walkthrough

Adds a comprehensive Go test suite for the tuplefile package covering CSV header validation, CSV row parsing (including condition handling), JSONL parsing, and focused unit tests for condition-column validation and parsing edge cases.

Changes

Cohort / File(s) Summary
Test Suite
internal/tuplefile/tuplefile_test.go
Adds ~450 lines of tests exercising CSV header reading/validation (required headers, unknown/missing headers, empty input), CSV row parsing (tuple fields, optional user_relation, condition parsing from condition_name + optional JSON condition_context, error cases), JSONL parsing (valid lines, blanks, condition objects, invalid JSON with line info, empty input), and unit tests for parseConditionColumnsForRow and csvColumns.validate edge cases.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

Suggested labels

codex

Suggested reviewers

  • Siddhant-K-code
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 21.74% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and specifically describes the main change: adding unit tests for CSV and JSONL parsing in the tuplefile package.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new tuplefile test suite to validate tuple parsing behavior for CSV and JSONL inputs, including optional condition fields and common error cases.

Changes:

  • Adds CSV header parsing tests (required headers, invalid headers, empty input, condition header dependency).
  • Adds CSV row parsing tests (user relations, conditions, invalid condition context, headers-only input).
  • Adds JSONL parsing tests (valid lines, blank lines, conditions, invalid JSON, empty input).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread internal/tuplefile/tuplefile_test.go Outdated
var tuples []client.ClientTupleKey
err := parseTuplesFromCSV([]byte(csv), &tuples)
require.Error(t, err)
assert.Contains(t, err.Error(), "failed to read condition context on line 0")
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assertion hard-codes the line number "0" in the error message, which is an implementation detail and may be semantically confusing (CSV files are typically 1-based and include the header line). Consider asserting only the stable part of the message (e.g., "failed to read condition context") or aligning the code/test to a consistent 1-based line numbering scheme.

Suggested change
assert.Contains(t, err.Error(), "failed to read condition context on line 0")
assert.Contains(t, err.Error(), "failed to read condition context")

Copilot uses AI. Check for mistakes.
Comment thread internal/tuplefile/tuplefile_test.go Outdated
Comment on lines +66 to +73
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
t.Parallel()

var tuples []client.ClientTupleKey
err := parseTuplesFromCSV([]byte(tc.csv), &tuples)
require.NoError(t, err)
})
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The subtests call t.Parallel() while ranging over cases without capturing the loop variable. This can make subtests read the wrong tc values (flaky/incorrect). Capture tc := tc inside the loop (or avoid t.Parallel in subtests).

Copilot uses AI. Check for mistakes.
Comment thread internal/tuplefile/tuplefile_test.go Outdated
Comment on lines +112 to +120
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
t.Parallel()

var tuples []client.ClientTupleKey
err := parseTuplesFromCSV([]byte(tc.csv), &tuples)
require.Error(t, err)
assert.Contains(t, err.Error(), tc.expectedErr)
})
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same loop-variable capture issue here: subtests run in parallel but tc isn't re-bound inside the for _, tc := range cases loop, so assertions may use the wrong case data.

Copilot uses AI. Check for mistakes.
Comment on lines +17 to +73
cases := []struct {
name string
csv string
expected csvColumns
}{
{
name: "minimal required headers",
csv: "user_type,user_id,relation,object_type,object_id\n",
expected: csvColumns{
UserType: 0,
UserID: 1,
UserRelation: -1,
Relation: 2,
ObjectType: 3,
ObjectID: 4,
ConditionName: -1,
ConditionContext: -1,
},
},
{
name: "all headers present",
csv: "user_type,user_id,user_relation,relation,object_type,object_id,condition_name,condition_context\n",
expected: csvColumns{
UserType: 0,
UserID: 1,
UserRelation: 2,
Relation: 3,
ObjectType: 4,
ObjectID: 5,
ConditionName: 6,
ConditionContext: 7,
},
},
{
name: "headers in different order",
csv: "object_id,relation,user_id,object_type,user_type\n",
expected: csvColumns{
UserType: 4,
UserID: 2,
UserRelation: -1,
Relation: 1,
ObjectType: 3,
ObjectID: 0,
ConditionName: -1,
ConditionContext: -1,
},
},
}

for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
t.Parallel()

var tuples []client.ClientTupleKey
err := parseTuplesFromCSV([]byte(tc.csv), &tuples)
require.NoError(t, err)
})
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test defines an expected csvColumns for each case but never asserts it, so it only verifies that parsing doesn't error. Consider calling readHeaders directly (or otherwise exposing the parsed header mapping) and asserting the returned csvColumns matches tc.expected to actually test header indexing/order handling.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
internal/tuplefile/tuplefile_test.go (2)

17-21: expected header mapping fixture is currently unused.

tc.expected is populated (e.g., Line 20 and case bodies) but never asserted in the test loop (Lines 66-73). Right now this block only verifies “no error”, so header-index mapping regressions could slip through unnoticed. Consider either asserting header mapping behavior explicitly or removing expected to avoid misleading test intent.

Also applies to: 66-73

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/tuplefile/tuplefile_test.go` around lines 17 - 21, The test defines
an unused expected header mapping (tc.expected of type csvColumns) but never
asserts it; update the test loop in tuplefile_test.go to compare the actual
parsed header mapping to tc.expected (or if intended, remove the expected field
from the test cases). Specifically, locate where the test calls the parser and
checks for no error (the test loop that references tc.csv and tc.name), retrieve
the produced header mapping from the parser result, and add an assertion that it
equals tc.expected (using the same equality/assert helper used elsewhere in the
file), or delete the expected field and its per-case values to avoid misleading
fixtures.

252-252: Prefer stable error assertions over exact line-number text.

The checks on Line 252 and Line 330 hardcode parser line numbers in error messages. That makes tests fragile to harmless wording/indexing changes. Consider asserting a stable substring (e.g., error category/prefix) instead of exact line suffixes.

Also applies to: 330-330

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/tuplefile/tuplefile_test.go` at line 252, Replace brittle assertions
that match exact line-number text by checking a stable error substring;
specifically, change assertions like assert.Contains(t, err.Error(), "failed to
read condition context on line 0") to assert.Contains(t, err.Error(), "failed to
read condition context") (and similarly for the other assertion that hardcodes a
line number). This targets the error message text used by the parser (the
"failed to read condition context on line X" string) and ensures tests assert
the error category/prefix rather than a fragile line-number suffix.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@internal/tuplefile/tuplefile_test.go`:
- Around line 17-21: The test defines an unused expected header mapping
(tc.expected of type csvColumns) but never asserts it; update the test loop in
tuplefile_test.go to compare the actual parsed header mapping to tc.expected (or
if intended, remove the expected field from the test cases). Specifically,
locate where the test calls the parser and checks for no error (the test loop
that references tc.csv and tc.name), retrieve the produced header mapping from
the parser result, and add an assertion that it equals tc.expected (using the
same equality/assert helper used elsewhere in the file), or delete the expected
field and its per-case values to avoid misleading fixtures.
- Line 252: Replace brittle assertions that match exact line-number text by
checking a stable error substring; specifically, change assertions like
assert.Contains(t, err.Error(), "failed to read condition context on line 0") to
assert.Contains(t, err.Error(), "failed to read condition context") (and
similarly for the other assertion that hardcodes a line number). This targets
the error message text used by the parser (the "failed to read condition context
on line X" string) and ensures tests assert the error category/prefix rather
than a fragile line-number suffix.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d1c6fb81-645d-4dee-ad9f-ea78e3a4d09f

📥 Commits

Reviewing files that changed from the base of the PR and between 43059ce and 2db58a8.

📒 Files selected for processing (1)
  • internal/tuplefile/tuplefile_test.go

Signed-off-by: Artem Muterko <artem@sopho.tech>
@archy-rock3t-cloud
Copy link
Copy Markdown
Author

Thanks @copilot / @coderabbitai — addressed the review:

  • Unused expected csvColumns fixture: now asserted by calling readHeaders directly in the success path and comparing the parsed csvColumns to tc.expected, so header-index mapping regressions are actually caught.
  • Brittle line-number assertion: softened "failed to read condition context on line 0" to a substring check on "failed to read condition context" so the test isn't coupled to the parser's 0-based line numbering.
  • tc := tc loop-var capture: tried this first (the classic pre-Go-1.22 fix), but the repo's golangci-lint config flags it as copyloopvar: "The copy of the 'for' variable can be deleted (Go 1.22+)". Since go.mod is at go 1.25, per-iteration scoping is the language default and the explicit re-bind would actually fail lint. Renamed tctestCase instead, which both satisfies varnamelen (the original Lints failure) and makes parallel safety explicit.
  • Also fixed the wsl_v5 whitespace violations that were tripping the Lints job (separator lines between var decls and following err := assignments).

Rebased on main. Local go test ./internal/tuplefile/... -race -count=1 passes; golangci-lint v2.10.1 -c .golangci.yaml ./internal/tuplefile/... reports 0 issues.

The remaining Security Audits red is GO-2026-4947 in the Go stdlib (tls.Conn.Read) reachable via bufio.Scanner.Scan in read.go — pre-existing on main and being handled separately in chore/vulncheck-fix, unrelated to this PR. License Compliance is FOSSA infra.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 28, 2026

@archy-rock3t-cloud Thanks for the detailed summary — all the fixes look well-reasoned:

  • The readHeaders assertion in the success path is a meaningful improvement; header-index mapping regressions will now actually be caught.
  • The substring check for "failed to read condition context" is the right call — it tests the relevant behaviour without coupling to internal line-counter offsets.
  • The tc → testCase rename is the correct approach for Go 1.22+: per-iteration variable scoping makes tc := tc a no-op that copyloopvar will flag, and the rename simultaneously satisfies varnamelen. Good reasoning there.
  • Noted that the GO-2026-4947 (tls.Conn.Read / bufio.Scanner.Scan) vulnerability is pre-existing on main and tracked in chore/vulncheck-fix — that's out of scope here.

Triggering a fresh review of the updated branch.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants