Add edge case tests for text handling #58

cardmagic · 2025-12-27T06:34:40Z

Summary

Add comprehensive edge case tests across all components to improve test coverage and document expected behavior.

Tests Added

Bayes (10 tests)

Empty string training/classification
Unicode text handling
Emoji in text
Special characters only
Very long text (10k repetitions)
Single word classification
Whitespace-only input
Mixed case handling
Numbers in text

LSI (8 tests)

Empty/single item index behavior
remove_item functionality
remove_item on nonexistent item
items method
find_related excludes self
Unicode mixed with ASCII
needs_rebuild? with auto_rebuild
categories_for nonexistent item

Extensions (18 tests)

without_punctuation method (5 tests)
sum_with_identity with block and custom identity
Vector magnitude (basic, zero vector, single element, negative values)
Vector normalize (basic, unit vector, preserves direction)
Matrix diag, trans alias, element assignment

Coverage

Metric	Before	After
Line	91.39%	92.78%
Branch	72.64%	76.42%

Fixes #52

Add comprehensive edge case tests across all components: Bayes: - Empty string training/classification - Unicode and emoji handling - Special characters, whitespace-only input - Very long text, mixed case, numbers in text LSI: - Empty/single item index behavior - remove_item and items methods - categories_for edge cases - Unicode mixed with ASCII Extensions: - without_punctuation method coverage - Vector magnitude/normalize (including zero vector) - Matrix diag, trans alias, element assignment - Array sum_with_identity with blocks Coverage improved to 93% line, 76% branch. Fixes #52

Copilot

Pull request overview

This PR adds comprehensive edge case tests to improve test coverage for the Bayes, LSI, and Extensions components. The additions focus on boundary conditions, Unicode handling, and error cases to document expected behavior and increase overall test coverage from 91.39% to 92.78% for line coverage and from 72.64% to 76.42% for branch coverage.

Key Changes:

Added 10 edge case tests for Bayesian classifier covering empty strings, Unicode, emojis, special characters, and very long text
Added 8 edge case tests for LSI covering index operations, item management, and category lookups
Added 18 tests for Extensions covering string punctuation handling, vector operations, and matrix utilities

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File	Description
test/bayes/bayesian_test.rb	Added 10 edge case tests for empty strings, Unicode text, emojis, special characters, very long text, single words, whitespace-only input, mixed case, and numbers in text
test/lsi/lsi_test.rb	Added 8 edge case tests for empty/single item indexes, item removal, related item finding, Unicode handling, auto-rebuild behavior, and category lookups
test/extensions/word_hash_test.rb	Added 18 tests covering array sum methods with blocks and custom identities, string punctuation removal, vector magnitude and normalization, and matrix operations

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Add missing test to confirm that LSI#remove_item properly increments the version, causing needs_rebuild? to return true. This completes the test coverage for issue #51 - the other required tests were added in PR #58. Fixes #51

cardmagic requested a review from Copilot December 27, 2025 06:34

cardmagic self-assigned this Dec 27, 2025

Copilot AI reviewed Dec 27, 2025

View reviewed changes

cardmagic merged commit 8d3528d into master Dec 27, 2025
4 checks passed

cardmagic deleted the test/edge-cases branch December 27, 2025 06:35

cardmagic mentioned this pull request Dec 27, 2025

Add test for remove_item triggering needs_rebuild #59

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add edge case tests for text handling #58

Add edge case tests for text handling #58

Uh oh!

cardmagic commented Dec 27, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add edge case tests for text handling #58

Add edge case tests for text handling #58

Uh oh!

Conversation

cardmagic commented Dec 27, 2025

Summary

Tests Added

Bayes (10 tests)

LSI (8 tests)

Extensions (18 tests)

Coverage

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants