Skip to content

Conversation

@abrookins
Copy link
Collaborator

Allow using modulo queries to enable wildcard (prefix search) support in Tag filters, e.g.:

wildcard_match = Tag("category") % "tech*"

closes #453 / RAAE-1309

Copilot AI review requested due to automatic review settings December 16, 2025 22:42
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds wildcard pattern matching support for Tag filters using the modulo operator (%). The implementation allows users to perform prefix, suffix, or middle wildcard searches on tag fields using the * character, which is preserved (not escaped) when using the % operator, unlike the equality operator which escapes special characters including *.

Key changes:

  • Added FilterOperator.LIKE support to the Tag class with the % operator
  • Enhanced TokenEscaper.escape() method with a preserve_wildcards parameter to selectively preserve * characters
  • Added comprehensive tests for wildcard pattern matching, including edge cases and combined filters

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
tests/unit/test_filter.py Adds comprehensive test coverage for the new wildcard filter functionality, including tests for various patterns, special character handling, and combination with exact match filters
redisvl/utils/token_escaper.py Extends the escape method with a preserve_wildcards parameter and adds ESCAPED_CHARS_NO_WILDCARD regex pattern that excludes asterisk from escaping
redisvl/query/filter.py Implements mod operator for Tag class to support wildcard patterns, adds LIKE operator mappings, and updates _formatted_tag_value to conditionally preserve wildcards

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +58 to +77
@pytest.mark.parametrize(
"pattern,expected",
[
# Basic prefix wildcard
("tech*", "@tag_field:{tech*}"),
# Multiple patterns via list
(["tech*", "soft*"], "@tag_field:{tech*|soft*}"),
# Wildcard with special chars that still get escaped
("tech*-pro", "@tag_field:{tech*\\-pro}"),
# Prefix with space (space escaped, wildcard preserved)
("hello w*", "@tag_field:{hello\\ w*}"),
# Multiple wildcards in same pattern
("*test*", "@tag_field:{*test*}"),
# Empty pattern returns wildcard match-all
("", "*"),
([], "*"),
(None, "*"),
# Pattern with special characters
("cat$*", "@tag_field:{cat\\$*}"),
],
Copy link

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation example showing Tag("category") % "elec*|soft*" should have a corresponding test case to verify that passing a string containing pipe characters works correctly. While the test at line 64 verifies that a list produces the expected output with pipes, there's no explicit test for passing a string with embedded pipe characters. Consider adding a test case like ("elec*|soft*", "@tag_field:{elec*|soft*}") to ensure this documented pattern works as expected.

Copilot uses AI. Check for mistakes.
Comment on lines +58 to +89
@pytest.mark.parametrize(
"pattern,expected",
[
# Basic prefix wildcard
("tech*", "@tag_field:{tech*}"),
# Multiple patterns via list
(["tech*", "soft*"], "@tag_field:{tech*|soft*}"),
# Wildcard with special chars that still get escaped
("tech*-pro", "@tag_field:{tech*\\-pro}"),
# Prefix with space (space escaped, wildcard preserved)
("hello w*", "@tag_field:{hello\\ w*}"),
# Multiple wildcards in same pattern
("*test*", "@tag_field:{*test*}"),
# Empty pattern returns wildcard match-all
("", "*"),
([], "*"),
(None, "*"),
# Pattern with special characters
("cat$*", "@tag_field:{cat\\$*}"),
],
ids=[
"prefix_wildcard",
"multiple_patterns",
"wildcard_with_special_char",
"prefix_with_space",
"multiple_wildcards",
"empty_string",
"empty_list",
"none",
"special_char_with_wildcard",
],
)
Copy link

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding test cases for set and tuple inputs to the wildcard operator, since SUPPORTED_VAL_TYPES includes these types. For example, Tag("category") % {"tech*", "soft*"} and Tag("category") % ("tech*", "soft*") should be tested to ensure they work correctly. While the implementation handles these types (converting them to lists in _set_tag_value), explicit test coverage would increase confidence in this behavior.

Copilot uses AI. Check for mistakes.
Comment on lines 183 to 200
"""Create a Tag wildcard filter expression for prefix matching.
This enables wildcard pattern matching on tag fields using the ``*``
character. Unlike the equality operator, wildcards are not escaped,
allowing prefix searches like ``"tech*"`` to match "technology",
"technical", etc.
Args:
other (Union[List[str], str]): The tag pattern(s) to filter on.
Use ``*`` for prefix matching (e.g., ``"tech*"``).
.. code-block:: python
from redisvl.query.filter import Tag
f = Tag("category") % "tech*" # Prefix match
f = Tag("category") % "elec*|soft*" # Multiple prefix patterns
f = Tag("category") % ["tech*", "sci*"] # List of patterns
Copy link

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring describes this method as "for prefix matching" and emphasizes "prefix searches", but the implementation actually supports wildcards in any position (prefix, suffix, or middle), as demonstrated by the test case "test" on line 70. Consider updating the docstring to clarify that wildcards can be placed anywhere in the pattern, not just at the end for prefix matching. For example: "Create a Tag wildcard filter expression for pattern matching" and mention that wildcards can be used for prefix (tech*), suffix (*tech), or middle (tech) matching.

Suggested change
"""Create a Tag wildcard filter expression for prefix matching.
This enables wildcard pattern matching on tag fields using the ``*``
character. Unlike the equality operator, wildcards are not escaped,
allowing prefix searches like ``"tech*"`` to match "technology",
"technical", etc.
Args:
other (Union[List[str], str]): The tag pattern(s) to filter on.
Use ``*`` for prefix matching (e.g., ``"tech*"``).
.. code-block:: python
from redisvl.query.filter import Tag
f = Tag("category") % "tech*" # Prefix match
f = Tag("category") % "elec*|soft*" # Multiple prefix patterns
f = Tag("category") % ["tech*", "sci*"] # List of patterns
"""Create a Tag wildcard filter expression for pattern matching.
This enables wildcard pattern matching on tag fields using the ``*``
character. Unlike the equality operator, wildcards are not escaped,
allowing patterns with wildcards in any position, such as prefix
(``"tech*"``), suffix (``"*tech"``), or middle (``"*tech*"``)
matches.
Args:
other (Union[List[str], str]): The tag pattern(s) to filter on.
Use ``*`` for wildcard matching (e.g., ``"tech*"``, ``"*tech"``,
or ``"*tech*"``).
.. code-block:: python
from redisvl.query.filter import Tag
f = Tag("category") % "tech*" # Prefix match
f = Tag("category") % "*tech" # Suffix match
f = Tag("category") % "*tech*" # Contains / middle match
f = Tag("category") % "elec*|*soft" # Multiple wildcard patterns
f = Tag("category") % ["tech*", "*science"]# List of patterns

Copilot uses AI. Check for mistakes.
Copy link
Collaborator

@vishal-bala vishal-bala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

DEFAULT_ESCAPED_CHARS = r"[,.<>{}\[\]\\\"\':;!@#$%^&*()\-+=~\/ ]"

# Same as above but excludes * to allow wildcard patterns
ESCAPED_CHARS_NO_WILDCARD = r"[,.<>{}\[\]\\\"\':;!@#$%^&()\-+=~\/ ]"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we'd prefer to define this in its entirety instead of as

Suggested change
ESCAPED_CHARS_NO_WILDCARD = r"[,.<>{}\[\]\\\"\':;!@#$%^&()\-+=~\/ ]"
escaped_chars_no_wildcard = DEFAULT_ESCAPED_CHARS.replace("*", "")

?

Both make sense, just wondering about the thinking behind it!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong preference, but I prefer to see the pattern explicitly even if it's mostly a duplicate.

Copy link
Collaborator

@tylerhutcherson tylerhutcherson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, thanks!

@abrookins abrookins merged commit ff415fb into main Dec 18, 2025
54 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

RedisVL's Tag filter does not support wildcards

4 participants