Skip to content

Conversation

@cbullinger
Copy link
Collaborator

@cbullinger cbullinger commented Dec 16, 2025

Problem

Searching for "james cameron" returned ~240 results instead of ~15 because the text operator tokenizes multi-word queries and matches using OR logic ("James" OR "Cameron").

Solution

Backend (all 3 implementations):
Use the text operator's matchCriteria: "all" option to require ALL query terms to match (AND logic), while maintaining fuzzy matching for typo tolerance.

  • Use matchCriteria: "all" for directors, writers, and cast fields
  • Fuzzy settings: maxEdits: 1, prefixLength: 2 for typo tolerance

Frontend:

  • Add fuzzy matching hints to search form inputs
  • Improve search modal layout with grouped sections, better spacing, and cleaner styling

Files Changed

  • mflix/server/python-fastapi/src/routers/movies.py
  • mflix/server/js-express/src/controllers/movieController.ts
  • mflix/server/js-express/src/types/index.ts
  • mflix/server/java-spring/src/main/java/com/mongodb/samplemflix/service/MovieServiceImpl.java
  • mflix/client/app/components/SearchMovieModal/SearchMovieModal.tsx
  • mflix/client/app/components/SearchMovieModal/SearchMovieModal.module.css

…eries

Change directors, writers, and cast fields from text operator to phrase
operator in the search endpoint across all three backend implementations.

The text operator with fuzzy matching tokenizes multi-word queries into
individual terms and matches using OR logic, causing searches like
'james cameron' to return ~240 results instead of ~10-15.

The phrase operator performs exact phrase matching, ensuring that only
documents where the full phrase appears are returned.

Affected files:
- Python FastAPI: mflix/server/python-fastapi/src/routers/movies.py
- Express TypeScript: mflix/server/js-express/src/controllers/movieController.ts
- Java Spring: mflix/server/java-spring/src/main/java/com/mongodb/samplemflix/service/MovieServiceImpl.java
…eries

Use compound queries with AND logic for directors, writers, and cast fields
to require ALL search terms to match, preventing 'james cameron' from matching
any director with 'James' OR 'Cameron'.

Changes:
- Split multi-word queries into individual terms
- Wrap terms in compound 'must' clause (AND logic)
- Adjust fuzzy settings: maxEdits=1, prefixLength=2 for better typo tolerance
  without over-matching (e.g., prevents 'james' matching 'jane')

Single-word queries continue to use simple text operator with fuzzy matching.

Affected files:
- Python FastAPI: mflix/server/python-fastapi/src/routers/movies.py
- Express TypeScript: mflix/server/js-express/src/controllers/movieController.ts
- Express types: mflix/server/js-express/src/types/index.ts
- Java Spring: mflix/server/java-spring/src/main/java/com/mongodb/samplemflix/service/MovieServiceImpl.java
- Update placeholder text with example names (e.g. James Cameron)
- Add helper text indicating fuzzy matching support for typo tolerance
- Group related fields into visual sections (Plot, People, Options)
- Add section headers with uppercase styling
- Use 3-column grid for directors/writers/cast fields
- Consolidate fuzzy matching hint at section level
- Improve spacing, padding, and border-radius
- Add gradient styling to primary search button
- Softer button styles (outline for Clear, subtle for Close)
- Better input hover/focus states and placeholder colors
- Improved responsive breakpoints for mobile
- Cleaner vector search layout with dedicated section
Comment on lines 165 to 189
director_terms = directors.split()
if len(director_terms) == 1:
search_phrases.append({
"text": {
"query": directors,
"path": "directors",
"fuzzy": {"maxEdits": 1, "prefixLength": 2}
}
})
else:
# Use compound must clause to require all terms match (AND logic)
search_phrases.append({
"compound": {
"must": [
{
"text": {
"query": term,
"path": "directors",
"fuzzy": {"maxEdits": 1, "prefixLength": 2}
}
}
for term in director_terms
]
}
})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting development! The new changes reduce the amount of matching results but do not enforce the must operator across the phrase.

From my understanding the text operator tokenizes each array element and searches all the tokens in the array. Therefore, it doesn't require both terms to match a single array element.

For example: James Cameron works but so does James Todd but James Todd is not a real person. It is matching on if James and Todd is within the array not if "James Todd" in the array.

Unsure but I think we have to decide if we want perfect matches but no typo tolerance (phrase operator) or typo tolerance and potentially incorrect matches (we might be able to try autocomplete or scoring but that will get very fiddly)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered autocomplete but didn't want to over-engineer a sample app

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to use text operator with matchCriteria: any

…ueries

Simplify multi-word search logic by using the built-in matchCriteria option
instead of manually splitting terms and wrapping in compound must clauses.

- matchCriteria: 'all' requires ALL query terms to match (AND logic)
- Maintains fuzzy matching support for typo tolerance
- Significantly reduces code complexity
- Same behavior, cleaner implementation

Ref: https://www.mongodb.com/docs/atlas/atlas-search/operators-collectors/text/
@tmcneil-mdb
Copy link
Collaborator

Hmmm what are you seeing on your end when testing "James Todd" in directors or "Goldie Webber" in cast?

I am still getting results of "Six by Sondheim" which has the directors of: James Lapine, Autumn DeWilde, Todd Hayne, and "Private Benjamin" with a cast of: Goldie Hawn, Eileen Brennan, Armand Assante, Robert Webber

leading me to believe its still tokenizing them as a individuals vs a phrase. However, I don't see that in the docs, only from my own experimentation. Let me know.

@cbullinger
Copy link
Collaborator Author

cbullinger commented Dec 19, 2025

@tmcneil-mdb that's the correct behavior with matchCriteria: all on an array field -- it's matching across the array (i.e. when there are multiple directors or cast names, it's returning every movie with "James" and "Todd" but doesn't guarantee it's the same name). That's different than returning every movie with "James" or "Todd"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants