Skip to content

Scc 5168#643

Merged
danamansana merged 41 commits intomainfrom
scc-5168
Mar 20, 2026
Merged

Scc 5168#643
danamansana merged 41 commits intomainfrom
scc-5168

Conversation

@danamansana
Copy link
Contributor

@danamansana danamansana commented Feb 11, 2026

Initial work for NYQL. Still a work in progress, but so far we have:

  • A mapping defining the fields used for different types of search scopes. Where applicable, I've based this on the existing search scopes, or filters
  • A grammar for parsing CQL queries
  • Methods to transform CQL queries into ElasticSearch
    EDIT: Merged in some other tickets which add:
  • date-related queries
  • exact match (==) queries
  • compatibility with filters
  • and related tests
  • some error handling/error messages
  • a compact representation of the parsed query, returned in the API response

(this covers SCC-5168/69/70)

Still to come:

  • The queries currently being generated are pretty complicated, if this affects performance we may want to come back and try to compress them, but I am not sure yet if this is going to be an issue
  • handling ES responses, particularly for inner_hits (this may be unnecessary)
  • need some more attention to issues around escaping
  • syntactic sugar in the grammar to make it more permissive
  • the tests are pretty literal currently, ideally I can find a way to make them a little more flexible

@danamansana danamansana marked this pull request as ready for review February 26, 2026 19:54
}

function displayParsed (string) {
const parsed = rightCqlParser.getAST(reverseString(string))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to better ensure that the chain of calls you're making to parse the string in this case matches the chain of calls in parseWithRightCql?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 good point, actually parseWithRightCql returns an AST so we can just call it directly

const queryJson = ElasticQueryBuilder.forApiRequest(request).query.toJson()
if (queryJson.bool && queryJson.bool.filter) {
return { filter: queryJson.bool.filter }
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is a temporary measure until we can migrate the filter stuff into something else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, we could factor out the filter stuff into a separate module shared by the cql_query_builder and ElasticQueryBuilder.

})
.map(([fieldType, fieldNames]) => fieldNames)
.flat()
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having trouble following what the types of these vars are and what they represent. If this part of the code is feeling final, let's add some documentation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah this is complicated, I've added a comment, let me know if it needs work

Copy link
Contributor

@charmingduchess charmingduchess left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing blocking merge into QA - mostly formatting/importing suggestions and clarifying questions.
Great work!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to ensure that the fields are the same for equivalent search_scopes? Or do we have a reason to keep these distinct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this has to do something more complicated than the search_scopes mapping, and also doesn't use the boosting. Might be a way to combine them but not sure

// We do custom field matching for this search-scope
}
},
cql: {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe worth a comment directing the reader to cql/index_mapping.js

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, I've deleted that


function buildBoolean (operator, queries) {
if (['NOT', 'AND NOT'].includes(operator)) return buildNegation(queries)
const esOperator = operator === 'AND' ? 'must' : 'should'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it correct to assume that there is validation upstream that would ensure these would only be AND or OR? Also, would that validation allow for OR NOT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that validation is done by the parser, which only allows operators that match the grammar

return children
}

function displayParsed (string) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you write a comment describing the left/right/reverse switcheration that (I think) is happening here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

function buildNegation (queries) {
return {
bool: {
must: [buildEsQueryFromTree(queries[0])],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At what point do the queries get split into this array?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is done by the buildEsQueryFromTree, which filters out the subqueries and hands them to buildBoolean in case of a boolean query

}

const table = {
exact: { term: 'term', prefix: 'prefix', fields: 'X', exact_fields: 'term' },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does 'X' represent here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are fields that we don't use in that particular case. E.g. in the exact case, don't use the basic fields, since these are text, instead use the matching exact_fields. I've added a comment

multi_match: {
query: 'Hamlet',
fields: [
'title',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe import these fields from the index-mapping.json? Would be less brittle if/when we update scopes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I want to revisit the whole testing approach eventually

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file is quite long. are there any clear groupings that could be split across different files, or even just demarcated with comments?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps split the boolean and atomic cases?

@danamansana danamansana merged commit e046813 into main Mar 20, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants