Currently, there is no way to do very large queries, such as those that require complex spatial polygons, or involve many species. There is a logical solution here - already implemented in galah-Python I believe - which is to upload a large query to the ALA using this API then use the returned queryID for later downloads/queries. Syntax could look something like this:
# cache a query on ALA
# noting that `copy_to()` is a dplyr generic, but isn't in galah yet
query_id <- galah_call() |>
filter(taxonConceptID %in% vector_of_many_species) |>
copy_to("data/query_id") # I'm guessing syntax rn, but could use the same string as `url_lookup()` here?
# call that query ID to get counts for many species
result <- galah_call() |>
filter(qid == query_id) |>
distinct(speciesID, .keep_all = FALSE) |>
select(count) |> # should really be `dplyr::add_count()`, but again not implemented yet
collect()
As a side-note, this would also help us close issue #53
Currently, there is no way to do very large queries, such as those that require complex spatial polygons, or involve many species. There is a logical solution here - already implemented in galah-Python I believe - which is to upload a large query to the ALA using this API then use the returned
queryIDfor later downloads/queries. Syntax could look something like this:As a side-note, this would also help us close issue #53