-
Notifications
You must be signed in to change notification settings - Fork 5.5k
fix: handler_mod func don't work when dealing None end date #2068
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hi, @lingbai-kong ,Thanks for your contribution to |
|
Hello @SunsetWolf, to make reproduction easier, I’ve prepared a minimal Google Colab notebook: |
|
Hi, @lingbai-kong , Thanks to your thoughtful reply, I have now successfully reproduced the problem. First of all, I would like to confirm that you are on the right track. I have one concern about the current implementation: treating test_seg_end is None as interval = -1 introduces an implicit magic value and mixes two different semantics (open-ended segments vs. time comparison) into a single numeric branch. This makes the logic a bit harder to reason about and may be error-prone for future maintenance. In this context, None has a clear business meaning in RollingGen: it represents an open-ended segment (i.e. “until now”). I think it would be clearer and more robust to handle this case explicitly before calling cal_interval, instead of encoding it indirectly. For example: I believe this version better reflects the semantics of RollingGen, avoids magic values, and improves readability and maintainability. |
1826b79 to
2e9a00a
Compare
|
Hello @SunsetWolf, thanks for your patient explaination and kind advice. I have pushed a new version. And it looks good to me. |
|
Hi, @lingbai-kong , I’ve updated and pushed the changes. This version extracts |
|
Hi, @lingbai-kong , The changes are ready, and the PR can be merged once the CLA check passes. |
|
@microsoft-github-policy-service agree |
|
Hi @SunsetWolf, I've completed the CLA signing just now. Thanks for your help! |
|
It looks good now, it's been merged. Thank you for your contribution. |

Description
Fix the issus that training data missing/trade date is NaT randomly occurs when using route RollingStrategy with OnlineManager.
Motivation and Context
In RollingGen, the handler_mod is used to deal the case that hander's data end_time is earlier than dataset's test_data's segments. However, when the RollingGen.gen_following_tasks shifts the current segment to the next prediction window and the expected test end date is later than the current date (i.e. the segment of the last rolling round), the test end date of the newly generated segment will be allocated None value.
Then, when RollingGen calling self._update_task_segs(t, segments), handler_mod calculate the interval of hander's data end_date and the end date of the dataset's test_data's segments as follows:
Due to task["dataset"]["kwargs"]["segments"][rolling_gen.test_key][1] is None, the cal_interval raises TypeError but there is no code to handle it. Thus, the task["dataset"]["kwargs"]["handler"]["kwargs"]["end_time"] keeps its original value and finally causes incomplete data in the follow process.
How to fix it?
How Has This Been Tested?
pytest qlib/tests/test_all_pipeline.pyunder upper directory ofqlib.Run this script to reproduce the problem. Please note: the dataset's version is 20251206.
Screenshots of Test Results (if appropriate):
Types of changes