fix: handler_mod func don't work when dealing None end date #2068

lingbai-kong · 2025-12-09T16:16:29Z

Description

Fix the issus that training data missing/trade date is NaT randomly occurs when using route RollingStrategy with OnlineManager.

Motivation and Context

In RollingGen, the handler_mod is used to deal the case that hander's data end_time is earlier than dataset's test_data's segments. However, when the RollingGen.gen_following_tasks shifts the current segment to the next prediction window and the expected test end date is later than the current date (i.e. the segment of the last rolling round), the test end date of the newly generated segment will be allocated None value.

Then, when RollingGen calling self._update_task_segs(t, segments), handler_mod calculate the interval of hander's data end_date and the end date of the dataset's test_data's segments as follows:

cal_interval(
            task["dataset"]["kwargs"]["handler"]["kwargs"]["end_time"],
            task["dataset"]["kwargs"]["segments"][rolling_gen.test_key][1],
        )

Due to task["dataset"]["kwargs"]["segments"][rolling_gen.test_key][1] is None, the cal_interval raises TypeError but there is no code to handle it. Thus, the task["dataset"]["kwargs"]["handler"]["kwargs"]["end_time"] keeps its original value and finally causes incomplete data in the follow process.

How to fix it?

Force update hander's data end_date when the end date of the dataset's test_data's segments is None.
Please let me know if there is a better solusion.

How Has This Been Tested?

Pass the test by running: pytest qlib/tests/test_all_pipeline.py under upper directory of qlib.
If you are adding a new feature, test on your own test scripts.
Run this script to reproduce the problem. Please note: the dataset's version is 20251206.

#!/usr/bin/env python3
"""
Test program for OnlineManager

This program tests the add_strategy and routine methods of OnlineManager.
"""

import sys
import os
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))

import pandas as pd
from typing import List, Dict
from qlib.workflow.online.manager import OnlineManager
from qlib.workflow.online.strategy import OnlineStrategy, RollingStrategy
from qlib.workflow.task.gen import RollingGen
from qlib.model.trainer import TrainerR
from qlib.workflow.recorder import Recorder
import qlib
qlib.init(provider_uri="~/.qlib/tushare_data/cn_data", region='cn')
def test_online_manager():
    """
    Test OnlineManager's add_strategy and routine methods
    """
    print("=== Testing OnlineManager ===")
    ###################################
    # online model
    ###################################

    online_segments = {
        "train": ("2025-05-22", "2025-09-09"),
        "valid": ("2025-09-10", "2025-10-14"),
        "test": ("2025-10-15", "2025-11-25"),
    }
    online_data_handler_config = {
        "start_time": online_segments["train"][0],
        "end_time": online_segments["test"][1],
        "fit_start_time": online_segments["train"][0],
        "fit_end_time": online_segments["train"][1],
        "instruments": 'csi300',
        "drop_raw": True
    }
    task = {
        "model": {
            "class": "LGBModel",
            "module_path": "qlib.contrib.model.gbdt",
            "kwargs": {
                "loss": "mse",
                "colsample_bytree": 0.8879,
                "learning_rate": 0.0421,
                "subsample": 0.8789,
                "lambda_l1": 205.6999,
                "lambda_l2": 580.9768,
                "max_depth": 8,
                "num_leaves": 210,
                "num_threads": 10,
                "verbosity": 2,
            }
        },
        "dataset": {
            "class": "DatasetH",
            "module_path": "qlib.data.dataset",
            "kwargs": {
                "handler": {
                    "class": "Alpha158",
                    "module_path": "qlib.contrib.data.handler",
                    "kwargs": online_data_handler_config,
                },
                "segments": online_segments,
            },
        },
        "record": [
            {
                "class": "SignalRecord",
                "module_path": "qlib.workflow.record_temp",
                "kwargs": {"dataset": "<DATASET>", "model": "<MODEL>"},
            },
            {"class": "SigAnaRecord", "module_path": "qlib.workflow.record_temp"},
        ],
        "strategy":{
            "rolling_step": 30
        }
    }
    strategy = RollingStrategy(
                    'test',
                    task,
                    RollingGen(step=task["strategy"]["rolling_step"], rtype=RollingGen.ROLL_SD),
                )
    print("Creating OnlineManager...")
    manager = OnlineManager(
        strategies=[],
        trainer=TrainerR()
    )
    
    print(f"Initial strategies count: {len(manager.strategies)}")
    
    # Test add_strategy method
    print("\n=== Testing add_strategy ===")    
    manager.add_strategy([strategy])
    print(f"Strategies count after add_strategy: {len(manager.strategies)}")
    
    # Test routine method
    print("\n=== Testing routine ===")
    test_time = pd.Timestamp("2025-12-06")
    manager.routine(cur_time=test_time, signal_kwargs={"over_write": True})
    
    print("\n=== Test completed successfully! ===")


if __name__ == "__main__":
    test_online_manager()

Screenshots of Test Results (if appropriate):

Pipeline test:
Your own tests:

before

fixed

Types of changes

Fix bugs
Add new feature
Update documentation

SunsetWolf · 2025-12-12T10:16:53Z

Hi, @lingbai-kong ,Thanks for your contribution to qlib, I'm still a bit confused after reading the detailed description: How can I test the difference before and after the modification?
I used your test code with qlib's official dataset, and found that there is no difference between before and after the modification, both work fine.

lingbai-kong · 2025-12-20T07:09:50Z

Hello @SunsetWolf, to make reproduction easier, I’ve prepared a minimal Google Colab notebook:
https://colab.research.google.com/drive/1_ccS_YbLTuesTExpVKGxg1FTFOAxDuQ7?usp=sharing
Please let me know if you’re still unable to reproduce it on your side.

SunsetWolf · 2025-12-23T06:11:29Z

Hi, @lingbai-kong , Thanks to your thoughtful reply, I have now successfully reproduced the problem.

First of all, I would like to confirm that you are on the right track.

I have one concern about the current implementation: treating test_seg_end is None as interval = -1 introduces an implicit magic value and mixes two different semantics (open-ended segments vs. time comparison) into a single numeric branch. This makes the logic a bit harder to reason about and may be error-prone for future maintenance.

In this context, None has a clear business meaning in RollingGen: it represents an open-ended segment (i.e. “until now”). I think it would be clearer and more robust to handle this case explicitly before calling cal_interval, instead of encoding it indirectly.

For example:

handler_kwargs = task["dataset"]["kwargs"]["handler"]["kwargs"]
handler_end = handler_kwargs.get("end_time")
test_seg_end = task["dataset"]["kwargs"]["segments"][rolling_gen.test_key][1]

if test_seg_end is None or rolling_gen.ta.cal_interval(handler_end, test_seg_end) < 0:
    handler_kwargs["end_time"] = copy.deepcopy(test_seg_end)

I believe this version better reflects the semantics of RollingGen, avoids magic values, and improves readability and maintainability.

lingbai-kong · 2025-12-24T14:27:55Z

Hello @SunsetWolf, thanks for your patient explaination and kind advice. I have pushed a new version. And it looks good to me.

…t(end_time)

SunsetWolf · 2025-12-25T09:46:43Z

Hi, @lingbai-kong , I’ve updated and pushed the changes. This version extracts handler_kwargs and uses .get("end_time"), which improves readability and maintainability and avoids hard-coded deep paths, making it more robust to missing fields.
Thanks a lot for the contribution — once the license/CLA checks pass, this should be good to go.

SunsetWolf · 2025-12-26T06:51:34Z

Hi, @lingbai-kong , The changes are ready, and the PR can be merged once the CLA check passes.
Please sign the contributor license agreement when you have a moment. After that, we’ll proceed with the merge. Appreciate your work!

lingbai-kong · 2025-12-27T04:15:34Z

@microsoft-github-policy-service agree

lingbai-kong · 2025-12-27T04:23:02Z

Hi @SunsetWolf, I've completed the CLA signing just now. Thanks for your help!

SunsetWolf · 2025-12-28T04:41:22Z

It looks good now, it's been merged. Thank you for your contribution.

lingbai-kong closed this Dec 24, 2025

lingbai-kong force-pushed the klb/fix_handler_mod branch from 1826b79 to 2e9a00a Compare December 24, 2025 13:54

[fix] handler_mod func don't work when dealing None end date

2c1f546

lingbai-kong reopened this Dec 24, 2025

refactor: avoid deep access by extracting handler_kwargs and using ge…

d16bd6c

…t(end_time)

SunsetWolf merged commit 3472e82 into microsoft:main Dec 27, 2025
77 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: handler_mod func don't work when dealing None end date #2068

fix: handler_mod func don't work when dealing None end date #2068

Uh oh!

lingbai-kong commented Dec 9, 2025

Uh oh!

SunsetWolf commented Dec 12, 2025

Uh oh!

lingbai-kong commented Dec 20, 2025

Uh oh!

SunsetWolf commented Dec 23, 2025

Uh oh!

lingbai-kong commented Dec 24, 2025

Uh oh!

SunsetWolf commented Dec 25, 2025

Uh oh!

SunsetWolf commented Dec 26, 2025

Uh oh!

lingbai-kong commented Dec 27, 2025

Uh oh!

lingbai-kong commented Dec 27, 2025

Uh oh!

Uh oh!

SunsetWolf commented Dec 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix: handler_mod func don't work when dealing None end date #2068

fix: handler_mod func don't work when dealing None end date #2068

Uh oh!

Conversation

lingbai-kong commented Dec 9, 2025

Description

Motivation and Context

How Has This Been Tested?

Screenshots of Test Results (if appropriate):

Types of changes

Uh oh!

SunsetWolf commented Dec 12, 2025

Uh oh!

lingbai-kong commented Dec 20, 2025

Uh oh!

SunsetWolf commented Dec 23, 2025

Uh oh!

lingbai-kong commented Dec 24, 2025

Uh oh!

SunsetWolf commented Dec 25, 2025

Uh oh!

SunsetWolf commented Dec 26, 2025

Uh oh!

lingbai-kong commented Dec 27, 2025

Uh oh!

lingbai-kong commented Dec 27, 2025

Uh oh!

Uh oh!

SunsetWolf commented Dec 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants