Where communities thrive

  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
Repo info
    We considered the requirements of live data when designing components like datahandler
    The model are saved on the disk (If you didn't change the configuration, the folder name will be mlflow)
    You can load the model by https://qlib.readthedocs.io/en/latest/component/recorder.html or read the files direclty.
    I want to update the data, data_ Collector, I don't know how to import, what impact does it have on the original data
    Pengrong Zhu
    @monkeycc Hi,
    If you need to use the latest data of yahoo finance, you can download and dump:
    download data: https://github.com/microsoft/qlib/tree/main/scripts/data_collector/yahoo
    dump data: https://qlib.readthedocs.io/en/latest/component/data.html#converting-csv-format-into-qlib-format
    When downloading data and dump data, the target directory should not overwrite the "origin data" directory, it will not affect the "origin data"
    How get real time stock data instead of downloading from 'http://fintech.msra.cn/stock_data/downloads'
    @lucasli0121 We released related API for updating data.
    You can use them for your real time data.
    It is version 0.0.2.dev20 with pip install? Something seems wrong.
    Could you give more details about the error?
    @you-n-g the error is corrected. But other problem like this https://pastebin.com/zbrgwWh0
    Dingsu Wang
    @yaduns Hi, could you please try to install pyqlib without tsinghua mirror of pip, or you could download the pyqlib package from pypi website and install it. Changing mirror sometimes will cause the error of distribution of package is not found due to the untrusted host of your machine.
    In terms of install from source, could you provide the version of cython and gcc on your machine? Thanks 😊
    Hello, I am having trouble with the data. I cannot open the bin file to see it structure
    i want to modify my data to fit into qlib but I can not see what is inside.
    Here are the docs which could help you to convert your data into Qlib Format
    Zhichong Fang
    Hi, I am a little confused about the label formula. The label formula of the example in the https://qlib.readthedocs.io/en/latest/component/report.html#id2 is Ref($close, -2)/Ref($close, -1)-1 (also used in Alpha158 and Alpha360), which is inconsistent with "the label is formulated as Ref($close, -1)/$close - 1" in the note. Thanks.
    The formula Ref($close, -1)/$close - 1 was used a lot in https://qlib.readthedocs.io/en/latest/reference/api.html . May I know the purpose of the difference?
    @fzc621 Hi, Ref($close, -1)/$close - 1 means the change from T to T+1, which can be used in backtest.
    In model training, we use the label Ref($close, -2)/Ref($close, -1)-1that means the change from T+1 to T+2 rather than Ref($close, -1)/$close - 1, of which the reason is that when you get T day close price of a china stock, you can buy it in T+1 day and sell it in T+2 day.
    Zhichong Fang
    @bxdd Thanks!
    Hi - how can I control the prediction horizon in qlib, for example how can I evaluate the LightGBM model to classify and predict returns over 3 or 5 days in the future on a rolling basis? Is there a way to control that in the config or .yaml file?
    Hi - this may be more of a future feature request but has anyone integrated this TA library to fast track technical features/indicators? https://twopirllc.github.io/pandas-ta/#indicators-by-category
    @BigW You can control the prediction horizon in Qlib by changing the label
    For example, you can change ge label in the config.

    The expression engine of Qlib provides similar problem like https://twopirllc.github.io/pandas-ta/#indicators-by-category
    Users are encouraged to implement technical indicators with it.

    But Qlib doesn't force users to use it. If you prefer pandas-ta, you can implement your technical indicates in the datahandler after retrieving the raw data.

    TypeError: Can only swap levels on a hierarchical axis. When I use the new fund-data,dataset = init_instance_by_config(task["dataset"])
    Thank you for your help @you-n-g
    Anaconda3,Spyder(Python 3.7.7),plotly 4.12.0,matplotlib 3.1.3,pyqlib 0.6.1.dev0.
    'XAxis' object has no attribute '_gridOnMajor'
    @yyll008 How can I reproduce this error ?
    @gxxuej Could you give more details about how to reproduce this bug?
    I keep on getting mid way through running backtests using both the qrun and the python run_all_model approach and get a single line saying "killed" has anyone seen these types or errors before?

    code sample below Running the model: LightGBM for iteration 1...
    40048:MainThread INFO - qlib.Initialization - [config.py:276] - default_conf: client.
    40048:MainThread WARNING - qlib.Initialization - [config.py:292] - redis connection failed(host= port=6379), cache will not be used!
    40048:MainThread INFO - qlib.Initialization - [init.py:46] - qlib successfully initialized based on client settings.
    40048:MainThread INFO - qlib.Initialization - [init.py:47] - data_path=/home/warren/.qlib/qlib_data/cn_data

    Deleting the environment: /tmp/tmp01wsrv_h...
    Retrieving results...
    39831:MainThread ERROR - qlib.workflow - [utils.py:35] - An exception has been raised[ValueError: No valid experiment has been found, please make sure the input experiment name is correct.].
    File "run_all_model.py", line 305, in <module>
    fire.Fire(run) # run all the model
    File "/home/warren/anaconda3/envs/QLIB3TEST/lib/python3.8/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
    File "/home/warren/anaconda3/envs/QLIB3TEST/lib/python3.8/site-packages/fire/core.py", line 466, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
    File "/home/warren/anaconda3/envs/QLIB3TEST/lib/python3.8/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace
    component = fn(varargs, **kwargs)
    File "run_all_model.py", line 62, in _return_wrapped
    return function_to_decorate(
    args, **kwargs)
    File "run_all_model.py", line 287, in run
    results = get_all_results(folders)
    File "run_all_model.py", line 148, in get_all_results
    exp = R.get_exp(experiment_name=fn, create=False)
    File "/home/warren/anaconda3/envs/QLIB3TEST/lib/python3.8/site-packages/qlib/workflow/init.py", line 239, in get_exp
    return self.exp_manager.get_exp(experiment_id, experiment_name, create)
    File "/home/warren/anaconda3/envs/QLIB3TEST/lib/python3.8/site-packages/qlib/workflow/expm.py", line 149, in get_exp
    exp, is_new = self._get_exp(experiment_id=experiment_id, experiment_name=experiment_name), False
    File "/home/warren/anaconda3/envs/QLIB3TEST/lib/python3.8/site-packages/qlib/workflow/expm.py", line 298, in _get_exp
    raise ValueError(
    ValueError: No valid experiment has been found, please make sure the input experiment name is correct.


    @you-n-g 以下是代码部分:import qlib
    import pandas as pd

    from qlib.config import REG_CN
    from qlib.workflow import R
    from qlib.contrib.report import analysis_model

    provider_uri = "~/.qlib/qlib_data/cn_data" # target_dir
    qlib.init(provider_uri=provider_uri, region=REG_CN)
    my_rid = "3f999767c71b499dbe0dc3c3480067d3"
    recorder = R.get_recorder(my_rid, experiment_name="backtest_analysis")
    pred_df = recorder.load_object("pred.pkl")
    label_df = recorder.load_object("label.pkl")
    label_df.columns = ['label']

    graph_names: list = ["group_return", "pred_ic", "pred_autocorr","pred_turnover"],

    Error : 'XAxis' object has no attribute '_gridOnMajor'

    f1 = analysis_model.model_performance_graph(pred_label,show_notebook=False)

    graph_names=["group_return"],OK !

    f2 = analysis_model.model_performance_graph(pred_label,graph_names=["group_return"],\

    graph_names=["pred_ic"],Error : 'XAxis' object has no attribute '_gridOnMajor'

    f3 = analysis_model.model_performance_graph(pred_label,graph_names=["pred_ic"],\

    may i ask if qlib support futures?

    @you-n-g df = D.features(D.instruments(market="all"), ["$DWJZ", "$LJJZ"],start_time='2018-01-01', end_time='2021-03-22', freq="day")
    market = "all"
    benchmark = "000001"
    data_handler_config = {
    "start_time": "2018-01-01",
    "end_time": "2020-08-01",
    "fit_start_time": "2018-01-01",
    "fit_end_time": "2018-12-31",
    "instruments": market,

    task = {
    "model": {
    "class": "LGBModel",
    "module_path": "qlib.contrib.model.gbdt",
    "kwargs": {
    "loss": "mse",
    "colsample_bytree": 0.8879,
    "learning_rate": 0.0421,
    "subsample": 0.8789,
    "lambda_l1": 205.6999,
    "lambda_l2": 580.9768,
    "max_depth": 8,
    "num_leaves": 210,
    "num_threads": 20,
    "dataset": {
    "class": "DatasetH",
    "module_path": "qlib.data.dataset",
    "kwargs": {
    "handler": {
    "class": "Alpha158",
    "module_path": "qlib.contrib.data.handler",
    "kwargs": data_handler_config,
    "segments": {
    "train": ("2018-01-01", "2018-12-31"),
    "valid": ("2019-01-01", "2019-12-31"),
    "test": ("2020-01-01", "2020-08-01"),

    model initiaiton

    model = init_instance_by_config(task["model"])
    dataset = init_instance_by_config(task["dataset"])

    @you-n-g Thanks!
    How to join the Qlib team and what are the requirements for interns?
    Pengrong Zhu

    @gxxuej Hi,
    I can’t reproduce this bug in the same environment.

    You can use the latest qlib(https://github.com/microsoft/qlib) for more information: cd scripts && python collect_info.py all

    @onepointfuck You can send your resume to our email qlib@microsoft.com
    @zhupr Thanks!
    Jagadish Kumar E 🇮🇳
    Hi Team, Currently QLib supports only two regions datasets. Any tentative dates when India region shall be supported and If I have to define India region on my own what all need to be done any pointers shall be helpful
    @javajk_twitter Qlib doesn't limit the region of the datasets.
    You can send PR to create a data collector to crawl data from india market.
    Jagadish Kumar E 🇮🇳
    @you-n-g : Thanks for your response, How do I raise PR ?
    @javajk_twitter You can raise PR to Qlib like normal opensource project.
    hi, guys, I notice use the pip install is always un-success, could you please tell me how to fix it?

    @mozi7 Can you give us more details about your error?

    We have include pip install test in our CI

    It suceeded in all kinds of platforms. You can check the commands it uses.

    Hello, where can I set the number of trading stocks as a multiple of 100?
    Jagadish Kumar E 🇮🇳

    Hello, I have two sets of datasets in csv format. One is the historical dataset and another one is the delta or current days shares OHLC values. When i run convert the historical dataset in csv format to bin format everything goes fine. But when I try to update the current days data in csv format to bin format i get following error "NaT is not in list" For the first time to convert from CSV format to bin format i use following command

    dump_bin.py dump_all --csv_path=nf.csv --qlib_dir=C:/data/.qlib/qlib_data/in_data --include_fields open,close,high,low,factor,tottrdqty,tottrdval

    For the second time to convert from CSV format to bin format i use following command

    dump_bin.py dump_update --csv_path=nf.csv --qlib_dir=C:/data/.qlib/qlib_data/in_data --include_fields open,close,high,low,factor,tottrdqty,tottrdval

    Is the command and arguments passed is correct or not, please let me know.

    Following is the error I get when i run the above command to convert the current days data to bin format

    dump bin errors: {'ABB': 'concurrent.futures.process._RemoteTraceback: \n"""\nTraceback (most recent call last):\n File "D:\Anaconda\lib\concurrent\futures\process.py", line 239, in _process_worker\n r = call_item.fn(call_item.args, *call_item.kwargs)\n File "E:\qlib-jk\qlib\scripts\dump_bin.py", line 250, in _dump_bin\n self._data_to_bin(df, calendar_list, features_dir)\n File "E:\qlib-jk\qlib\scripts\dump_bin.py", line 220, in _data_to_bin\n date_index = self.get_datetime_index(_df, calendar_list)\n File "E:\qlib-jk\qlib\scripts\dump_bin.py", line 211, in get_datetime_index\n return calendar_list.index(df.index.min())\nValueError: NaT is not in list\n"""\n\nThe above exception was the direct cause of the following exception:\n\nTraceback (most recent call last):\n File "dump_bin.py", line 467, in _dump_features\n _future.result()\n File "D:\Anaconda\lib\concurrent\futures\_base.py", line 432, in result\n return self.get_result()\n File "D:\Anaconda\lib\concurrent\futures\_base.py", line 388, in get_result\n raise self._exception\nValueError: NaT is not in list\n'}

    2 replies

    Hello, where can I set the number of trading stocks as a multiple of 100?

    Just for marking this question as solved in microsoft/qlib#395