Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    hitlixuan
    @hitlixuan
    振臂高呼,建个微信群吧
    wzy461143268
    @wzy461143268
    3a79e094dbc7d53e542fffe1466391d.jpg
    igor17400
    @igor17400

    I have created a folder br_index inside data_collector in order to implement a code that keeps track of brazil largest index (ibovespa) historic composition. Very similar to what has been done inside us_index and cn_index.

    I have created an issue #956 and I would like to upload the code I’ve written so it could be reviewed and, if approved, merged into the main repository. However, I tried to git push my branch but I don’t have permission.
    How should I proceed?

    8 replies
    hotwind2015
    @hotwind2015
    @zhupr
    calendar函数执行报错,代码如下:
    D.calendar(start_time='2021-12-30', end_time='2021-12-31', freq='week', future = False)
    错误提示如下:
    ValueError: calendar not exists: K:\stock\qlib-data\cn_data_dump\calendars\1week.txt
    程序在查找calendar文件时,文件名前面多了一个1,看程序似乎是要追加一个,但是calendars目录下的文件是通过dump_bin自动处理的。不会在文件名前自动追加“1”,是不是qlib 0.8.4在日历处理上还有bug。(手工改了文件名再执行是不会报错,但是不能每次dump完都要去改一下文件名字吧)
    3 replies
    realamd
    @realamd
    麻烦再发下微信群二维码吧
    XianfengJiao
    @XianfengJiao
    有人运行example出现这个问题吗
    ERROR - qlib.workflow - [utils.py:41] - An exception has been raised[ValueError: instrument: {'__DEFAULT_FREQ': 'C:\Users\Alphonse\.qlib\qlib_data\cn_data'} does not contain data for day]
    1 reply
    workflow_config_lightgbm_Alpha158.yaml
    XianfengJiao
    @XianfengJiao
    可以再发一下微信群吗~
    另外这个评价指标 risk 是越小越好吗
                                                  risk
    excess_return_without_cost mean 0.000692
    std 0.005374
    annualized_return 0.174495
    information_ratio 2.045576
    max_drawdown -0.079103
    excess_return_with_cost mean 0.000499
    std 0.005372
    annualized_return 0.125625
    information_ratio 1.473152
    max_drawdown -0.088263
    这些指标
    igor17400
    @igor17400

    While trying to submit a PR #990 , it failed in one test Test MacOS / build (macos-latest, 3.7) (pull_request).

    The error being a consequence from executing the command python scripts/get_data.py qlib_data --target_dir ~/.qlib/qlib_data/cn_data --interval 1d --region cn,

    zipfile.BadZipFile: File is not a zip file
    Error: Process completed with exit code 1.

    However, when I create, in my macos, a virtual environment with python 3.7, which is the same version used to test the code, my script runs successfully.

    2022-03-21 16:18:00.530 | WARNING  | qlib.tests.data:_download_data:57 - The data for the example is collected from Yahoo Finance. Please be aware that the quality of the data might not be perfect. (You can refer to the original data source: https://finance.yahoo.com/lookup.)
    2022-03-21 16:18:00.531 | INFO     | qlib.tests.data:_download_data:59 - qlib_data_cn_1d_latest.zip downloading......
    216677376it [01:30, 2393653.69it/s]
    2022-03-21 16:19:31.063 | WARNING  | qlib.tests.data:_unzip:82 - will delete the old qlib data directory(features, instruments, calendars, features_cache, dataset_cache): /Users/igorlimarochaazevedo/.qlib/qlib_data/cn_data
    2022-03-21 16:19:31.064 | INFO     | qlib.tests.data:_unzip:85 - /Users/.qlib/qlib_data/cn_data/20220321161759_qlib_data_cn_1d_latest.zip unzipping......
    100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 43788/43788 [00:07<00:00, 5577.22it/s]

    How could I resolve this error?

    It is worth metioning, that in my understanding at least, the code changes being submited to the PR does not not interfere in the execution process of scripts/get_data.pyfile

    1 reply
    igor17400
    @igor17400

    In the DataLayer : Data Framework & Usage documentation under Multiple Stock Modes section it says

    The trade unit defines the unit number of stocks can be used in a trade, and the limit threshold defines the bound set to the percentage of ups and downs of a stock.

    Can anyone give me further explanation or references on what trade unit and limit threshold are?

    Because I don't understand exactly how they work in the stock market environment. And as consequence, how they affect qlib execution when initialized using one region or the other as the given example qlib.init(provider_uri='~/.qlib/qlib_data/cn_data', region=REG_CN)

    1 reply
    igor17400
    @igor17400

    I’m executing XGBoost with my own data downloaded form Brazil’s Stock Exchange. In order to better understand the data used for training in XGBoost provided by the DataHandler module, I downloaded it into .csv format and opened with pandas.

    It has a column named label, which if I understood correctly is provided by the workflow variable defined by the user. In other words, this label variable is label: ["Ref($close, -2) / Ref($close, -1) - 1”]. As the sample from qlib shows.

    However, when I use qlib’s Data Retrieval to calculate such formula "Ref($close, -2) / Ref($close, -1) - 1” it returns a different value.
    Why is that?

    More info:


    Dataframe used for training in XGBoost:

    Date: 2008-01-02
    Symbol: bbas3.sa
    Ref($close, -2) / Ref($close, -1) - 1: 1.08125


    Value returned by qlib’s data retrieval:

    Date: 2008-01-02
    Symbol: bbas3.sa
    Ref($close, -2) / Ref($close, -1) - 1: -0.011585


    Code used for qlib’s data retrieval:

    df_ = D.features(D.instruments('ibov'), ['Ref($close, -2)/Ref($close, -1)-1'], '2008-01-02', '2014-12-30')
    df_.loc[(['BBAS3.SA'], '2008-01-02'), :]
    1 reply
    gawinghe
    @gawinghe
    麻烦再发下微信群二维码吧
    igor17400
    @igor17400

    Qlib comes with some benchmarks examples models for china stock market such as XGBoost. And those models already come with some predefined parameters that if I understood correctly are optimal paramters.

    However, if I modify the china dataset or use another dataset from a different stock market such as US or BR(Brazilian) how can I obtain such optimal parameters as the ones shown below?

    model:
            class: XGBModel
            module_path: qlib.contrib.model.xgboost
            kwargs:
                eval_metric: rmse
                colsample_bytree: 0.8879
                eta: 0.0421
                max_depth: 8
                n_estimators: 647
                subsample: 0.8789
                nthread: 20
    LUS8806
    @LUS8806
    Hi there, How to get label data when the dataset is TSDatasetH?
    I am using dataset.prepare('test', colset='label'). But it returns a TSDataSampler. I need a dataframe.
    igor17400
    @igor17400

    In the following documentation it says the following

    In the Alpha158, Qlib uses the label Ref($close, -2)/Ref($close, -1) - 1 that means the change from T+1 to T+2, rather than Ref($close, -1)/$close - 1, of which the reason is that when getting the T day close price of a china stock, the stock can be bought on T+1 day and sold on T+2 day.

    However, in Aplha360 should we use the same equation - Ref($close, -2)/Ref($close, -1) - 1 - for label? Or should we use Ref($close, -1)/$close - 1?

    In this other page from the documentation it says,

    • For ic

      The Pearson correlation coefficient series between label and prediction score. In the above example, the label is formulated as Ref($close, -1)/$close - 1. Please refer to Data Feature for more details.

    Are those two formulas related? Shouldn't them be the same?

    Issue(Label: Question) #1060

    gxxuej
    @gxxuej
    @LUS8806 你试一下:(dataset.prepare('test', col_set='label')).data
    igor17400
    @igor17400

    Why some models require infer_processors / learn_processors definition and other don't ?

    I’ve read the documentation available at link and the code. But I couldn’t understand why some models required infer_processors and others don’t. CatBoost doesn’t define any infer_processor, while MLP does define some. Why is that?

    I’ve found this explanation for the difference between inference and learning, however I’m not being able to understand why some models need inference and other don’t ;-;

    1 reply
    lauht
    @lauht
    Hi everyone, could you please tell me are there any stock index data of US? For example, SP500?
    CyberPlayerOne
    @CyberPlayerOne
    @wzy461143268 你好,能再发一下微信群吗?谢谢
    qianyongjun895
    @qianyongjun895
    微信群,发一下
    LUS8806
    @LUS8806
    qlib实现的ALSTM和原论文的实现不一样吧,我看差别挺大的
    1 reply
    想知道这么修改的原因是什么?
    qianyongjun895
    @qianyongjun895
    数据集不一样
    路旁的叶修
    @ChengzhenDu
    有微信群嘛
    lukekingca
    @lukekingca
    可以再发下微信群吗?谢谢!
    lukekingca
    @lukekingca
    论文里面ddg da模型的训练环境是什么?cpu训练的?
    pizi kuan
    @pizikuan_gitlab
    有微信群吗?
    SITONGRUC
    @SITONGRUC
    再建一个微信群?
    Quentin168
    @Quentin168
    while I was trying detailed_workflow.ipynb with crypto dataset, i can get Alpha158 features like below:
    image.png
    dataset_conf = {
    "class": "DatasetH",
    "module_path": "qlib.data.dataset",
    "kwargs": {
    "handler": hd,
    "segments": {
    "train": ("2018-01-01", "2018-12-31"),
    "valid": ("2019-01-01", "2019-12-31"),
    "test": ("2020-01-01", "2020-12-31"),
    },
    },
    }
    model = init_instance_by_config({
    "class": "LGBModel",
    "module_path": "qlib.contrib.model.gbdt",
    "kwargs": {
    "loss": "mse",
    "colsample_bytree": 0.8879,
    "learning_rate": 0.0421,
    "subsample": 0.8789,
    "lambda_l1": 205.6999,
    "lambda_l2": 580.9768,
    "max_depth": 8,
    "num_leaves": 210,
    "num_threads": 20,
    },
    })

    start exp to train model

    with R.start(experiment_name=EXP_NAME):
    model.fit(dataset)
    R.save_objects(trained_model=model)
    rec = R.get_recorder()
    rid = rec.id # save the record id
    
    # Inference and saving signal
    sr = SignalRecord(model, dataset, rec)
    sr.generate()
    13678:MainThread INFO - qlib.workflow - [expm.py:315] - <mlflow.tracking.client.MlflowClient object at 0x7f83f32ff3a0>
    13678:MainThread INFO - qlib.workflow - [exp.py:257] - Experiment 1 starts running ...
    13678:MainThread INFO - qlib.workflow - [recorder.py:293] - Recorder 575dc51dbd674d35a88d21e2e2815093 starts running under Experiment 1 ...
    Training until validation scores don't improve for 50 rounds
    [20] train's l2: 0.763141 valid's l2: 0.824096
    [40] train's l2: 0.763141 valid's l2: 0.824096
    [60] train's l2: 0.763141 valid's l2: 0.824096
    Early stopping, best iteration is:
    [17] train's l2: 0.763141 valid's l2: 0.824096
    13678:MainThread INFO - qlib.workflow - [record_temp.py:194] - Signal record 'pred.pkl' has been saved as the artifact of the Experiment 1
    'The following are prediction results of the LGBModel model.'
    score
    datetime instrument
    2020-01-01 BNBUSDT_BINANCE_1D 3.969982e-09
    BTCUSDT_BINANCE_1D 3.969982e-09
    ETHUSDT_BINANCE_1D 3.969982e-09
    MATICUSDT_BINANCE_1D 3.969982e-09
    TRXUSDT_BINANCE_1D 3.969982e-09
    13678:MainThread INFO - qlib.timer - [log.py:117] - Time cost: 5.316s | waiting async_log Done
    Quentin168
    @Quentin168
    It seems the train and valid stucked and no improvement like [20] train's l2: 0.763141 valid's l2: 0.824096, any expert can help me figure out where is the problem? thanks.
    Wing Light
    @winglight
    DatasetH is not updated since 2020/9/26? I tried to fetch daily trading data by D without any problem, but no data fetched while qrun any model that load data by DatasetH. How can I update Dataset just like dump_bin?
    wony
    @wony-zheng
    TopkDropoutStrategy的only_tradable判定有问题,过滤时没有指定买卖方向。精确来说换入的应该是可买,换出的是可卖的。实际上任一条件满足将都不能换入换出
    Roi Mallo
    @rmallof_twitter
    Hi all. I just opened the issue microsoft/qlib#1196 , I was using collect update_data_to_bin to update previously downloaded prices but 1) it kind of ignored the date range I provided and 2) it crashed at the end
    it's there a reliable way to update the prices efficiently?
    another question: I already have an universe of stocks created, but I've been days trying to figure out the easiest way to include it in the data acquisition pipeline. My current approach is to patch the code... any suggestion ?
    wangzhen
    @huasir
    来个微信群二维码呢
    xphynance
    @xphynance
    @wzy461143268 麻烦请再发一下微信群二维码可以吗?谢谢啦😊
    pirsoz
    @pirsoz
    Hi, I am a newbie, is there a step by step tutorial that Can I learn Qlib with it?
    Lishowie
    @howie1013
    微信群有吗
    Aben
    @IndiestudioAben_twitter
    有人在线么