Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    SP Mohanty
    @spMohanty
    @newtiopon : we understand. We are considering alternate options to deliver the data, and hopefully will send out official communication about the possible alternate approaches
    Whats the bandwidth of your internet connection ?
    newtiopon
    @newtiopon
    100M/s, fyi
    brianbrost
    @brianbrost
    @newtiopon in the meantime there are some suggestions on the discussion board (https://www.crowdai.org/topics/56g-is-the-training-set/discussion) which might help.
    newtiopon
    @newtiopon
    thanks Brian. i have question about the relation between train_set filename and the "date" field. if i understand, the date field refers to the date the session happens. But something confusing is that file 'log_0_20180726_000000000000.csv ' contains sessions whose date field vary from 2012 to 2018. So my questions are 1) what does the datestamp in file name refers to? 2)if the datestamp in filename means something, how does sessions from different dates assigned to this file? looking for further clarification
    brianbrost
    @brianbrost
    @newtiopon, there are some occasional logging errors that can occur for a number of reasons, causing a mismatch between the date in the file name and the date when the session occurred according to the field. These mismatches should be relatively rare, and it's up to you exactly how you handle them.
    hjh1011
    @hjh1011
    Are we able to extract artists and album info through the web api for all 3M tracks?
    @brianbrost
    brianbrost
    @brianbrost
    Hi @hjh1011, this would have been very interesting, unfortunately this is no longer allowed since we were required to remove the link between our internal track id's for the dataset and the actual track id's.
    hjh1011
    @hjh1011
    @brianbrost I just realized that the data was updated on 20 Nov and artists and album and some other information were removed from the track features. I just wonder for people who has download the data before that, they would have those columns, right? If that is the case, how is this a fair contests???
    brianbrost
    @brianbrost
    @hjh1011 As noted on the discussion board, participants will be required to open source their code, so we will be able to see if participants have used those features.
    RStudent
    @rstudent_gitlab
    @brianbrost , @spMohanty any one else, please, I could use some help here. Very late entrant to the contest, still want to give it a sincere try. having trouble downloading the train set file. Even with the splits. I keep getting the "gzip: stdin: invalid compressed data--format violated" error on doing tar -xzf. please, if anyone else had similar issues and resolved them, i could use some help here. please note that i am able to successfully download and extract the test set. i am on ubuntu 18.04. thanks a lot in anticipation.
    SP Mohanty
    @spMohanty
    @brianbrost : Can you upload md5sums of all the files, so that participants can atleast verify that the file they downloaded was not corrupted in any sense during tramsmission ?
    RStudent
    @rstudent_gitlab
    @spMohanty : Thanks a lot for the response. For reference, this is what i get for the file throwing an error. ~/workspace/music/train$ md5sum 20181113_training_set.tar.gz
    1dd55ea738937c5a3c23f6d18eb9804c 20181113_training_set.tar.gz
    and the size as expected is 56G
    SP Mohanty
    @spMohanty
    well for me the md5sum is this :
    f0b818a7cffd355d6ddeb368d2b244c0  20181113_training_set.tar.gz
    so something seems off definitely
    are you sure the download completed in a clean way ?
    brianbrost
    @brianbrost
    hi @rstudent_gitlab, I'll be home in about 10 minutes and I'll verify the md5sum, but have you tried downloading the split version of the training set? See the Training_Set_Split_Download.txt file in the Dataset tab. It contains links to download the training set split into 10 files, making it easier to download
    brianbrost
    @brianbrost
    ah sorry just saw that you tried the split version as well
    @rstudent_gitlab, do you get the exact same error message for the split version of the training set?
    brianbrost
    @brianbrost
    For what it's worth, here's the md5sums I get:
    f0b818a7cffd355d6ddeb368d2b244c0 20181113_training_set.tar.gz
    will list the ones for the split version of the training set when I finish re-downloading it
    brianbrost
    @brianbrost
    9bef4b0ed6ec4754c91d43fa0058213c training_set_0.tar.gz
    cb1a443f9613f11388c1c1aac703f7f6 training_set_1.tar.gz
    c0508e75ea300fd0e04b385d83a4ff04 training_set_2.tar.gz
    brianbrost
    @brianbrost
    66773b8a1f6d7a3034414afa223fe617 training_set_3.tar.gz
    99a88fa87ffadc40d1777d002e830805 training_set_4.tar.gz
    brianbrost
    @brianbrost
    a7193e27165ab849fb8e70156d9aa265 training_set_5.tar.gz
    brianbrost
    @brianbrost
    65d3b5731f1f735ccb8f7de1128c3354 training_set_6.tar.gz
    b2e6e6c0989b9995672cc92219ac4bd8 training_set_7.tar.gz
    1d716c77bcc64ca197372a89c7963d3d training_set_8.tar.gz
    58f1e0b1e3d2c91edef3903199f15e9a training_set_9.tar.gz
    @rstudent_gitlab please let me know if your checksums are different, and if you can extract any of the split files?
    RStudent
    @rstudent_gitlab
    I know i am being a bother. Could someone else please who has downloaded the file successfully confirm the md5sum please. At least i will know the corruption error is valid
    brianbrost
    @brianbrost
    I just downloaded those from the competition website, so the md5sums are the ones I would expect anyone else to get too. Are some of your checksums different, or are all of them different?
    @rstudent_gitlab
    RStudent
    @rstudent_gitlab
    Thank you so much for helping me out on this @brianbrost , @spMohanty . Well my md5sums are different for the main and the splits. For example for ~/workspace/music/train$ md5sum training_set_1.tar.gz
    9bdea6a8c4a9e47b47bace227bad252f training_set_1.tar.gz
    md5sum training_set_9.tar.gz
    b1b4393806a9d90711e619bfd08dd2f7 training_set_9.tar.gz
    So far i have tried , on click download, wget, CurL, no luck so far. Could you please give me actual size of these files. I will try clean downloads again and will monitor them. Maybe this time I will get them to work :) Thanks a lot for looking into it
    brianbrost
    @brianbrost

    @rstudent_gitlab

    with ls -l, I get the following sizes:

    6044161773 Dec 12 15:09 training_set_0.tar.gz
    6044989394 Dec 12 15:15 training_set_1.tar.gz
    6042349689 Dec 12 15:21 training_set_2.tar.gz
    6043674073 Dec 12 15:27 training_set_3.tar.gz
    6042105510 Dec 12 15:34 training_set_4.tar.gz
    6043173901 Dec 12 15:40 training_set_5.tar.gz
    6042906018 Dec 12 15:46 training_set_6.tar.gz
    6046086003 Dec 12 15:52 training_set_7.tar.gz
    6043512819 Dec 12 15:58 training_set_8.tar.gz
    6045573656 Dec 12 16:04 training_set_9.tar.gz
    60438525791 Dec 23 00:14 20181113_training_set.tar.gz

    Hope you manage to figure out the problem, and let me know if there's any other way I can help. Unfortunately I don't have any useful suggestions, except to try to re-download, which I guess is what you're trying right now. Please let us know if you sort out the problem!

    RStudent
    @rstudent_gitlab
    @brianbrost @spMohanty i cannot thank you enough. back in business using the wonder of modern technology: Aria2 https://aria2.github.io/
    Now time to use the other wonder of modern technology: deep net. Thanks again
    Joey
    @joychengzhaoyue_twitter
    Hi
    I’m wondering the deadline is Jan 4th is in which time zone? Thanks!
    brianbrost
    @brianbrost
    @spMohanty What timezone is the competition deadline currently set for?
    SP Mohanty
    @spMohanty
    @brianbrost : Its in UTC. Looking at the deadline its on Jan 4th, 12:00 UTC
    Sainath Adapa
    @sainathadapa
    Just want to get a confirmation: The objective is to predict if a track was played briefly (skip_2 being true). It is not to predict if the track was skipped (that is not_skipped being false). Am I correct?
    brianbrost
    @brianbrost
    @sainathadapa that's correct. Apologies for the delay in replying. I don't get a notification if you don't use @ my username.