def my_inputs(n_devices):
while True:
file = random.choice(os.listdir('files'))
with GFile('/files/' + file) as f:
text = f.read()
IDS = TOKENIZER.EncodeAsIds(text)
@nkitaev I'm using this to feed the multiple text files. Do you think I can tweak any of the hyparameters in the parse_config to run the model longer than half an hour without running into memory issues?
MultifactorSchedule
control the learning rate schedule, which only affects how long training takes and not how much memory is used. You can try running with a little more warmup steps, and more steps_per_cycle
in the cyclic cosine schedule.
my_inputs
will let you feed in your own data, and you can tune the model hyperparameters as well
Just wanted to chime in to say we are really excited to be making use of Trax in the Project Clarify (https://github.com/projectclarify/clarify) codebase and would like to invite anyone with experience with Trax or Jax and interest in mentoring to come give a tutorial session at one of our upcoming hackathon/training days (Jan 25th and Feb 29): https://forms.gle/oFWkN7UuAxS7NUGJ9 Especially you @afrozenator, didn't have your email to send you an invite. Looking forward to adding value to Trax as we get up to speed with using it.
Thanks @cwbeitel - that looks exciting, will follow up over email.
Step 1563: train accuracy | 0.21875000
Step 1563: train loss | 13.42460442
-
Step 3126: train accuracy | 0.15625000
Step 3126: train loss | 2.90936065
-
Step 4689: train accuracy | 0.28125000
Step 4689: train loss | 1.86861885
-
Step 6252: train accuracy | 0.09375000
Step 6252: train loss | 20935.30468750
-
Step 7815: train accuracy | 0.46875000
Step 7815: train loss | 1.39475393
I0123 00:01:49.315093 140688036902784 trainer_lib.py:752] Step 1: train accuracy | 0.07968750
Step 1: train accuracy | 0.07968750
I0123 00:01:49.315915 140688036902784 trainer_lib.py:752] Step 1: train loss | 4623.56787109
-
I0123 00:03:25.528132 140688036902784 trainer_lib.py:752] Step 2000: train accuracy | 0.16718750
Step 2000: train accuracy | 0.16718750
I0123 00:03:25.528812 140688036902784 trainer_lib.py:752] Step 2000: train loss | 5.00325918
-
I0123 00:04:39.171037 140688036902784 trainer_lib.py:752] Step 4000: train accuracy | 0.19687501
Step 4000: train accuracy | 0.19687501
I0123 00:04:39.171646 140688036902784 trainer_lib.py:752] Step 4000: train loss | 3.77166605
-
I0123 00:05:53.784074 140688036902784 trainer_lib.py:752] Step 6000: train accuracy | 0.13750000
Step 6000: train accuracy | 0.13750000
I0123 00:05:53.784580 140688036902784 trainer_lib.py:752] Step 6000: train loss | 4.01593637
-
I0123 00:07:08.749417 140688036902784 trainer_lib.py:752] Step 8000: train accuracy | 0.33906251
Step 8000: train accuracy | 0.33906251
I0123 00:07:08.749950 140688036902784 trainer_lib.py:752] Step 8000: train loss | 1.65593171
-
...
-
I0123 00:10:54.362399 140688036902784 trainer_lib.py:752] Step 14000: train accuracy | 0.13125001
Step 14000: train accuracy | 0.13125001
I0123 00:10:54.362976 140688036902784 trainer_lib.py:752] Step 14000: train loss | 78.67562866