Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • Jan 15 08:49

    AlexKuhnle on master

    Add not-maintained message to r… (compare)

  • Nov 21 2022 20:42

    dependabot[bot] on pip

    (compare)

  • Nov 21 2022 20:42

    dependabot[bot] on pip

    Bump tensorflow from 2.8.0 to 2… (compare)

  • Jul 29 2022 23:31

    dependabot[bot] on pip

    Bump mistune from 0.8.4 to 2.0.… (compare)

  • May 24 2022 17:30

    dependabot[bot] on pip

    Bump tensorflow from 2.8.0 to 2… (compare)

  • Feb 10 2022 08:43

    dependabot[bot] on pip

    (compare)

  • Feb 10 2022 08:43

    AlexKuhnle on master

    Bump tensorflow from 2.7.0 to 2… Merge pull request #855 from te… (compare)

  • Feb 09 2022 23:35

    dependabot[bot] on pip

    (compare)

  • Feb 09 2022 23:35

    dependabot[bot] on pip

    Bump tensorflow from 2.7.0 to 2… (compare)

  • Feb 09 2022 23:28

    dependabot[bot] on pip

    Bump tensorflow from 2.7.0 to 2… (compare)

  • Jan 08 2022 21:53

    AlexKuhnle on master

    Correct type (compare)

  • Jan 08 2022 21:41

    AlexKuhnle on master

    Add missing box2d dependency (compare)

  • Jan 08 2022 16:56

    AlexKuhnle on master

    Downgrade numpy version for Py3… (compare)

  • Jan 08 2022 16:51

    AlexKuhnle on master

    Update to TF 2.7, update depend… (compare)

  • Jan 03 2022 16:15

    AlexKuhnle on master

    Update setup and travis config (compare)

  • Dec 29 2021 14:54

    AlexKuhnle on master

    make states ArrayDict to pass a… Merge pull request #849 from dx… (compare)

  • Nov 10 2021 20:00

    dependabot[bot] on pip

    (compare)

  • Nov 10 2021 20:00

    AlexKuhnle on master

    Bump tensorflow from 2.6.0 to 2… Merge pull request #840 from te… (compare)

  • Nov 10 2021 19:45

    dependabot[bot] on pip

    Bump tensorflow from 2.6.0 to 2… (compare)

  • Oct 20 2021 20:50

    AlexKuhnle on master

    Update gym version requirement (compare)

Alexander Kuhnle
@AlexKuhnle
Right, I was wondering about this as well. It's better to change it, since the convolution is applied over the first dimension, and the feature size / second dimension should start off being 1.
(presumably that's where the transpose came from, but you can just swap it everywhere.)
Nicolas Neubauer
@nneubauer
Right, thanks. Another question. Is there any documentation on how multiple states are fed to the neural net? Assuming I have eg. an image which I want to feed to the convolutional net and at the same time some other features (eg. the size of the image in bytes or the time of the last edit) and I want to include them but not run trough the conv layers, but feed them eg. to the dense layer(s) at the end of the network?
Alexander Kuhnle
@AlexKuhnle
On the one hand, there is the 'auto' network, which provides a "default" of combining multiple inputs. The way generally such networks can be specified is via the Register and Retrieve layers here, and by specifying the network not as a list of layers, but as a list of lists where each list generally starts with a retrieve layer and ends with a register layer, so constitute a sequential "segment" of the overall network.
There used to be examples in the unittests, but good point, they are all gone by now due to the auto network. Would be worth adding this to the documentation, will look into it.
Alexander Kuhnle
@AlexKuhnle
(It may also help to have a look at the auto network class.)
Nicolas Neubauer
@nneubauer
Thank, I will look into that.
IbraheemNofal
@IbraheemNofal
Hello there, so I've been recently experimenting with Tensorforce and I've been pretty impressed with how easy and convenient it is to work with. One thing
One thing I can't figure out how to do is how to build the model using tensorflow or tensorflow-lite after training and saving it. Could anyone please help me understand whether that's possible?
Alexander Kuhnle
@AlexKuhnle
Hi. Thanks for the feedback. :-) Regarding your question: Saving and loading an agent can be done via agent.save(...) and Agent.load(...) (see docs). Is that what you're looking for, or what more specifically do you mean by "building the model using tf or tf-lite after training"?
IbraheemNofal
@IbraheemNofal
Thanks for the reply. Now I know that tensorforce is built on top of tensorflow, correct? In my case I'd like to deploy / test my model on a low powered IOT device after training. Now I know that tf-lite would perform best in such a case, and that typically, it's possible to convert tensorflow models to tf-lite, so I'm wondering whether the trained model would be tensorflow or tensorflow-lite compatible, whether it'd be possible to translate it to tf-lite. I'm a beginner when it comes to machine learning, so I hope what I'm asking about makes sense.
Alexander Kuhnle
@AlexKuhnle
I haven't worked with tf-lite or similar myself, but it should be possible and I would be interested in working through it. Do you know what format is typically used to load models in tf-lite / how loading in tf-lite generally looks like? Is it SavedModel and specialized loading functionality, or can you just create and load it regularly in Python code? Tensorforce is currently still using the older TensorFlow Saver feature, so it may require upgrading to the new SavedModel stuff (shouldn't be too difficult, can look into that).
IbraheemNofal
@IbraheemNofal
I haven't worked with TF-Lite yet as I'm still working on building and refining my model, so I'll hopefully get back to you with any issues that might come along the way so we can hopefully work through them. Reading the tensorflow documentation, it seems like TF-Lite uses a proprietary .tflite format, but there is a tool to convert tensorflow's SavedModel to TF-Lite's .tflite format. I guess I'll have to look into upgrading from the saver feature to SavedModel for it to work. Thank you for your help and for pointing me in the right direction.
Nicolas Neubauer
@nneubauer
Hi there, another question. I finally got tensorflow to get running with GPU support and tensorforce as well. However, I am seeing zero benefit of using a GPU. One episode takes in my case 2 seconds with 66% of the time spent in observe, no matter if I run on GPU or CPU. I am positive GPU is really run on GPU as can be seen nvidia-smi. GPUs are 2x Tesla K80 on a cloud machine. Any hints what the problem might be?
(Btw, the more complex network setup worked very nicely).
Only one GPU utilized ~20% during running. Can I tune some params to get more utility out of the GPUs?
Alexander Kuhnle
@AlexKuhnle
@IbraheemNofal I can certainly help with the SavedModel stuff from Tensorforce-side, as this is on the roadmap anyway. I'll try to look into that in the next days.
@nneubauer 66% of the time spent in observe sounds a lot, are you sure? What agent are you using, and could you post the spec? Usually, I would expect the GPU to not be used very much because the interactions are dominated by many batchsize-1 act calls where a GPU is not particularly helpful (different from supervised learning, where this problem doesn't exist). But in your case this sounds different, however, depending on your agent spec, there might be a simple improvement
phurichai
@phurichai
@AlexKuhnle Thanks a lot for your suggestion. I tried what you suggested. I write a python file named "RLenv1.py", put an environment class called "Env1" there. Then in another python file in the same folder, I call
environment = Environment.create( environment='RLenv1.Env1', max_episode_timesteps=10)
and I still got an error "ModuleNotFoundError: No module named 'RLenv1'". Do you have any idea where the problem might be? Is there some complete example of custom environment file structure that I can just emulate and modify bit by bit?
@nneubauer Thanks for your suggestion --- I'm a complete newbie, how does one create and pass an instance in place of a custom environment file?
Alexander Kuhnle
@AlexKuhnle
@phurichai here is some explanation and here a corresponding unit-test. Does this look similar / can you run the latter (python -m unittest test/test_documentation.py in the main Tensorforce directory)
phurichai
@phurichai
Thanks @AlexKuhnle , I'm beginning to crack this!
Christian Hidber
@christianhidber
@AlexKuhnle I tried to figure why model saving depends on the call to tf.compat.v1.disable_eager_execution() during initialisation, but failed. Do you think you might have a chance to take a look at it ? Avoiding this call would be a big help to reintegrate tensorforce in our package. Thanks a lot in advance.
Alexander Kuhnle
@AlexKuhnle
@christianhidber I will try next week. There's something about initialization and TF2, I came across that before, but since it worked in the current setting, I didn't bother too much.
Niklas Nolte
@niklasnolte

Hello,
i wonder, is this also the right place to ask questions about tf-agents? If not, please ignore the following:

I am trying to run a simple dqn agent over an environment that has a collection of different specs as observation_spec. This means that my QNetwork wants a preprocessing_combiner, so i followed its documentation and gave it tf.keras.layers.Concatenate(). Concatenate does not like to concatenate tuples, so the observation_spec has to be a list. Next, i want to implement a replay_buffer, specifically a TFUniformReplayBuffer. Converting it to a dataset (as done in https://github.com/tensorflow/agents/blob/master/docs/tutorials/1_dqn_tutorial.ipynb), i get an error and the explicit instruction to change my specs from lists to tuples. However, then i run into the problem that Concatenate doesn't like it's input. Am i doing something conceptually wrong here?

Niklas Nolte
@niklasnolte
concat_layer = tf.keras.layers.Lambda(lambda x:tf.keras.layers.Concatenate()(list(x))) this is my solution for a QNetwork.preprocessing_combiner, but i'm not sure if that's a good idea
Alexander Kuhnle
@AlexKuhnle
hi Nicolas
^^ sorry, autocorrect, hi Niklas... this channel is for Tensorforce, which is an independent framework and unrelated to tf-agents.
Niklas Nolte
@niklasnolte
alright, thanks! sorry for the noise then
Alexander Kuhnle
@AlexKuhnle
no worries ;-)
Alexander Kuhnle
@AlexKuhnle
@STRATZ-Ken, if you're still interested in parallelized execution of environments, I've finished this feature now, and happy to talk through it.
Also @/all , FYI: parallelizing environments can now easily be done via the remote argument of Environment.create (either based on multiprocessing or socket) and the extended features of Runner / run.py, which was basically merged with the functionality of ParallelRunner (which itself was removed).
Christian Hidber
@christianhidber
@AlexKuhnle thanks !
Qiao.Zhang
@qZhang88
@AlexKuhnle Hi, I am reading the source code and got some questions. Does the PPO agent use the entire episode to train or batch_size of timesteps in one or more episodes? and there is a parameter subsampling fraction, the code shows that this will randomly sample a fraction from one batch? Is this because it is a modular design and PPO belongs to the policy gradient method, which use all timesteps in one episode to train? Thanks for answering my questions first
Alexander Kuhnle
@AlexKuhnle
Hi @qZhang88, hope the following explanation clarifies your question: PPO, as many other standard policy gradient algorithms, uses complete rollouts (episodes) for reward estimation. In Tensorforce this means that batch_size defines the number of episodes (each consisting of many timesteps) per update batch. Moreover, the way the PPO update works according to the paper is that it actually performs multiple updates based on randomly subsampled timestep-minibatches (the entire batch of n episodes is quite big). So the subsampling_fraction specifies what fraction of the full batch is subsampled for each minibatch, and optimization_steps specifies how often these mini-updates should happen.
B Walker
@bob7827
I'm new to TensorForce. I have a simple question. I've created a PPO agent. I can't save the agent (as I get a JSON error error regarding floating point). How can I access the TensorFlow model so I can use hd5 to save and load?
B Walker
@bob7827
I see AlexKuhnle pull request to fix the JSON-encoding problem.
Alexander Kuhnle
@AlexKuhnle
Currently, the only way to save the agent is to use TensorFlow's saver format, unfortunately. However, it's planned to be updated and extended soon, including support for just extracting weight variables in numpy format or similar -- is this what you're looking for? (The recent JSON-encoding problem was about agent config saving.)
(Re TensorFlow's saver format: I don't know exactly what the format looks like, but it contains weights as well as meta information about the graph itself, so that the entire model can be reconstructed from the information.)
B Walker
@bob7827
Alex, thanks for the good information. This is what I need.
Qiao.Zhang
@qZhang88
Notice the recent update about initial_internals(), what is that for? Did not get time to read the code, hope a quick help, thanks
Alexander Kuhnle
@AlexKuhnle
The new method is add
Alexander Kuhnle
@AlexKuhnle
...ed to address a problem when evaluating agents with internal states (using an RNN).
When evaluating, one only calls agent.act with the independent/evaluation flag set. In this case, one is now additionally required to provide an internals argument (and in return gets an additional result, the next internal states). On the first call in an episode, this is supposed to be the result of initial_internals, afterwards just the return of the previous act call. See for instance the evaluation example in the docs under getting started. I will add a bit more documentation as well.
(this can be ignored if no internal states are used)
Alexander Kuhnle
@AlexKuhnle
does this clarify the question?
Qiao.Zhang
@qZhang88
great! thanks a lot
IbraheemNofal
@IbraheemNofal

Hello,
so I'm currently attempting to get a DQN agent to work for my current solution and I'm finding a few things not entirely clear, so I have a couple of questions + an error that I'm getting.

The questions:

1) Does the DQN agent automatically update the weights at the end of each episode or do I have to manually call the Update() method?

2) Does the agent automatically store the state, action, reward it's given so we it can use that to train afterwards, or do I have to manually do it by storing them in a memory module and then use that for training?

The error I'm getting:

As for the error I'm getting, it's the following:

InvalidArgumentError (see above for traceback): assertion failed: [] [Condition x == y did not hold element-wise:] [x (agent.observe/strided_slice:0) = ] [407] [y (agent.observe/strided_slice_1:0) = ] [0]
[[node agent.observe/assert_equal_1/Assert/AssertGuard/Assert (defined at F:\ProgramFiles\Anaconda3\envs\Tensorforce\lib\site-packages\tensorforce\core\models\model.py:1094) ]]
[[{{node GroupCrossDeviceControlEdges_0/agent.observe/agent.core_observe/agent.core_experience/estimator.enqueue/assert_equal/Assert/AssertGuard/Assert/data_4}}]]

Opening the model.py file, the error seems to occur at the following stage:

size of terminal equals buffer index

        tf.debugging.assert_equal(
            x=tf.shape(input=terminal, out_type=tf.int64)[0],
            y=tf.dtypes.cast(x=self.buffer_index[parallel], dtype=tf.int64)
        ),