Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • Jan 15 08:49

    AlexKuhnle on master

    Add not-maintained message to r… (compare)

  • Nov 21 2022 20:42

    dependabot[bot] on pip

    (compare)

  • Nov 21 2022 20:42

    dependabot[bot] on pip

    Bump tensorflow from 2.8.0 to 2… (compare)

  • Jul 29 2022 23:31

    dependabot[bot] on pip

    Bump mistune from 0.8.4 to 2.0.… (compare)

  • May 24 2022 17:30

    dependabot[bot] on pip

    Bump tensorflow from 2.8.0 to 2… (compare)

  • Feb 10 2022 08:43

    dependabot[bot] on pip

    (compare)

  • Feb 10 2022 08:43

    AlexKuhnle on master

    Bump tensorflow from 2.7.0 to 2… Merge pull request #855 from te… (compare)

  • Feb 09 2022 23:35

    dependabot[bot] on pip

    (compare)

  • Feb 09 2022 23:35

    dependabot[bot] on pip

    Bump tensorflow from 2.7.0 to 2… (compare)

  • Feb 09 2022 23:28

    dependabot[bot] on pip

    Bump tensorflow from 2.7.0 to 2… (compare)

  • Jan 08 2022 21:53

    AlexKuhnle on master

    Correct type (compare)

  • Jan 08 2022 21:41

    AlexKuhnle on master

    Add missing box2d dependency (compare)

  • Jan 08 2022 16:56

    AlexKuhnle on master

    Downgrade numpy version for Py3… (compare)

  • Jan 08 2022 16:51

    AlexKuhnle on master

    Update to TF 2.7, update depend… (compare)

  • Jan 03 2022 16:15

    AlexKuhnle on master

    Update setup and travis config (compare)

  • Dec 29 2021 14:54

    AlexKuhnle on master

    make states ArrayDict to pass a… Merge pull request #849 from dx… (compare)

  • Nov 10 2021 20:00

    dependabot[bot] on pip

    (compare)

  • Nov 10 2021 20:00

    AlexKuhnle on master

    Bump tensorflow from 2.6.0 to 2… Merge pull request #840 from te… (compare)

  • Nov 10 2021 19:45

    dependabot[bot] on pip

    Bump tensorflow from 2.6.0 to 2… (compare)

  • Oct 20 2021 20:50

    AlexKuhnle on master

    Update gym version requirement (compare)

Alexander Kuhnle
@AlexKuhnle
The error message you posted above seems to suggest that the value you pass in is of shape (1440,) instead of (1, 1440), is that possible?
Tensorforce is not very forgiving with shapes, as you may be used from numpy or so, since this sometimes leads to subtle errors.
Nicolas Neubauer
@nneubauer
@AlexKuhnle I made a more or less minimal example, so that might be easier as your help is very appreciated: https://gist.github.com/nneubauer/fb1dbb4e95b01cb643bf0eb26226c2d2 To my eyes, it should be alright but flatten didn't help.
Alexander Kuhnle
@AlexKuhnle
@STRATZ-Ken, just to let you know: I think I will finish the parallel stuff over the next days, hopefully, and would then be happy to talk you through it.
@nneubauer will look at it now.
Alexander Kuhnle
@AlexKuhnle
@nneubauer, the following environment implementation should work:
class ForexEnvironment(Environment):

    def __init__(self):
        super().__init__()
        self.data_points = 1440

    def states(self):
        return dict(type='float', shape=(1, self.data_points))

    def actions(self):
        return dict(type='int', num_values=3)

    def current_state(self):
        random_state = np.random.uniform(size=(1, self.data_points))
        return random_state

    def reset(self):
        return self.current_state()

    def execute(self, actions):
        terminal = False
        next_state = self.current_state()
        reward = 1.0
        return next_state, terminal, reward
Moreover:
    environment = Environment.create(environment=ForexEnvironment, max_episode_timesteps=1000)

    agent = Agent.create(
        agent='dqn', environment=environment,
        seed=8, memory=5000,
        network=dict(type='layered', layers=[
            dict(type='conv1d', size=8),
            dict(type='flatten'),
            dict(type='linear', size=8)
        ])
    )
differences:
  • size of np.random.unifom includes the 1 axis
  • transposing the state should not be necessary (?)
  • memory has to be explicitly set now (new change, since the old default was a bit random)'
  • environment created via Environment.create, and max episode length set there (right now necessary, but i'm planning to look into that soon)
  • conv1d_transpose doesn't work properly in the new TensorFlow, some weird gradient exception, so I removed it
Alexander Kuhnle
@AlexKuhnle
I think that's all.
Nicolas Neubauer
@nneubauer
Thanks, I got it to work this way. I am wondering however, shouldn't the shape be (1440, 1) (the keras documentation says "e.g. input_shape=(10, 128) for time series sequences of 10 time steps with 128 features per step" in my case I have 1440 timesteps with 1 feature?
Alexander Kuhnle
@AlexKuhnle
Right, I was wondering about this as well. It's better to change it, since the convolution is applied over the first dimension, and the feature size / second dimension should start off being 1.
(presumably that's where the transpose came from, but you can just swap it everywhere.)
Nicolas Neubauer
@nneubauer
Right, thanks. Another question. Is there any documentation on how multiple states are fed to the neural net? Assuming I have eg. an image which I want to feed to the convolutional net and at the same time some other features (eg. the size of the image in bytes or the time of the last edit) and I want to include them but not run trough the conv layers, but feed them eg. to the dense layer(s) at the end of the network?
Alexander Kuhnle
@AlexKuhnle
On the one hand, there is the 'auto' network, which provides a "default" of combining multiple inputs. The way generally such networks can be specified is via the Register and Retrieve layers here, and by specifying the network not as a list of layers, but as a list of lists where each list generally starts with a retrieve layer and ends with a register layer, so constitute a sequential "segment" of the overall network.
There used to be examples in the unittests, but good point, they are all gone by now due to the auto network. Would be worth adding this to the documentation, will look into it.
Alexander Kuhnle
@AlexKuhnle
(It may also help to have a look at the auto network class.)
Nicolas Neubauer
@nneubauer
Thank, I will look into that.
IbraheemNofal
@IbraheemNofal
Hello there, so I've been recently experimenting with Tensorforce and I've been pretty impressed with how easy and convenient it is to work with. One thing
One thing I can't figure out how to do is how to build the model using tensorflow or tensorflow-lite after training and saving it. Could anyone please help me understand whether that's possible?
Alexander Kuhnle
@AlexKuhnle
Hi. Thanks for the feedback. :-) Regarding your question: Saving and loading an agent can be done via agent.save(...) and Agent.load(...) (see docs). Is that what you're looking for, or what more specifically do you mean by "building the model using tf or tf-lite after training"?
IbraheemNofal
@IbraheemNofal
Thanks for the reply. Now I know that tensorforce is built on top of tensorflow, correct? In my case I'd like to deploy / test my model on a low powered IOT device after training. Now I know that tf-lite would perform best in such a case, and that typically, it's possible to convert tensorflow models to tf-lite, so I'm wondering whether the trained model would be tensorflow or tensorflow-lite compatible, whether it'd be possible to translate it to tf-lite. I'm a beginner when it comes to machine learning, so I hope what I'm asking about makes sense.
Alexander Kuhnle
@AlexKuhnle
I haven't worked with tf-lite or similar myself, but it should be possible and I would be interested in working through it. Do you know what format is typically used to load models in tf-lite / how loading in tf-lite generally looks like? Is it SavedModel and specialized loading functionality, or can you just create and load it regularly in Python code? Tensorforce is currently still using the older TensorFlow Saver feature, so it may require upgrading to the new SavedModel stuff (shouldn't be too difficult, can look into that).
IbraheemNofal
@IbraheemNofal
I haven't worked with TF-Lite yet as I'm still working on building and refining my model, so I'll hopefully get back to you with any issues that might come along the way so we can hopefully work through them. Reading the tensorflow documentation, it seems like TF-Lite uses a proprietary .tflite format, but there is a tool to convert tensorflow's SavedModel to TF-Lite's .tflite format. I guess I'll have to look into upgrading from the saver feature to SavedModel for it to work. Thank you for your help and for pointing me in the right direction.
Nicolas Neubauer
@nneubauer
Hi there, another question. I finally got tensorflow to get running with GPU support and tensorforce as well. However, I am seeing zero benefit of using a GPU. One episode takes in my case 2 seconds with 66% of the time spent in observe, no matter if I run on GPU or CPU. I am positive GPU is really run on GPU as can be seen nvidia-smi. GPUs are 2x Tesla K80 on a cloud machine. Any hints what the problem might be?
(Btw, the more complex network setup worked very nicely).
Only one GPU utilized ~20% during running. Can I tune some params to get more utility out of the GPUs?
Alexander Kuhnle
@AlexKuhnle
@IbraheemNofal I can certainly help with the SavedModel stuff from Tensorforce-side, as this is on the roadmap anyway. I'll try to look into that in the next days.
@nneubauer 66% of the time spent in observe sounds a lot, are you sure? What agent are you using, and could you post the spec? Usually, I would expect the GPU to not be used very much because the interactions are dominated by many batchsize-1 act calls where a GPU is not particularly helpful (different from supervised learning, where this problem doesn't exist). But in your case this sounds different, however, depending on your agent spec, there might be a simple improvement
phurichai
@phurichai
@AlexKuhnle Thanks a lot for your suggestion. I tried what you suggested. I write a python file named "RLenv1.py", put an environment class called "Env1" there. Then in another python file in the same folder, I call
environment = Environment.create( environment='RLenv1.Env1', max_episode_timesteps=10)
and I still got an error "ModuleNotFoundError: No module named 'RLenv1'". Do you have any idea where the problem might be? Is there some complete example of custom environment file structure that I can just emulate and modify bit by bit?
@nneubauer Thanks for your suggestion --- I'm a complete newbie, how does one create and pass an instance in place of a custom environment file?
Alexander Kuhnle
@AlexKuhnle
@phurichai here is some explanation and here a corresponding unit-test. Does this look similar / can you run the latter (python -m unittest test/test_documentation.py in the main Tensorforce directory)
phurichai
@phurichai
Thanks @AlexKuhnle , I'm beginning to crack this!
Christian Hidber
@christianhidber
@AlexKuhnle I tried to figure why model saving depends on the call to tf.compat.v1.disable_eager_execution() during initialisation, but failed. Do you think you might have a chance to take a look at it ? Avoiding this call would be a big help to reintegrate tensorforce in our package. Thanks a lot in advance.
Alexander Kuhnle
@AlexKuhnle
@christianhidber I will try next week. There's something about initialization and TF2, I came across that before, but since it worked in the current setting, I didn't bother too much.
Niklas Nolte
@niklasnolte

Hello,
i wonder, is this also the right place to ask questions about tf-agents? If not, please ignore the following:

I am trying to run a simple dqn agent over an environment that has a collection of different specs as observation_spec. This means that my QNetwork wants a preprocessing_combiner, so i followed its documentation and gave it tf.keras.layers.Concatenate(). Concatenate does not like to concatenate tuples, so the observation_spec has to be a list. Next, i want to implement a replay_buffer, specifically a TFUniformReplayBuffer. Converting it to a dataset (as done in https://github.com/tensorflow/agents/blob/master/docs/tutorials/1_dqn_tutorial.ipynb), i get an error and the explicit instruction to change my specs from lists to tuples. However, then i run into the problem that Concatenate doesn't like it's input. Am i doing something conceptually wrong here?

Niklas Nolte
@niklasnolte
concat_layer = tf.keras.layers.Lambda(lambda x:tf.keras.layers.Concatenate()(list(x))) this is my solution for a QNetwork.preprocessing_combiner, but i'm not sure if that's a good idea
Alexander Kuhnle
@AlexKuhnle
hi Nicolas
^^ sorry, autocorrect, hi Niklas... this channel is for Tensorforce, which is an independent framework and unrelated to tf-agents.
Niklas Nolte
@niklasnolte
alright, thanks! sorry for the noise then
Alexander Kuhnle
@AlexKuhnle
no worries ;-)
Alexander Kuhnle
@AlexKuhnle
@STRATZ-Ken, if you're still interested in parallelized execution of environments, I've finished this feature now, and happy to talk through it.
Also @/all , FYI: parallelizing environments can now easily be done via the remote argument of Environment.create (either based on multiprocessing or socket) and the extended features of Runner / run.py, which was basically merged with the functionality of ParallelRunner (which itself was removed).
Christian Hidber
@christianhidber
@AlexKuhnle thanks !
Qiao.Zhang
@qZhang88
@AlexKuhnle Hi, I am reading the source code and got some questions. Does the PPO agent use the entire episode to train or batch_size of timesteps in one or more episodes? and there is a parameter subsampling fraction, the code shows that this will randomly sample a fraction from one batch? Is this because it is a modular design and PPO belongs to the policy gradient method, which use all timesteps in one episode to train? Thanks for answering my questions first
Alexander Kuhnle
@AlexKuhnle
Hi @qZhang88, hope the following explanation clarifies your question: PPO, as many other standard policy gradient algorithms, uses complete rollouts (episodes) for reward estimation. In Tensorforce this means that batch_size defines the number of episodes (each consisting of many timesteps) per update batch. Moreover, the way the PPO update works according to the paper is that it actually performs multiple updates based on randomly subsampled timestep-minibatches (the entire batch of n episodes is quite big). So the subsampling_fraction specifies what fraction of the full batch is subsampled for each minibatch, and optimization_steps specifies how often these mini-updates should happen.
B Walker
@bob7827
I'm new to TensorForce. I have a simple question. I've created a PPO agent. I can't save the agent (as I get a JSON error error regarding floating point). How can I access the TensorFlow model so I can use hd5 to save and load?
B Walker
@bob7827
I see AlexKuhnle pull request to fix the JSON-encoding problem.
Alexander Kuhnle
@AlexKuhnle
Currently, the only way to save the agent is to use TensorFlow's saver format, unfortunately. However, it's planned to be updated and extended soon, including support for just extracting weight variables in numpy format or similar -- is this what you're looking for? (The recent JSON-encoding problem was about agent config saving.)
(Re TensorFlow's saver format: I don't know exactly what the format looks like, but it contains weights as well as meta information about the graph itself, so that the entire model can be reconstructed from the information.)