dependabot[bot] on pip
Bump mistune from 0.8.4 to 2.0.… (compare)
dependabot[bot] on pip
Bump tensorflow from 2.8.0 to 2… (compare)
dependabot[bot] on pip
AlexKuhnle on master
Bump tensorflow from 2.7.0 to 2… Merge pull request #855 from te… (compare)
dependabot[bot] on pip
dependabot[bot] on pip
Bump tensorflow from 2.7.0 to 2… (compare)
dependabot[bot] on pip
Bump tensorflow from 2.7.0 to 2… (compare)
AlexKuhnle on master
Correct type (compare)
AlexKuhnle on master
Add missing box2d dependency (compare)
AlexKuhnle on master
Downgrade numpy version for Py3… (compare)
AlexKuhnle on master
Update to TF 2.7, update depend… (compare)
AlexKuhnle on master
Update setup and travis config (compare)
AlexKuhnle on master
make states ArrayDict to pass a… Merge pull request #849 from dx… (compare)
dependabot[bot] on pip
AlexKuhnle on master
Bump tensorflow from 2.6.0 to 2… Merge pull request #840 from te… (compare)
dependabot[bot] on pip
Bump tensorflow from 2.6.0 to 2… (compare)
AlexKuhnle on master
Update gym version requirement (compare)
AlexKuhnle on master
fix ZeroDivisionError in the pr… Merge pull request #836 from hr… (compare)
AlexKuhnle on master
Fix environments unittest (compare)
AlexKuhnle on master
Update gym dependency and fix e… (compare)
/site-packages/tensorflow/python/framework/indexed_slices.py:433: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
name + '_mask'
from the state spec where name
is any name found in the action_spec. It does this recursively using fmap(), calling this function: # Separate auxiliaries
def function(name, spec):
auxiliary = ArrayDict()
if self.config.enable_int_action_masking and spec.type == 'int' and \
spec.num_values is not None:
if name is None:
name = 'action'
# Mask, either part of states or default all true
auxiliary['mask'] = states.pop(name + '_mask', np.ones(
shape=(num_parallel,) + spec.shape + (spec.num_values,), dtype=spec.np_type()
))
return auxiliary
However, when I run it I get a KeyError exception, where the key is a root name from action space. I instrumented the line from nested_dict.pop() that threw the error, like so:
3 elif '/' in key:
2 key, subkey = key.split('/', 1)
1 if not key in self:
371 print(f"pop {key} {subkey}")
1 import pprint
2 pprint.pprint(self)
3 value = super().__getitem__(key)
4 assert isinstance(value, self.__class__)
5 return value.pop(subkey, default)
This is what it printed:pop MID_1_counter_0 promise_date_mask
{'MID_1_bid_0/price': array([1.15]),
'MID_1_bid_0/promise_date': array([4]),
'MID_1_bid_0/qty': array([44135.2]),
'MID_1_bid_0/supplier_tier': array([0]),
...
value = super().__getitem__(key)
Converting sparse IndexedSlices to a dense Tensor of unknown shape.
comes up if you use embeddings (used by "auto" network if state is int
), and maybe in other situations as well. I've read a bit about it a while ago, and it doesn't seem to be critical, if e.g. the number of embeddings (num_values
of int
state) is reasonable. Model initialization may take a while if the network is bigger -- is this the case for you?
self.config.enable_int_action_masking
to False, but I don't see a way to do that... The config object explicitly overrides __set_attr__
and I wasn't able to pass it as a constructor arg to the agent. So, what's the right way to do that?
enable_int_action_masking
can be done via the config
argument of any agent (docs here). That should hopefully work.
NestedDict
-- would you mind posting the shape exception, since I don't know why that would come up?
File "train_rl_agent.py", line 77, in run_agent
runner.run(num_episodes=sim_config["train_episodes"])
File "/home/hinrichs/build/tensorforce/tensorforce/execution/runner.py", line 545, in run
self.handle_act(parallel=n)
File "/home/hinrichs/build/tensorforce/tensorforce/execution/runner.py", line 579, in handle_act
actions = self.agent.act(states=self.states[parallel], parallel=parallel)
File "/home/hinrichs/build/tensorforce/tensorforce/agents/agent.py", line 388, in act
deterministic=deterministic
File "/home/hinrichs/build/tensorforce/tensorforce/agents/recorder.py", line 267, in act
num_parallel=num_parallel
File "/home/hinrichs/build/tensorforce/tensorforce/agents/agent.py", line 415, in fn_act
states = self.states_spec.to_tensor(value=states, batched=True, name='Agent.act states')
File "/home/hinrichs/build/tensorforce/tensorforce/core/utils/tensors_spec.py", line 57, in to_tensor
value=value[name], batched=batched, recover_empty=recover_empty
File "/home/hinrichs/build/tensorforce/tensorforce/core/utils/tensors_spec.py", line 57, in to_tensor
value=value[name], batched=batched, recover_empty=recover_empty
File "/home/hinrichs/build/tensorforce/tensorforce/core/utils/tensor_spec.py", line 149, in to_tensor
raise TensorforceError.value(name=name, argument='value', value=value, hint='shape')
tensorforce.exception.TensorforceError: Invalid value for TensorSpec.to_tensor argument value: 0 shape.
@AlexKuhnle I've figured out what's happening, but I don't fully understand the cause. I instrumented the code where the exception is raised like so:
3 # Check whether shape matches
2 if value.shape[int(batched):] != self.shape:
1 print(f"\nvalue {value} type {type(value)} batched {batched}")
150 print(f"value shape {value.shape[int(batched):]} self shape {self.shape}")
1 import pprint
2 pprint.pprint(self)
3 pprint.pprint(value)
4 raise TensorforceError.value(name=name, argument='value', value=value, hint='shape')
and this is what it prints:
value [0] type <class 'numpy.ndarray'>
value shape () self shape (1,)
TensorSpec(type=int, shape=(1,), num_values=4)
array([0])
The reason is that batched
is True, but the value shape doesn't have a batch dimension.
batch_size
from the agent params, but it tells me that one is required.
OSError: [Errno 12] Cannot allocate memory
. I also note that when I first instantiate the agent, it takes about 3 minutes to complete the Agent.create()
process. So far I've been using network='auto'
. I printed out the state and action spec, and my state-space has a total of 59 scalar components, and the action space has a total of 80.
In execution.Runner
, the constructor takes an argument evaluation
, and the comment says that if it there are multiple environments it will run the last one only, but in Runner.run()
it raises an exception like so -
1 if evaluation and (self.evaluation or len(self.environments) > 1):
419 raise TensorforceError.unexpected()
.. and the comment says that it is an error to pass evaluation = True
with multiple environments.
Which behavior is intended to be the standard? As it is, run()
controls because you have to call it. Also, why would it throw an error if evaluation
and self.evaluation
are both True?
Runner.run() can be used multiple times in the "standard" use case of a single environment, in particular training
runner.run(num_episodes=???)and subsequent evaluation
runner.run(num_episodes=???, evaluation=True)(that's the
run()`evaluation
argument); on the other hand, it provides an interface to parallel execution, but in that case you can't just switch from training run()
to evaluation run()
-- however, you can specify that one of the parallel environments is used for evaluation throughout (that's the constructor evaluation
argument).
run()
arguments have not been separated in a principled way for a long time, and really a Runner
should probably be a one-off specification of a run which cannot be re-used. Maybe make Runner.run()
a static method and put all arguments there.
File "socket_server.py", line 72, in handle_biotech
max_episode_timesteps = weeks + 1
File "/home/ubuntu/tensorforce/tensorforce/execution/runner.py", line 151, in __init__
remote=remote, blocking=blocking, host=host[0], port=port[0]
File "/home/ubuntu/tensorforce/tensorforce/environments/environment.py", line 139, in create
environment=environment, max_episode_timesteps=max_episode_timesteps
File "/home/ubuntu/tensorforce/tensorforce/environments/environment.py", line 344, in __init__
print(f"environment.max_episode_timesteps() {environment.max_episode_timesteps()}")
TypeError: <lambda>() missing 1 required positional argument: 'self'
tensorforce.Unexpected
, it's a TypeError
of lambda missing a self
arg. The reason is that immediately below, the Runner checks whether the environment's max_episode_timesteps()
returns None, and if so it equips it with a lambda that takes a self argument, just like a class method. But, since it's a lambda, it doesn't actually get passed that self argument. I changed that line to look like this, and it fixed the problem:352 if self._environment.max_episode_timesteps() is None:
353 self._environment.max_episode_timesteps = (lambda : max_episode_timesteps)
num_values
), and if you look at your action space as a product space, (2,2,4) x 5 x (10,3,4) x 2 x (10,2,4,4) x 2
, that's of course gigantic. However, Tensorforce splits this space into its factors, so it's (2,2,4)
actions with 5
options, and (10,3,4) + (10,2,4,4)
actions with two options, and so each individual action is actually quite "low-dimensional", and this factorization works well if the actions are not correlated in very complex ways -- which presumably they aren't. I'm not sure how common such a factorization is for other frameworks, but I would be surprised if this is very uncommon. Anyway, that's the only additional feature I can think of in Tensorforce which is beneficial in such a context (and this factorization may go particularly well with the "dueling" part, but not sure).