These are chat archives for ethersphere/orange-lounge

30th
Sep 2017
lash
@nolash
Sep 30 2017 08:52
@zelig could you send me healthy snapshots of 64 and 128 nodes please? I think the ones I make of that size are bogus.
(or someone)
Viktor Trón
@zelig
Sep 30 2017 09:15
yes
i pm'd you btw :)
Aron
@homotopycolimit
Sep 30 2017 10:08

Hey everyone, please excuse this plug for Colony, but I just wanted to post this .... after many months of work, we've finally released our whitepaper! :)
Why Colony: https://blog.colony.io/why-colony-2a1e479dc40d
The Whitepaper: https://blog.colony.io/the-colony-whitepaper-502a7b5722b2
Why no ICO: https://blog.colony.io/the-colony-token-sale-7ac14c845bc0

feedback always welcome.

...and of course the whitepaper is on swarm at whitepaper.joincolony.eth

^ log + core dump (161M) of hanging test, when running pss test with 16 nodes and 4096 msgs. Freezes after around 1200 msgs. I'm wondering whether the deadlock is in the p2p layer. Have we made stress tests for p2p yet?
Viktor Trón
@zelig
Sep 30 2017 12:25
oh dear
did you switch discovery off yes?
lash
@nolash
Sep 30 2017 14:08
yes no discovery
eh .. yes, I've turned off discovery :)
lash
@nolash
Sep 30 2017 15:21
@zelig have a look now? ethersphere/go-ethereum@e3871bd
(pls)
lash
@nolash
Sep 30 2017 15:34
the concurrency makes the deadlock occur more frequent, it seems
the send concurrency. it was to be expected.
lash
@nolash
Sep 30 2017 18:35
INFO [09-30|20:24:12] network test nodecount=32 msgcount=512 addrhint size=2 caller=pss_test.go:499 TRACE[09-30|20:24:12] node fcdcb9c138c3bdc51b6280a1224419097c4fb188867ca381ac9b92ab08f9fdef0c6267d9ce9fa53944e39cf1b76ce6bf3b17272f97a457649bff6a838eccbf69 created caller=network.go:135 TRACE[09-30|20:24:12] starting node fcdcb9c138c3bdc51b6280a1224419097c4fb188867ca381ac9b92ab08f9fdef0c6267d9ce9fa53944e39cf1b76ce6bf3b17272f97a457649bff6a838eccbf69: false using exec-adapter caller=network.go:191 t=2017-09-30T20:24:12+0200 lvl=crit msg="error decoding _P2P_NODE_CONFIG" err="json: cannot unmarshal object into Go struct field Config.Logger of type log.Logger" ERROR[09-30|20:24:22] node failed to start err="timed out waiting for WebSocket address on stderr" caller=exec.go:167 WARN [09-30|20:24:22] start up failed: timed out waiting for WebSocket address on stderr caller=network.go:193
@lmars trying to start ExecAdapter - seems to want Websocket address from stderr somehow, don't quite understand how it's supposed to work? ^
Lewis Marshall
@lmars
Sep 30 2017 18:37
ok, the real error is this one:
error decoding _P2P_NODE_CONFIG" err="json: cannot unmarshal object into Go struct field Config.Logger of type log.Logger
so the node is failing to start, and then the simulation fails because the node didn't print its WebSocket address on stderr (because it in fact did not start)
so it is the addition of the logger in ethereum/go-ethereum#15198 that is the issue, we probably need a test of the ExecAdpater on master
so the logger needs to be assigned in the child process rather than in the simulation (i.e. this https://github.com/ethersphere/go-ethereum/blob/p2p-simulations-fixes/p2p/simulations/adapters/exec.go#L107)
lash
@nolash
Sep 30 2017 18:47
ok
Lewis Marshall
@lmars
Sep 30 2017 18:48
it would be great if you could push that fix to the PR (and look at the other compilation issues :grin:) ethereum/go-ethereum#15198
oh there is no compilation issue actually, just a failing test
lash
@nolash
Sep 30 2017 18:53
I'll try to whack it together. I'm going to look at the netpipe too. There be deadlocks in the pss tests still, and I don't think it's the pss itself :/
Lewis Marshall
@lmars
Sep 30 2017 18:53
ah ok :/
Viktor Trón
@zelig
Sep 30 2017 19:04
is it not simply the net.Pipe that is the problem? I recommended to @nolash to try out a full implementation of net.Pipe with SetWriteDeadline and see if that solves the issue. Any insight?
Lewis Marshall
@lmars
Sep 30 2017 19:05
the ExecAdapter uses real connections with deadlines so it's worth checking if that works
Viktor Trón
@zelig
Sep 30 2017 19:05
yep
that is how we landed there ;)
instead of the simple net.Pipe I really wouldnt mind using a proper buffered pipe of some sort and go back on the forced async sends I think it is really annoying to have to use it only because of the simulation
Lewis Marshall
@lmars
Sep 30 2017 19:07
ok, so we could use os.Pipe instead
Viktor Trón
@zelig
Sep 30 2017 19:08
oh is it that simple?
does that implement net.Conn?
Lewis Marshall
@lmars
Sep 30 2017 19:09
they are os.File so you need to wrap them with net.FileConn: https://golang.org/pkg/net/#FileConn
I have no idea if it works on Windows :wink:
Viktor Trón
@zelig
Sep 30 2017 19:09
oh but then we run into the too many file descriptors problem no?
windows i dont care about
Lewis Marshall
@lmars
Sep 30 2017 19:09
but on Linux it uses socketpair(2)
you will indeed have lots of descriptors
Viktor Trón
@zelig
Sep 30 2017 19:10
we already have a problem with pss actually
Lewis Marshall
@lmars
Sep 30 2017 19:10
but you can raise that, at least on Linux
Viktor Trón
@zelig
Sep 30 2017 19:11
if we use 256 nodes it fails cos the each simnode uses its own dpa cache ie a database...
how can we raise it?
Lewis Marshall
@lmars
Sep 30 2017 19:11
ok, that seems like an easy one to fix :)
with ulimit
Viktor Trón
@zelig
Sep 30 2017 19:12
genious marshall
thanks
Viktor Trón
@zelig
Sep 30 2017 19:13
@nolash copy?
Lewis Marshall
@lmars
Sep 30 2017 19:13
as for pss, can they not use one cache with a namespace or something?
Viktor Trón
@zelig
Sep 30 2017 19:13
exactly what i am planning with mock db for chunks
see the roadmap ;)
Lewis Marshall
@lmars
Sep 30 2017 19:14
:+1:
Viktor Trón
@zelig
Sep 30 2017 19:14
i want syncing tests to be able to run as if normal totally without passing chunks
actually i was thinking not to use namespace but an extra structure in the db using a bitvector to show which node is supposed to have the chunk
that way snapshoting etc is much leaner
although it wont allow concurrent retrievals ...
thoughts?
Aron
@homotopycolimit
Sep 30 2017 20:18
updated theswarm.eth
http://30402.swarm-gateways.net/bzz:/3e6083fa1d228326572f40714134a2a5a241ed77fee306c921ed203ac2d1268c/
If I don't hear objections I'll update the ENS entry to this ^