These are chat archives for ethersphere/orange-lounge

30th
Mar 2018
lash
@nolash
Mar 30 09:49
@nonsense where u at wit dis dawg? ethersphere/go-ethereum#346
are you working through easter?
lash
@nolash
Mar 30 09:55
@MistokenToken I think you want https://gitter.im/ethereum/swarm
Anton Evangelatov
@nonsense
Mar 30 10:42
@nolash yes - it looks like travis on macOS go 1.10 is very slow
i can’t reproduce these macOS failures locally at all (and I am on an 2011 late macbook pro machine)
first they were failing with too many open files error, now they are just failing...
i’m thinking of just disabling them for macOS for the time being
we also have a failure - timer_test.go:51: tm.Max(): 35e6 > 110571792 || 110571792 > 95e6, which means that the machine is quite slow
lash
@nolash
Mar 30 10:46
@nonsense MacOS builds on travis were a problem all through the make-travis-green-again push, I remember.
So the rest is working now?
Anton Evangelatov
@nonsense
Mar 30 10:46
yes
lash
@nolash
Mar 30 10:46
I'll give it a spin!
Anton Evangelatov
@nonsense
Mar 30 10:48
yes, try it locally
ultimately the problem with the 9-10 messages missing was the startup time. after this was fixed (with a sleep), the problem is that we reuse some of the kademlia components from newServices
lash
@nolash
Mar 30 10:50
Brilliant!
It works :)
I think I ran the network tests separately, and they still failed. Which means the first test would be one of the ones that failed, with a fresh kademlia
Anton Evangelatov
@nonsense
Mar 30 10:52
on my end, the tests were passing in isolation
but were consistently failing when ran together
which makes sense, because i manually did the snapshots for 2 3 4 nodes
and they contain the same node ids, etc.
lash
@nolash
Mar 30 10:53
right
So, great work on finding this.
Is there a way to make tests contingent on architecture?
Or ... is there a different way to vet the MacOS builds? Actually, maybe we should talk to Travis about this, because they've been an issue for awhile it seems
Balint Gabor
@gbalint
Mar 30 10:56
@nonsense @janos @nolash FYI I merged the encryption PR, if you still find any problems I will fix them on the fly.
lash
@nolash
Mar 30 10:56
@gbalint congrats!
Janoš Guljaš
@janos
Mar 30 10:56
@gbalint :+1:
Balint Gabor
@gbalint
Mar 30 10:57
thanks :)
lash
@nolash
Mar 30 11:06
oooh lots of conflicts with the validator branch %)
Balint Gabor
@gbalint
Mar 30 11:07
@nolash tell me if i can help...
lash
@nolash
Mar 30 11:18
@gbalint thanks I think it's ok
I'll probably need your eyes on the PR after the rebase tho
Balint Gabor
@gbalint
Mar 30 11:42
@nolash :+1:
lash
@nolash
Mar 30 11:44
@nonsense there are still races in pretty much every pss test
(I must confess I've never used this -race switch before)
Anton Evangelatov
@nonsense
Mar 30 11:52
ahh damn
i did run them with -race a few months back
and i fixed all the issues it found back then
let me run it again and see what it reports
these tests are really nasty :D they pass and fail in such a weird way
lash
@nolash
Mar 30 11:56
So under no circumstances should races (according to go tools) be allowed to exist, even in contrived tests like these?
Anton Evangelatov
@nonsense
Mar 30 11:57
well it is very difficult to reason about whether this race may occur in real-world circumstances or not
if you don’t have a good intuition explaining that it WON’T occur, then we should fix it
lash
@nolash
Mar 30 11:58
Maybe we could do a hangout later today to look at this?
Anton Evangelatov
@nonsense
Mar 30 12:01
WARNING: DATA RACE
Write at 0x00c4202a40a8 by goroutine 190:
  github.com/ethereum/go-ethereum/swarm/pss.(*Pss).enqueue()
      /Users/nonsense/code/src/github.com/ethereum/go-ethereum/swarm/pss/pss.go:694 +0x4fd
  github.com/ethereum/go-ethereum/swarm/pss.(*Pss).send()
      /Users/nonsense/code/src/github.com/ethereum/go-ethereum/swarm/pss/pss.go:666 +0xba1
  github.com/ethereum/go-ethereum/swarm/pss.(*Pss).SendAsym.func1()
      /Users/nonsense/code/src/github.com/ethereum/go-ethereum/swarm/pss/pss.go:613 +0x137

Previous read at 0x00c4202a40a8 by goroutine 59:
  github.com/ethereum/go-ethereum/swarm/pss.(*Pss).dequeue()
      /Users/nonsense/code/src/github.com/ethereum/go-ethereum/swarm/pss/pss.go:699 +0x5d
  github.com/ethereum/go-ethereum/swarm/pss.(*Pss).Start.func2()
      /Users/nonsense/code/src/github.com/ethereum/go-ethereum/swarm/pss/pss.go:186 +0x15a
this is def relevant
it is not the slice, but a counter
i am not sure a hangout would help, i am pretty much out of ideas at this point, have to go thoroughly through the current state of the code once again i guess
WARNING: DATA RACE
Write at 0x00c4201dc920 by goroutine 207:
  github.com/ethereum/go-ethereum/swarm/storage.(*PyramidChunker).incrementWorkerCount()
      /Users/nonsense/code/src/github.com/ethereum/go-ethereum/swarm/storage/pyramid.go:156 +0xa0
  github.com/ethereum/go-ethereum/swarm/storage.(*PyramidChunker).prepareChunks()
      /Users/nonsense/code/src/github.com/ethereum/go-ethereum/swarm/storage/pyramid.go:382 +0x174
  github.com/ethereum/go-ethereum/swarm/storage.(*PyramidChunker).Split()
      /Users/nonsense/code/src/github.com/ethereum/go-ethereum/swarm/storage/pyramid.go:183 +0x436
  github.com/ethereum/go-ethereum/swarm/storage.(*DPA).Store()
      /Users/nonsense/code/src/github.com/ethereum/go-ethereum/swarm/storage/dpa.go:104 +0xb3
  github.com/ethereum/go-ethereum/swarm/pss.(*Pss).storeMsg()
      /Users/nonsense/code/src/github.com/ethereum/go-ethereum/swarm/pss/pss.go:834 +0x1bb
  github.com/ethereum/go-ethereum/swarm/pss.(*Pss).enqueue()
      /Users/nonsense/code/src/github.com/ethereum/go-ethereum/swarm/pss/pss.go:678 +0x14a
  github.com/ethereum/go-ethereum/swarm/pss.(*Pss).send()
      /Users/nonsense/code/src/github.com/ethereum/go-ethereum/swarm/pss/pss.go:666 +0xba1
  github.com/ethereum/go-ethereum/swarm/pss.(*Pss).SendAsym.func1()
      /Users/nonsense/code/src/github.com/ethereum/go-ethereum/swarm/pss/pss.go:613 +0x137

Previous read at 0x00c4201dc920 by goroutine 193:
  github.com/ethereum/go-ethereum/swarm/storage.(*PyramidChunker).prepareChunks()
      /Users/nonsense/code/src/github.com/ethereum/go-ethereum/swarm/storage/pyramid.go:384 +0x191
  github.com/ethereum/go-ethereum/swarm/storage.(*PyramidChunker).Split()
      /Users/nonsense/code/src/github.com/ethereum/go-ethereum/swarm/storage/pyramid.go:183 +0x436
  github.com/ethereum/go-ethereum/swarm/storage.(*DPA).Store()
      /Users/nonsense/code/src/github.com/ethereum/go-ethereum/swarm/storage/dpa.go:104 +0xb3
  github.com/ethereum/go-ethereum/swarm/pss.(*Pss).storeMsg()
      /Users/nonsense/code/src/github.com/ethereum/go-ethereum/swarm/pss/pss.go:834 +0x1bb
  github.com/ethereum/go-ethereum/swarm/pss.(*Pss).enqueue()
      /Users/nonsense/code/src/github.com/ethereum/go-ethereum/swarm/pss/pss.go:678 +0x14a
  github.com/ethereum/go-ethereum/swarm/pss.(*Pss).send()
      /Users/nonsense/code/src/github.com/ethereum/go-ethereum/swarm/pss/pss.go:666 +0xba1
  github.com/ethereum/go-ethereum/swarm/pss.(*Pss).SendAsym.func1()
      /Users/nonsense/code/src/github.com/ethereum/go-ethereum/swarm/pss/pss.go:613 +0x137
lash
@nolash
Mar 30 12:03
ah
Anton Evangelatov
@nonsense
Mar 30 12:03
this is also relevant IMO
lash
@nolash
Mar 30 12:03
well don't worry about if it's in storage
because storage will be removed
Anton Evangelatov
@nonsense
Mar 30 12:03
ok, yeah i saw we use it for some kind of cache at the moment
lash
@nolash
Mar 30 12:03
it's in one of the PRs I've prepared
But the first "counter" ... what do you mean?
Anton Evangelatov
@nonsense
Mar 30 12:04
the slice race is also reported (the one I commented on the ticket)
well, the counter is read and modified in multiple goroutines
we increment it in one, whereas we compare it in another - it has to be locked
lash
@nolash
Mar 30 12:04
So use atomic
?
Anton Evangelatov
@nonsense
Mar 30 12:05
either use atomic, or just lock it
we should try the tests again after those 2 races are fixed
  1. slice
  2. atomic counter
i am not sure how relevant the other 2 are.
lash
@nolash
Mar 30 12:06
I did put in a lock
but there were still lots of races
Balint Gabor
@gbalint
Mar 30 12:06
and………. my first real encrypted swarm upload
swarm-test-cluster git:(master) ✗ curl -H "Content-Type: text/plain" --data-binary @BoardingPass.pdf http://0.0.0.0:8501/bzzr:/encrypt
a95e6fe0faf7998b44508464da19f43420f31b98842dcd39269db40f990fac50ac67d2cced88ab8fe775eb113105f057bf1143b8a144e26c0123ab29ee71a10f%
you can see that hash returned is 64 bytes - it contains the encryption key
:fireworks:
lash
@nolash
Mar 30 12:07
So as long as the counter is ok, the array access shouldn't be a problem (go tool is maybe a bit over-protective?)
Balint Gabor
@gbalint
Mar 30 12:07
(for the note: and retrieve works and it returns the same file)
Anton Evangelatov
@nonsense
Mar 30 12:07
yes, could be
lash
@nolash
Mar 30 12:07
@gbalint looks promising. I'm looking forward to trying this!
Anton Evangelatov
@nonsense
Mar 30 12:08
go tool reports 50 data races @nolash , we should go over them one by one, and probably fix those within pss if we think they are real. then hopefully the tests would be more stable
lash
@nolash
Mar 30 12:08
yeah it's a LOT
Tests are pretty stable now, though, it seems?
Or do you think these races could be what's messing with the MacOS build, for example?
Anton Evangelatov
@nonsense
Mar 30 12:09
@nolash i think these races are messing up the macOS build, yes
maybe they are not so visible on linux
sock adapter is more concurrent than the sim adapter. sim adapter is much more stable because it is synchronous
Anton Evangelatov
@nonsense
Mar 30 12:21
@nolash https://golang.org/doc/faq#atomic_maps - regarding maps, i somehow can’t find info about slices, but they are also not safe for concurrent use
but its old
and behavior allegedly changed in 1.6 for maps, before read was safe
@gbalint the ChunkerParams are gone?
The values set with them, where are they set now?
@nonsense Maybe for fun let's make some very simple code and disassemble to see how the array write is done
@gbalint Please, your eyes on the rebased code when you have a moment, thanks: ethersphere/go-ethereum#337
Balint Gabor
@gbalint
Mar 30 12:46
@nolash ChunkerParams had a Hash, that is in DPAParams now
(because chunker doesn’t depend on the hash anymore as it doesn’t do any hashing, that is done by the hasherstore)
It also had a Branches field, that is eliminated, it is just calculated from the DefaultChunkSize and the hash size
Anton Evangelatov
@nonsense
Mar 30 12:49
@nolash yeah, i wouldn’t risk it, better add the locks
As long as you can guarantee the areas won't overlap, it's fine. - on slice or array - i am not sure how true this is
i’ve heard a similar statement before, but i think they introduce some kind of change around go 1.5 - 1.6, so not sure if this still holds
lash
@nolash
Mar 30 13:18
It would be interesting to investigate closer. If it's possible to avoid the overhead of locks all the time, I'd like to know.
@gbalint cool
Balint Gabor
@gbalint
Mar 30 13:42
@nolash can you help me what should i mostly check, where were the conflicts?
lash
@nolash
Mar 30 13:53
the conflicts were
1) ConfigParams
2) several in swarm/storage/types.go
3)swarm.go where I had moved some initializations further down in NewBzz
I just want to make sure I haven't removed any code needed by your changes inadvertently (i had cut two functions needed by mine in the rebase, which I manually replaced)
lash
@nolash
Mar 30 16:04
@gbalint any open tasks for the merge I could look at?
lash
@nolash
Mar 30 16:57
This for example? ethersphere/go-ethereum#324
Balint Gabor
@gbalint
Mar 30 19:44
@nolash sure, that’s fine to take
and the rebase looks fine