Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
Christian Muehlhaeuser
@muesli
What issues are you facing on Linux?
Christoph Haas
@Softener_gitlab
Morning. I'd like to try knoxite because it looks similar to 'restic' which has some rough edges I currently don't like. However I can't install it using "go get…". I end up with build errors apparently referring to github.com/klauspost/reedsolomon which eventually lead to "asm: too many errors". What might be the cause?
System in question is Ubuntu 18.04. "apt install golang".
Christian Muehlhaeuser
@muesli
@Softener_gitlab Sorry for the late response here, I hate gitter's notification model 😒 I think we resolved this over on github already, tho? :)
Kilian Lackhove
@crabmanX
@muesli i just stumbled over knoxite and went tgrough the Website/readme but keep wondering whats is the main difference to e.g. restic or duplicacy?
o
Christian Muehlhaeuser
@muesli
modular architecture, fault tolerance / reed solomon encoding, future sasquatch integration (asymetric backups)
@crabmanX ^
Prof-Bloodstone
@Prof-Bloodstone
Hey! I'm looking forward to release 1.0.
Were there any, at least initial, tests performed that compare knoxite to other solutions? I'm mainly thinking about disk, time and bandwidth usage - restic and kopia seem to be eating bandwidth a lot, especially downloading stuff when pruning is performed. I know that it's hard to test it all, because each backup source will trigger different behaviors in the tools.
Prof-Bloodstone
@Prof-Bloodstone
I wish the compression information was more complete. I can't seem to find a comparison of the algorithms used - apart from some random spreadsheet in https://github.com/klauspost/compress
It also seems like the default GO implementations are slowish, from what I read. Especially given that none of them seem to be parallel?
Prof-Bloodstone
@Prof-Bloodstone
If anyone is interested - quick and dirty comparison of the compression options available in knoxite: https://gist.github.com/Prof-Bloodstone/85b1d3077db7aeb0ac8bcbaa49f730bb
Wanted to rerun it on a bigger data set on a dedicated VM to get more accurate results, but I hit OOM knoxite/knoxite#188
Christian Muehlhaeuser
@muesli
Hey @Prof-Bloodstone !
Nice comparison!
I'll look into the memory issue asap! This is using the latest git-master?
I'm a bit surprised by zstd's weak performance. It's a native implementation in Go, which was actually supposed to be fairly decent.
Prof-Bloodstone
@Prof-Bloodstone
@muesli Yes, I mentioned the git commit used for compiling it. It's git-master with my PR merged :)
I'm also really surprised that it performed so poorly compared to others - given that all algorithms produced rougly the same sized repository. I've read good things about zstd and saw that restic wanted to go zstd only.
Christian Muehlhaeuser
@muesli
We could try an alternative implementation. The native zstd algo is probably a bit faster, even though I'd hate the cgo dependency
Makes releasing and cross-platform support quite a bit harder
We should probably add a benchmarks to our test suite
Prof-Bloodstone
@Prof-Bloodstone
Yeah, I'm not sure if this is issue in implementation (in which case it'd probably make sense to open issue on Klaus' repository), or my data set just wasn't ideal for it (maybe the default ZSTD writer tries harder to compress the files?).
I feel like there's too much variability to say. Maybe ZSTD is better at handling bigger blobs? I have no idea, never actually used it.
One thing that could be nice is adding --compression-level (fast|standard|max) flag, to manipulate the writer's compression, on supported ones.
Prof-Bloodstone
@Prof-Bloodstone
I wonder if there's something in knoxite that will limit the amount of data needed to be downloaded from remote when doing prunning? One of my bigger issues with restic is that it seems to eat a lot of download bandwidth - which usually is priced by backup providers. For example rn, where I have each backup operation followed by a prune of oldest snapshot, I have almost the same download as I have upload usage. It's just 1TB, since it's for my personal stuff - but that can get quite a bit pricey.
Christian Muehlhaeuser
@muesli
oh wow... so yeah, the prune needs to download the chunk index, which should only be a few mega bytes
it then has to go and delete (potentially) a ton of chunks. but it doesn't read them to do so
last but not least it updates the list of snapshots. typically a few hundred kb. again, no read operation required here
That could be a restic bug tho, you know. I'm not sure, though, I don't know their storage design too well
just a heads up: if you're an irc user, there's also a #knoxite channel on Freenode. Whatever you prefer tho, really.
Prof-Bloodstone
@Prof-Bloodstone
Yeah, I'd imagine only index needs to be downloaded (if it's not in sync already). But on a machine which is solely a SFTP server for one restic backed-up machine, I have 922.5GiB rx (so received from restic) and 855.9Gib tx (sent to restic) just last month.
Christian Muehlhaeuser
@muesli
wow
and apparently a memory leak in knoxite 😂 darn it!
i hope i can quickly reproduce and resolve that though
Prof-Bloodstone
@Prof-Bloodstone
I use IRC from time to time, since right after Discord, IRC has most communities for tools I'm using. I haven't used Gitter before (had to create account), but because I don't have the irc logging bot setup and knoxite is fresh (not much activity here nor there), I went with Gitter to see response when someone will have time.
Christian Muehlhaeuser
@muesli
perfectly valid reason
Prof-Bloodstone
@Prof-Bloodstone
Hey @muesli , did you manage to reproduce my OOM issue?
Prof-Bloodstone
@Prof-Bloodstone
Regarding restic high download usage - I found out that running prune back to back (which only deletes/rewrites unused blobs and rewrites index), without any other changes to repo, restic always rewrites whole index downloading hundreds of megabytes of data to do so each time, even if index shouldn't change.
And got one prune (after deleting snapshot) to download 577MiB of data from repository, while uploading only 304MiB.
I understand that it needs to rewrite some blobs - I wonder how knoxite makes it different seeing how you said that it should only download chunk index, but nothing more is needed to be downloaded.
Christian Muehlhaeuser
@muesli
@Prof-Bloodstone wow
what knoxite does: it fetches the index, finds chunks that are now not referenced by any snapshots anymore, delete the chunks and updates the index
Yonggan
@Y0ngg4n
Hey guys. I am currently in decision phase between kopia and knoxite. Maybe somebody can explain me the advantages what makes knoxite better than kopia. Thanks for your help :)
Prof-Bloodstone
@Prof-Bloodstone
@Y0ngg4n I think it'll come down to preferences and use case. For me personally having kopia use global config only is not acceptable (for example can't use it on my personal machine - because if I'll switch repo to restore files from some other server, then it'll try to backup there), as well as it having huge bandwidth usage (restic is also using a lot, but they recently merged a few PRs with noticeable improvements).
I'm looking forward to using knoxite at some point, but I'm waiting for first release because of it's not stable-enough for my standard.
Probably @muesli or someone else from contributors can give you more technical answer of what's behind the hood.
Yonggan
@Y0ngg4n
@Prof-Bloodstone Thank you for your detailed answer. So for a Server Installation you would recommend me to Kopia now because knoxite is not stable-enough now?
Prof-Bloodstone
@Prof-Bloodstone
@Y0ngg4n As I said - depends on what you want. I haven't tested kopia in depth because high bandwidth usage and global config did put be off quite quickly. With knoxite, there's some issues with lzma/zstd compression (my dirty testing: https://gist.github.com/Prof-Bloodstone/85b1d3077db7aeb0ac8bcbaa49f730bb) and there's post-backup OOM if you backup lots of files. It appears to show up post backup and backup seems fine (maybe something handling statistics at the end is borked?). Also, knoxite doesn't have locks, so you have to limit yourself to one process at a time (see: https://github.com/knoxite/knoxite/issues/130).
Prof-Bloodstone
@Prof-Bloodstone

It seems that I almost monthly seek for a personal and server backup solution - and there's nothing that doesn't have its flaws. All depends on:

  1. SFTP only or also other backup locations (S3, B2, GCS, ...)
  2. Encrypted backups?
  3. Compression?
  4. Download usage on prune?
  5. Dedup?
  6. Incremental?

How many servers you have? For having central backup server for other servers, I'd look into elkarbackup or backuppc (haven't used them much, but both seem fine). If you want servers to push backups - take a look at: https://github.com/restic/others

I personally got back to using restic for now, but as soon as knoxite lands its first release and I test it - I'm thinking about switching to it.

Yonggan
@Y0ngg4n
@Prof-Bloodstone thank you for your great support i will go ahead with kopia first until knoxite hits its first release and will then try to move to knoxite thank you very much you helped me out :)
bigdataspin
@bigdataspin
Hi, I run a quiet little data storage company (Lima Labs) and also like what's being built here. I'd like to offer free 25GB SFTP storage areas and perhaps in the future get listed as a provider. I'm not sure who to reach out to.