Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Jul 29 05:24

    dpc on master

    `cargo update` (compare)

  • Jul 29 04:36

    dpc on master

    flake.nix: cleanup (compare)

  • Jul 29 02:29

    dpc on master

    chore: add nix flake files (compare)

  • Jul 29 02:15

    dpc on master

    implement `read_metadata` for b… add `Metadata.created` field Add `Name.created` field Adds … and 2 more (compare)

  • Jul 29 02:15
    dpc closed #178
  • Jul 27 17:21
    dpc commented #198
  • Jul 27 16:45
    aemiranda7 opened #198
  • Jun 20 03:33
    dpc closed #197
  • Jun 20 03:33

    dpc on master

    Make the backblaze b2 backend o… Panic if b2 backend is requeste… Merge pull request #197 from mk… (compare)

  • Jun 20 02:14
    mkroman edited #197
  • Jun 20 02:12
    mkroman opened #197
  • Apr 25 03:59
    dpc commented #196
  • Apr 24 21:35
    dywedir commented #196
  • Apr 23 17:04
    dpc commented #196
  • Apr 20 23:55
    kevincox opened #196
  • Apr 20 23:53
    kevincox opened #195
  • Apr 19 17:24
    dpc commented #194
  • Apr 19 10:06
    DarkKirb opened #194
  • Mar 25 10:45
    invakid404 closed #193
  • Mar 25 10:45
    invakid404 commented #193
Erlend Langseth
@Ploppz
oh :o ok thanks I will just assume that the default is fine then
also wondering what URI is used for. Just an identifier?
or actual URI to some remote rdedup repository?
matrixbot
@matrixbot
dpc Unless you're going to be doing terabytes of data, 2 levels should be OK.
dpc There is a WIP support for remote stores, yes.
dpc I never completed it though.
dpc The infra is there, just needs a bit of integration code for each backend.
dpc I've been using rclone instead, and I have no time left to dedicate.
Erlend Langseth
@Ploppz
I see. Thanks.
Erlend Langseth
@Ploppz
damn it takes quite a while to tar like 300GB
should I try to split it into "old things that will never be updated / archive" and rest? Several minutes have passed and not even 1 GB is stored in the rdedup store yet.
Erlend Langseth
@Ploppz
hm I think I will. Is it unexpected or not, that it takes so long time? (not thinking that it's rdedup's fault, just the use case in general)
Ethan Smith
@ethanhs
Won't tar-ing things throw off deduplication?
matrixbot
@matrixbot
dpc taring should be streaming ...
dpc And yeah, 300GB is quite a bit of data.
dpc Check your cpu usage and io usage.
Erlend Langseth
@Ploppz
cpu usage is not high, load is quite high (7-8), idk about disk usage - about 25-30 R/s and W/s
looking at ytop
@ethanhs you think so? I don't apply compression. I just do tar -C / -cf - <files>
Ethan Smith
@ethanhs
ahh then maybe not, I'm not familiar with the data layout of a tar, I think it would depend on that
matrixbot
@matrixbot
dpc 30MB/s is about right for spinning disk
Erlend Langseth
@Ploppz
well, maybe it's because my disk is about 8-9 years old... been running all night and it's only at 90GB, which makes it leses than 3 MB/s
matrixbot
@matrixbot
dpc That's very slow.
Erlend Langseth
@Ploppz
hm.. maybe I should try to write it directly to another disk. I only have an external harddisk with that capacity though.. not sure if that would be any faster
matrixbot
@matrixbot
dpc Reading and writting at the same time to the same spinning disk is usually super slow.
Erlend Langseth
@Ploppz
I see. But it's at least as slow from the HDD to an external harddisk :o oh well, just leaving my computer on
Stefan Junker
@steveeJ
hey, any idea how to run gc when the device is filled up 100% with the backups?
matrixbot
@matrixbot
dpc rdedup gc shouldn't need much extra space. Just a little bit.
dpc It only creates handful of directories and moves data files.
dpc But it will not start deleting stuff until the whole operation is complete.
dpc But that's an interesting problem, I admit.
dpc What I would do ... is I would move out some chunks to another device, and put a symlink in their place.
dpc Or whole one dir
dpc That should create enough space, without need to move everything.
matrixbot
@matrixbot
dpc I'm quite sure rdedup will just follow symlinks.
Jenda Kolena
@jendakol
Hi @dpc, I wanted to try implement PoC of some remote backend storage with rdedup but as I've found, this is currently not possible as one can only pass Url into Repo::init and that will fail for any other scheme than file:// or b2://. In other words, it's not possible to give it own Backend implementation.
Is there some way to workaround it other than compiling own version of rdedup? :-D
Thx.
matrixbot
@matrixbot
dpc I'm not sure what are you asking about ...
dpc If you're trying to "implement" new remote backu storage than ... you have to add the new code and compile rdedup with it.
dpc The whole point of url is that you can add new schema , let's say mybackupprotocol://somehost/somedir
dpc And then you would have a new code detecting mybackupprotocol and instantiating a new implementation implementing neccessary interface, so that rdedup sends you only the low-level reads and writes, and you deal with that.
Jenda Kolena
@jendakol
Yeah, but I'd like to implement my own backend without touching rdedup, only having it as dependency. I'm digging in the problem since yesterday so now I know it won't be that easy... It's nice how the backend abstraction is done (those low-level read, writes, etc) but necessity to adjust rdedup itself is.. unfortunate :-)
I'll make PoC and then try to think through possible refactoring of rdedup internals O:-) I'll let you know.
Jenda Kolena
@jendakol

Anyway, related question for you @dpc...
In the write method in backend, I receive SGData and PathBuf. I have expected that the last part of the path is hash of the data provided. However, it seems like it's not the case...

[2020-11-07T16:59:23Z DEBUG rdedup_lib::aio::remote] remote write: path="0000000000000000-e8b69f34102f2432/chunk/b2/65/b265ac0fabfa6aee1f4bf7685db5056cc0fab322edc5aa507a663454189dce96" hash=3cbfaa5146ff36d0fdcef2cf4d4686ecec70ca126033bd1530bf44acd7e38368 len=209744B idem=true

What did I miss? (I'm using your function calculate_digest to calculate SHA256 from SGData)
My goal was to transfer both the path and the data over network and use a hash extracted from the path to verify the data were transferred successfully.

matrixbot
@matrixbot
dpc It might be a hash of concatenated hashes of chunks, possibly recursive
Jenda Kolena
@jendakol
Or maybe a hash of raw data where the data in SGData are already encrypted?
It's important for me because otherwise I'd need to calculate the hash by myself which requires either one unnecessary read of the data or some magic stream calculation during streaming the data :-D
matrixbot
@matrixbot
dpc The hash is by necessity of the unencrypted data. Please check out https://github.com/dpc/rdedup/wiki/Rust's-fearless-concurrency-in-rdedup and then ...
Jenda Kolena
@jendakol
Oh, sorry. It was my fault, I accidentally used another hash than SHA256 for the chunks and then compared it to the SHA256. Unfortunately both hashes are 64b so it didn't hit my eyes... :-) Thanks though.
Vladyslav
@N-006_gitlab
Hi!
Not sure, maybe it's a problem on my side :/
I have the following panic using rdedup store: https://pastebin.com/raw/kUUA4FeY
Vladyslav
@N-006_gitlab
@dpc related: contain-rs/linked-hash-map#100
Could you please tag a new version with the commit that fixes this: dpc/rdedup@5b4b5fd ?
dpc
@dpc:matrix.org
[m]
Please create github issues, so I won't forget.