Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Jul 29 2021 05:24

    dpc on master

    `cargo update` (compare)

  • Jul 29 2021 04:36

    dpc on master

    flake.nix: cleanup (compare)

  • Jul 29 2021 02:29

    dpc on master

    chore: add nix flake files (compare)

  • Jul 29 2021 02:15

    dpc on master

    implement `read_metadata` for b… add `Metadata.created` field Add `Name.created` field Adds … and 2 more (compare)

  • Jul 29 2021 02:15
    dpc closed #178
  • Jul 27 2021 17:21
    dpc commented #198
  • Jul 27 2021 16:45
    aemiranda7 opened #198
  • Jun 20 2021 03:33
    dpc closed #197
  • Jun 20 2021 03:33

    dpc on master

    Make the backblaze b2 backend o… Panic if b2 backend is requeste… Merge pull request #197 from mk… (compare)

  • Jun 20 2021 02:14
    mkroman edited #197
  • Jun 20 2021 02:12
    mkroman opened #197
  • Apr 25 2021 03:59
    dpc commented #196
  • Apr 24 2021 21:35
    dywedir commented #196
  • Apr 23 2021 17:04
    dpc commented #196
  • Apr 20 2021 23:55
    kevincox opened #196
  • Apr 20 2021 23:53
    kevincox opened #195
  • Apr 19 2021 17:24
    dpc commented #194
  • Apr 19 2021 10:06
    DarkKirb opened #194
  • Mar 25 2021 10:45
    invakid404 closed #193
  • Mar 25 2021 10:45
    invakid404 commented #193
matrixbot
@matrixbot
Ralith okay, because what you said was "the first tool to use cdc" :p
Dawid Ciężarkiewicz
@dpc
Oh, right. I meant "fastcdc". :)
Jenda Kolena
@jendakol
Did some research, didn't find any existing implementation of such algorithm in Java/Scala. It would require a lot of work and for my use-case, the benefits would be very discutable. So... maybe in the future ;-)
Thanks for suggestions though! I've learnt some new things and that's always good.
mbl
@maobaolong
Hi all, is there a way to use fastcdc chunking in java?
matrixbot
@matrixbot
dpc I'd just reimplement
dpc It's small piece of code
mbl
@maobaolong
Is it possible to chunking and dedupe from oss?
matrixbot
@matrixbot
matrixbot
@matrixbot
dpc oss?
dpc CEC in essence is like one line of code
Jenda Kolena
@jendakol
@maobaolong I was looking for it, didn't find any (working). However, it shouldn't be that hard to implement it. Once I have enough free time... :smile:
mbl
@maobaolong
@jendakol I have found out a way to use it, use JNI or JNR. And i have already implements it in java, how about open source in github? :smile:
I find a problem that the fastcdc do not appear in the github repository of rdedup, but it appears in the Crate.io 0.1.0 why?
matrixbot
@matrixbot
dpc There used to be a group of people that wanted to have a common CDC crate, but everbody lost interest, and I had to release my own branch/fork or something.
dpc https://crates.io/crates/rdedup-cdc , if you click repository it points to https://github.com/aidanhs/rsroll
mbl
@maobaolong
Well, where is the true repository contains fastCDC?
Jenda Kolena
@jendakol
@maobaolong Sure, if you implemented it, put it on GitHub :-) JNI is an option of course, I was talking about "native" Java/Scala implementation.
mbl
@maobaolong
@jendakol https://github.com/maobaolong/jfastcdc is a java version implementation
Jenda Kolena
@jendakol
@maobaolong Nice, thanks. I've already sent you a question through issues, since it doesn't really belong here :-)
Erlend Langseth
@Ploppz
How can I use rdedup? I don't understand its usage from the readme. I can't install rdup (compiler errors) so I'll have to do without it. In my understanding it just prints some paths? In the rdedup readme there's an example rdup -x /dev/null "$HOME" | rdedup store home.. does this just store a list of paths?
just for testing I did echo hey | rdedup store home, and it seems like loading this just returns "hey". Now... how can I use rdedup to backup file hierarchies? And how do I do it incrementally? Running rdedup store home again returns an error that home already exists.
Erlend Langseth
@Ploppz
I guess I misunderstand what rdup does? Probably also gives the contents of files?
Erlend Langseth
@Ploppz
Looked a bit more it the docs/example in rdedup, and I think I understand the general workings. You would pipe all data you want tobackup into rdedup store, with a new (probably with date) name. and then rdedup will try to store that with as little data duplication as possible compared with other stored things? Is that right? So I was contemplating using tar to create an archive of everything I want to backup and feed it to rdedup. But that would perhaps compromise the performance compared to rdup? Or not? (thinking that it might obscure some otherwise duplicated data)
matrixbot
@matrixbot

dpc > <@gitter_ploppz:matrix.org> Looked a bit more it the docs/example in rdedup, and I think I understand the general workings. You would pipe all data you want tobackup into rdedup store, with a new (probably with date) name. and then rdedup will try to store that with as little data duplication as possible compared with other stored things? Is that right? So I was contemplating using tar to create an archive of everything I want to backup and feed it to rdedup. But that would perhaps compromise the performance compared to rdup? Or not? (thinking that it might obscure some otherwise duplicated data)

tar should work OK, I think.

Erlend Langseth
@Ploppz
Ok thanks. Does rdedup do any compression? If so it's a bit silly to use tar
Maybe I could implement dpc/rdedup#5 . But then I need to know a bit more about the requirements. The idea is to have an algorithm to traverse file hierarchy to generate a text that fully describes it, and an inverse operation?
@dpc
matrixbot
@matrixbot
dpc Yes, it does compression of deduplicated chunks.
dpc You must not use compression in tar, otherwise deduplication wouldn't work.
dpc And yes, traverser is about doing something like tar or rdup do.
dpc But if you designed it well, it would work better with deduplication in rdedup
Erlend Langseth
@Ploppz
Ok I see. I will use tar without compression for now then. I don't know how to make the traverser any better
matrixbot
@matrixbot
dpc Improving it is mostly about organizing it in a way that would minimize the number of data chunks affected when some minor things change.
Erlend Langseth
@Ploppz
Hm I see
Erlend Langseth
@Ploppz
say I want to backup several folders. Should I create one rdedup store for each folder or doesn't it matter that much (thinking about performance I suppose)
docs fails https://docs.rs/crate/rdedup-lib/3.1.0/builds/139972 "libsodium-sys v0.1.0`\nprocess didn\'t exit successfully"
Erlend Langseth
@Ploppz
the library aspect of rdedup is interesting; I'm tempted to create an application based on rdedup, that is more high-level about backups
Erlend Langseth
@Ploppz
compression is on by default? I stored some images once in an rdedup store, but rdedup du reports that they take as much space on disk in the rdedup as the originals
the pictures are 3.5 GB. Then I store the exact same folder again (using tar), and the this time it takes 2.2GB. That's rather much considering it's the exact same directory? Is tar bad then? I did it with tar -cf - /path/to/dir
matrixbot
@matrixbot
dpc Dump everything in one tar.
dpc rdedup du reports original size of the data you stored
dpc Deduplication will happen between multiple backups of the same stuff (with minor changes).
Erlend Langseth
@Ploppz
Ah I was wrong yes, it doesn't actually take that much space on disk. Besides I wonder whether I misinterpreted 0.22GB for 2.2GB
Erlend Langseth
@Ploppz
I thought "disk usage" of du referred to, well, literal disk usage
Erlend Langseth
@Ploppz

Dump everything in one tar.

@dpc You mean, all folders I want to backup, I should put in one tar?. Hm... It's just that I have like 300GB of stuff already that I want to backup - basically my whole life. I keep updating it with e.g. pictures from phone. And then I was thinking about having some "recent and relevant" store that is a bit smaller and keeps getting updated. This is turning into general backup advice :P

I was reading your "original usecase" text, about how you sync it across several devices, for redundancy. I wanted to do that, but idk if they all need everything shrug
Erlend Langseth
@Ploppz
--nesting <N> Set level of folder nesting [default: 2]
what does this mean?
matrixbot
@matrixbot
dpc Internally chunks are stored under ./firstbyte/secondbyte/restbytes path format.
Erlend Langseth
@Ploppz
oh :o ok thanks I will just assume that the default is fine then
also wondering what URI is used for. Just an identifier?