Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • Sep 20 17:23
    mandusm commented #355
  • Sep 20 08:04

    andris9 on v1.29.6

    (compare)

  • Sep 20 08:04

    andris9 on master

    include rewrite addresses in QU… 1.29.6 (compare)

  • Sep 19 15:40
    louis-lau commented #355
  • Sep 19 08:53
    andris9 commented #355
  • Sep 18 18:19
    mandusm edited #355
  • Sep 18 18:18
    mandusm opened #355
  • Sep 15 20:58
    louis-lau transferred #331
  • Sep 14 06:43

    andris9 on master

    fixed a but with invalid proper… (compare)

  • Sep 13 10:37
    louis-lau closed #354
  • Sep 13 10:37
    louis-lau commented #354
  • Sep 13 10:20
    asfour75 opened #354
  • Sep 13 08:56

    andris9 on master

    throw if encryption secret is n… (compare)

  • Sep 13 07:00

    andris9 on master

    Add optional thread message cou… Make thread counters work corre… Merge pull request #353 from lo… (compare)

  • Sep 13 07:00
    andris9 closed #353
  • Sep 13 07:00
    andris9 commented #353
  • Sep 12 20:17
    louis-lau ready_for_review #353
  • Sep 12 20:17
    louis-lau commented #353
  • Sep 12 20:15
    louis-lau synchronize #353
  • Sep 12 16:19
    louis-lau commented #353
Louis
@louis:laureys.me
[m]
Ah, dns round robin. Doesn't that fail if one of them goes down?
But, port 80 on those ips isn't used by webmail at least then 😁
Andris Reinman
@andris9
well, yeah each of these servers is for a single purpose. eg an imap server does not server pop3 etc. the new acme thing is yet too experimenal, so it is not actually used but in the end eachs such server is probably going to have acme handler running on port 80 indeed
Louis
@louis:laureys.me
[m]
Make sense :)
Though I want to be sure of my understanding, that way of load balancing won't give high availability right? Just horizontal load scaling.
Andris Reinman
@andris9
yes, you're correct. there are plans for HA but these are not in any way urgent
Louis
@louis:laureys.me
[m]
Great :)
Andris Reinman
@andris9
so far all major issues have been with the DB and not, for example, an imap server burning down or anything, so making application servers more bulletproof has not been a priority
Louis
@louis:laureys.me
[m]
Interesting, anything that would cause the db servers to be more volatile?
Andris Reinman
@andris9
well, there's about 50TB of actual data (~70TB+ virtual data) to manage and there aren't any tutorials to follow to do it "properly". though so far most major issues have been caused by external things like exploding network switches etc
Louis
@louis:laureys.me
[m]
Ah, yeah the 50TB of data doesn't help hahaha
Andris Reinman
@andris9
if different db replica servers do not see eachother anymore due to some faults in network then strange things start to happen
Louis
@louis:laureys.me
[m]
Can't easily just spin up a new one
Ahh, yeah they go read only if they can't reach a quorum right?
Andris Reinman
@andris9
i'm not sure but there's at least 6 replica shards which means 6*3=18 physical servers. in addition there are mongodb mongos servers and configurtion shard (also 3 servers)
if you have 3 member replica set then at least 2 of these must be able to communicate with eachother, otherwise there would be no primary instance anymore (even if there's nothing wrong with current primary, it steps down automatically once it does not have enough votes)
Louis
@louis:laureys.me
[m]
Yeah, that sounds hard to manage. Are you doing 3 data bearing nodes or 2 + 1 arbiter?
Seems more cost effective, but with additional risk
Andris Reinman
@andris9
3 data nodes. 2 in one DC and 1 in 2nd DC. but if the DC with 2 nodes goes offline then the 1 can't process anything anymore as it does not have enough votes. and the system makes so many write operations that using only a replica member is not possible. at first I tried to designed the system in a way where emails would still be readable even if there is no primary member anymore but each read causes several writes (eg. marking unseen emails as seen etc) that it requires an actual primary to be available
there's also separate disks for different kind of data. so messages and user information are stored in a db that is in a fast SSD. attachments (but not attachment indexes) are stored in a very large but slow HDD. so if that HDD becomes inaccessible then most of the system still works, you can log in and read emails etc but you can not download any attachments
so when using IMAP then you can only download messages without attachments. it fails requests agains messages that do have at least one attachment. in webmail attachments are loaded later, so each message can be read (but attachments requests will fail, so no images etc)
Louis
@louis:laureys.me
[m]
I like how you seperated those databases :)
Makes a lot of sense
Andris Reinman
@andris9
yeah, it is much cheaper that way. SSD is quite pricey, and you don't even need to access attachments most of the time
Louis
@louis:laureys.me
[m]
And attachments use the most data as well
People love sending large files over email hahaha
My relatives always complain about the 25mb limit that's basically everywhere, but I know that I don't want a higher limit
Daviesmolly
@Daviesmolly
How to integrate text or simple basic captcha to wildduck-webmail??
Andris Reinman
@andris9
@Daviesmolly wildduck webmail supports reCaptcha but it is disabled by default, https://github.com/nodemailer/wildduck-webmail/blob/3371984a32a7942d7859c3fcde923cf62484e7fa/config/default.toml#L48-L51
Tiny product news. WIldduck Auditing System now generates verification hashes for email downloads (each download is logged and you can later download verification hash for the downloaded file to verify if the downloaded files has been changed or not)
Screenshot 2021-07-03 at 11.49.01.png
Screenshot 2021-07-03 at 11.49.22.png
Louis
@louis:laureys.me
[m]
Cool! What's the exact use case for this?
Andris Reinman
@andris9
Once an email is downloaded as an evidence it must be possible to later validate that the email has not been tampered with and is the same that was in the server
not every download is actually signed. Instead download hash is logged and once you try to download it then the file is put together and signed with server key
Louis
@louis:laureys.me
[m]
Ah, that's pretty cool
Andris Reinman
@andris9
Btw this does not hash actual emails but the container, eg the downloaded zip file. Every time you download emails, be it a single email file or a zipped selection, then that action is logged and you can later go and download signed verifying hash for that download. Would prefer to somehow include the hash with the initial download but the zip files are streamed (can be very large) and there is no way to know the hash of it before it has been actually downloaded
Audit system does not show email contents, only metadata (including subject and to/from addresses). To actually see the email you have to download it and that action is logged.
venusian
@venusian:matrix.org
[m]
Hi all, I doing research for a medium to large installation and was wondering what scale of installations Wildduck is used for at the moment, it sounds like it should scale really well but I have not found any references so far
Andris Reinman
@andris9
@venusian:matrix.org WildDuck is mainly developed for a single specific email system. That system currently stores about 70TB of emails (that's virtual size, actual db size with deduplication is 47TB) and has 100k+ registered accounts. I'm not 100% sure but I guess that there are about 10k-20k logged in IMAP users in peak hours. There are 7 mongodb shards. New shards are added whenever free space runs out.
Louis
@louis:laureys.me
[m]
Have you ever ran into ram or CPU limitations before space ran out on a shard? Or is that generally not a problem?
Andris Reinman
@andris9
CPU is usually not an issue. Real problem is memory size as MongoDB needs to keep indexes in memory
if there is not enough memory then Mongo loads only "hot" indexes to memory and keeps everything else on disk that makes irregular operatsions (eg. search) quite slow
this is also the main thing that limits shard size - too much data on a single shard means that there is no way that indexes fit into memory
another limit is backup - backing up regularly a lot of TBs is real pain
so if the shard is smaller then it is also easier to back it up as there is less data
Venusian
@venusian:matrix.org
[m]
We're looking at a small multiple of those stats and are interested in alternatives to the known dovecot setups however at this scale everything new is scary :)
Backing up would need to be 'smart' and not simply backup the mongodb files but contents based on actual change I think, as coping everything everytime would murder any viable setup I can think of
Louis
@louis:laureys.me
[m]
Afaik the only viable backup method without mongodb enterprise is filesystem snapshots:
https://docs.mongodb.com/manual/tutorial/backup-sharded-cluster-with-filesystem-snapshots/
Andris Reinman
@andris9
we use PerconaDB where you can create db snapshots. so on each shard there is one replica set member with an extra 10TB disk. once a day we run the command to create the snapshot to that disk.