Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    alidhamieh
    @alidhamieh
    That is the base image: mongo:3.6.15-xenial
    Adrian Reber
    @adrian:lisas.de
    [m]
    @alidhamieh: did you try to use --file-locks
    It is an option to podman container checkpoint
    alidhamieh
    @alidhamieh
    sudo podman container checkpoint swac3-server_receiving-1 --export=/tmp/swac3-server_receiving-1.tar.gz --file-locks --tcp-established
    Error: unknown flag: --file-locks
    Adrian Reber
    @adrian:lisas.de
    [m]
    ah, then you probably need a newer version of podman or try to drop file-locks into the criu configuration file
    alidhamieh
    @alidhamieh
    how to drop file-locks in the config
    Adrian Reber
    @adrian:lisas.de
    [m]
    are you using runc or crun?
    alidhamieh
    @alidhamieh
    runc
    Adrian Reber
    @adrian:lisas.de
    [m]
    echo "file-locks" >> /etc/criun/runc.conf
    alidhamieh
    @alidhamieh
    sorry how to know if iam using runc or crun
    seems runc
    it worked without the n in criun
    Adrian Reber
    @adrian:lisas.de
    [m]
    yeah, that was a typo
    Do Hoang
    @huyhoang8398
    can i ask where is the block code that CRIU uses pages.img for restoration?
    Pavel Tikhomirov
    @Snorch

    Sure you can =)

    That's how you normally find where image is used in CRIU code:

    1) First look for an image name "pages-<id>.img":

    [# criu]$ grep -r "pages-" criu
    criu/image-desc.c:      FD_ENTRY_F(PAGES,       "pages-%u", O_NOBUF),
    criu/image-desc.c:      FD_ENTRY_F(PAGES_OLD,   "pages-%d", O_NOBUF),
    criu/image-desc.c:      FD_ENTRY_F(SHM_PAGES_OLD, "pages-shmem-%ld", O_NOBUF),

    2) Se what FD_ENTRY_F is:

    [# criu]$ git grep -A1 "#define FD_ENTRY_F"
    criu/image-desc.c:#define FD_ENTRY_F(_name, _fmt, _f)     \
    criu/image-desc.c-      [CR_FD_##_name] = {

    3) Look for CR_FD_PAGES open:

    [# criu]$ git grep open.*CR_FD_PAGES
    criu/image.c:   return open_image_at(dfd, CR_FD_PAGES, flags, *id);
    criu/mem.c:     pages = open_image(CR_FD_PAGES, opts.auto_dedup ? O_RDWR : O_RSTR, rsti(t)->pages_img_id);

    One is in prepare_vma_ios() and another is in open_pages_image_at().

    4) Vim cctree plugin says prepare_vma_ios() is on restore:

      +-< prepare_vma_ios
        +-< prepare_vmas
        | +-< restore_one_alive_task
        | | +-< restore_one_task

    5) open_pages_image_at() is both dump and restore

    Pavel Tikhomirov
    @Snorch
    And if we add some magic from here https://github.com/Snorch/call_tree_builder we get nice picture:
    pages-image.png
    Do Hoang
    @huyhoang8398
    Holy. Thanks a lots
    Do Hoang
    @huyhoang8398
    is this possible to make a fake pages.img from kernel module :| if so, can you guy recommend any way to do it
    Radostin Stoyanov
    @rst0git
    What do you mean by "fake pages.img"?
    Do Hoang
    @huyhoang8398
    i have to do some tweaking criu for my own purpose, so that i need to dump pages but using kernel module
    Radostin Stoyanov
    @rst0git
    I'm not sure I understand your question. CRIU uses the parasite code to dump memory pages: https://criu.org/Parasite_code
    Timo
    @TVH7

    Hi there,

    First of all, CRIU looks like a very cool project. Using the out of the box "criu" CLI was super easy and appears to do what I want (checkpoint a TCP connection, and restore it somewhere else, I for now, tried to do this by just restoring a docker container using the guide on criu.org) however my goal is to move this functionality to a application running in a kubernetes pod, which can than be used to "transfer" a TCP connection from my kubernetes pod to another kubernetes pod whenever a pod gets rebalanced.

    I learned that libsoccr has been build to exactly do this, so I'm trying to build a little C application that simulates a TCP client/server connetion, and uses libsoccr to checkpoint/restore the TCP connection of the client. Now the last time that I used C is years ago, so I am really struggling getting the libsoccr.a library linked to my "demo-app".

    I tried the following:

    1. build the criu repo
    2. moved the libsoccr.a from the soccr directory to a "lib" folder inside my project.
    3. Copied the include repo from criu to my project.
    4. Ran GCC to link the lib gcc doSocket.c -lsoccr -o doSocket.o -I include -L lib

    However it is failing trying to find libnet_init.

    Now I am guessing that I am linking the project incorrectly (as I probably also have to link libnet and other dependencies) but I am afraid I have to admit that I'm not really sure how to go from here as my C skills are lacking here.

    Anyone that can lead me in the right direction // is there maybe a demo application on github somewhere that demonstrates how to use libsoccr as a standalone library?

    Help would be really appreciated, I hope that I am not annoying you with my beginner-questions.

    3 replies
    Do Hoang
    @huyhoang8398
    is this possible to translate Virtual Address and Number Pages of Application to PFN or struct Page in Kernel Module?
    Shreyas Kharbanda
    @Alphacode18

    Hi all,

    I have been trying to setup CRIU v3.13 inside a privileged docker container (A dockerized CRIU Image is bundled as part of the Dockerfile on an Alpine Linux Base) to checkpoint a certain process. The containers run inside a Kubernetes cluster on a fresh install of Ubuntu 20.04 LTS.

    I have done some sanity checks, namely running the criu check command from within the container. Apart from criu check, I have checked for various privilege accesses and all come back positive.

    The criu check command returns the following:

    Error (criu/util.c:610): exited, status=1
    Error (criu/util.c:610): exited, status=1
    Warn (criu/kerndat.c:839): Can't keep kdat cache on non-tempfs
    Looks good.

    I am currently trying to get the simple loop example (https://criu.org/Simple_loop) working. After following the instructions, I see that dumping the process fails with error code -1.

    pie: 52: Warn (criu/pie/parasite.c:648): /proc/self/cgroup was bigger than the page size
    pie: 52: __sent ack msg: 76 76 -1
    pie: 52: Close the control socket for writing
    (00.037284) Fetched ack: 76 76 -1
    pie: 52: Daemon waits for command
    (00.037294) Error (compel/src/lib/infect-rpc.c:72): Command 76 for daemon failed with -1
    (00.037305) Error (criu/parasite-syscall.c:447): Parasite failed to dump /proc/self/cgroup

    I haven't had much success in finding a solution to the problem in the GitHub issues, and I'd really appreciate it if you could point me in the right direction.

    Pavel Tikhomirov
    @Snorch
    That happens likely because you have too much nested cgroup directories for a dumpee process. If you show /proc/<dumpee_pid>/cgroup that can be confirmed.
    one way of fixing/workarounding it is to use criu on host =)
    Adrian Reber
    @adrian:lisas.de
    [m]
    After almost two years my Kubernetes checkpoint support PR was finally merged today: kubernetes/kubernetes#104907 That took a lot longer than expected, but now it should be possible to checkpoint containers in Kubernetes with the help of CRIU 🎆
    Prajwal S N
    @snprajwal
    That's awesome!! Congratulations :confetti_ball:
    Zeyad Yasser
    @ZeyadYasser
    Awesome, Congrats 🎉
    Pavel Tikhomirov
    @Snorch
    Cool, Congrats!!!
    Andrei Vagin
    @avagin
    @adrian:lisas.de good job! Congrats!
    Adrian Reber
    @adrian:lisas.de
    [m]
    @mihalicyn: have you seen this: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1973620 does this mean that CRIU is again broken on Ubuntu kernels?
    Alexander Mikhalitsyn
    @mihalicyn

    Ugh, something strange happening with that. It's a long story.

    1. Andrei has filled bug https://bugs.launchpad.net/ubuntu/impish/+source/linux/+bug/1967924
      because our patch https://kernel.ubuntu.com/git/ubuntu/ubuntu-hirsute.git/commit/?id=c9dae9237803b4ae3517f9599b33c6d4b6b9c0a4
      from 2021-05-07 12:11:00 was losed

    2. They ported this patch by themself and discovered that it leads to crash:
      https://bugs.launchpad.net/ubuntu/impish/+source/linux/+bug/1967924/comments/8
      issue was successfully fixed and I've reviewed this solution:
      https://bugs.launchpad.net/ubuntu/impish/+source/linux/+bug/1967924/comments/14

    3. For some reason fix for this patch wasn't squashed into the original commit, but instead applied on top, and then kernel release was came out without fix...
      [ https://lwn.net/Articles/899420/ ]

    AFAIK, right now kernels have no this problem and contains both patches.

    Adrian Reber
    @adrian:lisas.de
    [m]
    Thanks for the overview. Good to know.
    Alexander Mikhalitsyn
    @mihalicyn
    If something gets broken with that please ping me. I'm also looking on that but...
    Alexander Mikhalitsyn
    @mihalicyn
    and this version looks like not correct because it contains additional fput(file) that doesn't needed anymore because of https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/focal/commit/fs/overlayfs/file.c?h=hwe-5.15-next&id=2896900e22f8212606a1837d89a6bbce314ceeda
    Alexander Mikhalitsyn
    @mihalicyn
    JFYI: I've sent new patches for Ubuntu 22.04 kernels regarding this issue with overlayfs.
    https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1967924
    NoobTracker
    @NoobTracker
    Hello, I am unable to restore a program with criu as described here: https://www.youtube.com/watch?v=kjhuzSl6JYc
    Should I post the error messages?
    Adrian Reber
    @adrian:lisas.de
    [m]
    @NoobTracker: yes please, it would probably be better to open a GitHub issue
    NoobTracker
    @NoobTracker
    okay ...
    NoobTracker
    @NoobTracker
    hm, how do I replicate it ... my memory is sooo poor
    Alexander Mikhalitsyn
    @mihalicyn

    It's not your memory fault. Bugs tend to disappear when you try to reproduce them purposefully. :D

    https://en.wikipedia.org/wiki/Young%27s_interference_experiment

    NoobTracker
    @NoobTracker
    well I forgot the commands I typed in
    anyway, reproduced it
    NoobTracker
    @NoobTracker
    issue submitted
    Radostin Stoyanov
    @rst0git
    @NoobTracker you might need to set security.sandbox.content.level to 1 in about:config as criu does not yet support checkpoint of nested IPC namespaces
    https://wiki.mozilla.org/Security/Sandbox