sudo podman inspect get_counter_5 --format '{{.NetworkSettings.IPAddress}}'
10.88.0.11
10.88.0.11:8088
counter: 0
sudo podman rm get_counter_5
sudo podman container restore --import=get_counter_5.tar.gz --tcp-established
curl 10.88.0.11:8088 counter: 1
Hi @minhbq-99!
Just a few versions: the hugetlbfs mounts have no FS_USERNS_MOUNT (it means that cannot be mounted from inside the user namespace) (but shmemfs has this fs_flag). I'm not sure that I've understood you correctly here:
successful in the host but failed when run CI in docker, run that container manually with shell and the test is successful).
Did you manage to reproduce the problem on your machine? Can you show the command?
another possible reason is that the hugetlbfs should be mounted manually (as I can see), but the shmem fs is mounted automatically from the system start. Maybe you have to add direct hugetlbfs mount into your kerndat initialization function?
Just a few versions: the hugetlbfs mounts have no FS_USERNS_MOUNT (it means that cannot be mounted from inside the user namespace) (but shmemfs has this fs_flag).
Interesting information. I get error with user namespace but have no idea why, so I've disabled uns test already.
I think I understand the problem. hugetlbfs is mounted by default as I see but I need to set the number of hugepage in sysfs so that I can allocate them. Currently, I add some code to zdtm.py to set some hugepages before run zdtm test. The kerndat function may get called before hugepages are available so I cannot allocate and get dev number. The reason for inconsistent zdtm result is because the kerndat get cached. And my cached version in host get the correct dev number so the zdtm test is always successful.
Thank you @adrian:lisas.de @mihalicyn :)
Hi @minhbq-99 , afaics If we shmat sysvipc shm with hugetlb backing it looks the same as hugetlb mapping created by mmap. Meaning that it also has /proc/pid/map_files/file. But looking on your PR https://github.com/checkpoint-restore/criu/pull/1622/commits/3ec6dbfe29558c5067fdb7c04313f01743e694c7#diff-6f08d59ddde08ca75f7ccb0aac7f5ca6e011bd968b57d3de0ee7a1786f582763R238 I'm not sure that your dev comparison works even for mmap's. If I do simple test https://gist.github.com/Snorch/ab5f86e5e8f3d7f9fecfd7eabdcadd7a
[root@fedora helpers]# ./shm-huge
shm_ptr = 0x7f8868400000
map = 0x7f88689af000
map2m = 0x7f8868200000
All three different mappings have the same device:
[root@fedora snorch]# stat /proc/136984/map_files/{7f8868400000,7f88689af000,7f8868200000}* | grep Dev
Device: 16h/22d Inode: 1055674 Links: 1
Device: 16h/22d Inode: 1055681 Links: 1
Device: 16h/22d Inode: 1055673 Links: 1
On pretty new 5.13.12-200.fc34.x86_64 kernel.
[root@fedora helpers]# ./shm-huge
shm_ptr = 0x7f555e800000
map = 0x7f555ed76000
map2m = 0x7f555e600000
[root@fedora helpers]# grep "7f555e800000\|7f555ed76000\|7f555e600000" /proc/158858/maps
7f555e600000-7f555e800000 rw-s 00000000 00:0f 1088051 /anon_hugepage (deleted)
7f555e800000-7f555ea00000 rw-s 00000000 00:0f 65567 /SYSV6129e7d0 (deleted)
7f555ed76000-7f555ed77000 rw-s 00000000 00:01 73556 /dev/zero (deleted)
Hi, my pull request fails on a CentOS 7 user_namespace test case. The problem is in restoring hugetlb shmem mappings, when restoring shmem mappings, we try to use memfd, if we cannot, we open map_files link of that mapping. In case of of CentOS 7, we fall into open map_files link and we don't have CAP_SYS_ADMIN cap
https://elixir.bootlin.com/linux/v3.10/source/fs/proc/base.c#L1889
With some debuggings, I found that the restored process have CAP_SYS_ADMIN but its cred->user_ns
has the lower level than init_user_ns. But why the checkpoint process can open map_files link? I see that checkpoint process's cred->user_ns
is the same as init_user_ns. So why there is a difference in cred->user_ns
between checkpoint and restore process?
Traceback (most recent call last):
File "./criu-ns", line 231, in <module>
res = wrap_dump()
File "./criu-ns", line 200, in wrap_dump
set_pidns(pid, pid_idx)
File "./criu-ns", line 161, in set_pidns
raise OSError(errno.ENOENT, 'Cannot find NSpid field in proc')
FileNotFoundError: [Errno 2] Cannot find NSpid field in proc
cat /proc/self/status | grep NSpid
that command should work
which version of ubuntu?
When criu is restored, it will cause pid conflicts, so through criu-ns restoration under the new command space, the successful process has been restored, and freezing again will cause this problem
uname -a ?
Linux 8194e282c3c5 3.10.0-1062.el7.x86_64 #1 SMP Wed Aug 7 18:08:02 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
While dumping a container (using runc diskless method), I get a stats-dump file on the source node. But on the destination side, after restoring that container, I am not getting the stats-restore file. Isn't it possible to get that stats-restore thing?
(I also used podman where stats-dump and stats-restore both got on the source and destination)
Though I had asked this question before and thought I could solve this issue, I failed to do that.