Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Guillaume Fraux
    @Luthaf
    Hey @dkonstan, thanks for taking the time to come over discuss this! I think implementing this would be pretty easy to do. We already support a distance selection, and sub-selections in is_bonded and friends. The syntax would look like resname HOH and distance(#1, resname Y Z) < 12.5. The changes should be relatively simple so if you want to dig a bit into C++ and implement it yourself I would be happy to help you! Else I'll open an issue on the repo and try to do it soon-ish.
    Looking through MDA documentation, they also have multiple selections related to the center of geometry of a sub-selection, which would also be interesting to implement, but a bit harder.
    Guillaume Fraux
    @Luthaf
    I opened chemfiles/chemfiles#327 to track this feature =)
    Mykola Dimura
    @mdimura
    Hi, everyone! How do I properly specify a property in a selection string? I tried "[chainname] A", but it gives an empty selection. Sorry, could not find an exmaple in the docs.
    Guillaume Fraux
    @Luthaf
    Hi! "[chainname] A" should work, but the issue here is that chainname is a residue property, and only atomic properties are checked when evaluating selection. That something I think we should fix, I'll open an issue. And yes, the docs for the selection language are not very good, sorry about that.
    Guillaume Fraux
    @Luthaf
    See chemfiles/chemfiles#331, and thanks for the report!
    In the mean time, you can modify the parser for the format you are using to store chainname on atoms too and recompile chemfiles. If you want to go this route, I can help.
    Mykola Dimura
    @mdimura
    Thank you for your answer! I implemented a workaround in the src/selections/expr.cpp. Seems to work, please let me know if there are some caveats.
    Mykola Dimura
    @mdimura
    Chemfiles selection syntax does not seem to allow for ranged numerical properties. Would it be possible to introduce something like resid 3-17 or resid 3 to 17? Similar syntax exists in vmd, pymol, mdtraj, etc., and it is more compact than chemfiles' resid >=3 and resid <=17. Another option could be 3 <= resid <= 17, but resid 3-17 might be cleaner.
    Guillaume Fraux
    @Luthaf
    Hey @mdimura, sorry I missed you question here! I agree having some kind of ranged selection could be nicer than resid >=3 and resid <=17. My favorite alternatives would be 3 <= resid <= 17, or between(resid, 3, 17). Could you open an issue to discuss what syntax this could use? resid 3-17 would clash with resid == 3 - 17 i.e. resid == -14
    Mykola Dimura
    @mdimura
    jmintser
    @jmintser
    Hi, I'm new to to chemfiles and pretty new to MD trajectories in general. I'm trying to read data from a .dtr trajectory. Since chemfiles uses VMD plugins, I feel like the .dtr could be easily supported as well since I (perhaps naively) think that the interface is the same via molfile. Am I way off? Thanks
    Guillaume Fraux
    @Luthaf
    We don't have all VMD plugins enabled since I prefer having test files to ensure the code is working as intended! It should be pretty easy to add support for DTR files though. Can you open an issue for this, and link an example file that could be used for tests?
    BTW, which software produces DTR files?
    Guillaume Fraux
    @Luthaf
    After digging a bit more in the VMD dtrplugin code, it seems that it support more than what is exported through the VMD API, and it would make sense to extract https://github.com/chemfiles/molfiles/blob/chemfiles/src/dtrplugin.hxx and https://github.com/chemfiles/molfiles/blob/chemfiles/src/dtrplugin.cxx and use them directly. In particular, they support reading only part of the trajectory, which helps reducing the amount of memory required.
    The main thing we would need is at least one trajectory to use as test case, even better if we can get multiple files, to check all the different variation of the format: https://mdtraj.org/1.9.3/api/generated/mdtraj.load_dtr.html#mdtraj.load_dtr
    jmintser
    @jmintser
    Thank you so much for looking into this. I opened a new issue and added links to some published trajectories
    jmintser
    @jmintser
    Hi, just wanted to make sure: to use chemfiles in c++ code, I need chemfiles.hpp but that doesn't seem to be included with a linux binary install (tried on centos 8). I can just build from source but was wondering if this is expected behavior. Thanks.
    Guillaume Fraux
    @Luthaf
    Are you talking about the binaries from OpenSuse build, i.e. http://chemfiles.org/chemfiles/latest/installation.html#pre-compiled-binaries?
    These are following the usual linux naming scheme: chemfiles package only contains the shared library, chemfiles-static contains the static library & chemfiles-dev (Debian & friends) / chemfiles-devel (CentOS & friends) contains the headers.
    I realize this is not very clear on the documentation, I'll update it (unless you want to send a PR for this! :smiley:)
    jmintser
    @jmintser
    Thanks! Since it wasn't in the docs I didn't think to look. Sent a PR with copy/paste of the above
    jmintser
    @jmintser
    Hi, I'm reading .dcd trajectory, which doesn't contain any metadata, so I assumed that I could populate a Topology from a pdb. But it doesn't look like a Topology can be populated from a file, only created from scratch?
    jmintser
    @jmintser
    Oh, sorry, I think I found the relevant docs
    jmintser
    @jmintser
    Hi again, I'm trying to figure out how to use SubSelection as implemented in chemfiles/chemfiles#327 and can't quite figure it out. I basically have the use case that was requested (say, find all atoms within 5A of residue ALA). Can't I do two sub-selections "distance(resname == ALA, resname != ALA) < 5"? Thanks much
    Guillaume Fraux
    @Luthaf
    Sub-selections are implemented on master, but not yet released, so if you are using the pre-built linux version (which is still on 0.9.3) that's expected. I would like to release a new version soon, but in the meantime you would have to build chemfiles from sources
    For your topology question, you can use this overload of Trajectory::set_topology
    jmintser
    @jmintser
    Got it, thanks!
    jmintser
    @jmintser
    Hi, sorry more questions. I'm reading MD coordinates from one file and topology from another using trj.set_topology and it works great. Now I want to add some new atomic properties and use them across multiple frames. It seems I can only access a topology and its atoms one frame at a time. One simple way to do that might be to define a topology from a file, add my properties and then use traj.set_topology(modifeid_topology) but there doesn't seem to be a way to just define a new topology from a file (topo = Topology(filename)), only if assigning it to a trajectory. Am I missing something? Thanks
    jmintser
    @jmintser
    another way maybe to enable 'topology = traj.get_topology'. Then I could modify it and assign it back to the trajectory
    Guillaume Fraux
    @Luthaf
    I'm not quite sure what you mean =) So have one file (file1) with coordinates and another one (file2) with a topology that you want to modify and then use for reading file1? The easiest way to do this would be something like
    // get topology from the first frame of the file
    auto topology = Trajectory("file2").read().topology();
    // modify topology
    topology[12].set("foo", false);
    
    // use the modified topology to read the other file
    auto traj = Trajectory("file1");
    traj.set_topology(topology);
    
    // read, etc.
    auto frame = traj.read();
    jmintser
    @jmintser
    Great, thank you! Didn't realize I could get it from a frame.
    Moyassar Meshhal
    @mmeshhal
    Hey there, I'm a new user of chemfiles/cfiles and I have some questions:
    in the hbonds analysis using cfiles:
    1. if the trajectory is in xyz format but the system is periodic, so if I used the cell argument, will it make any difference?
    2. if yes, does it wrap the atoms within the cell?
    Guillaume Fraux
    @Luthaf
    Hey! If you are using the cell option to cfiles, that makes it so the system is considered periodic, and compute distances accordingly. If you don't use it, atoms on opposite sides of the cell will not see each other.
    However, using the option do not modify the structure and change the positions of the atoms (i.e. wrap them). It just changes the way distances are computed
    jmintser
    @jmintser
    Hi I'm analyzing a trajectory that includes ~ 35000 water atoms. I'd like to select and remove them from a frame before doing further analysis, using code essentially like in the examples (C++ Tutorials). This works perfectly for any other residue I select, but for waters, most of them get removed, but not all. About 32750 get removed, so my hunch is there's some limit somewhere? Feels close to max(short int). Does this kind of behavior ring any bells? Any additional selections on the remaining atoms results in "/usr/include/chemfiles/external/optional.hpp:685: std::experimental::optional<T&>::operator->() const [with T = const chemfiles::Residue]::<lambda()>: Assertion `false && "ref"' failed." Thanks much for any helpful hints
    Guillaume Fraux
    @Luthaf

    If the failed assertion comes from chemfiles, that's a bug on our side! What is the selection you are using? It sounds like you are checking a residue property/residue name.

    so my hunch is there's some limit somewhere?

    The limit should be max size_t, i.e. 4294967295 for 32-bit systems & 18446744073709551615 on 64-bits

    Yassine Naimi
    @ynaimi
    Hi all, I am working on optimizing the guess_bonds function and I had a question on multithreading. Does chemfiles offer or use an API for running and managing multithreaded tasks?
    Thank you in advance for any information/help.
    Guillaume Fraux
    @Luthaf
    No, everything is single threaded currently. I can see two main contenders for multi-threading: OpenMP and std::thread. You may want to open an issue to discuss this since there are a lot of design implication to adding multithreading =)
    For example: how can we guarantee that threading is supported with all compilers (including MSVC), and if we can not how to ensure the code can still run single-threaded; which parallelisation strategy (tasks, split the input data in chunks, work stealing, ...) is the best one in this case; how can we ensure the abscence of data races (maybe add thread sanitizer in CI?); and a few others
    Yassine Naimi
    @ynaimi
    For sure, it is a big change and this should be discussed carefully in order to choose the best option and the best design for the library as it will impact everything else. Maybe a good start would be the use of the mulithreading API in VMD (under BSD-3 license) as it is cross-platform and seems to offer the possibility to run single-threaded codes as well.
    Guillaume Fraux
    @Luthaf
    I would be a bit worried of importing code from VMD, since it is not the cleanest and easiest code to use. I don't know what the threading API they have looks like, but I've had bad experiences with the plugins. Also, VMD codebase is not under the BSD license, the plugins are under the UIUC license which looks a lot like the BSD one. The core of the code is under a non standard open source license: https://www.ks.uiuc.edu/Research/vmd/current/LICENSE.html
    To get back at the problem at hand here, I think using a cell list for neighbors search in bond search would yield more gains for less pains that trying to use multithreading here. The current algorithm is O(N^2), a cell list would be O(N) (cf https://hoomd-blue.readthedocs.io/en/stable/nlist.html)
    Yassine Naimi
    @ynaimi
    The header is mentionning BSD-3 license but I understand your concerns about using VMD code.
    Yassine Naimi
    @ynaimi
    Actually, as they are doing both using a cell list algorithm and running it on multithreads I thought about having multithreading as well on chemfiles. But I understand that this is a complex implementation which will impact several aspects of the library's code. It should be discussed extensively before choosing an option. I will keep you updated about my version of the code with a cell list algorithm.
    Julian Mintseris
    @jmintser_gitlab
    Hi, I had chemfiles 0.9.3 installed with dnf on centos 8 working fine. Then I had to remove and re-install for various reasons and now I just realized that the newly installed chemfiles is version 0.10 and that appears to be the only version available via packages. Also, the website http://chemfiles.org/chemfiles/latest now appears to refer to version "0.11-dev". I'm not sure what this means. Code that was working before with 0.9 now fails with a segfault using 0.10 version. I can install 0.9.3 from source but just wanted to check what's happening. Thanks
    Guillaume Fraux
    @Luthaf
    Sorry about that, we are in the process of releasing 0.10, and we did not intend for it to reach users yet. I guess you are getting chemfiles as a pre-compiled binary here: http://chemfiles.org/chemfiles/latest/installation.html#pre-compiled-binaries ? I though that using a pre-release tag (0.10.0-rc1) meant that users would not get the update yet.
    You should be able to use dnf to get back to 0.9.3 with dnf downgrade chemfiles-0.9.3 and potentially the same for chemfiles-devel.
    Out of interest, and to prevent this from happening again, which language are you using chemfiles from? If it is C or C++, could you share the compiled binary with me, potentially in private? I would like for users to get a proper error when trying to use a different version of chemfiles instead of a segfault.