Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • 14:19
    synchon review_requested #1080
  • 14:19
    synchon edited #1080
  • 14:18
    synchon labeled #1080
  • 14:17
    codecov-commenter commented #1080
  • 14:16
    codecov-commenter commented #1080
  • 14:15
    codecov-commenter commented #1080
  • 14:12
    codecov-commenter commented #1080
  • 14:11
    codecov-commenter commented #1080
  • 14:11
    codecov-commenter commented #1080
  • 14:09
    codecov-commenter commented #1080
  • 14:09
    codecov-commenter commented #1080
  • 14:08
    codecov-commenter commented #1080
  • 14:05
    synchon labeled #1080
  • 14:05
    synchon assigned #1080
  • 14:05
    synchon opened #1080
  • 13:32
    synchon closed #1077
  • 06:52
    synchon commented #1077
  • Apr 20 16:51
    synchon edited #1077
  • Apr 20 16:27
    cdiener commented #946
  • Apr 19 06:45
    sirno closed #1073
Christian Diener
@cdiener
Yeah because neither optlang nor cobrapy checks that the IDs are compatible with the solvers.
Moritz E. Beber
@Midnighter
:grimacing: I suppose optlang is a logical place for that kind of validation?
Rodrigo Santibáñez
@glucksfall
No objective coefficients in model. Unclear what should be optimized
Christian Diener
@cdiener
Yep, there are some. I suspect there is a character that optlang does not catch (utf maybe).
So for instance optlang checks for spaces.
Rodrigo Santibáñez
@glucksfall
image.png
How I change the optimizer? I employed gurobi
Rodrigo Santibáñez
@glucksfall
Thanks
I tried to use glpk and the kernel died
Christian Diener
@cdiener
The same issue was reported before: opencobra/cobrapy#964 and had to do with #955. Try to read it with read_sbml_model("iRP911.xml.gz", f_replace=None).
Moritz E. Beber
@Midnighter
GLPK is probably the most strict when it comes to identifiers/variable names. Good old C-strings, I suppose.
Christian Diener
@cdiener
After that it works for me:
In [10]: mod.objective = mod.reactions.BiomassSynthesis

In [11]: mod.optimize()
Out[11]: <Solution 18.622 at 0x7f3a8879b640>
Rodrigo Santibáñez
@glucksfall
Seems the problematic metabolites are:
b'adenosine_3\x05bisphosphate'
b'_D\x04phosphopantothenate'
b'_R\x04phosphopantothenoyl_L_cysteine'
b'2___5_triphosphoribosyl\x03dephospho_CoA'
b'2_5_diamino_6__ribosylamino\x043H__pyrimidinone_5__phosphate'
b'3_4_3\x04tetrahydrospirilloxanthin'
b'3\x04dihydrorhodovibrin'
b'5_amino_6\x05phosphoribosylamino_uracil'
Christian Diener
@cdiener
The ones actually crashing the kernel are ones that contain things like __X__ because that will be converted to the character with the ASCII code X which will result in weird control characters in this case.
Rodrigo Santibáñez
@glucksfall
Ohhh... the metabolite adenosine_3\x05bisphosphate was converted from adenosine_3__5__bisphosphate, right?
Christian Diener
@cdiener
Yes!
Rodrigo Santibáñez
@glucksfall
So sed -i 's|__|_|g' iRP911.xml solves the problem
Christian Diener
@cdiener
Yeah i would guess. Or deactivating the automatic replacement using f_replace=None.
jeremymchacon
@jeremymchacon
@glucksfall @cdiener you both rule! I was able to change the met codes for things like X as suggested, after loading with f_replace = None, then save, thus creating a model that doesn't crash the kernel. I greatly appreciate it!
Moritz E. Beber
@Midnighter
@cdiener loky looks quite interesting. Have you used it much? Does it live up to its promises for you? I feel like the job submission API is a bit thin compared to a multiprocessing.Pool but maybe one doesn't really need more than a simple submit and map.
It'd be interesting to see if it has a performance degradation on Windows or runs just the same.
Christian Diener
@cdiener
I did not note much difference to multiprocessing but I don't use windows. sklearn uses joblib which uses loky. I know that some people prefer it because cloudpickle is faster and did not have the pre-3.8 size constraint. I think most people are waiting for https://docs.python.org/3/library/concurrent.futures.html though.
Moritz E. Beber
@Midnighter
Loky docs say cloudpickle is slower but can pickle objects defined in __main__?
Christian Diener
@cdiener
Oh yeah and seems it doesn't even use it for stuff outside main. My bad! Did you check if it has the same startup overhead on windows?
Doesn't seem to be that actively maintained though.
Moritz E. Beber
@Midnighter
No, I havent checked the Windows thing yet.
Gustavo Tamasco
@tamascogustavo
Hey guys I m struggling to run gapfill in my model.
I want to apply the whole metacyc as my universal, but after using the gapfill command all the reactions from the database are incorporated to my model. Any suggestions on how to solve it ?
Also tried to use a model from BIGG as db, but since the ids are not identical I dont known if its a valid approach, because even doing that when I try to model.optimize it gives erro "system has no solution"
Gustavo Tamasco
@tamascogustavo
File "cobrapy_v4.py", line 697, in <module>
main()
File "cobrapy_v4.py", line 652, in main
solution = gapfill(model, database, demand_reactions=False)
File "/usr/local/anaconda3/lib/python3.7/site-packages/cobra/flux_analysis/gapfilling.py", line 350, in gapfill
return gapfiller.fill(iterations=iterations)
File "/usr/local/anaconda3/lib/python3.7/site-packages/cobra/flux_analysis/gapfilling.py", line 253, in fill
error_value=None, message="gapfilling optimization failed"
File "/usr/local/anaconda3/lib/python3.7/site-packages/cobra/core/model.py", line 1082, in slim_optimize
assert_optimal(self, message)
File "/usr/local/anaconda3/lib/python3.7/site-packages/cobra/util/solver.py", line 544, in assert_optimal
raise exception_cls(f"{message} ({status}).")
cobra.exceptions.Infeasible: gapfilling optimization failed (infeasible).
Christian Diener
@cdiener
Yeah if your IDs don't match it won't work. All metabolite IDs have to be compatible for gapfilling.
What is your objective for gapfilling?
Gustavo Tamasco
@tamascogustavo

I am trying to build a model for a lineage of Pseudomonas putida, using the whole genome as the base. The main goal is to use gap fill is to enable the flux through the BOF.

My sbml file was made using the .dat files from pathway tools, therefore the ids are from Metacyc.
I want to use Metacyc DB to gap-fill my model, but when I do that, all the reactions from the DB are added to my model.

I also tried to use a curated model for the organism as DB, it works.
But I am not sure if the result that I have is the ideal one.

The best option would be to do a specific gap fill, just targeting reaction from energy biosynthesis (which I already have in a dict format), but I have no idea if it's possible to do a focused gap fill in cobrapy.

3 replies
Christian Diener
@cdiener
So the gapfill objective will be whatever the current objective on the model is. What does model.objective.expression show before you run gapfill?
Christian Diener
@cdiener
But this kind of gapfilling is often done in the reconstruction phase itself since you can incorporate the genomic info for that (gapfill preferentially with reactions that should be present).
Gustavo Tamasco
@tamascogustavo

So the gapfill objective will be whatever the current objective on the model is. What does model.objective.expression show before you run gapfill?

1.0PP_Biomass_core - 1.0PP_Biomass_core_reverse_248f2

I defined my biomass based on the following paper: High-quality genome-scale metabolic modelling of Pseudomonas putida highlights its broad metabolic capabilities.

Gustavo Tamasco
@tamascogustavo

But this kind of gapfilling is often done in the reconstruction phase itself since you can incorporate the genomic info for that (gapfill preferentially with reactions that should be present).

I'm not sure if I got it right. Please correct me if I don't.

So if a more focused gap-fill approach is needed, I need to check if the core reactions (energy metabolism in my case) are in the model. If not, I need to add them, followed by gap-filling to add the remaining exchange reactions...

Christian Diener
@cdiener
Ah sorry I just meant that a lot of model constructuon pipelines (CARVEME, gapseq) allow you to specify a growth medium during the construction and that will use the genomic evidence. When you gapfill separately afterwards this will not use your genomic data anymore. However can you specify what your goal is here. Any reconstruction method should leave you with a model that can grow. What does model.optimize() give you before running gapfill?
Gustavo Tamasco
@tamascogustavo

Ah sorry I just meant that a lot of model constructuon pipelines (CARVEME, gapseq) allow you to specify a growth medium during the construction and that will use the genomic evidence. When you gapfill separately afterwards this will not use your genomic data anymore. However can you specify what your goal is here. Any reconstruction method should leave you with a model that can grow. What does model.optimize() give you before running gapfill?

Oh, now I got it.
The only tool that I used besides cobrapy was AuReMe.
But got a lot of errors, so I decided to use cobra only.

The goal here is to enable flux through the BOF. When I use model. optimize() before gap-fill I have no flux (0.0)
After adding exchange reactions I have flux.

Christian Diener
@cdiener
I think I don't follow, how did you get the initial model for your strain of P. putida?
Gustavo Tamasco
@tamascogustavo
The ideia here was to build a pipeline, and the chosen organism was P.putida. The steps can be described as:
1) The .gbk of P.putida from NCBI
2) Pathologic from Pathway Tools
3).data files were converted to sbml
4)model was load to cobrapy.
5)some reactions were deleted (containing RNA and broke dna)
6)addition of bof
7)gapfill of the model
Christian Diener
@cdiener
I see. What did you use for step 3. So usually you would use a reconstruction tool there to do the initial mapping to pathways and addition of the BOF which will also ensure the model can produce biomass. The most common ones are modelseed, CARVEME and the still pretty new gapseqs. AuReMe used to be a contender here, bit it looks like it is unmaintained at this point. For manual reconstruction and curation Blade is also really good but this would be hard to scale to 100s of models. All of them have been used to generate thousands of models, so they scale well. Then you usually gapfill to get growth on particular growth media like LB, or to reproduce experimental observations.
Gustavo Tamasco
@tamascogustavo

I see. What did you use for step 3. So usually you would use a reconstruction tool there to do the initial mapping to pathways and addition of the BOF which will also ensure the model can produce biomass. The most common ones are modelseed, CARVEME and the still pretty new gapseqs. AuReMe used to be a contender here, bit it looks like it is unmaintained at this point. For manual reconstruction and curation Blade is also really good but this would be hard to scale to 100s of models. All of them have been used to generate thousands of models, so they scale well. Then you usually gapfill to get growth on particular growth media like LB, or to reproduce experimental observations.

About step 3:
I based my approach on AuReMe.

sbmlGenerator" \
" --padmet={}" \
" --output={} " \
"--sbml_lvl=3 " \
"--mnx_chem_prop=chem_prop.tsv " \
"--mnx_chem_xref=chem_xref.tsv"\
.format(pad_file,out_name)

--mnx_chem_prop=FILE path of the MNX chemical compounds properties.
--mnx_chem_xref=FILE path of the mnx dict of chemical compounds id mapping.

About CARVEME:

As I plan to implement a pipeline for thousands of organisms, I will take a look at this tool.

About gap fill:

Now I see, so gap fill has a different goal, than the one I am using.

I will try to implement a step of reconstruction using CARVEME, followed by the insertion of the model into cobrapy for further analysis.

Thanks so much for the feedback!
Christian Diener
@cdiener
That sounds like a great plan and it's the same one we are using (carveme and additional curation afterwards). This way you will get models that are guaranteed to grow and carveme is pretty resistant to lower completion metrics as well (due to the "carving" approach).
Gustavo Tamasco
@tamascogustavo

That sounds like a great plan and it's the same one we are using (carveme and additional curation afterwards). This way you will get models that are guaranteed to grow and carveme is pretty resistant to lower completion metrics as well (due to the "carving" approach).

Nice ! I hope it goes all well here, I will let you now how its follows.

Thanks so much!
Christian Diener
@cdiener
:+1:
Moritz E. Beber
@Midnighter

@/all

Dear COBRApy community member,

We cordially invite you to the next COBRApy community meeting. Please feel especially welcome if you have not previously attended any of the calls. We propose to talk about future directions for the project and look forward to particularly hearing about expectations from new users, as well as long term supporters.

We suggest the following agenda. Please email moritz.beber@gmail.com to suggest further topics. Please go vote on the https://doodle.com/poll/m8cwipuyu58i42fk for a specific time slot. The times are biased towards European and American time zones. Please also email if this prevents you from participating.
Agenda

  1. Discuss and decide on the future structure of the COBRApy project or rather Python within openCOBRA, we suggest:
    1. A steering committee (3 people?)
      1. How does one become a member?
      2. For how long?
      3. Responsibilities?
    2. Core developers (unlimited?)
      1. How does one become a member?
      2. For how long?
      3. Responsibilities?
  2. Discuss and decide on streamlining Python packages within openCOBRA organization
    1. Do we see a benefit in organizing packages similarly?
      1. Structured in the same way (using the cookiecutter) might make it easier for devs to work on any project due to familiar tooling.
      2. Documentation is created in the same way with the openCOBRA Sphinx template
    2. Do we see a benefit and is the code of conduct committee willing to be responsible for all Python projects, i.e., apply a single code of conduct to the organization?
  3. Communication channels: Which ones to open, keep, or close?
    1. Google group
    2. Gitter channel
    3. GitHub issues
    4. GitHub discussions
    5. Synthetic Biology StackExchange
  4. How can we organize support for COBRApy? Who writes applications?
    1. Google Summer of Code (GSoC)
    2. Google Season of Docs
    3. Chan-Zuckerberg Initiative (CZI)
    4. NumFOCUS membership application
    5. Other funding sources?
  5. Updated publication for COBRApy.
    1. Who can contribute to writing a first draft?
    2. How do we handle contributions?
  6. What are big unmet needs in the community that COBRApy should address?
    1. Omics data integration (something like driven needs to become official)
  7. Personal pledge from Moritz: Mentor one person from an underrepresented minority group (URM) to become a core developer/contributor

We look forward to seeing you at the meeting.

Best regards,
Moritz & Niko on behalf of the COBRApy project