Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Apr 05 2018 14:51
    eparejatobes unassigned #31
  • Apr 05 2018 14:51
    eparejatobes unassigned #31
  • Apr 05 2018 14:51
    eparejatobes unassigned #33
  • Apr 05 2018 14:51
    eparejatobes unassigned #33
  • Dec 16 2016 10:50
    Travis bio4j/bio4j (archive/no/null/keywords/167) passed (269)
  • Dec 16 2016 10:36
    Travis bio4j/bio4j@d2d4110 (no/null/keywords/167) passed (266)
  • Dec 16 2016 10:25
    Travis bio4j/bio4j (archive/pr/165) passed (265)
  • Dec 15 2016 21:10
    Travis bio4j/bio4j (pr/165) fixed (262)
  • Dec 15 2016 15:50
    Travis bio4j/bio4j (pr/165) still failing (261)
  • Dec 15 2016 15:42
    Travis bio4j/bio4j (pr/165) broken (260)
  • Nov 05 2016 15:23
    eparejatobes assigned #46
  • Nov 05 2016 15:23
    eparejatobes labeled #46
  • Nov 05 2016 15:23
    eparejatobes opened #46
  • Oct 29 2016 16:35
    codacy-bot commented #90
  • Oct 29 2016 16:35

    eparejatobes on code

    (compare)

  • Oct 29 2016 16:35
    eparejatobes unlabeled #90
  • Oct 29 2016 16:35

    eparejatobes on clean

    (compare)

  • Oct 29 2016 16:35
    eparejatobes closed #90
  • Oct 29 2016 16:35

    eparejatobes on master

    remove unused Optional imports make schema error messages shor… Merge branch 'clean/code/90' (compare)

  • Oct 29 2016 16:34
    eparejatobes commented #90
Eduardo Pareja Tobes
@eparejatobes
but a significant improvement
happy to expand/clarify anything
Eduardo Pareja Tobes
@eparejatobes
@SandraCastilloPriego I just opened a PR for this bio4j/bio4j-titan#70
feel free to comment there
or maybe just tell me a bit more about your project etc
with a bit more info it'd be much easier for me to help
also from a more general perspective
like whether Bio4j could be useful for your project
or how you could use it
etc
Eduardo Pareja Tobes
@eparejatobes
oh @cosimosimeoneTR I forgot
as I told you there
if there's interest in a Neo4j distribution
it would be really easy to do
We only need a working angulillos implementation for Neo4j
I played a bit with it one year ago
the only issue I found was indexes
the Neo4j API was in a transitioning state
maybe things improved a bit there, didn't check
but nothing insurmountable
cosimosimeoneTR
@cosimosimeoneTR
@eparejatobes Thanks a ton!
Yep, i have same needs (HUGE datasets) and i was having trouble with relativeli small (20m nodes) dataset with Neo4j...
Sandra Castillo
@SandraCastilloPriego
I have several projects.. One of them is related with protein similarities. We are trying to compute a big protein similarity matrix using all the proteins found in uniprot and other db. With it we are planning to apply some clustering methods and do some data mining
Sandra Castillo
@SandraCastilloPriego
Also, we are creating metabolic models of different organisms
Eduardo Pareja Tobes
@eparejatobes
@SandraCastilloPriego sorry got to leave yesterday
I see
yep, Bio4j sounds like a good fit for that
I think that you'd only need to import data locally
and use the Bio4j API
which is backend-independent
if you prefer to discuss it in private send me an email or private chat here
Fadel
@FadelBerakdar
Is there any python API for Bio4j ? if not how much java should I know before i start a simple bio4j project ?
Fadel
@FadelBerakdar
btw Im undergraduate student and I chose a topic related to Bio4j to do my final year project :D
sounds crazy but yeah I will do it :D
Alexey Alekhin
@laughedelic

@FadelBerakdar hi!
There is no Python API for Bio4j. But I believe, that you don't really need any Java knowledge to start with Bio4j :wink:
The current version uses TitanDB v0.5.4 as a backend (see bio4j-titan repo). So you can interact with the database in a number of ways:

  • through the Bio4j Java API :sparkles:
  • through the raw TitanDB Java API
  • through the Tinkerpop/Gremlin API, for which, I think, there are various implementations including Python (probably this?).

Anyway, it may be good for starting with it, but if you are really going to do something serious with Bio4j, it's worth using the Bio4j Java API (we are also working on a Scala version), because it let's you to take advantage of the quite complex schema of the database :+1:

Fadel
@FadelBerakdar
thanks @laughedelic,
I still have some simple questions if u don't mind :)
Fadel
@FadelBerakdar
like
I couldn't find RefSeq module in bio4j-lite nor bio4j-full !
500 GB is the size of bio4j-lite.tar or Bio4j-lite ? sorry if u found my questions quite naive, I'm just trying to figure out how much exactly the AWS will cost me :'(
Alexey Alekhin
@laughedelic
@FadelBerakdar, I think about AWS is better to ask @eparejatobes
Eduardo Pareja Tobes
@eparejatobes
sorry didn't get notifications :(
@FadelBerakdar
About AWS
the required size would be less than 1TB for full
speaking from memory though
so could be wrong
:)
about costs in general:
  1. if you work in eu-west-1, all data transfer is free
  2. you can always find instance types with enough (ephemeral) instance store space so that Bio4j will fit there
in that scenario, your costs will only be EC2-based.
Fadel
@FadelBerakdar
sounds great ... thanks @eparejatobes
Fadel
@FadelBerakdar
hey there :)
I just have one simple question and I couldn't find an answer ... how the data is being updated?
Fadel
@FadelBerakdar
This message was deleted
I mean how I can update the data after transferring it to my AWS instance ?