These are chat archives for bio4j/bio4j

13th
Oct 2015
Sandra Castillo
@SandraCastilloPriego
Oct 13 2015 07:50
hey @eparejatobes
I've used neo4j one year ago before going to my maternity leave and I've been completely disconnected from the world until now.
I don't know anything about Titan
and I can program in java
cosimosimeoneTR
@cosimosimeoneTR
Oct 13 2015 09:24
Hola All, hola @eparejatobes, que tal?
Following yuor mail ("Is bio4j-noe4j still alive", i'd like to ask you which was the cause made you prefere Titan over Neo4j database...
Answer me in private (and/or mail) if you prefere.
Muchisimas gracias, ciao!
Eduardo Pareja Tobes
@eparejatobes
Oct 13 2015 13:38
@SandraCastilloPriego OK fine. Then the easiest route for you would be to start with the Titan-backed version. Give me a sec and I'll review the docs for it
Sandra Castillo
@SandraCastilloPriego
Oct 13 2015 13:47
is titan an alternative of Neo4j? or is it based on it?
why is it better?
Eduardo Pareja Tobes
@eparejatobes
Oct 13 2015 13:56
Titan is a different graph database engine
About whether it's better or not
I do think it's better for what we need
:)
@cosimosimeoneTR
about our choice
happy to discuss it here, of course
Eduardo Pareja Tobes
@eparejatobes
Oct 13 2015 14:01
we just had a lot of scalability issues with Neo4j
the Neo4j data storage design was IMHO a bit flawed
in that typing information
like edge labels etc
didn't affect data layout
we were waiting for ~2 years I think
for fixes etc
and we were only getting Neo4j sales people offering discounts on licenses :-|
so when we saw Titan
we didn't look back
more types
local indexes
a saner data storage layout
Titan is still far from perfect for us
but a significant improvement
Eduardo Pareja Tobes
@eparejatobes
Oct 13 2015 14:11
happy to expand/clarify anything
@SandraCastilloPriego I just opened a PR for this bio4j/bio4j-titan#70
feel free to comment there
or maybe just tell me a bit more about your project etc
with a bit more info it'd be much easier for me to help
also from a more general perspective
like whether Bio4j could be useful for your project
or how you could use it
etc
Eduardo Pareja Tobes
@eparejatobes
Oct 13 2015 14:19
oh @cosimosimeoneTR I forgot
as I told you there
if there's interest in a Neo4j distribution
it would be really easy to do
We only need a working angulillos implementation for Neo4j
I played a bit with it one year ago
the only issue I found was indexes
the Neo4j API was in a transitioning state
maybe things improved a bit there, didn't check
but nothing insurmountable
cosimosimeoneTR
@cosimosimeoneTR
Oct 13 2015 14:42
@eparejatobes Thanks a ton!
Yep, i have same needs (HUGE datasets) and i was having trouble with relativeli small (20m nodes) dataset with Neo4j...
Sandra Castillo
@SandraCastilloPriego
Oct 13 2015 14:43
I have several projects.. One of them is related with protein similarities. We are trying to compute a big protein similarity matrix using all the proteins found in uniprot and other db. With it we are planning to apply some clustering methods and do some data mining
Sandra Castillo
@SandraCastilloPriego
Oct 13 2015 14:57
Also, we are creating metabolic models of different organisms