Where communities thrive

  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
Repo info
    im looking for a solution to detect changes in the graph db and push all the mutations to a queue for further processing, i have found that janusgraph supports trigger log but i could not find any example how to read that log, secondly is there a way that we can detect changes without using trigger log i.e. without using the name of the log?
    Matthias Leinweber
    we have a mixed index (es) but when we try to query property within the the index we get the message that queries would be faster when we use indexes what could be the reason?
    ... property is camelCase if that could be an issue
    Matthias Leinweber
    ok i dont exaclty know why its working now .. but sometime querying the index ist slower.. serveral seconds instead of ms. How could i investigate this?
    Saurabh Verma
    Multiple vertices generated for the same index value and vertex properties missing with RF3
    with RF3, we are seeing index corruption, which is not seen with RF1

    Hi team, I'm connecting to JG from a NodeJS typescript application, making use of the npm/gremlin package. I'm struggling to connect to an existing graph (created from Console with below command) - is there some other recommended package, or can someone perhaps give some guidance, please?

    graph = JanusGraphFactory.build().
      set("storage.backend", "cql").
      set("storage.hostname", "jg-cassandra").

    The code I currently have:

        const traversal = gremlin.process.AnonymousTraversalSource.traversal;
        const DriverRemoteConnection = gremlin.driver.DriverRemoteConnection;
        const g = traversal().withRemote(new DriverRemoteConnection('ws://localhost:8182/gremlin'));

    I've created a JG config file and added it to the container, so that I can reference that when connecting to JG, but I don't see where I can do this?

    If I run the above code, I connect to another (believed to be in-memory) graph, instead of the one created with the Console command above.

    Sanjeev Ghimire
    Hi I am depploying janusgraph to a openshift cluster. it deploys fine but when i start the gremlin server it gives me this error:
    cp: cannot create regular file '/etc/opt/janusgraph/janusgraph.properties': Permission denied
    cp: cannot create regular file '/etc/opt/janusgraph/gremlin-server.yaml': Permission denied
    chown: changing ownership of '/var/lib/janusgraph': Operation not permitted
    chown: changing ownership of '/etc/opt/janusgraph': Operation not permitted
    chmod: changing permissions of '/var/lib/janusgraph': Operation not permitted
    chmod: changing permissions of '/etc/opt/janusgraph': Operation not permitted
    chmod: cannot access '/etc/opt/janusgraph/*': No such file or directory
    grep: /etc/opt/janusgraph/janusgraph.properties: No such file or directory
    /usr/local/bin/docker-entrypoint.sh: line 52: /etc/opt/janusgraph/janusgraph.properties: Permission denied
    grep: /etc/opt/janusgraph/janusgraph.properties: No such file or directory
    /usr/local/bin/docker-entrypoint.sh: line 52: /etc/opt/janusgraph/janusgraph.properties: Permission denied
    Error: stat /etc/opt/janusgraph/gremlin-server.yaml: no such file or directory
      yq write [yaml_file] [path_expression] [value] [flags]
      write, w
    2 replies
    Any help is appreciated
    I have a 'Date' property formatted like this:
    gremlin> g.V().has("Patent", "title", textContains("Cups")).values('date').limit(10);
    ==>Tue Jan 03 00:10:00 UTC 1978
    ==>Sun Jan 28 00:08:00 UTC 2007
    ==>Tue Jan 27 00:10:00 UTC 1987
    ==>Wed Jan 10 00:04:00 UTC 2001
    ==>Sun Jan 17 00:08:00 UTC 2010
    ==>Tue Jan 05 00:10:00 UTC 2010
    ==>Thu Jan 28 00:09:00 UTC 2010
    ==>Wed Jan 04 00:09:00 UTC 2012
    ==>Wed Jan 09 00:12:00 UTC 2008
    ==>Wed Jan 24 00:04:00 UTC 2018
    Is there a way to search for data from a specific year or do I need to create a separate property field with just 'year'?
    Florian Hockmann
    @maahutch You can compare dates in Gremlin just like other date types. So, if you want to ensure that your date property is within a certain time range, then you can use this traversal:
    g.V().has("Patent", "title", textContains("Cups")).has('date', within(startDate, endDate)).limit(10);
    where startDate and endDate are Date objects that represent the start and end of your interval, so the beginning and end of the year in this case
    Vinayak Shiddappa Bali
    Hi All,
    I want to enable JMX authentication for Cassandra Deamon. What all changes are needed at janusgraph side to enable the authentication ??
    Piotr Joński
    Hi guys, is there a way to customize log format in janusgrap? i would like to apply our company-standard formatting, so we can browse all logs in kibana in single place.
    i am not interested in workaround like logstash pipeline, which changes log format in-the-fly -- this i know already
    I am running janusgraph in k8s, docker image. configuration is done by mounting configuration files in proper places.
    3 replies
    Razi Kheir
    Guys, regarding graph management, I can't find in the javadocs, or the janusgraph docs, when to exactly rollback management (not transaction, ie: mgmt.rollback()), and what specific effect it actually has, can anyone please explain this or give me a link to where it is explained?
    Vinayak Shiddappa Bali

    Hi All,

    The schema consists of A, B as nodes, and E as an edge with some other nodes and edges.
    A: 183468
    B: 437317
    E: 186513

    Query: g.V().has('property1', 'A').as('v1').outE().has('property1','E').as('e').inV().has('property1', 'B').as('v2').select('v1','e','v2').dedup().count()
    Output: 200166
    Time Taken: 1min

    Query: g.V().has('property1', 'A').aggregate('v').outE().has('property1','E').aggregate('e').inV().has('property1', 'B').aggregate('v').select('v').dedup().as('vetexCount').select('e').dedup().as('edgeCount').select('vetexCount','edgeCount').by(unfold().count())
    Output: ==>[vetexCount:383633,edgeCount:200166]
    Time: 3.5 mins
    Property1 is the index.
    How can I optimize the queries because minutes of time for count query is not optimal. Please suggest different approaches.

    Thanks & Regards,

    Matthias Leinweber
    what else could be a reason that my index dont get registered/installed/disable although i clean all connections and transactions?
    Matthias Leinweber
    and i am pretty confused about mapreduce jobs and spark/hadoop connectivity .. i am looking for more complete examples or someone who has some time to answer stupid questions :)
    Ben Doan
    Hey everyone - has anyone here successfully implemented a SparkGraphComputer for OLAP on AWS EMR? I'm trying to implement it using JanusGraph 0.5.3 on AWS EMR Release version 5.23.0 and I'm running into some dependency issues
    Ben Doan

    I am currently trying to set up SparkGraphComputer using JanusGraph with a CQL storage and ElasticSearch Index backend, and am receiving an error when trying to complete a simple vertex count traversal in the gremlin console:

    gremlin> hadoop_graph = GraphFactory.open('conf/hadoop-graph/olap/olap-cassandra-HadoopGraph-YARN.properties')
    gremlin> hg = hadoop_graph.traversal().withComputer(SparkGraphComputer)
    gremlin> hg.V().count()
    ERROR org.apache.spark.SparkContext - Error initializing SparkContext. java.lang.ClassCastException: org.apache.hadoop.yarn.proto.YarnServiceProtos$GetNewApplicationRequestProto cannot be cast to org.apache.hadoop.hbase.shaded.com.google. protobuf.Message

    Relevant cluster details

    • JanusGraph Version 0.5.3
    • Spark-Gremlin Version 3.4.6
    • AWS EMR Release 5.23.0
    • Spark Version 2.4.0
    • Hadoop 2.8.5
    • Cassandra/CQL version 3.11.10

    Implementation Details

    Properties File

    metrics.enabled= false
    ### Storage - CQL 
    ###HOSTS and PORTS
    ####InputFormat configuration
    ### SparkGraphComputer Configuration 
    ###Spark Job Configurations
    ###Gremlin and Serializer Configuration
    ### Special Yarn Configuration (WIP)
    ###Spark Driver and Executors
    Sanjeev Ghimire
    question: my janusgraph is hosted in openshift cluster. I have apython app that can connect to it and load successfully, but I can't connect to the server using gremlin console, I get permission denied. Also the data are not replicated on all the servers
    Any idea why? Any help is appreciated
    Piotr Joński
    hi, quick question:
    we are testing things and dont want to spin up full cluster with elastisearch, so we use lucene indexing as default.
    will that work with multi-node janusgraph cluster? (we have 2 instances)
    2 replies
    Hi, anyone already tried to add a custom lib in JanusGraph? I create an empty class in a TestClass.java file with org.test package that I compiled into jar and I added it to my JANUS_HOME/lib folder. However when I try to import my class with :i org.test.TestClass it does not find it. Should I do something special?
    I found a funny bug:
    These two requests looks exactly the same but they are not, they do not produce the same result and if I play with up arrow and enter they still do. I checked that there is no hidden special character.
    And this occurs only with numbers combined with a whitespace, or at least I could not reproduce it with text containing letters+whitespaces.
    On the second request I wrote the value digit per digit on my keyboard while on the first request and copied/pasted the value from the gremlin console after doing a .values()
    My bad, notepad detected that the withespace character is not the same:
    Sanjeev Ghimire
    why is janusgraph not replicating data on all pods?
    we are using all default settings
    when queried from an app, we dont get all teh results
    I would like to ask a question regarding how Janus retrieves properties of vertexes. Take following query for example, g.V(12345).out().has("Name", "abc"), V(12345) has 100 or so neighbors. But it seems that Janus is fetching properties of these vertexes in a sequential way. Am I missing anything or this is done intentionally? I also found a related option "query.batch-property-prefetch" . After it is turned on, Janus can prefetch properties in parallel, but with a drawback. It seems that it is fetching all the properties of an vertex even though the filters in later steps only need a single property. Am I missing anything?
    16 replies
    Sanjeev Ghimire

    why is janusgraph not replicating data on all pods?

    any help on this?

    7 replies
    Piotr Joński

    hi guys,
    question about

                    .serializer(new GraphSONMessageSerializerV2d0(GraphSONMapper.build().addRegistry(JanusGraphIoRegistry.instance())))

    if we use k8s service (the service acts as load balancer) shall we configure single ContactPoint or put 3 (we have 3 pods) janusgraph pod IPs there? (whetever is ContactPoint, i assume it is the same as url, is it correct?)
    and next question: if we specify 3 ContactPoints (all pods directly) what will be the difference? is it only for client-side loadbalancing or some addtional behaviour is expected?

    Piotr Joński

    did anybody in the world, or at least in this channel :) tried to scale up janusgraph? i have serious issues with that, struggiling for few days already.
    it always return connection timeout if we have more than 1 replica :/
    i try to deploy that to k8s, sometimes single pod work, sometimes it throws exceptions about connections.
    how to set it up properly? i have read the articles from janusgraph website (multi node) and nothing helps. do you have any examples of working configurations for multinode janusgraph deployment?


    Florian Hockmann
    Hey Piotr, regarding your first question: You can use a load balancer in front of JanusGraph and then pass it as a contact point to the driver. You just need to make sure that the load balancer supports Websockets and ensures that subsequent requests on the same connection will be forwarded to the same endpoint. Nginx supports this for example. Not sure whether the k8s load balancer supports this or whether it's more "low level".
    If you however provide the driver directly with the IPs of the JanusGraph pods, then it should do the load balancing itself. That might be an option if your environment doesn't change much (e.g., if you scale JanusGraph up or down, you would need to change this in the driver)
    5 replies
    What exception do you get if you try it with just one pod for JanusGraph? I would definitely try to get that working before scaling up as you then get the added complexity from the load balancer
    5 replies
    Hope someone can chime in on my question above. The key question is why Janus fetches properties sequentially which hurts the latency to a great degree.
    Sanjeev Ghimire
    Anyone inegrated cassandra with janusgraph?
    11 replies
    Piotr Joński

    hi guys,
    could you elaborate more on how to connect from java application to janusgraph, please?
    i found SO question and some answers: https://stackoverflow.com/questions/45673861/how-can-i-remotely-connect-to-a-janusgraph-server
    but i cannot fully get it, and, nowadays it looks really awkward to send "code as a string" to server.

    i tried to find description of that problem and solutions in official docs -- https://docs.janusgraph.org/ -- but did not manage :sad:
    the docs are great, but sometimes i feel like reading abstraction-view of specific problem, without description of specific solutions. after reading i have the feeling that janusgraph is supposed to be used only manually from gremlin console, instead of in automatic-way from java application.

    could someone shed more light on that topic, please?
    eventually update docs with examples, or point to examples to SO or any other pages?
    thank you :thumbsup:

    1 reply
    Florian Cäsar

    Hi, Gremlin/Janus question (though mostly Gremlin-related). I'm injecting vertices & edges in a Gremlin language variant for bulk insertion. To add the edges, I reference the connected vertices by their id like using the Vertex-step (i.e. V(<id>)). Since I'm adding the edges from an injected array of maps like [from:id, to:id], I need to dynamically look up vertices by their id from within the traversal with e.g. select('from'). In essence, I want:

     g.inject([[from:vertex_id_1,to:vertex_id_2], ...])

    However, this doesn't work since the outer map containing from/to aren't available inside the vertex lookup step. Something like this must be possible, but I haven't been able to figure out how. Any ideas?

    3 replies
    Marie Diana Tran

    Hi there,

    I have a setup with janusgraph server with BigTable as a storage backend and a remote Elasticsearch for search indexing.
    Following the documentation, I managed to have a remote traversal using Gremlin to the previous janusgraph server.

    Now, I want to punctually run scripts to manage the graph schema and setup indexes.
    I understood that I have to use janusgraph-core and JanusGraphManagement .
    Nevertheless, I could not manage to setup the connection to the janusgraph server

        JanusGraphFactory.Builder config = JanusGraphFactory.build();
            config.set("storage.backend", "hbase");
            config.set("storage.hbase.ext.hbase.client.connection.impl", "com.google.cloud.bigtable.hbase2_x.BigtableConnection");
            config.set("storage.hbase.ext.google.bigtable.project.id", "xxx");
            config.set("storage.hbase.ext.google.bigtable.instance.id", "xxx");
            config.set("index.search.backend", "elasticsearch");
            config.set("index.search.hostname", "xxx:9200");
            JanusGraph graph = config.open(); 
            GraphTraversalSource g = graph.traversal();
            JanusGraphManagement mgmt = graph.openManagement();

    I have the following debug log

    16:12:36.914 [main] DEBUG o.j.d.c.BasicConfiguration - Ignored configuration entry for storage.hbase.ext.hbase.client.connection.impl since it does not map to an option
    java.lang.IllegalArgumentException: Unknown configuration element in namespace [root.storage.hbase.ext]: client
        at com.google.common.base.Preconditions.checkArgument(Preconditions.java:164) ~[jg-tester-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
        at org.janusgraph.diskstorage.configuration.ConfigElement.parse(ConfigElement.java:177) ~[jg-tester-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
        at org.janusgraph.diskstorage.configuration.BasicConfiguration.getAll(BasicConfiguration.java:93) ~[jg-tester-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
        at org.janusgraph.graphdb.configuration.builder.GraphDatabaseConfigurationBuilder.build(GraphDatabaseConfigurationBuilder.java:59) [jg-tester-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
        at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:161) [jg-tester-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
        at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:132) [jg-tester-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
        at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:122) [jg-tester-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
        at org.janusgraph.core.JanusGraphFactory$Builder.open(JanusGraphFactory.java:261) [jg-tester-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
        at live.yubo.JanusGraphApp.main(JanusGraphApp.java:44) [jg-tester-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
    16:12:36.915 [main] DEBUG o.j.d.c.BasicConfiguration - Ignored configuration entry for storage.hbase.ext.google.bigtable.project.id since it does not map to an option
    java.lang.IllegalArgumentException: Unknown configuration element in namespace [root.storage.hbase.ext]: bigtable
        at com.google.common.base.Preconditions.checkArgument(Preconditions.java:164) ~[jg-tester-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
        at org.janusgraph.diskstorage.configuration.ConfigElement.parse(ConfigElement.java:177) ~[jg-tester-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
        at org.janusgraph.diskstorage.configuration.BasicConfiguration.getAll(BasicConfiguration.java:93) ~[jg-tester-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
        at org.janusgraph.graphdb.configuration.builder.GraphDatabaseConfigurationBuilder.build(GraphDatabaseConfigurationBuilder.java:59) [jg-tester-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
        at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:161) [jg-tester-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
        at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:132) [jg-tester-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
        at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:122) [jg-tester-1.0-SNAPSHOT-jar-wit

    Anyone can help ?

    Venkat Dasari
    Hi, Janus Architecture Question. We are trying to start our evaluation on Janus Graph, and from Architecture standpoint, can this graph be built by loading the data directly from HDFS and does it support parquet? Even if it doesn't support Parquet, its fine, but will it be able to read the data from HDFS directly?
    Philipp Kraus
    Hello, I'M using Janusgraph with Python and Java for some graph data storage, but now I would like to use Janusgraph in Python as a knowledge graph with logical or probabilistic reasoning. Is there any additional support for logical reasoning without writing my unification by myself and a general question can I define some inheritance on the vertex label types e.g. label "car" is an inheritance by "vehicale"?
    1 reply
    Vinayak Shiddappa Bali
    Hi All, 
    The Data Model of the graph is as follows:
    Label: Node1, count: 130K
    Label: Node2, count: 183K
    Label: Node3, count: 437K
    Label: Node4, count: 156
    Node1 to Node2 Label: Edge1, count: 9K
    Node2 to Node3 Label: Edge2, count: 200K
    Node2 to Node4 Label: Edge3, count: 71K
    Node4 to Node3 Label: Edge4, count: 15K
    Node4 to Node1 Label: Edge5 , count: 1K
    The Count query used to get vertex and edge count :
    g2.V().has('title', 'Node2').aggregate('v').outE().has('title','Edge2').aggregate('e').inV().has('title', 'Node3').aggregate('v').select('v').dedup().as('vertexCount').select('e').dedup().as('edgeCount').select('vertexCount','edgeCount').by(unfold().count())
    This query takes around 3.5 mins to execute and the output returned is as follows:
    The problem is traversing the edges takes more time.
    g.V().has('title','Node3').dedup().count() takes 3 sec to return 437K nodes.
    g.E().has('title','Edge2').dedup()..count() takes 1 min to return 200K edges
    In some cases, subsequent calls are faster, due to cache usage. 
    I also considered in-memory backend, but the data is large and I don't think that will work. Is there any way to cache the result at first-time execution of query ?? or any approach to load the graph from cql backend to in-memory to improve performance?
    Please help me to improve the performance, count query should not take much time.
    Janusgraph : 0.5.2
    Storage: Cassandra cql
    The server specification is high and that is not the issue.
    Thanks & Regards,