Where communities thrive

  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
Repo info

    Hi team, I'm connecting to JG from a NodeJS typescript application, making use of the npm/gremlin package. I'm struggling to connect to an existing graph (created from Console with below command) - is there some other recommended package, or can someone perhaps give some guidance, please?

    graph = JanusGraphFactory.build().
      set("storage.backend", "cql").
      set("storage.hostname", "jg-cassandra").

    The code I currently have:

        const traversal = gremlin.process.AnonymousTraversalSource.traversal;
        const DriverRemoteConnection = gremlin.driver.DriverRemoteConnection;
        const g = traversal().withRemote(new DriverRemoteConnection('ws://localhost:8182/gremlin'));

    I've created a JG config file and added it to the container, so that I can reference that when connecting to JG, but I don't see where I can do this?

    If I run the above code, I connect to another (believed to be in-memory) graph, instead of the one created with the Console command above.

    Sanjeev Ghimire
    Hi I am depploying janusgraph to a openshift cluster. it deploys fine but when i start the gremlin server it gives me this error:
    cp: cannot create regular file '/etc/opt/janusgraph/janusgraph.properties': Permission denied
    cp: cannot create regular file '/etc/opt/janusgraph/gremlin-server.yaml': Permission denied
    chown: changing ownership of '/var/lib/janusgraph': Operation not permitted
    chown: changing ownership of '/etc/opt/janusgraph': Operation not permitted
    chmod: changing permissions of '/var/lib/janusgraph': Operation not permitted
    chmod: changing permissions of '/etc/opt/janusgraph': Operation not permitted
    chmod: cannot access '/etc/opt/janusgraph/*': No such file or directory
    grep: /etc/opt/janusgraph/janusgraph.properties: No such file or directory
    /usr/local/bin/docker-entrypoint.sh: line 52: /etc/opt/janusgraph/janusgraph.properties: Permission denied
    grep: /etc/opt/janusgraph/janusgraph.properties: No such file or directory
    /usr/local/bin/docker-entrypoint.sh: line 52: /etc/opt/janusgraph/janusgraph.properties: Permission denied
    Error: stat /etc/opt/janusgraph/gremlin-server.yaml: no such file or directory
      yq write [yaml_file] [path_expression] [value] [flags]
      write, w
    2 replies
    Any help is appreciated
    I have a 'Date' property formatted like this:
    gremlin> g.V().has("Patent", "title", textContains("Cups")).values('date').limit(10);
    ==>Tue Jan 03 00:10:00 UTC 1978
    ==>Sun Jan 28 00:08:00 UTC 2007
    ==>Tue Jan 27 00:10:00 UTC 1987
    ==>Wed Jan 10 00:04:00 UTC 2001
    ==>Sun Jan 17 00:08:00 UTC 2010
    ==>Tue Jan 05 00:10:00 UTC 2010
    ==>Thu Jan 28 00:09:00 UTC 2010
    ==>Wed Jan 04 00:09:00 UTC 2012
    ==>Wed Jan 09 00:12:00 UTC 2008
    ==>Wed Jan 24 00:04:00 UTC 2018
    Is there a way to search for data from a specific year or do I need to create a separate property field with just 'year'?
    Florian Hockmann
    @maahutch You can compare dates in Gremlin just like other date types. So, if you want to ensure that your date property is within a certain time range, then you can use this traversal:
    g.V().has("Patent", "title", textContains("Cups")).has('date', within(startDate, endDate)).limit(10);
    where startDate and endDate are Date objects that represent the start and end of your interval, so the beginning and end of the year in this case
    Vinayak Shiddappa Bali
    Hi All,
    I want to enable JMX authentication for Cassandra Deamon. What all changes are needed at janusgraph side to enable the authentication ??
    Piotr Joński
    Hi guys, is there a way to customize log format in janusgrap? i would like to apply our company-standard formatting, so we can browse all logs in kibana in single place.
    i am not interested in workaround like logstash pipeline, which changes log format in-the-fly -- this i know already
    I am running janusgraph in k8s, docker image. configuration is done by mounting configuration files in proper places.
    3 replies
    Razi Kheir
    Guys, regarding graph management, I can't find in the javadocs, or the janusgraph docs, when to exactly rollback management (not transaction, ie: mgmt.rollback()), and what specific effect it actually has, can anyone please explain this or give me a link to where it is explained?
    Vinayak Shiddappa Bali

    Hi All,

    The schema consists of A, B as nodes, and E as an edge with some other nodes and edges.
    A: 183468
    B: 437317
    E: 186513

    Query: g.V().has('property1', 'A').as('v1').outE().has('property1','E').as('e').inV().has('property1', 'B').as('v2').select('v1','e','v2').dedup().count()
    Output: 200166
    Time Taken: 1min

    Query: g.V().has('property1', 'A').aggregate('v').outE().has('property1','E').aggregate('e').inV().has('property1', 'B').aggregate('v').select('v').dedup().as('vetexCount').select('e').dedup().as('edgeCount').select('vetexCount','edgeCount').by(unfold().count())
    Output: ==>[vetexCount:383633,edgeCount:200166]
    Time: 3.5 mins
    Property1 is the index.
    How can I optimize the queries because minutes of time for count query is not optimal. Please suggest different approaches.

    Thanks & Regards,

    Matthias Leinweber
    what else could be a reason that my index dont get registered/installed/disable although i clean all connections and transactions?
    Matthias Leinweber
    and i am pretty confused about mapreduce jobs and spark/hadoop connectivity .. i am looking for more complete examples or someone who has some time to answer stupid questions :)
    Ben Doan
    Hey everyone - has anyone here successfully implemented a SparkGraphComputer for OLAP on AWS EMR? I'm trying to implement it using JanusGraph 0.5.3 on AWS EMR Release version 5.23.0 and I'm running into some dependency issues
    Ben Doan

    I am currently trying to set up SparkGraphComputer using JanusGraph with a CQL storage and ElasticSearch Index backend, and am receiving an error when trying to complete a simple vertex count traversal in the gremlin console:

    gremlin> hadoop_graph = GraphFactory.open('conf/hadoop-graph/olap/olap-cassandra-HadoopGraph-YARN.properties')
    gremlin> hg = hadoop_graph.traversal().withComputer(SparkGraphComputer)
    gremlin> hg.V().count()
    ERROR org.apache.spark.SparkContext - Error initializing SparkContext. java.lang.ClassCastException: org.apache.hadoop.yarn.proto.YarnServiceProtos$GetNewApplicationRequestProto cannot be cast to org.apache.hadoop.hbase.shaded.com.google. protobuf.Message

    Relevant cluster details

    • JanusGraph Version 0.5.3
    • Spark-Gremlin Version 3.4.6
    • AWS EMR Release 5.23.0
    • Spark Version 2.4.0
    • Hadoop 2.8.5
    • Cassandra/CQL version 3.11.10

    Implementation Details

    Properties File

    metrics.enabled= false
    ### Storage - CQL 
    ###HOSTS and PORTS
    ####InputFormat configuration
    ### SparkGraphComputer Configuration 
    ###Spark Job Configurations
    ###Gremlin and Serializer Configuration
    ### Special Yarn Configuration (WIP)
    ###Spark Driver and Executors
    Sanjeev Ghimire
    question: my janusgraph is hosted in openshift cluster. I have apython app that can connect to it and load successfully, but I can't connect to the server using gremlin console, I get permission denied. Also the data are not replicated on all the servers
    Any idea why? Any help is appreciated
    Piotr Joński
    hi, quick question:
    we are testing things and dont want to spin up full cluster with elastisearch, so we use lucene indexing as default.
    will that work with multi-node janusgraph cluster? (we have 2 instances)
    2 replies
    Hi, anyone already tried to add a custom lib in JanusGraph? I create an empty class in a TestClass.java file with org.test package that I compiled into jar and I added it to my JANUS_HOME/lib folder. However when I try to import my class with :i org.test.TestClass it does not find it. Should I do something special?
    I found a funny bug:
    These two requests looks exactly the same but they are not, they do not produce the same result and if I play with up arrow and enter they still do. I checked that there is no hidden special character.
    And this occurs only with numbers combined with a whitespace, or at least I could not reproduce it with text containing letters+whitespaces.
    On the second request I wrote the value digit per digit on my keyboard while on the first request and copied/pasted the value from the gremlin console after doing a .values()
    My bad, notepad detected that the withespace character is not the same:
    Sanjeev Ghimire
    why is janusgraph not replicating data on all pods?
    we are using all default settings
    when queried from an app, we dont get all teh results
    I would like to ask a question regarding how Janus retrieves properties of vertexes. Take following query for example, g.V(12345).out().has("Name", "abc"), V(12345) has 100 or so neighbors. But it seems that Janus is fetching properties of these vertexes in a sequential way. Am I missing anything or this is done intentionally? I also found a related option "query.batch-property-prefetch" . After it is turned on, Janus can prefetch properties in parallel, but with a drawback. It seems that it is fetching all the properties of an vertex even though the filters in later steps only need a single property. Am I missing anything?
    16 replies
    Sanjeev Ghimire

    why is janusgraph not replicating data on all pods?

    any help on this?

    7 replies
    Piotr Joński

    hi guys,
    question about

                    .serializer(new GraphSONMessageSerializerV2d0(GraphSONMapper.build().addRegistry(JanusGraphIoRegistry.instance())))

    if we use k8s service (the service acts as load balancer) shall we configure single ContactPoint or put 3 (we have 3 pods) janusgraph pod IPs there? (whetever is ContactPoint, i assume it is the same as url, is it correct?)
    and next question: if we specify 3 ContactPoints (all pods directly) what will be the difference? is it only for client-side loadbalancing or some addtional behaviour is expected?

    Piotr Joński

    did anybody in the world, or at least in this channel :) tried to scale up janusgraph? i have serious issues with that, struggiling for few days already.
    it always return connection timeout if we have more than 1 replica :/
    i try to deploy that to k8s, sometimes single pod work, sometimes it throws exceptions about connections.
    how to set it up properly? i have read the articles from janusgraph website (multi node) and nothing helps. do you have any examples of working configurations for multinode janusgraph deployment?


    Florian Hockmann
    Hey Piotr, regarding your first question: You can use a load balancer in front of JanusGraph and then pass it as a contact point to the driver. You just need to make sure that the load balancer supports Websockets and ensures that subsequent requests on the same connection will be forwarded to the same endpoint. Nginx supports this for example. Not sure whether the k8s load balancer supports this or whether it's more "low level".
    If you however provide the driver directly with the IPs of the JanusGraph pods, then it should do the load balancing itself. That might be an option if your environment doesn't change much (e.g., if you scale JanusGraph up or down, you would need to change this in the driver)
    5 replies
    What exception do you get if you try it with just one pod for JanusGraph? I would definitely try to get that working before scaling up as you then get the added complexity from the load balancer
    5 replies
    Hope someone can chime in on my question above. The key question is why Janus fetches properties sequentially which hurts the latency to a great degree.
    Sanjeev Ghimire
    Anyone inegrated cassandra with janusgraph?
    11 replies
    Piotr Joński

    hi guys,
    could you elaborate more on how to connect from java application to janusgraph, please?
    i found SO question and some answers: https://stackoverflow.com/questions/45673861/how-can-i-remotely-connect-to-a-janusgraph-server
    but i cannot fully get it, and, nowadays it looks really awkward to send "code as a string" to server.

    i tried to find description of that problem and solutions in official docs -- https://docs.janusgraph.org/ -- but did not manage :sad:
    the docs are great, but sometimes i feel like reading abstraction-view of specific problem, without description of specific solutions. after reading i have the feeling that janusgraph is supposed to be used only manually from gremlin console, instead of in automatic-way from java application.

    could someone shed more light on that topic, please?
    eventually update docs with examples, or point to examples to SO or any other pages?
    thank you :thumbsup:

    1 reply
    Florian Cäsar

    Hi, Gremlin/Janus question (though mostly Gremlin-related). I'm injecting vertices & edges in a Gremlin language variant for bulk insertion. To add the edges, I reference the connected vertices by their id like using the Vertex-step (i.e. V(<id>)). Since I'm adding the edges from an injected array of maps like [from:id, to:id], I need to dynamically look up vertices by their id from within the traversal with e.g. select('from'). In essence, I want:

     g.inject([[from:vertex_id_1,to:vertex_id_2], ...])

    However, this doesn't work since the outer map containing from/to aren't available inside the vertex lookup step. Something like this must be possible, but I haven't been able to figure out how. Any ideas?

    3 replies
    Marie Diana Tran

    Hi there,

    I have a setup with janusgraph server with BigTable as a storage backend and a remote Elasticsearch for search indexing.
    Following the documentation, I managed to have a remote traversal using Gremlin to the previous janusgraph server.

    Now, I want to punctually run scripts to manage the graph schema and setup indexes.
    I understood that I have to use janusgraph-core and JanusGraphManagement .
    Nevertheless, I could not manage to setup the connection to the janusgraph server

        JanusGraphFactory.Builder config = JanusGraphFactory.build();
            config.set("storage.backend", "hbase");
            config.set("storage.hbase.ext.hbase.client.connection.impl", "com.google.cloud.bigtable.hbase2_x.BigtableConnection");
            config.set("storage.hbase.ext.google.bigtable.project.id", "xxx");
            config.set("storage.hbase.ext.google.bigtable.instance.id", "xxx");
            config.set("index.search.backend", "elasticsearch");
            config.set("index.search.hostname", "xxx:9200");
            JanusGraph graph = config.open(); 
            GraphTraversalSource g = graph.traversal();
            JanusGraphManagement mgmt = graph.openManagement();

    I have the following debug log

    16:12:36.914 [main] DEBUG o.j.d.c.BasicConfiguration - Ignored configuration entry for storage.hbase.ext.hbase.client.connection.impl since it does not map to an option
    java.lang.IllegalArgumentException: Unknown configuration element in namespace [root.storage.hbase.ext]: client
        at com.google.common.base.Preconditions.checkArgument(Preconditions.java:164) ~[jg-tester-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
        at org.janusgraph.diskstorage.configuration.ConfigElement.parse(ConfigElement.java:177) ~[jg-tester-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
        at org.janusgraph.diskstorage.configuration.BasicConfiguration.getAll(BasicConfiguration.java:93) ~[jg-tester-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
        at org.janusgraph.graphdb.configuration.builder.GraphDatabaseConfigurationBuilder.build(GraphDatabaseConfigurationBuilder.java:59) [jg-tester-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
        at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:161) [jg-tester-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
        at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:132) [jg-tester-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
        at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:122) [jg-tester-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
        at org.janusgraph.core.JanusGraphFactory$Builder.open(JanusGraphFactory.java:261) [jg-tester-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
        at live.yubo.JanusGraphApp.main(JanusGraphApp.java:44) [jg-tester-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
    16:12:36.915 [main] DEBUG o.j.d.c.BasicConfiguration - Ignored configuration entry for storage.hbase.ext.google.bigtable.project.id since it does not map to an option
    java.lang.IllegalArgumentException: Unknown configuration element in namespace [root.storage.hbase.ext]: bigtable
        at com.google.common.base.Preconditions.checkArgument(Preconditions.java:164) ~[jg-tester-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
        at org.janusgraph.diskstorage.configuration.ConfigElement.parse(ConfigElement.java:177) ~[jg-tester-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
        at org.janusgraph.diskstorage.configuration.BasicConfiguration.getAll(BasicConfiguration.java:93) ~[jg-tester-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
        at org.janusgraph.graphdb.configuration.builder.GraphDatabaseConfigurationBuilder.build(GraphDatabaseConfigurationBuilder.java:59) [jg-tester-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
        at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:161) [jg-tester-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
        at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:132) [jg-tester-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
        at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:122) [jg-tester-1.0-SNAPSHOT-jar-wit

    Anyone can help ?

    Venkat Dasari
    Hi, Janus Architecture Question. We are trying to start our evaluation on Janus Graph, and from Architecture standpoint, can this graph be built by loading the data directly from HDFS and does it support parquet? Even if it doesn't support Parquet, its fine, but will it be able to read the data from HDFS directly?
    Philipp Kraus
    Hello, I'M using Janusgraph with Python and Java for some graph data storage, but now I would like to use Janusgraph in Python as a knowledge graph with logical or probabilistic reasoning. Is there any additional support for logical reasoning without writing my unification by myself and a general question can I define some inheritance on the vertex label types e.g. label "car" is an inheritance by "vehicale"?
    1 reply
    Vinayak Shiddappa Bali
    Hi All, 
    The Data Model of the graph is as follows:
    Label: Node1, count: 130K
    Label: Node2, count: 183K
    Label: Node3, count: 437K
    Label: Node4, count: 156
    Node1 to Node2 Label: Edge1, count: 9K
    Node2 to Node3 Label: Edge2, count: 200K
    Node2 to Node4 Label: Edge3, count: 71K
    Node4 to Node3 Label: Edge4, count: 15K
    Node4 to Node1 Label: Edge5 , count: 1K
    The Count query used to get vertex and edge count :
    g2.V().has('title', 'Node2').aggregate('v').outE().has('title','Edge2').aggregate('e').inV().has('title', 'Node3').aggregate('v').select('v').dedup().as('vertexCount').select('e').dedup().as('edgeCount').select('vertexCount','edgeCount').by(unfold().count())
    This query takes around 3.5 mins to execute and the output returned is as follows:
    The problem is traversing the edges takes more time.
    g.V().has('title','Node3').dedup().count() takes 3 sec to return 437K nodes.
    g.E().has('title','Edge2').dedup()..count() takes 1 min to return 200K edges
    In some cases, subsequent calls are faster, due to cache usage. 
    I also considered in-memory backend, but the data is large and I don't think that will work. Is there any way to cache the result at first-time execution of query ?? or any approach to load the graph from cql backend to in-memory to improve performance?
    Please help me to improve the performance, count query should not take much time.
    Janusgraph : 0.5.2
    Storage: Cassandra cql
    The server specification is high and that is not the issue.
    Thanks & Regards,
    Mohammad Alian
    Hi, we currently call /?gremlin=g.V().none() to check if JG is up and has healthy connection to it's storage. Calling /?gremlin=graph.open doesn't check if the storage connection is healthy or not. Is there any other way to do this?
    3 replies
    The JG docs pages on schema management recommend against spaces and special chars in PropertyKey names but not Edge or Vertex Label names. Any particular reason for this? We were hoping to standardise on lower case with underscores. Now it looks like we need e.g. camelCase on Property Keys. Does anyone know if this is more than a recommendation e.g. some impact on performance or indexing?
    2 replies
    Michael Wilson

    Hey all. Hopefully this question isn't too noobish. I'm currently just getting started looking into JanusGraph, and I'm trying to get a general feel for failure modes. In particular, we're building out a service where various teams will have their own individual spaces that we want to protect from one another while still maintaining queryability. For example, if team 1 accidentally does something that takes down their space JanusGraph, we want to ensure that teams 2, 3, 4, and 5 are not affected and their jobs/processes still continue.

    That being said, I'm not sure if this question is valid or realistic, so a gut check would be very much appreciated here.

    1 reply
    Vinayak Shiddappa Bali

    Hi All,

    Trying to implement OLAP for performance improvement of count queries.
    Referred to the above document, still not working.
    Error: Spark master not responding, but it's running
    Error while invoking RpcHandler #receive() for one-way message while spark job is hosted on Jboss and trying to connect to master


    Florian Cäsar

    Hi, I'm experiencing a strange no response bug somewhere between Gremlin python, JanusGraph and Gremlin server:
    Some long traversals (with thousands of instructions) don't get a response. And I don't mean that they time out or that they get an error back, I mean they get nothing back whatsoever. No log entries at the server, nothing in the client.

    The threshold for this silence treatment isn't clear - it doesn't clearly depend on bytes or number of instructions. For instance, this code will hang forever for me:

    g = traversal().withRemote(DriverRemoteConnection('ws://localhost:8182/gremlin', 't'))
    g = g.inject("")
    for i in range(0, 8000):
        g = g.constant("test")
    print(f"submitting traversal with length={len(g.bytecode.step_instructions)}")
    result = g.next()
    print(f"done, got: {result}") # this is never reached

    I know it doesn't depend on just the number of bytes because the number of instructions beyond which this happens doesn't change even with very large strings instead of just "test".

    Just in case, I've already increased server-side maxHeaderSize, maxChunkSize, maxContentLength etc. to ridiculously high numbers. No change.

    Any ideas what I'm doing to deserve the silent treatment? This is driving me insane.

    3 replies
    Florian Cäsar

    I'm trying to use JanusGraph's full-text predicates in the gremlin-python client library. Using GraphSON serializer, I can just use a predicate with e.g. "textContains" as the operator and it works since JanusGraphIoRegistryV1d0 registers its custom deserializer for P objects:

    addDeserializer(P.class, new JanusGraphPDeserializerV2d0());

    However, as of v0.5.3, JanusGraph does not register any deserializers for GraphBinary (though that feature is already on the master branch). This means that when I submit the same exact traversal with P("textContains", "string") in graphbinary format I get:

    org.apache.tinkerpop.gremlin.server.handler.OpSelectorHandler  - Invalid OpProcessor requested [null]

    I presume this is because the "textContains" predicate isn't registered. Weirdly enough, in my Groovy console, the same traversal works fine even though it also uses graphbinary (according to the configuration).

    There are a couple options here and I don't have enough information on any of them, so I would appreciate input:

    1. Figure out what the Groovy console is doing differently and use that in the Python library
    2. Use a Docker image from master branch and adapt the Python library to use the new custom JanusgraphP type in graphbinary
    3. Use two separate clients with different serializations depending on which traversal I need to run (yuck)

    Note: I've already tested https://github.com/JanusGraph/janusgraph-python, it does the same thing I do manually and thus only works with GraphSON.


    ok, found - configuration options:

    schema.constraints: 'true'
    schema.default: none

    How can these options be set on an already created graph? I can do it in ConfiguredGraphFactory, but what about a running graph?

    3 replies