Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Sumedh Wale
    @sumwale
    @XZhi Looked at the DDL. Can you avoid disabling persistence and let it be default? It is recommended to keep that always on for column tables and I don't think it is required for your use-case. We are currently not testing for no-persistence+overflow combo for column tables and will likely disallow it. We depend on data being available on disk for corner cases where data was just evicted from region yet is required by an async operation listener/... a bit later from its queue.
    @XZhi Also disabling that means that you will lose data in case of restarts.
    Michael Tao
    @XZhi
    @sumwale Ok,thanks very much for your advice, I will change it and then test it.
    jxwr
    @jxwr
    @sumwale We were loading testing data into snappydata (persistence+overflow ), but everytime the "Memory Usage" was near 80% that the dashboard is not working properly.
    image.png
    Then all tables disappeared
    This part changed to Tables(0)
    jxwr
    @jxwr
    After that, querying on some table, I get: ERROR 38000: (SQLState=38000 Severity=20000) (Server=rz-data-olap-test01/10.16.118.38[8414] Thread=pool-3-thread-1) The exception 'com.gemstone.gemfire.SerializationException: An IOException was thrown while deserializing' was thrown while evaluating an expression.
    Sumedh Wale
    @sumwale

    @jxwr Can you go to our slack channel (https://snappydata-public.slack.com/messages)? It is lot more active and many more developers can respond there. I may not be able to respond for a few days.

    For this issue, can you upload server logs and provide a link?

    jxwr
    @jxwr
    OK
    jxwr
    @jxwr
    @sumwale how can I join in the channel?
    Need invitation?
    Sumedh Wale
    @sumwale
    @jxwr I think that should be a simple signup. @piercelamb can you help?
    jxwr
    @jxwr
    @sumwale My emai: jxwr.cn@gmail.com
    Pierce Lamb
    @piercelamb
    yep thats it
    Giridhar Addepalli
    @giridhara
    Hi, i am SnappyData newbie
    we currently have data in Hive Tables
    can you please share code pointers for importing data from hive table in to snappydata store
    Sumedh Wale
    @sumwale

    @giridhara Spark has support for reading hive: https://spark.apache.org/docs/2.1.1/sql-programming-guide.html#hive-tables . The issue would be that you need SnappySession for column tables but using hive-metastore (enableHiveSupport) would lead to it being used for column tables too (rather than the inbuilt meta-store). So the best option for now is to take the RDD[Row] from Dataset, then insert that into column table. Something like:

    val ds = spark.table("hiveTable")
    val rdd = ds.rdd
    val session = new SnappySession(sparkContext)
    val df = session.createDataFrame(rdd, ds.schema)
    df.write.format("column").saveAsTable("columnTable")

    Can be made more efficient by using RDD[InternalRow] to avoid conversions from InternalRow to Row and back using package-private APIs if you have your classes (or helper methods) in org.apache.spark.sql package.

    Michael Tao
    @XZhi

    @sumwale hi, we got exception on data server as below and its status is always waiting.

    2017-08-31 15:19:17,094 DEBUG [pool-3-thread-47]: server.SnappyThriftServerThreadPool (SnappyThriftServerThreadPool.java:run(276)) - Thrift error occurred during processing of message.
    org.apache.thrift.transport.TTransportException: Channel closed.
    at io.snappydata.thrift.common.SnappyTSocket.read(SnappyTSocket.java:373)
    at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
    at org.apache.thrift.protocol.TCompactProtocol.readByte(TCompactProtocol.java:634)
    at org.apache.thrift.protocol.TCompactProtocol.readMessageBegin(TCompactProtocol.java:501)
    at io.snappydata.thrift.server.SnappyDataServiceImpl$Processor.process(SnappyDataServiceImpl.java:196)
    at io.snappydata.thrift.server.SnappyThriftServerThreadPool$WorkerProcess.run(SnappyThriftServerThreadPool.java:270)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

    Sumedh Wale
    @sumwale
    @XZhi check server logs or post a link here
    Michael Tao
    @XZhi
    ok, a moment please
    2017-08-31 15:18:50,241 INFO [serverConnector]: snappystore (GFToSlf4jBridge.java:put(73)) - Recovering oplog#2 /opt/meituan/snappydata_work/servers/./datadictionary/BACKUPGFXD-DD-DISKSTORE_2.drf for disk store GFXD-DD-DISKSTORE.
    2017-08-31 15:18:50,242 INFO [serverConnector]: snappystore (GFToSlf4jBridge.java:put(73)) - Recovering oplog#1 /opt/meituan/snappydata_work/servers/./datadictionary/BACKUPGFXD-DD-DISKSTORE_1.drf for disk store GFXD-DD-DISKSTORE.
    2017-08-31 15:18:50,243 INFO [serverConnector]: snappystore (GFToSlf4jBridge.java:put(73)) - Recovering oplog#2 /opt/meituan/snappydata_work/servers/./datadictionary/BACKUPGFXD-DD-DISKSTORE_2.crf for disk store GFXD-DD-DISKSTORE.
    2017-08-31 15:18:50,244 INFO [serverConnector]: snappystore (GFToSlf4jBridge.java:put(73)) - Recovering oplog#1 /opt/meituan/snappydata_work/servers/./datadictionary/BACKUPGFXD-DD-DISKSTORE_1.crf for disk store GFXD-DD-DISKSTORE.
    2017-08-31 15:18:50,249 INFO [serverConnector]: snappystore (GFToSlf4jBridge.java:put(73)) - recovery oplog load took 8 ms
    2017-08-31 15:18:50,261 INFO [Oplog Delete Task]: snappystore (GFToSlf4jBridge.java:put(73)) - Deleted oplog#2 crf for disk store GFXD-DD-DISKSTORE.
    2017-08-31 15:18:50,269 INFO [Oplog Delete Task]: snappystore (GFToSlf4jBridge.java:put(73)) - Deleted oplog#2 drf for disk store GFXD-DD-DISKSTORE.
    2017-08-31 15:18:50,277 INFO [serverConnector]: snappystore (GFToSlf4jBridge.java:put(73)) - Created oplog#3 drf for disk store GFXD-DD-DISKSTORE.
    2017-08-31 15:18:50,485 INFO [serverConnector]: snappystore (GFToSlf4jBridge.java:put(73)) - Created oplog#3 crf for disk store GFXD-DD-DISKSTORE.
    2017-08-31 15:18:50,564 INFO [serverConnector]: snappystore (GFToSlf4jBridge.java:put(73)) - recoverOplogs oplog: oplog#2 parent: GFXD-DD-DISKSTORE, needskrf: true
    2017-08-31 15:18:50,564 INFO [serverConnector]: snappystore (GFToSlf4jBridge.java:put(73)) - createKrfAsync called for oplog: oplog#2, parent: GFXD-DD-DISKSTORE
    2017-08-31 15:18:50,564 INFO [serverConnector]: snappystore (GFToSlf4jBridge.java:put(73)) - recoverOplogs oplog: oplog#1 parent: GFXD-DD-DISKSTORE, needskrf: true
    2017-08-31 15:18:50,564 INFO [serverConnector]: snappystore (GFToSlf4jBridge.java:put(73)) - createKrfAsync called for oplog: oplog#1, parent: GFXD-DD-DISKSTORE
    2017-08-31 15:18:50,565 INFO [serverConnector]: snappystore (GFToSlf4jBridge.java:put(73)) - recovery region initialization took 316 ms
    2017-08-31 15:18:50,572 WARN [serverConnector]: snappystore (GFToSlf4jBridge.java:put(69)) - Creating persistent region GFXD_PdxTypes, but enable-network-partition-detection is set to false. Running with network partition detection disabled can lead to an unrecoverable system in the event of a network split.
    Michael Tao
    @XZhi
    It seems that a network split is detected, but we restart the server serval times, the status is always waiting.
    Sumedh Wale
    @sumwale
    @XZhi The network split warning is that it is recommended to turn it on to enable detection, not that there was a network split. Send a URG signal to the server (see Process ID at the start) to dump locks/stacks and then post a link to the full server logs to see where it is stuck.
    Michael Tao
    @XZhi
    Ok, thanks very much!
    Sumedh Wale
    @sumwale
    @XZhi to clarify: kill -URG <process ID> . Product catches this to dump locks/stacks in the log.
    Michael Tao
    @XZhi

    2017-08-31 16:02:04,468 INFO [SIGURG handler]: snappystore (GFToSlf4jBridge.java:put(73)) - GfxdLocalLockService@70734388[gfxd-ddl-lock-service]: SIGURG received, full state dump

    __PRLS: 0 tokens, 0 locks held
    gfxd-ddl-lock-service: 0 tokens, 0 locks held

    TX states:

    Full Thread Dump:

    "SIGURG handler" Id=73 RUNNABLE
    at sun.management.ThreadImpl.dumpThreads0(Native Method)
    at sun.management.ThreadImpl.dumpAllThreads(ThreadImpl.java:454)
    at com.pivotal.gemfirexd.internal.engine.locks.GfxdLocalLockService.generateThreadDump(GfxdLocalLockService.java:384)
    at com.pivotal.gemfirexd.internal.engine.locks.GfxdLocalLockService.dumpAllRWLocks(GfxdLocalLockService.java:373)
    at com.pivotal.gemfirexd.internal.engine.locks.GfxdDRWLockService.dumpAllRWLocks(GfxdDRWLockService.java:750)
    at com.pivotal.gemfirexd.internal.engine.distributed.utils.GemFireXDUtils.dumpStacks(GemFireXDUtils.java:2843)
    at com.pivotal.gemfirexd.internal.engine.SigThreadDumpHandler.handle(SigThreadDumpHandler.java:107)
    at sun.misc.Signal$1.run(Signal.java:212)
    at java.lang.Thread.run(Thread.java:745)

    "Attach Listener" Id=72 RUNNABLE

    "Idle OplogCompactor" Id=71 TIMED_WAITING on [I@7a8b28f8
    at java.lang.Object.wait(Native Method)

    -  waiting on [I@7a8b28f8
    at com.gemstone.gemfire.internal.cache.DiskStoreImpl.waitForIndexRecovery(DiskStoreImpl.java:5214)
    at com.gemstone.gemfire.internal.cache.DiskStoreImpl.waitForIndexRecoveryEnd(DiskStoreImpl.java:5190)
    at com.gemstone.gemfire.internal.cache.DiskStoreImpl$ValueRecoveryTask.run(DiskStoreImpl.java:4841)
    -  locked java.lang.Object@7eb4a3f6
    at com.gemstone.gemfire.internal.cache.DiskStoreImpl$2.run(DiskStoreImpl.java:4985)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
    
    Number of locked synchronizers = 1
    - java.util.concurrent.ThreadPoolExecutor$Worker@4ec02ecf

    "Pooled Waiting Message Processor 1" Id=70 TIMED_WAITING on java.lang.Object@3f89a9cd
    at java.lang.Object.wait(Native Method)

    -  waiting on java.lang.Object@3f89a9cd
    at com.pivotal.gemfirexd.internal.engine.store.GemFireStore.waitForIndexLoadBegin(GemFireStore.java:2923)
    at com.pivotal.gemfirexd.internal.engine.store.RegionEntryUtils$1.waitForAsyncIndexRecovery(RegionEntryUtils.java:744)
    at com.gemstone.gemfire.internal.cache.DiskStoreImpl$IndexRecoveryTask.run(DiskStoreImpl.java:4710)
    at com.gemstone.gemfire.internal.cache.DiskStoreImpl$2.run(DiskStoreImpl.java:4985)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at com.gemstone.gemfire.distributed.internal.DistributionManager.runUntilShutdown(DistributionManager.java:730)
    at com.gemstone.gemfire.distributed.internal.DistributionManager$7$1.run(DistributionManager.java:1091)
    at java.lang.Thread.run(Thread.java:745)
    
    Number of locked synchronizers = 1
    - java.util.concurrent.ThreadPoolExecutor$Worker@303269b4

    "Asynchronous disk writer for region GFXD-DD-DISKSTORE" Id=68 TIMED_WAITING on java.lang.Object@558cc2b7
    at java.lang.Object.wait(Native Method)

    -  waiting on java.lang.Object@558cc2b7
    at java.lang.Object.wait(Object.java:460)
    at java.util.concurrent.TimeUnit.timedWait(TimeUnit.java:348)
    at com.gemstone.gemfire.internal.cache.DiskStoreImpl$FlusherThread.waitUntilFlushIsReady(DiskStoreImpl.java:1802)
    at com.gemstone.gemfire.internal.cache.DiskStoreImpl$FlusherThread.run(DiskStoreImpl.java:1837)
    at java.lang.Thread.run(Thread.java:745)

    "Idle OplogCompactor" Id=67 TIMED_WAITING on [I@2d13452c
    at java.lang.Object.wait(Native Method)

    -  waiting on [I@
    Michael Tao
    @XZhi
    @sumwale The dump log file is uploaded as above.
    Borislav Iordanov
    @bolerio
    Hi, it's really nice that SnappyData has so many community channel, but it makes hard to decide where to post one's question :smile:
    Sumedh Wale
    @sumwale
    @bolerio Slack channel is the most active one (https://snappydata-public.slack.com). StackOverflow questions tagged with snappydata are also actively monitored but for fastest responses, use slack.
    Viacheslav Rodionov
    @bepcyc
    Hi! @sumwale or someone else, could you please tell me how to use this "public" slack channel? It says: "If you have an @snappydata.io email address, you can create an account.". I don't have such an e-mail. Does it mean that the channel is not public at all?
    Sumedh Wale
    @sumwale
    @bepcyc Please see the links on webpage: https://www.snappydata.io/community
    Diend
    @Diend

    image.png

    Hi, i want to ask, if i have spark cluster and snappy cluster like on the image above.
    And i spark-submit on my spark cluster to insert data in snappydata cluster. Can i do that??? because i keep getting error thriftserver.

    Alessio Palma
    @GTD_B2C_IT5_AU_twitter
    Hello, is someone using snappydata for more than 2 years ?
    xiaomayi123
    @xiaomayi123
    i have try the "Using Spark Scala APIs" but when i execute [ds.write.insertInto("colTable")] ,i got one massage "Table or view not found: coltable;" .but when i execute [snappy.sql("show tables").show],i see that table existed. What explains this?
    Alessio Palma
    @GTD_B2C_IT5_AU_twitter
    @xiaomayi123 does the table exist into hdfs ?
    Jags Ramnarayan
    @jramnara
    @xiaomayi123 for this to work, you need to obtain ds using a SnappySession or SnappyContext, not SparkSession or SparkContext. See product documentation.
    @xiaomayi123 Try this if you cannot switch to SnappySession .... val df = snappy.createDataFrame(YOUR_DS.rdd, ds.schema);
    df.write.format("column").saveAsTable("colTable")
    xiaomayi123
    @xiaomayi123
    @jramnara thank you,you are right.
    Argenis Leon
    @argenisleon
    I am trying to use snappy with pyspark but I am getting this
    from pyspark.sql.snappy import SnappySession
    snappy = SnappySession(sc)
    
    ---------------------------------------------------------------------------
    ModuleNotFoundError                       Traceback (most recent call last)
    <ipython-input-7-b46794f153bf> in <module>()
    ----> 1 from pyspark.sql.snappy import SnappySession
          2 snappy = SnappySession(op.sc)
    
    ModuleNotFoundError: No module named 'pyspark.sql.snappy'
    Is there a pip or something I am missing?
    zhangpei529
    @zhangpei529
    Hi guys, I got a issues with running snappydata in multi-host, is there a example with running snappydata in multi-host? Thanks a lot
    Swati Sawant
    @swatisawant
    @zhangpei529 , you can go through multi-host-installation section in the documentation to setup the the multi-host SnappyData cluster.
    Ivan
    @advancedwebdeveloper
    Hello. Anyone present here, competent to tell about EMC's use cases/feedback?
    yxtwang
    @yxtwang
    hello,Anyone present here,When will snappydata support spark2.3? Do you have a plan?thx。
    Ishaq Sahibole
    @ishaq1189_gitlab
    This is Ishaq.. Newly started a POC with SnappyData to replace existing Spark and Cassandra cluster in our Company.. However, when we worked on SnappyData Streaming, we could see that JVM Heap memory is getting increased like anything for a small testcase and also we could see that all stream data captured in Cache which can be visible from SPARK CACHE Tab in pulse UI. Could anyone please explain why stream data is getting cached in SPARK Cache and when it will be cleared from cache. Also, how the memory management should be for the snappydata streaming?
    Jags Ramnarayan
    @jramnara
    Can you share your program? Are u using structured steaming? If so I wouldn’t expect the behavior u see.