Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    luzhijing
    @luzhijing
    Click it and join the slack of Apache Doris Community!
    zhangwgch
    @zhangwgch
    streamLoad导入的时候,磁盘写入有限制吗,目前写入速度上不去,磁盘io10%都不到
    大神们,谁给解读一下
    zebingtian
    @zebingtian
    有大神在么
    Mingyu Chen
    @morningman

    streamLoad导入的时候,磁盘写入有限制吗,目前写入速度上不去,磁盘io10%都不到

    单个stream load是单线程的,如果数据量大,需要开多个并发导入

    青松
    @FireFreedomK

    各位大佬,萌新请教一个问题。我在三台机器上部署了doris,一台fe、be混部,两台单独部署be。doris版本是0.15.0

    在用spark DataFrame向doris导入数据时,一开始可以正常导入,但是后面会报如下的错误
    2022-01-26 14:29:41,867 ERROR Executor:94 Executor task launch worker for task 11714 718847 - Exception in task 9.0 in stage 8206.0 (TID 11714)
    org.apache.doris.spark.exception.StreamLoadException: stream load error: too many filtered rows
    at org.apache.doris.spark.DorisStreamLoad.load(DorisStreamLoad.java:162)
    at org.apache.doris.spark.DorisStreamLoad.load(DorisStreamLoad.java:149)
    at org.apache.doris.spark.sql.DorisSourceProvider

    KaTeX parse error: Can't use function '$' in math mode at position 8: anonfun$̲createRelation$…: anonfun$createRelation$1
    anonfun$org$apache$doris$spark$sql$DorisSourceProvideranonfunanonfunflush$1$1
    KaTeX parse error: Can't use function '$' in math mode at position 8: anonfun$̲apply$mcV$sp$1.…: anonfun$apply$mcV$sp$1.apply$mcVI$sp(DorisSourceProvider.scala:96)
            at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160)
            at org.apache.doris.spark.sql.DorisSourceProvider
    anonfun$createRelation$1
    KaTeX parse error: Can't use function '$' in math mode at position 8: anonfun$̲org$apache$dori…: anonfun$org$apache$doris$spark$sql$DorisSourceProvider
    anonfun
    KaTeX parse error: Can't use function '$' in math mode at position 6: flush$̲1$1.apply$mcV$s…: flush$1$1.apply$mcV$sp(DorisSourceProvider.scala:86)
            at scala.util.control.Breaks.breakable(Breaks.scala:38)
            at org.apache.doris.spark.sql.DorisSourceProvider
    anonfun$createRelation$1.org$apache$doris$spark$sql$DorisSourceProvideranonfunanonfunflush$1(DorisSourceProvider.scala:84)
    at org.apache.doris.spark.sql.DorisSourceProvider
    KaTeX parse error: Can't use function '$' in math mode at position 8: anonfun$̲createRelation$…: anonfun$createRelation$1
    anonfun$apply$2.apply(DorisSourceProvider.scala:70)
    at org.apache.doris.spark.sql.DorisSourceProvider
    KaTeX parse error: Can't use function '$' in math mode at position 8: anonfun$̲createRelation$…: anonfun$createRelation$1
    anonfun$apply$2.apply(DorisSourceProvider.scala:62)
    at scala.collection.Iterator$class.foreach(Iterator.scala:743)
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1174)
    at org.apache.doris.spark.sql.DorisSourceProvider
    KaTeX parse error: Can't use function '$' in math mode at position 8: anonfun$̲createRelation$…: anonfun$createRelation$1.apply(DorisSourceProvider.scala:62)
            at org.apache.doris.spark.sql.DorisSourceProvider
    anonfun$createRelation$1.apply(DorisSourceProvider.scala:60)
    at org.apache.spark.rdd.RDD
    KaTeX parse error: Can't use function '$' in math mode at position 8: anonfun$̲foreachPartitio…: anonfun$foreachPartition$1
    anonfun$apply$28.apply(RDD.scala:935)
    at org.apache.spark.rdd.RDD
    KaTeX parse error: Can't use function '$' in math mode at position 8: anonfun$̲foreachPartitio…: anonfun$foreachPartition$1
    anonfun$apply$28.apply(RDD.scala:935)
    at org.apache.spark.SparkContext
    KaTeX parse error: Can't use function '$' in math mode at position 8: anonfun$̲runJob$5.apply(…: anonfun$runJob$5.apply(SparkContext.scala:2121)
            at org.apache.spark.SparkContext
    anonfun$runJob$5.apply(SparkContext.scala:2121)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
    at org.apache.spark.scheduler.Task.run(Task.scala:121)
    at org.apache.spark.executor.Executor$TaskRunner$$anonfun$11.apply(Executor.scala:407)
    at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1408)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:413)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

    htyoung
    @htyoung
    image.png
    @FireFreedomK too many filtered rows说明导入的数据中有错误,被doris拒绝了,结合自己下面的错误日志排查一下原因
    jing zhang
    @kane0409_gitlab
    请问我搭建了三个节点,show backends;显示都alive,但有两个节点磁盘容量显示1b,这有问题吧?
    075a3e8eb78ebd61b89f9f00a56d1fd.png
    而且我点击最前面的BackendId,后两个都是没有信息,只有第一个有信息显示,能指导下大概是哪里的问题吗,多谢
    jing zhang
    @kane0409_gitlab
    在be的log里找到这个,后两个节点没找到master,这个在哪里配置呢?
    image.png
    luzhijing
    @luzhijing
    fe的priority_networks没生效
    青松
    @FireFreedomK
    @htyoung 好的,谢谢。
    青松
    @FireFreedomK
    @htyoung 我把那份数据下载下来看了一下,没有明显的异常。请问我在哪个位置找数据库的日志文件能帮助我定位问题?
    JiangJungle
    @JiangJungle
    @kane0409_gitlab 你用的啥界面工具?
    JiangJungle
    @JiangJungle
    今年上半年doris有哪些规划?
    JiangJungle
    @JiangJungle
    doris编程用什么?mysql驱动?像clickhouse有自己的驱动吗?
    Mingyu Chen
    @morningman
    The Doris Slack Channel is opened now. Welcome to join.
    The official Doris discussion channel will later be integrated into the Slack and dev@doris mailing lists. 后续Doris的官方讨论渠道将整合到 Slack 以及 dev@doris 邮件列表中。

    今年上半年doris有哪些规划?

    Roadmap 2022: apache/incubator-doris#7502

    AUB
    @aubdiy
    hdfs-broker 在导入 hdfs 数据的时候 失败, 日志中提示 找不到类
    java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.util.StringUtils
    有朋友遇到过吗
    doris 0.15, hadoop 3.3.1 hdfs-broker 在导入 hdfs 数据的时候 失败, 日志中提示 找不到类
    receive a ping request, request detail: TBrokerPingBrokerRequest(version:VERSION_ONE, clientId:172.31.3.146)
    [INFO ] 2022-02-08 08:07:10,947 method:org.apache.doris.broker.hdfs.HDFSBrokerServiceImpl.listPath(HDFSBrokerServiceImpl.java:67)
    received a list path request, request detail: TBrokerListPathRequest(version:VERSION_ONE, path:hdfs://test.internal:8020/tmp/data.csv, isRecursive:false, properties:{_DORIS_STORAGE_TYPE_=BROKER})
    [INFO ] 2022-02-08 08:07:10,948 method:org.apache.doris.broker.hdfs.FileSystemManager.getDistributedFileSystem(FileSystemManager.java:244)
    create file system for new path: hdfs://test.internal:8020/tmp/data.csv
    Exception in thread "pool-2-thread-13" java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.util.StringUtils
        at org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:1437)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:369)
        at org.apache.doris.broker.hdfs.FileSystemManager.getDistributedFileSystem(FileSystemManager.java:360)
        at org.apache.doris.broker.hdfs.FileSystemManager.getFileSystem(FileSystemManager.java:152)
        at org.apache.doris.broker.hdfs.FileSystemManager.listPath(FileSystemManager.java:427)
        at org.apache.doris.broker.hdfs.HDFSBrokerServiceImpl.listPath(HDFSBrokerServiceImpl.java:74)
        at org.apache.doris.thrift.TPaloBrokerService$Processor$listPath.getResult(TPaloBrokerService.java:815)
        at org.apache.doris.thrift.TPaloBrokerService$Processor$listPath.getResult(TPaloBrokerService.java:795)
        at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38)
        at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:313)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:834)
    [INFO ] 2022-02-08 08:07:10,963 method:org.apache.doris.broker.hdfs.HDFSBrokerServiceImpl.listPath(HDFSBrokerServiceImpl.java:67)
    received a list path request, request detail: TBrokerListPathRequest(version:VERSION_ONE, path:hdfs://test.internal:8020/tmp/data.csv, isRecursive:false, properties:{_DORIS_STORAGE_TYPE_=BROKER})
    Red12345678
    @Red12345678
    image.png
    有朋友遇到过吗,能否请教如何修复
    JiangJungle
    @JiangJungle
    doris可以跨集群搜索吗
    airfreshchen
    @airfreshchen
    用Doris做标签圈人,以标签为主键做聚合操作,导致单个标签上的userId过多,如何将userId
    用Doris做标签圈人,以标签为主键做聚合操作,导致单个标签上的userId过多,如何将userId行转列导出 有什么好办法吗
    需要自己开发UDTF函数 还有其他什么好办法吗
    kevinliukai
    @kevinliukai
    max_conn_per_user这个参数可以动态设置吗
    deemogsw
    @deemogsw
    有没有大佬使用了doris on es,建外部表的。我目前的需求是画像大宽表想要关联其他小表做人群生成预览,本来想每日同步数据进doris,但是同步时间比较长,卡在早高峰会使预览功能不可用。想做es外部表,宽表数据本身就在es中,只要把小表数据同步到doris,然后再关联es外部表count。看了下官网描述不知道这种join的操作谓词还能不能下推到es查询了。
    jiuqinyan
    @jiuqinyan
    刚入门,看了一下官方文档,并没有说mysql全量导入到doris中的解释,增量倒是有binlog,求大佬解惑
    King0513
    @King0513
    全量导入用dataX或者外部表直接查询。
    luzhijing
    @luzhijing

    doris可以跨集群搜索吗

    不支持

    用Doris做标签圈人,以标签为主键做聚合操作,导致单个标签上的userId过多,如何将userId行转列导出 有什么好办法吗

    下个版本支持Lateral View

    有没有大佬使用了doris on es,建外部表的。我目前的需求是画像大宽表想要关联其他小表做人群生成预览,本来想每日同步数据进doris,但是同步时间比较长,卡在早高峰会使预览功能不可用。想做es外部表,宽表数据本身就在es中,只要把小表数据同步到doris,然后再关联es外部表count。看了下官网描述不知道这种join的操作谓词还能不能下推到es查询了。

    支持

    jiuqinyan
    @jiuqinyan
    image.png
    image.png
    用es做外部表的时候报连不上,但是ip却不是建表的ip?
    Jw9394
    @Jw9394
    大家好,我有8台服务器16c 256G SSD16T想要部署BE,请问每一台应该部署几个BE示例比较好?
    ziliang-wan
    @ziliang-wan
    大家好,有没有遇到过doris集群通过broker备份快照到hdfs的时候报 could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation. 的,同一个hadoop,另外一套doris集群可以正常做备份。
    wanhuhou
    @wanhuhou
    Rollup. 创建的个数有没有限制,因为没办法判断用户使用什么字段查询,所以想创建多个来提高不同的查询条件
    luzhijing
    @luzhijing

    大家好,我有8台服务器16c 256G SSD16T想要部署BE,请问每一台应该部署几个BE示例比较好?

    建议1台机器部署1个BE

    Rollup. 创建的个数有没有限制,因为没办法判断用户使用什么字段查询,所以想创建多个来提高不同的查询条件

    可以创建多个,没有数量限制,但是存在多个物化视图时可能会影响写入或删除的效率

    wanhuhou
    @wanhuhou
    分区多的表创建太慢 有办法优化吗