Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Sep 15 08:39

    seyguai on scala-logging-3.9.4

    (compare)

  • Sep 15 08:39

    seyguai on hadoop-azure-3.3.1

    (compare)

  • Sep 15 08:39

    seyguai on elasticsearch-spark-30-7.13.2

    (compare)

  • Sep 15 08:39

    seyguai on scalate-core-1.9.7

    (compare)

  • Sep 15 08:39

    seyguai on spark-bigquery-with-dependencies-0.21.1

    (compare)

  • Sep 15 08:39

    seyguai on sbt-1.5.5

    (compare)

  • Sep 15 08:39

    seyguai on elasticsearch-spark-30-7.13.4

    (compare)

  • Sep 15 08:39

    seyguai on elasticsearch-spark-30-7.14.0

    (compare)

  • Sep 15 08:39

    seyguai on spark-bigquery-with-dependencies-0.22.0

    (compare)

  • Sep 15 08:39

    seyguai on scalafmt-core-3.0.1

    (compare)

  • Sep 15 08:39

    seyguai on testcontainers-scala-elasticsearch-0.39.7

    (compare)

  • Sep 15 08:38

    seyguai on elasticsearch-spark-30-7.14.1

    (compare)

  • Sep 15 08:38

    seyguai on scalafmt-core-3.0.2

    (compare)

  • Sep 15 08:38

    seyguai on scalafmt-core-3.0.3

    (compare)

  • Sep 15 08:38

    seyguai on spark-bigquery-with-dependencies-0.22.1

    (compare)

  • Sep 15 08:38

    seyguai on jsqlparser-4.2

    (compare)

  • Sep 15 08:38

    seyguai on schemaUpdate

    (compare)

  • Sep 15 08:37

    seyguai on elasticsearch-spark-30-7.13.1

    (compare)

  • Sep 15 08:37

    seyguai on spark-bigquery-with-dependencies-0.21.0

    (compare)

  • Sep 15 08:37

    seyguai on testcontainers-scala-elasticsearch-0.39.5

    (compare)

Hayssam Saleh
@hayssams
I just restored gitter as the main chat channel on github
Mohamad Kassir
@mhdkassir
Hello , it seems that there is an issue with thr snapshot workflow on github , it has been failing recently
any idea why ? We have this error :
[error] java.net.ProtocolException: Server redirected too many times (20)
Hayssam Saleh
@hayssams
Yep. I searched through SO and it seems that it was due a Gigahorse
We were having it unset in the build.sbt file but recently this has not worked anymore
Moreover, SBT 1.5 seems to have corrected this bug
So I moved to sbt 1.5.1 but it did not change anything
This heppen only on github actions
trying to publish from my own laptop works
I was not able to solve it yet
The issue is referenced here sbt/sbt-pgp#150
Hayssam Saleh
@hayssams
And it was supposed to be solved here sbt/librarymanagement#317
Hayssam Saleh
@hayssams
@cchepelov any idea why scala-steward i snot helping anymore ?
Mohamad Kassir
@mhdkassir
Ok I see , anyways it is not a blocking issue
Hayssam Saleh
@hayssams
@cchepelov problem solved : Scala steward is scheduled to run everyday as a GitHub action
Hayssam Saleh
@hayssams
From now on (v 0.2.1), any detected invalid YAML file will stop the Comet Job.
Mohamad Kassir
@mhdkassir
Hello, was there an issue with the release of the version 0.2.1 ? the build.sbt file is missing on this tag , seems to have been deleted by the release commit
Olivier Schultz
@olivierschultz

Hello, I have a question about this parameter sink-to-file which has been set to false by default on 0.2.2.
According to the description in the documentation https://ebiznext.github.io/comet-data-pipeline/docs/reference/configuration#ingestion

Should ingested files be stored on the filesystem on only in the sink defined in the YAML file ?

If I understand it correctly setting this parameter to false won't create parquet files on accepted/ and rejected/. So in my current comet's workflow, which is simply import -> watch, if set to false it will sink dataset toBQ but not on the cloud storage accepted/ and rejected/ ?

Hayssam Saleh
@hayssams
Yes exactly :)
To force storage on disk in addition to BQ you’ll have, as you said, to set sink-to-file to true
@mhdkassir Yes there was an issue wit the release of 0.2.1 due to the update of the build.sbt file that release a little bit differently now
I added a message in the release notes about it :)
Olivier Schultz
@olivierschultz
@hayssams Thanks
seyguai
@seyguai
ebiznext/comet-data-pipeline#536
A new PR ready to review, anyone available ?
Fred904
@Fred904
Feature/multiple schemas #537
A new PR ready to review, please check.
Abdelhakim Bendjabeur
@abdelhakimbendjabeur
Here is a new PR ebiznext/comet-data-pipeline#538
can you please check it out?
Hayssam Saleh
@hayssams
@seyguai just merged your PR. Thanks for the contributioon
Hayssam Saleh
@hayssams
@abdelhakimbendjabeur Thanks for the contribution. PR just merged.
@Fred904 Thanks for the PR. Just merged your contribution
Mohamad Kassir
@mhdkassir

I added a message in the release notes about it :)

okay noted :)

seyguai
@seyguai
Hello, I added a new PR: ebiznext/comet-data-pipeline#566
Can you check it ?
Hayssam Saleh
@hayssams
Will look at it in a couple of hours and get back to you. Thanks
Hayssam Saleh
@hayssams
@seyguai Gr8 job on PR ebiznext/comet-data-pipeline#572.
Thank you !
@seyguai do not forget to merge your PR
seyguai
@seyguai
@hayssams I don't have that kind of power =)
Hayssam Saleh
@hayssams
Now you have it !
May the force be with you
seyguai
@seyguai
@hayssams I found a new blocker for me, and raised this: ebiznext/comet-data-pipeline#582
Do you know if this might be fixed with your latest rework, or if I should try to fix it myself?
Hayssam Saleh
@hayssams
I’ll look at it tomorrow
Hayssam Saleh
@hayssams
@seyguai what’s your contirbutor email please so I can add you to the list of dev?
seyguai
@seyguai
You can add nb.seyguai@gmail.com
1 reply
Hayssam Saleh
@hayssams
@seyguai @mhdkassir About Java nio we now have an bigger issue
By using NIO we are now more able to handle filesystems for which the java nio interface is not uspported
Such filesystems include Hadoop filesystems
and a lot of users use it on hadoop
I am wondering if we should not backtrack to the hadoop interface when the nio is not available. What do you think ?
Hayssam Saleh
@hayssams
@seyguai following our conversation, we keep the loading of the nio provider in the code. This would require users to reference the java nio jar for the platform of their choice at runtime