by

Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Doug Whitehead
    @dsw9742
    just went through the somewhat frustrating experience of following the 2.0.0.M1 getting started documentation at http://docs.spring.io/spring-cloud-dataflow/docs/2.0.0.M1/reference/htmlsingle/#getting-started, only to bump into UI issues reported at spring-cloud/spring-cloud-dataflow-ui#1016, the current workaround for which is to use the shell or REST API
    downloading the shell jar to /maven isn't difficult or anything but given how much its use is referenced in the documentation + the UI issue + low LOE, it sure might be an easy improvement to make to the Docker image
    Doug Whitehead
    @dsw9742
    or even publishing images for related SC shells to Docker Hub
    just $0.02
    happy to file feature request on Github if that would help
    Sabby Anandan
    @sabbyanandan

    Hi, @dsw9742. It's no fun if the getting-started experience doesn't work as advertised. The UI bug has been fixed; it should be in 2.0 BUILD-SNAPSHOT image [docs here]. We plan to release 2.0 M2 next week.

    Some background:
    We wanted to simplify the steps for development through docker-compose. I hope that was straightforward for you. Given our users are of varied personas (app-devs, ops, analysts, data-engg, data-scientists, etc.), we don't take the opinion of promoting REST-APIs vs. Shell vs. UI - we leave that option to the end-users.

    That said, if packaging Shell in the docker-compose experience helps further, sure. However, I'm not sure how folks package a CLI-binary and also expose it on a port via docker-compose, though.

    Feel free to drop a story or even better, please consider contributing if you've experience with it.

    Doug Whitehead
    @dsw9742
    @sabbyanandan thanks for the response, and totally makes sense
    I'd be happy to take a pass at updating the docker-compose.yml file to also mount the shell jars for both dataflow and skipper ... what's the best way to go about doing so?
    I've signed the Pivotal individual CLA -- if taking this offline (e.g. email or otherwise) is best, just let me know
    Sabby Anandan
    @sabbyanandan
    Oh, that'd be great. Feel free to submit a PR against: https://github.com/spring-cloud/spring-cloud-dataflow/blob/master/spring-cloud-dataflow-server/docker-compose.yml - we would be able to review/merge it then.
    Doug Whitehead
    @dsw9742
    @sabbyanandan I've looked at a couple different ways of doing this, seems like just copying the shell jar to root of the dataflow-server image would be the easiest and most straight-forward way to do this ... but that probably involves updates to the dataflow-server Dockerfile ... is this the right one? https://github.com/spring-cloud/spring-cloud-dataflow/blob/master/spring-cloud-dataflow-server-core/src/main/docker/Dockerfile
    Sabby Anandan
    @sabbyanandan
    Ah, I see. I think we use a Bamboo-plugin to generate the image, so on every commit, there'll be a image against latest tag. Let me double check, though.
    Doug Whitehead
    @dsw9742
    yeah, I'm digging around in all the projects and there's actually multiple Dockerfile s scattered around here ... the one I linked to doesn't look like it matches the docker history for the latest dataflow-server image
    Sabby Anandan
    @sabbyanandan
    Correct. That's some cruft from legacy versions. We will have to remove that file. The others 2 files are in relation to Prometheus and Influx support for monitoring.
    So, I was wrong about the Bamboo plugin. Actually, it appears we are using the Fabric8 docker-plugin to generate the image as part of the commit/build. Here's the relevant bits.
    Doug Whitehead
    @dsw9742
    got it
    thanks
    Justin Fleck
    @jfleck1_gitlab
    Hello! I've been developing my own custom SCDF (2.0.0.BUILD-SNAPSHOT) apps and was wondering if the framework still supported spring boot admin? I'm having a little trouble getting my app to talk to the admin server once deployed.
    Sabby Anandan
    @sabbyanandan

    Hi, @jfleck1_gitlab. We used to have documentation on Spring-Boot Admin, but we have added native support for metrics via SCDF's Metrics Collector on the recent releases.

    Also, in v2.0, we are deprecating Metrics Collector in favor of Micrometer and the supported backend (eg: Prometheus, InfluxDB, ..) for monitoring and metrics in general.

    All that said, if you are still interested only in Spring Boot Admin, here's some docs from the old release -> https://docs.spring.io/spring-cloud-dataflow/docs/1.3.0.RELEASE/reference/htmlsingle/#_spring_boot_admin - it might be useful.

    Justin Fleck
    @jfleck1_gitlab
    ah awesome @sabbyanandan. I'd rather push us forward to the newest tech and be future proofed a little. Plus we have other products using prometheus so I can consolidate there. Originally I was after the logs but I read and tried redirecting them to the skipper logs for debugging, so that works out. I'll go with that. Soon after that will be monitoring, especially message counts and flow for back pressure scale out. I'll look into micrometer. Thanks!
    Sabby Anandan
    @sabbyanandan

    @jfleck1_gitlab: Exciting! Please have a look at the OOTB Grafana Dashboard widgets that we curated to work with streams running in SCDF.

    We focus on the message-rates, error-rates, latency, throughput, cpu and other usual suspects. If you can think of any other important metrics, that can be added to it as well. Contributions welcome!

    We recently also integrated Grafana in the SCDF Dashboard via spring-cloud/spring-cloud-dataflow-ui#1027. You will now be able to click the icon at each stream (see screenshot in the issue), and that will automatically take you over to the Grafana dashboard with all the widgets automatically populated with context specific information. :smile:

    Christian Tzolov
    @tzolov
    @nWidart , @tkvangorder, I'm sorry for jumping now in the micrometer/prometheus conversation! As Sabby has mentioned we are working on providing a "generic" and hopefully easy to start with, integration for various TSDB (Influx and Prometheus are the first) and Grafana.
    I'd be very interested to learn more about your environments, and the type of metics your are collecting?
    1. Are you running the SCDF on K8s, CloudFoundry or Local?
    2. How do you handle the Prometheus' target Service Discovery?
    3. Are you securing your prometheus entry points?
      Collecting Tasks metrics is on our roadmap too. As you've pointed out collecting metrics from short-lived tasks is easier for push based TSDB such as InfluxDB but tricky for pull based one such as Prometheus. The pushgateay is one of the options we are exploring (so @tkvangorder your snippets could come handy) along with more elaborate approaches such as M3.
    Doug Whitehead
    @dsw9742
    In addition to s.c.d.grafana-info.url do other properties need to be set to enable the Prometheus data collection? Like stream app.*.properties.metrics.destination or something?
    Doug Whitehead
    @dsw9742
    ah OK ... figured it out
    --management.endpoints.web.exposure.include=*
    Sabby Anandan
    @sabbyanandan
    Ah, we are still in the middle of documenting this feature. You'd need the following properties supplied at SCDF-server level to use Prometheus as the backend.
    spring.cloud.dataflow.applicationProperties.stream.management.metrics.export.prometheus.enabled=true
    spring.cloud.dataflow.applicationProperties.stream.spring.cloud.streamapp.security.enabled=false
    spring.cloud.dataflow.applicationProperties.stream.management.endpoints.web.exposure.include=prometheus,info,health
    spring.cloud.dataflow.grafana-info.url=http://localhost:3000
    Tyler Van Gorder
    @tkvangorder

    @tzolov

    Our use case:

    We are migrating jobs that used to run on a home-grown scheduler. All jobs were running in the same VM and, obviously, this is less than ideal for a company of our size. So we have been pulling jobs out of that scheduler and reimplementing them as spring cloud task applications. The legacy scheduler now just makes an API call to the SCDF server to launch the task. We have 100+ scheduled jobs, of which we have migrated roughly 10%.

    • We are currently running SCDF in a local environment, we have a cluster of 5 nodes running the data flow server and use Eureka + ribbon to load balance across those nodes. We even have some primitive customizations to the task launching api to check resources on that node and to throw an exception that results in a retry on a different node in our cluster. We have plans to move this workload to Kubernetes which will make our lives a little less complicated (at least in theory).
    • We are just now moving toward Prometheus, so our setup is still pretty simple. Since, we are still in a private data center, we provision both prometheus and the push gateway on the same VM. We configure prometheus to point it at the push gateway and are not using service discovery. For an individual spring cloud task application, we just point it to the push gateway via spring environment setting.
    • We do not currently secure the prometheus endpoints.
    Nicolas Widart
    @nWidart
    @tzolov Hey no worries,
    1. scdf running in k8s
    2. I just pointed prometheus to the pushgateway endpoint
    3. not currently.
      The main metrics we needed is the spring http.client.requests one. We'll probably add other based on the needs.
    majorisit
    @majorisit

    Hi @sabbyanandan, We are running a spring cloud Dataflow server on Kubernetes. When creating a task app on scdf-shell, we are referring a maven meta-data artifact in the task command for whitelist properties.

    app register --name smoketest --type task --uri docker://scdf-smoketest:latest --metadata-uri maven://com.pipeline.scdf:smoketest-task-app:jar:metadata:1.0.0-SNAPSHOT

    App got created succesfully. But, when we type app info command on scdf-shell, we are getting SSLHandshakeException on scdf server logs.

    Caused by: javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target

    Is there anyway to skip SSL validation from scdf server side? Please help us to resolve this issue.

    Sabby Anandan
    @sabbyanandan

    Hi, @majorisit. No-cert errors are no fun. I don't think we have a way to skip SSL validation for when resolving for the companion-metadata artifact for the apps. If you cannot fix the cert issue, perhaps you could download the metadata JARs in a HTTP location inside your network, and register the metadata JAR with the http location of the JAR.

    Something like:

    dataflow:>app register --name time2 --type source --metadata-uri http://repo.spring.io/milestone/org/springframework/cloud/stream/app/time-source-rabbit/2.1.0.M2/time-source-rabbit-2.1.0.M2-metadata.jar --uri docker:springcloudstream/time-source-kafka:2.1.0.M2
    Let's see if @markpollack has any other ideas.
    majorisit
    @majorisit
    @sabbyanandan, we are already doing that. We should be able to create apps with companion-metadata artifacts with http and file protocols.
    but, we are interested in maven approach as this would be more helpful for CI/CD process.
    Doug Whitehead
    @dsw9742
    @chrisjs if I am butchering the #2797 pull request, please let me know ... I may need a bit of guidance to clean it up so it meets guidelines
    Chris Schaefer
    @chrisjs
    @dsw9742 you may want to create a new branch from master, pick your change into there and make a new PR
    djgeary
    @djgeary

    Hi, we have just updated from SCDF 1.5.0 to 1.7.3 and have an issue with some of our existing streams not being able to be displayed in the dashboard, I think it might be to do with commas or quotes in the stream (as simple ones without either do work)

    For example we have a stream already defined:

    rabbit --queues=logs --outputType=text/plain | filter --expression=!#jsonPath(payload,'$.type').equals('test') | mongodb --mongodb.database=logs --collection=responses

    If I try to view the stream in the dashboard I just get 'Loading ....' and in the web console I see
    TokenizationError: Unexpected character
    ERROR TypeError: "t.lines[0].nodes is undefined"

    So I can't deploy etc from the dashboard, but deploy/undeploy still works from the shell.
    However if I try to create a new version in the shell with the same definition eg

    stream create --name newlogs --definition "rabbit --queues=logs --outputType=text/plain | filter --expression=!#jsonPath(payload,'$.type').equals('test') | mongodb --mongodb.database=logs --collection=responses"

    I get
    bash: syntax error near unexpected token `('

    Has something changed with the syntax or do I need to escape something?

    djgeary
    @djgeary
    sorry ignore the final part - creating via the shell is fine, the problem is displaying the stream in the dashboard
    djgeary
    @djgeary

    ok, the problem seems to be I now need to quote anything with a comma (ie the filter expression here) , whereas I didn't before, if I use

    stream create --name newlogs2 --definition "rabbit --queues=logs --outputType=text/plain | filter --expression='!#jsonPath(payload,'''$.type''').equals('''test''')' | mongodb --mongodb.database=logs --collection=responses"

    it will display correctly in the dashboard

    Doug Whitehead
    @dsw9742
    @chrisjs OK, I think I just did that and resubmitted as pr #2834
    Should I close the original one? And for future reference ... is closing unapproved PRs & resubmitting as new PRs the preferred workflow? Hoping to cause as little friction as possible for maintainers :)
    Sabby Anandan
    @sabbyanandan

    Hi, @djgeary! You're correct. We have had to revise the parser when we added support for Stream Application DSL; since this feature introduced a new delimiter for streams, the overall parser had had to be reworked a bit via spring-cloud/spring-cloud-dataflow@7c3a99c.

    Typically, we are very careful to not break anything in a point-release. Unfortunately, we had to in this case to avoid several downstream impacts, and we didn't do a good job saying it clearly in the docs. Sorry about that!

    I'm glad with extra quotes, you're able to get everything up and running.
    Doug Whitehead
    @dsw9742
    is 2.0.0.M2 still scheduled to be released this week? if so is there a target date that can be shared?
    Sabby Anandan
    @sabbyanandan
    Hi, @dsw9742. Yes; we are targeting to wrap it this week. You can follow-along here: https://github.com/spring-cloud/spring-cloud-dataflow/milestones
    Doug Whitehead
    @dsw9742
    that is great, thanks @sabbyanandan
    Nicolas Widart
    @nWidart

    hello, I've been getting the following error and not sure what's causing it 🤔Dependencies were updated but that's it.

    The bean 'transactionManager', defined in org.springframework.cloud.task.configuration.SimpleTaskConfiguration, could not be registered. A bean with that name has already been defined in class path resource [org/springframework/batch/core/configuration/annotation/SimpleBatchConfiguration.class] and overriding is disabled.

    spring.main.allow-bean-definition-overriding is already set to true, yet the action mentioned is setting that to true

    Glenn Renfro
    @cppwfs
    Is SCDF reporting this exception? Or is your task application?
    Sabby Anandan
    @sabbyanandan
    @nWidart: Perhaps it'd be good to repost it at https://gitter.im/spring-cloud/spring-cloud-task.
    Nicolas Widart
    @nWidart
    The task application indeed, sorry, will post there :)
    djgeary
    @djgeary
    Hi @sabbyanandan yes we have it working now, and we have a way to modify our existing streams to get them to display in the dashboard. Everything works from the shell though without adding the quotes - the streams create and deploy successfully which implies to be its just a bug in the dashboard display? At least it should be consistent - currently you can create and deploy streams from the shell and not realise there is an issue until you try to view them in the dashboard.
    Another issue we encountered with the upgrade was the new concurrent task executions limit functionality. We initially were unable to launch tasks and with the error message 'The maximum concurrent task executions [20] is at its limit.' Looking into this SCDF calculates the number of running tasks to be the number of tasks in the execution table with a null end_time. For various reasons (eg tasks that died or were killed by kubernetes) we had a few hundred entries in here with null end_times. We've worked around this for now by setting spring.cloud.dataflow.task.maximum-concurrent-tasks to a very high value effectively disabling this feature, but I'm not sure using end_time=null is a reliable way to determine the number of concurrent tasks?