Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Meir Shamay
    @may215
    I tried that already, but, nothing happend ./gollum -c config/kafka_roundtrip.conf -ps -ll 3
    the topic created, but, non of the messages were proccess
    I am working with GO 1.8.1, gollum master, kafka 0.10
    Marc Siebeneicher
    @msiebeneicher
    Hmm - weird - i guess it should be something with the kafka setup or topics. i use the example config one to one for testing with the following docker-compose setup:
    zookeeper:
      image: wurstmeister/zookeeper
      ports:
        - "2181:2181"
        #- "32022:22"
        - "2888:2888"
        - "3888:3888"
    kafkaone:
      image: wurstmeister/kafka:0.10.0.0
      ports:
        - "9092:9092"
      links:
        - zookeeper:zookeeper
      volumes:
        - /var/run/docker.sock:/var/run/docker.sock
      environment:
        KAFKA_ADVERTISED_HOST_NAME: 192.168.34.24
        KAFKA_ZOOKEEPER_CONNECT: "zookeeper"
        KAFKA_BROKER_ID: "21"
        KAFKA_CREATE_TOPICS: "test:1:3,Topic2:1:1:compact"
    kafkatwo:
      image: wurstmeister/kafka:0.10.0.0
      ports:
        - "9093:9092"
      links:
        - zookeeper:zookeeper
      volumes:
        - /var/run/docker.sock:/var/run/docker.sock
      environment:
        KAFKA_ADVERTISED_HOST_NAME: 192.168.34.24
        KAFKA_ZOOKEEPER_CONNECT: "zookeeper"
        KAFKA_BROKER_ID: "22"
        KAFKA_CREATE_TOPICS: "test:1:3,Topic2:1:1:compact"
    kafkathree:
      image: wurstmeister/kafka:0.10.0.0
      ports:
        - "9094:9092"
      links:
        - zookeeper:zookeeper
      volumes:
        - /var/run/docker.sock:/var/run/docker.sock
      environment:
        KAFKA_ADVERTISED_HOST_NAME: 192.168.34.24
        KAFKA_ZOOKEEPER_CONNECT: "zookeeper"
        KAFKA_BROKER_ID: "23"
        KAFKA_CREATE_TOPICS: "test:1:3,Topic2:1:1:compact"
    Meir Shamay
    @may215
    thanks man, trying.
    Marc Siebeneicher
    @msiebeneicher
    for some reasons i will now also get no data back from the kafka consumer - i will investigate the issue tomorrow in the office
    i will let you now if we found the issue
    Marc Siebeneicher
    @msiebeneicher
    i created an issue on github for follow up: trivago/gollum#142
    Meir Shamay
    @may215
    thanks.. +1
    Arne Claus
    @arnecls
    We just checked this here. The problem is with the kafka advertised hostname section.
    In the docker-compose.yml you see these „KAFKA_ADVERTISED_HOST_NAME“ entries.
    Two options here:
    1. Insert your computer’s IP here
    2. Insert some name here, e.g. kafka0, kafka1, kafka2 and add those to your /etc/hosts, pointing again to your IP
    If you don’t do that, the kafka client will have trouble finding the brokers.
    When starting gollum with the „-ll 3“ flag you should also see that in the log output
    TL;DR: kafka_roundtrip.conf is fine, docker-compose.yml is not.
    Arne Claus
    @arnecls
    A little bit more information on what is happening here:
    • kafka consumer/producers talks to a random broker on the list „do you know this topic“.
    • broker responses „yes, here are the brokers serving this topic“
    • a list of host names is returned, this list is built from KAFKA_ADVERTISED_HOST_NAME entries
    • if the returned names are not known you cannot poll
    • if you use 127.0.0.1 as KAFKA_ADVERTISED_HOST_NAME this does also not work because brokers cannot talk to each other
    • if you use an IP known to docker (e.g. the IP of your network interface) everything is fine
    • if you use an /etc/hosts entry this is also fine as those are propagated to your containers
    Marc Siebeneicher
    @msiebeneicher

    here now a working docker-compose file:

    zookeeper:
      image: wurstmeister/zookeeper
      ports:
        - "2181:2181"
        #- "32022:22"
        - "2888:2888"
        - "3888:3888"
    kafkaone:
      image: wurstmeister/kafka:0.10.0.0
      ports:
        - "9092:9092"
      links:
        - zookeeper:zookeeper
      volumes:
        - /var/run/docker.sock:/var/run/docker.sock
      environment:
        KAFKA_ADVERTISED_HOST_NAME: kafka0
        KAFKA_ZOOKEEPER_CONNECT: "zookeeper"
        KAFKA_BROKER_ID: "21"
        KAFKA_CREATE_TOPICS: "test:1:3,Topic2:1:1:compact"
    kafkatwo:
      image: wurstmeister/kafka:0.10.0.0
      ports:
        - "9093:9092"
      links:
        - zookeeper:zookeeper
      volumes:
        - /var/run/docker.sock:/var/run/docker.sock
      environment:
        KAFKA_ADVERTISED_HOST_NAME: kafka1
        KAFKA_ZOOKEEPER_CONNECT: "zookeeper"
        KAFKA_BROKER_ID: "22"
        KAFKA_CREATE_TOPICS: "test:1:3,Topic2:1:1:compact"
    kafkathree:
      image: wurstmeister/kafka:0.10.0.0
      ports:
        - "9094:9092"
      links:
        - zookeeper:zookeeper
      volumes:
        - /var/run/docker.sock:/var/run/docker.sock
      environment:
        KAFKA_ADVERTISED_HOST_NAME: kafka2
        KAFKA_ZOOKEEPER_CONNECT: "zookeeper"
        KAFKA_BROKER_ID: "23"
        KAFKA_CREATE_TOPICS: "test:1:3,Topic2:1:1:compact"

    and here the necessary /etc/hosts entry:

    <YOUR PUBLIC IP> kafka0 kafka1 kafka2
    Meir Shamay
    @may215
    Hi,
    Thanks for the response, it looks like its working now
    Can I use the round trip kafka config to my use case to send the messages to s3 ? instead of to the console?
    Arne Claus
    @arnecls
    If you add another producer for s3 listening to the same stream - yes
    If that is what you want
    producerS3:
        type: producer.S3
        streams: „write“
        Endpoint: „…"
        Region: „…"
    This will send all messages on the „write“ stream to S3 and Kafka at the same time
    That’s what I meant by „to s3 AND kafka“ vs. „s3 over kafka“. In the latter case you would need a second instance of gollum with a kafka consumer and an s3 producer.
    Meir Shamay
    @may215
    Hi, I tried the following configuration, and nothing goes to s3, the map object that need to contain the object data is always empty.
    consumerConsole:
        type: consumer.Console
        Streams: "write"
    
    producerKafka:
        type: producer.Kafka
        Streams: "write"
        Compression: "zip"
        Topics:
            "write" : "test"
        Servers:
            - 127.0.0.1:9092
    
    consumerKafka:
        type: consumer.Kafka
        Streams: "read"
        Topic: "test"
        DefaultOffset: "Oldest"
        MaxFetchSizeByte: 100
        Servers:
            - 127.0.0.1:9092
    
    producerConsole:
        type: producer.Console
        Streams: "read"
        Modulators:
            - format.Envelope:
                Postfix: "\n"
    
    producerS3:
        Type: "producer.S3"
        Streams: "read"
        Region: "us-east-1"
        Endpoint: "s3-us-east-1.amazonaws.com"
        StorageClass: "STANDARD"
        CredentialType: "static"
        CredentialId: ""
        CredentialToken: "my_key"
        CredentialSecret: "my_secret"
        CredentialFile: ""
        CredentialProfile: ""
        BatchMaxMessages: 2
        ObjectMaxMessages: 2
        ObjectMessageDelimiter: "\n"
        SendTimeframeMs: 1000
        BatchTimeoutSec: 3
        TimestampWrite: "2017-01-02T15:04:05"
        #PathFormatter: ""
        Compress: false
        LocalPath: ""
        UploadOnShutdown: false
        #FileMaxAgeSec: 3600
        FileMaxMB: 1000
        StreamMapping:
          "*" : "main_bucket/sub_folder"
        Formatters:
            - "format.Base64Encode"
    Marc Siebeneicher
    @msiebeneicher

    Morning - without to test the config i would try the following:

    1) try to test the s3 producer only with a consoleProducer to verify the s3 config
    2) try an explicit steam mapping for s3:

    StreamMapping:
          read: "main_bucket/sub_folder"
    Meir Shamay
    @may215
    Hi, I tried this already, but, the same behavior
    consumerConsole:
        type: consumer.Console
        Streams: "write"
    
    producerConsole:
        type: producer.Console
        Streams: "read"
        Modulators:
            - format.Envelope:
                Postfix: "\n"
    
    producerS3:
        Type: "producer.S3"
        Streams: "read"
        Region: "us-east-1"
        Endpoint: "s3-us-east-1.amazonaws.com"
        StorageClass: "STANDARD"
        CredentialType: "static"
        CredentialId: ""
        CredentialToken: "my_key"
        CredentialSecret: "my_secret"
        CredentialFile: ""
        CredentialProfile: ""
        BatchMaxMessages: 2
        ObjectMaxMessages: 2
        ObjectMessageDelimiter: "\n"
        SendTimeframeMs: 1000
        BatchTimeoutSec: 3
        TimestampWrite: "2017-01-02T15:04:05"
        #PathFormatter: ""
        Compress: false
        LocalPath: ""
        UploadOnShutdown: false
        #FileMaxAgeSec: 3600
        FileMaxMB: 1000
        StreamMapping:
          "read" : "main_bucket/sub_folder"
        Formatters:
            - "format.Base64Encode"
    Marc Siebeneicher
    @msiebeneicher
    in your config the s3 producer is listing to the wrong stream - read have no consumer to create messages
    try te following one
    consumerConsole:
        type: consumer.Console
        Streams: "write"
    
    producerS3:
        Type: "producer.S3"
        Streams: "write"
        Region: "us-east-1"
        Endpoint: "s3-us-east-1.amazonaws.com"
        StorageClass: "STANDARD"
        CredentialType: "static"
        CredentialId: ""
        CredentialToken: "my_key"
        CredentialSecret: "my_secret"
        CredentialFile: ""
        CredentialProfile: ""
        BatchMaxMessages: 2
        ObjectMaxMessages: 2
        ObjectMessageDelimiter: "\n"
        SendTimeframeMs: 1000
        BatchTimeoutSec: 3
        TimestampWrite: "2017-01-02T15:04:05"
        #PathFormatter: ""
        Compress: false
        LocalPath: ""
        UploadOnShutdown: false
        #FileMaxAgeSec: 3600
        FileMaxMB: 1000
        StreamMapping:
          "write" : "main_bucket/sub_folder"
        Formatters:
            - “format.Base64Encode”
    Meir Shamay
    @may215
    No, that not working. BTW, I checked the correct credentials for s3, and they working fine. but, as I said the list of objects to upload is empty
    Marc Siebeneicher
    @msiebeneicher
    Hmm - unfortunatly i can at the earliest debug the producer anc check the configs setup at the end of the next week.
    what i would try first is to set the envelope formatter in the consumer because you set as object delimiter \n
    like
    consumerConsole:
        type: consumer.Console
        Streams: "write"
        Modulators:
            - format.Envelope:
                Postfix: "\n”
    mybe it helps if you create a temporary log entry in the producer method to check your recived payload
    Meir Shamay
    @may215
    :( No... I will try to check it also.
    Thanks for the replies ...
    Marc Siebeneicher
    @msiebeneicher

    you are welcome - let us know about your process.

    If you should found a bug please create a issue on github with necessary configs to reproduce the issue.

    Marc Siebeneicher
    @msiebeneicher
    @may215 : fyi - we implemented a new AwsS3 producer which is now available in the master. The old also still exists and works.
    darsenault
    @darsenault

    Thank you for making and releasing Gollum! Amazing framework for sure.

    Style question if I may. I want to make a component that reads messages, modifies their content, and writes messages. For example read a message with a file path name, read the bytes of the file, and pass on the path and the file bytes. I'd love for this to be a top-level component like a a Consumer or a Producer so I can drop in multiple of these to construct a flow of many/complex data changes. Right now I have this as a Modulator (formatter), but this attaches to other components.

    Please advise on the Gollum way to approach this design. Thanks!

    Arne Claus
    @arnecls
    There are a few ways to do this but the modulator approach is what I would have chosen, too. As you rely on messages from other sources it's not a good fit for a consumer. As you are not directly sending to a service it's not a good fit for a producer, too. You could do sth. like the spooler but I wouldn't see at as a "clean" and easy to manage solution. ProducerConsumers are a bit tricky.
    If you want to react on multiple incoming streams you could add a stream/file map to your formatter. If you want to re-use the same formatter for multiple consumers or producers try an aggregate.
    Arne Claus
    @arnecls
    Hope that helps a bit.
    darsenault
    @darsenault
    Arne, thanks. I am not sure this is what I was looking for. Let me try to explain with an example.
    I want a flow like this: DirectoryWatcher (custom consumer; emits paths of modified files) --> something to read file from path in message --> something to read file contents from message and emit modified contents --> another read / modify / write --> .... --> Producer.
    darsenault
    @darsenault
    Based on my work so far with Gollum the closest thing I can find is perhaps a formatter to get the message, process it, then send the message, but this is not a top level component. What about attching a Modulator (formatter) to a Router?
    Basically I want some way to specify a stream to read, enable processing of the message data, then specify a steam to write to. Then chain these into a flow. Thanks in advance for the follow up!
    Arne Claus
    @arnecls
    Hmm. I would maybe combine the watcher and the file content extraction into one consumer. In the end you care about the content, not the paths. And if you still need the paths later, put them into metadata.
    The content modification should be done by a chain of formatters.
    If you need to move messages to different streams based on content you should use a router.
    So: change detection and reading: consumer. Content modification: formatter, target decision: router
    Steven Meadows
    @smeadows53_gitlab
    I'm interested in using gollum for project involving various consumers (file, ftp, http), and producer (kafka, avro, http). There appears to NOT have been any activity for sometime, is this project still an active?