Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • 18:29
    jszwedko closed #6617
  • 18:28
    lucperkins synchronize #6479
  • 18:27
    lucperkins synchronize #6479
  • 18:25
    lucperkins synchronize #6479
  • 18:02
    eeyun review_requested #6627
  • 18:02
    eeyun ready_for_review #6627
  • 18:02
    eeyun synchronize #4837
  • 17:52
    eeyun synchronize #4837
  • 17:47
    eeyun synchronize #4837
  • 17:31
    lucperkins synchronize #6603
  • 17:21
    eeyun synchronize #6627
  • 17:20
    eeyun synchronize #6627
  • 17:10
    eeyun synchronize #6627
  • 17:00
    eeyun synchronize #6627
  • 16:42
    eeyun synchronize #6627
  • 16:11
    jszwedko closed #6641
  • 15:35
    eeyun synchronize #6627
  • 15:25
    FungusHumungus synchronize #6639
  • 15:23
    jszwedko review_requested #6641
  • 15:23
    jszwedko review_requested #6641
Edward Roper
@eroper

Hi Everyone,
I'm attempting to use the aws_s3 sink with Ceph. The healthcheck passes but it fails to store any objects. On the server-side I'm seeing

2020-08-11 00:52:50.837 7fa599155700 20 get_system_obj_state: s->obj_tag was set empty
2020-08-11 00:52:50.837 7fa599155700 20 Read xattr: user.rgw.idtag
2020-08-11 00:52:50.837 7fa599155700  2 req 2 0.020s s3:put_obj recalculating target
2020-08-11 00:52:50.837 7fa599155700  2 req 2 0.020s s3:put_obj reading permissions
2020-08-11 00:52:50.837 7fa599155700  2 req 2 0.020s s3:put_obj init op
2020-08-11 00:52:50.837 7fa599155700  2 req 2 0.020s s3:put_obj verifying op mask
2020-08-11 00:52:50.837 7fa599155700 20 req 2 0.020s s3:put_obj required_mask= 2 user.op_mask=7
2020-08-11 00:52:50.837 7fa599155700  2 req 2 0.020s s3:put_obj verifying op permissions
2020-08-11 00:52:50.837 7fa599155700  0 setting obj tags failed with -2210
2020-08-11 00:52:50.837 7fa599155700 20 req 2 0.020s s3:put_obj get_params() returned ret=-22
2020-08-11 00:52:50.837 7fa599155700 20 op->ERRORHANDLER: err_no=-22 new_err_no=-22
2020-08-11 00:52:50.837 7fa599155700  2 req 2 0.020s s3:put_obj op status=0
2020-08-11 00:52:50.837 7fa599155700  2 req 2 0.020s s3:put_obj http status=400
2020-08-11 00:52:50.837 7fa599155700  1 ====== req done req=0x563d82950720 op status=0 http_status=400 latency=0.0199999s ======
'''

data_dir = "/var/lib/vector"

[sources.journald]
batch_size = 16
current_boot_only = true
type = "journald"

[sinks.ceph]
bucket = "anubis-logs"
endpoint = "https://some.host"
healthcheck = true
inputs = ["journald"]
type = "aws_s3"
buffer.type = "memory"
'''

Any ideas on what might be wrong?

interestingly the addition of tags.Tag1 = "value1"
seems to "fix" it
Matt Franz
@mdfranz_gitlab
Aug 10 21:30:21.451  INFO vector::sources::docker: Started listening logs on docker container id=7d704b317c21a893a26de131a0495e08ef39ee5144e0a743e23a6027c85316e2
Aug 10 21:30:21.454 TRACE vector::sources::docker: Received one event. event=Log(LogEvent { fields: {"container_created_at": Timestamp(2020-08-11T01:02:56.926028916Z), "container_id": Bytes(b"7d704b317c21a893a26de131a0495e08ef39ee5144e0a743e23a6027c85316e2"), "container_name": Bytes(b"mynginx13"), "image": Bytes(b"nginx"), "label": Map({"maintainer": Bytes(b"NGINX Docker Maintainers <docker-maint@nginx.com>")}), "message": Bytes(b"/docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration"), "source_type": Bytes(b"docker"), "stream": Bytes(b"stdout"), "timestamp": Timestamp(2020-08-11T01:30:21.441879555Z)} })
Aug 10 21:30:21.454  WARN sink{name=cw_log type=aws_cloudwatch_logs}: vector::sinks::aws_cloudwatch_logs: keys in stream template do not exist on the event; dropping event. missing_keys=[Atom('host' type=inline)] rate_limit_secs=30
Aug 10 21:30:21.454 TRACE vector::sources::docker: Received one event. event=Log(LogEvent { fields: {"container_created_at": Timestamp(2020-08-11T01:02:56.926028916Z), "container_id": Bytes(b"7d704b317c21a893a26de131a0495e08ef39ee5144e0a743e23a6027c85316e2"), "container_name": Bytes(b"mynginx13"), "image": Bytes(b"nginx"), "label": Map({"maintainer": Bytes(b"NGINX Docker Maintainers <docker-maint@nginx.com>")}), "message": Bytes(b"/docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/"), "source_type": Bytes(b"docker"), "stream": Bytes(b"stdout"), "timestamp": Timestamp(2020-08-11T01:30:21.441906500Z)} })
Aug 10 21:30:21.454  WARN sink{name=cw_log type=aws_cloudwatch_logs}: vector::sinks::aws_cloudwatch_logs: "keys in stream template do not exist on the event; dropping event." is being rate limited. rate_limit_secs=5
are there known issues with docker logs and cloudwatch? (i didn't see anything obvious but could have missed it)
7 replies
Bumsoo Kim
@bskim45
Hi all, how do you think of using vector as a forwarder of analytic events from Kafka to Kinesis? Though docs says vector is not for non-observability logs, it seems vector would work fine for that simple purpose.
3 replies
Ghost
@ghost~5cd19fe2d73408ce4fbfa0bf
is possible in vector add luascript use a lunajson packages?
Grant Isdale
@grantisdale

Hey all :) When using the Loki sink, I am continuously getting entry out of order errors for my logs. Does anyone have any experience?

I (think) the problem is that because we deploy vector as a daemonset in our clusters (currently at 4 pods) it's a decentralised deployment, making things tricky for loki to handle. But we thought adding a unique label to each instance would solve the problem.

I added a unique label for each instance withlabels.vector_instance = "${HOSTNAME}" but I'm still getting the same error. I can verify the label exists by inspecting the logs in grafana because some are coming through, but most are being rejected!

I've also tried:

labels.vector_instance = {{ host }}
request.in_flight_limit = 1
request.rate_limit_num = 1

3 replies
Felipe Passos
@SharksT
Hello i have one doubt, how do i visualize the aggregated data from vector ? i didn't see a option to use grafana as sink option for example
1 reply
Lakshmi-r21
@Lakshmi-r21
Hi team,could you pleas help us its urgent, we are trying to push the vector logs from accout a (server exists) to account b(which has s3 bucket).File is getting pushed but canonical ID is not getting added we tried adding the below part to config file # ACL acl = "private" # optional, no default grant_full_control = "79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be" # optional, no default grant_read = "79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be" # optional, no default grant_read_acp = "79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be" # optional, no default grant_write_acp = "79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be" # optional, no default even then canonical ID is not getting added.
sagar-jatti
@sagar-jatti
Hi Team,
We are not able to pass the ACL permissions to push objects to give full control on S3 bucket for owner, also i can see "bucket-owner-full-control" this ACL is not present in your Vector page at https://vector.dev/docs/reference/sinks/aws_s3/#acl. Please let us know how to grant full permissions to bucket owner when Vector trying to push data to S3 bucket should have canonical ID. Please respond back ASAP, it is very urgent we are stuck with this almost 3-4 days, your help will be truly appreciated... :-)
25 replies
bhattchaitanya
@bhattchaitanya
In the s3 sink module, is there a way I can specify the original tail file name or parts of the file name as tokens in the s3 prefix path expression?
As of now it seems like only strftime date expressions are allowed
11 replies
Matt Franz
@mdfranz_gitlab
Hi, Is there a transform for fields that allows only specific fields to be passed from a sink. I'm looking at a way to reduce the fields from journald (especially all the SYSTEMD* fields) I could us remove_fields but I would have to specify a huge list) It would be nice to have something like pass_fields (that would only output those specific fields)
5 replies
journalctl (which I believe the sink calls) does have a --output-fields argument that could solve this as well but this seems like a more general use case
bhattchaitanya
@bhattchaitanya
I see that vector process does not use multiple CPU cores and gets pegged at 100% and the throughput drops. How do I make vector make use of all the CPU cores available in the node?
33 replies
Christof Weickhardt
@somehowchris

Hey there

I got to know vector a few months ago and have been using it personally for all the non-profit projects I work on.
Now I would like to bring it's awesome performance to the team I work at currently but at that scope, it's not anymore just a small thing.

Our current setup consists of logstash scaling up or down dynamically depending on the load and all the apps pushing their filebeat, packetbeat, metricbeat, etc. data (elastic stack beats) to it. As I would like to transition to vector my first attempt would be bringing over vector to replace logstash to have all the current setup elastic stack compliant but benefit from the performance of vector.

I again read through some parts of your docs and found some stuff I'm currently concerned about.

One would be the at least delivered once guarantee https://vector.dev/docs/about/guarantees/#at-least-once
You state that you should specify some kind of path to have events queued up until they can be delivered once the destination is available again. A nice feature but that means my container has to be stateful?

I thought about having vector reading from a kafka topic but that raised also some questions. I read part of the kafka source code but couldn't find a part where the instances talk to each other. Does this mean each instance would consume the event? Is vector cloud native?

I may misunderstand some scaling vector but I couldn't find a part in the docs explaining those situations.

24 replies
Vlad Pedosyuk
@vpedosyuk
Hi, I'd like to clarify batching in the kafka sink: It's clearly stated in the docs that it doesn't batch data and send it event by event. Does this mean that batch.num.messages/queue.buffering.max.ms/batch.size/etc librdkafka parameters don't work in Vector?
7 replies
Ashwanth Goli
@iamashwanth
@jszwedko Does vector provide init.d files for debian/ubuntu distributions?
5 replies
davidconnett-splunk
@davidconnett-splunk
Hey Everyone,
I am planning on writing a blog on Vector, I was wondering if I can use screenshots of your webpage on a publicly facing site.
5 replies
Ayush Goyal
@perfectayush

Is there a way to send internal_metrics to statsd sink via vector. I tried with it with this config
https://gist.github.com/perfectayush/3aaec7fd63439be69f07440f72b80ca6

But on when listening via netcat on 8125, all I am getting is the namespace, not the metrics.

3 replies
ShadowNet
@shadownetro_twitter
Hello, i'm having trouble forcing lowercase index name in my vector.toml config. I'm getting this error:
Hello, i'm having trouble forcing lowercase index name in my vector.toml config. I'm getting this error: ElasticSearch error response err_type=invalid_index_name_exception reason=Invalid index name [application-CRON-2020-08-19.{lc_identifier}], must be lowercase
Sorry for the double post.Is there a fast way of solving this? Thx
Jesse Szwedko
@jszwedko
@shadownetro_twitter I think the only way to do that right now might be the lua transform. I'll open an issue for this to track. There is some work happening right now around field transformations that I think this could fit into
Jesse Szwedko
@jszwedko
ShadowNet
@shadownetro_twitter

Issue: timberio/vector#3496

thank you

Ghost
@ghost~5cd19fe2d73408ce4fbfa0bf
the skin for bigquery is ready?
Jesse Szwedko
@jszwedko
not yet, but there is an open PR for it: timberio/vector#1951
davidconnett-splunk
@davidconnett-splunk
Jonathan Endy
@jonathan.endy.csr_gitlab

Hi All,
Hope you can help me, I'm trying to stream data from Kafka to GCS.
The requirement is to create an object for each event from Kafka and the object name is compound from content in the event.
The first question, is it possible not to use the batch option? (or batch 1)
Second, I think I saw it possible to reference all fields can I use conversion and split of date from one field?
third, If I'm reading from Kafka can I skip disk buffer and still achieve at least one?

Thank you all!

11 replies
夜读书
@db2jlu_twitter
Hello All ,I met some error below ,could you pls have a look? thanks
Aug 23 02:28:47.114 ERROR sink{name=clickhouse-apilog type=clickhouse}:request{request_id=212}: vector::sinks::util::sink: Response wasn't successful. response=Response { status: 400, version: HTTP/1.1, h
eaders: {"date": "Sun, 23 Aug 2020 02:28:47 GMT", "connection": "Keep-Alive", "content-type": "text/tab-separated-values; charset=UTF-8", "x-clickhouse-server-display-name": "master-01", "transfer-encodin
g": "chunked", "x-clickhouse-query-id": "1188cca8-94ef-4b63-b3c9-19c7771ee72b", "x-clickhouse-format": "TabSeparated", "x-clickhouse-timezone": "UTC", "x-clickhouse-exception-code": "26", "keep-alive": "t
imeout=3", "x-clickhouse-summary": "{\"read_rows\":\"0\",\"read_bytes\":\"0\",\"written_rows\":\"0\",\"written_bytes\":\"0\",\"total_rows_to_read\":\"0\"}"}, body: b"Code: 26, e.displayText() = DB::Except
ion: Cannot parse JSON string: expected opening quote: (while read the value of key consumer.created_at): (at row 19)\n (version 20.6.3.28 (official build))\n” }
夜读书
@db2jlu_twitter
Seems clickhouse sind doesn’t support metrics, could I know the reason ? thanks !
Jesse Szwedko
@jszwedko

@db2jlu_twitter I'm not super familiar with Clickhouse, but there is an open issue for metrics support: timberio/vector#3435 . It may just not be implemented yet.

Looking at that though, are you sure that's the reason? It seems like it might be a mismatch in the schema or datatypes in clickhouse or, possibly, that vector is sending invalid JSON

夜读书
@db2jlu_twitter
@jszwedko sorry ,that is two different question . for the first question ,I checked ch logs ,seems it happened on vector only ,not on ch side ,maybe special characters ? not sure . for the second question ,that is opened by me , hope that feature could be implemented ,vector is so cool ! Thank you again !
夜读书
@db2jlu_twitter
@jszwedko btw,what is the main difference for metrics and log to store in sink ?
Jay Fenton
@jfenton

I just posted a blog about Vector: https://www.splunk.com/en_us/blog/it/meet-the-fastest-forwarder-on-the-net.html

huh...Splunk pulled the article?

3 replies
Liran Albeldas
@albeldas
Hi,
I'm trying to implement vector as DS (Helm) and having some troubles with filter conditions
I tried to add the namespace before with _ and / but it doesn't work.
If I'm removing the filter condition all containers logs go out to console.
my pod label: app=liran-demo , Namespace: demo
transforms:
   "liran-demo-logs":
     type: filter
     inputs: ["kubernetes_logs"]
     rawConfig: |
      [transforms.liran-demo-logs.condition]
      "kubernetes.pod_labels.component.eq" = "app=liran-demo"
        "stream.eq" = "stdout"

sinks:
   console:
     type: "console"
     inputs: ["liran-demo-logs"]
     taget: "stdout"
     rawConfig: |
      # Encoding
      encoding.codec = "json" # required
1 reply
Liran Albeldas
@albeldas
Never mind i had miss confguration in my lables everything works.
jsomwaru
@jsomwaru
I have an issue where s3 sink can't verify SSL of the s3 bucket. I've looked in the docs and i can't find anything about it. WARN sink{name=meraki_dump type=aws_s3}:request{request_id=2}: vector::sinks::util::retries2: retrying after error: Error during dispatch: error trying to connect: the handshake failed: error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed:ssl/statem/statem_clnt.c:1915:: unable to get local issuer certificate
Is anyone aware of some work around for this?
Liran Albeldas
@albeldas
Hi,
Which sink is the right one to send logs to Logstash?
1 reply
Andrey Afoninsky
@afoninsky

hello
I have a lot of spam messages after installing helm chart "vector-0.11.0-nightly-2020-08-24":

Aug 25 13:34:06.533  WARN source{name=kubernetes_logs type=kubernetes_logs}: vector::internal_events::kubernetes_logs: failed to annotate event with pod metadata event=Log(LogEvent { fields: {"file": Bytes(b"/var/log/pods/vector_cluster-logs-chf8d_290b7ab5-9752-49f1-81d7-cc9a51483c4d/vector/2.log"), "message": Bytes(b"{\"log\":\"Aug 25 13:19:17.029  INFO source{name=kubernetes type=kubernetes}:file_server: file_source::file_server: More than one file has same fingerprint. path=\\\"/var/log/pods/jaeger_jaeger-cassandra-2_3d357498-7fd7-448e-a0d7-54b8922b0050/jaeger-cassandra/6.log\\\" old_path=\\\"/var/log/pods/jaeger_jaeger-cassandra-2_3d357498-7fd7-448e-a0d7-54b8922b0050/jaeger-cassandra/5.log\\\"\\n\",\"stream\":\"stdout\",\"time\":\"2020-08-25T13:19:17.02974474Z\"}"), "source_type": Bytes(b"kubernetes_logs"), "timestamp": Timestamp(2020-08-25T13:34:06.533091773Z)} })

config:

  kubernetesLogsSource:
    enabled: true
    sourceId: kubernetes_logs
  env:
    - name: LOGGLY_TOKEN
      value: ****-****-****-****-****
  sinks:
    # console:
    #   type: console
    #   inputs: ["kubernetes_logs"]
    #   rawConfig: |
    #     encoding.codec = "json"
    loggly:
      type: http
      inputs: ["kubernetes_logs"]
      rawConfig: |
        uri = "https://logs-01.loggly.com/bulk/${LOGGLY_TOKEN}/tag/olly,dev,k8s/"
        batch.max_size = 50000
        encoding.codec = "ndjson"

should I create an issue or it's already known and/or fixed? thanks

1 reply
Binary Logic
@binarylogic
@afoninsky please open an issue and we'll get the right person on it.
Jesse Orr
@jesseorr
Hello, should vector be fingerprinting inputs from the file source when they are older than the ignore_older value?
I have an application that logs to many new logs, so I have an arbitrarily low ignore value to limit the scope of what vector sees, but I am running into issues with it opening too many files.
[sources.access-raw]
  # General
  type = "file"
  ignore_older = 300
  include = ["/var/log/od/access_*.log"]
  start_at_beginning = false
  oldest_first = true
  fingerprinting.strategy = "checksum"
  fingerprinting.ignored_header_bytes = 2048
  fingerprinting.fingerprint_bytes = 4096

Aug 25 14:39:14 vm8857 vector: Aug 25 14:39:14.117 ERROR source{name=access-raw type=file}:file_server: file_source::file_server: Error reading file for fingerprinting err=Too many open files (os error 24) file="/var/log/od/access_2020-02-24_13-53-24_pid_2074.log"
I could change max_open_files, which is limited to 1024 for the vector user, but it seems odd to have to do such a thing when only one log file at a time is being written.
Jesse Szwedko
@jszwedko
I tried this out. It looks like it isn't fingerprinting it, but I do see that it maintains an open file handle even if the file is older than the cutoff. I'll open an issue to see if this is expected
Jesse Orr
@jesseorr
Interesting, good to know that I'm not 100% crazy. Thank you Jesse =)
Jesse Szwedko
@jszwedko
Mark Klass
@ChristianKlass
Hi, I'm trying to send logs to Loki, and it works, but I've only got one label (agent="vector") for every log. I've noticed there's a labels.key field in the configuration demo. What are they for, and how do I use them? Can I use them to tag my logs?
[sinks.loki]
  # General
  type = "loki" # required
  inputs = ["cleaned_traefik_logs"]
  endpoint = "http://loki:3100" # required
  healthcheck = true # optional, default

  # Encoding
  encoding.codec = "json" # optional, default

  # Labels
  labels.key = "value" # I'm not sure what this does
  labels.key = "{{ event_field }}" # nor this
4 replies
alpi-ua
@alpi-ua
Hello !
Can someone help ? Have a bug with vector in SUSE - it doesn't clean buffer and i have a plenty of files stored on host after being sent to the server
6 replies