Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • 19:14
    bruceg synchronize #7141
  • 18:07
    lucperkins synchronize #7368
  • 17:46
    lucperkins edited #7368
  • 17:46
    lucperkins converted_to_draft #7368
  • 17:10
    bruceg closed #7373
  • 15:52
    pablosichert synchronize #7352
  • 15:47
    jszwedko synchronize #6985
  • 15:44
    jszwedko closed #7378
  • 15:43
    jszwedko synchronize #6985
  • 15:42
    jszwedko synchronize #6985
  • 15:32
    pablosichert edited #7352
  • 15:26
    jszwedko closed #7377
  • 14:54
    pablosichert review_requested #7352
  • 14:54
    pablosichert review_requested #7352
  • 14:54
    pablosichert review_requested #7352
  • 14:54
    pablosichert review_requested #7352
  • 14:53
    pablosichert ready_for_review #7352
  • 14:53
    pablosichert synchronize #7352
  • 14:48
    pablosichert synchronize #7352
  • 14:11
    JeanMertz synchronize #7197
Anderson Ferneda
@dersonf
I make something using regex_parser
[transforms.filein_regex_message]
type = "regex_parser"
inputs = ["file"]
drop_field = true
field = "message"
patterns = ['^(?P<day>[\d-]+) (?P<hour>[\d:,\d]+) (?P<loglevel>.) {"key":"(?P<key>.)",(?P<key2>.*)}$']
When I try to use \ to be removed I receive the error, probably I making something wrong, but I could find out what
Anderson Ferneda
@dersonf
Thanks, I find the problem, the "\" is just a new line, if a remove it and place another regex it vanish.
Ashwanth Goli
@iamashwanth

Hi everyone, I came across vector recently and thinking of replacing my existing filebeat + logstash pipeline with it. For some reason, I am not able to get the multi-line parsing working.

I am trying to capture lines between tokens TESTS and TESTC, but vector is dumping all the lines to the sink. What am I doing wrong here?

[sources.test_run_log]
  # General
  type = "file"
  ignore_older = 3600
  include = ["/path_to_log.log"]
  start_at_beginning = false

  # Priority
  oldest_first = true

  [sources.test_run_log.multiline]
    start_pattern = ".*TESTS.*"
    mode = "halt_with"
    condition_pattern = ".*TESTE.*"
    timeout_ms = 1000
3 replies
Bindu-Mawat
@Bindu-Mawat
Hi
I am seeing this error when configuring html as source:
Aug 05 21:55:03.249 ERROR vector: Configuration error: "/etc/vector/vector.toml": unknown variant http, expected one of docker, file, journald, kafka, kubernetes, logplex, prometheus, socket, splunk_hec, statsd, stdin, syslog, vector for key sources.bindu-in.type
^C
[1]+ Exit 78
Jesse Szwedko
@jszwedko
@Bindu-Mawat that should work. What version of vector are you using?
Bindu-Mawat
@Bindu-Mawat
Hi I am using vector 0.8.2 (v0.8.2 x86_64-unknown-linux-musl 2020-03-06)
Jesse Szwedko
@jszwedko
It's possible that version did not have the http source, let me check. The current version is 0.10.0 if you are able to upgrade
@Bindu-Mawat that source was added in 0.9.0. I would upgrade to the latest though, 0.10.0
Bindu-Mawat
@Bindu-Mawat
Thanks Jesse. I'll see if it is possible for me.
Grant Isdale
@grantisdale

Hey all,

AWS S3 Sink Q:

When using server_side_encryption = "aws:kms" I am trying to pass the relevant ssekms_key_id but the key exists in a different account (alongside the S3 bucket) from where the cluster itself exists.

I have used the assume_role key to assume a role in the target (where the S3 bucket lives) account; this works for the aws_cloudwatch_logs sink - by assuming a role in another account it 'knows' to look for the specific log group in the target account, not the account that the cluster is running in. But I'm currently getting an error because vector is unable to find the kms key because it is looking in the account where the cluster exists not the account where the assumed role exists.
Is there something I should be doing differently? Is this generally possible for KMS keys the way it is for cloudwatch log groups?

5 replies
夜读书
@db2jlu_twitter
hi all ,is it possible to transform and parse multiple json fields ? thanks !
3 replies
Ashwanth Goli
@iamashwanth

@jszwedko Is it possible to embed the entire event.log as a json value while writing to a sink?

Kafka rest proxy expects the records in the following format. I am having trouble connecting HTTP sink to my kafka rest endpoint because of this.

{
   "records": [
        {"value": event.log}
    ]
}
13 replies
Ghost
@ghost~5cd19fe2d73408ce4fbfa0bf
Hi i have a problem on logstash json
can i save a part of the like path and status response on gcp_stackdriver_logs?
mcgfenwick
@mcgfenwick
Whats the underlying data type for the garage metric ? I'm getting a lot of output values of zero, the input values are quiet large
Jesse Szwedko
@jszwedko
@mcgfenwick the internal representation is a 64 bit float
mcgfenwick
@mcgfenwick
Hmm, ok, then my problem is something else. Thanks
mcgfenwick
@mcgfenwick
My problem appears to be in the way the prometheus sink handles my metrics, The code generates about 20 metrics per second, but when I scrape the prometheus sink, I appear to get the last metric generated or something that does not represent the average or something more useful. Is there a way too change this ?
Jesse Szwedko
@jszwedko
Gauges typically represent the current value of a metric. Vector is capable of aggregating samples into histograms. I'm not seeing a way to have it take an average though
What is the metric? That would help inform the best representation for it
mcgfenwick
@mcgfenwick
Its a count of packets, not exactly sure if its a per second value or some other interval, it would require quiet a bit of digging to find that out. The value varies form billions to zero
mcgfenwick
@mcgfenwick
But my question is really more about how the prometheus sink deals with the metrics it receives.
Jesse Szwedko
@jszwedko
If I had to guess. You probably want to represent it as a counter. The Prometheus sink just exposes a scrape endpoint for Prometheus. You can curl it yourself too see the values
(on my phone or I'd dig up more resources)
mcgfenwick
@mcgfenwick
ack
guy
@guy1976_gitlab
question: how can i ran different regex_parser transform depending on the pod label?
Rick Richardson
@rrichardson
@guy1976_gitlab - create a filter that only accepts that pod label.. then use that filter as the source for your regex_parser transform
Jesse Szwedko
@jszwedko
You might also be interested in the swimlanes transform https://vector.dev/docs/reference/transforms/swimlanes/
Ayush Goyal
@perfectayush
Hi, I am running into an issue with vector that it's not closing file descriptors on rotation of files via logrotate (file source). These are nginx logs. This is is happening for already rotated *.access.log.1 files which are rotated a second time, to *.access.log.2.gz. These deleted file descriptors accumulate over a period of time and we have to restart vector to fix disk alerts. Fingerprinting is currently configured with checksum strategy, with file source configured to check only for *.access.log file
7 replies
夜读书
@db2jlu_twitter
hello all , is it possible for clickhouse sink to store metrics ? thanks !
Edward Roper
@eroper

Hi Everyone,
I'm attempting to use the aws_s3 sink with Ceph. The healthcheck passes but it fails to store any objects. On the server-side I'm seeing

2020-08-11 00:52:50.837 7fa599155700 20 get_system_obj_state: s->obj_tag was set empty
2020-08-11 00:52:50.837 7fa599155700 20 Read xattr: user.rgw.idtag
2020-08-11 00:52:50.837 7fa599155700  2 req 2 0.020s s3:put_obj recalculating target
2020-08-11 00:52:50.837 7fa599155700  2 req 2 0.020s s3:put_obj reading permissions
2020-08-11 00:52:50.837 7fa599155700  2 req 2 0.020s s3:put_obj init op
2020-08-11 00:52:50.837 7fa599155700  2 req 2 0.020s s3:put_obj verifying op mask
2020-08-11 00:52:50.837 7fa599155700 20 req 2 0.020s s3:put_obj required_mask= 2 user.op_mask=7
2020-08-11 00:52:50.837 7fa599155700  2 req 2 0.020s s3:put_obj verifying op permissions
2020-08-11 00:52:50.837 7fa599155700  0 setting obj tags failed with -2210
2020-08-11 00:52:50.837 7fa599155700 20 req 2 0.020s s3:put_obj get_params() returned ret=-22
2020-08-11 00:52:50.837 7fa599155700 20 op->ERRORHANDLER: err_no=-22 new_err_no=-22
2020-08-11 00:52:50.837 7fa599155700  2 req 2 0.020s s3:put_obj op status=0
2020-08-11 00:52:50.837 7fa599155700  2 req 2 0.020s s3:put_obj http status=400
2020-08-11 00:52:50.837 7fa599155700  1 ====== req done req=0x563d82950720 op status=0 http_status=400 latency=0.0199999s ======
'''

data_dir = "/var/lib/vector"

[sources.journald]
batch_size = 16
current_boot_only = true
type = "journald"

[sinks.ceph]
bucket = "anubis-logs"
endpoint = "https://some.host"
healthcheck = true
inputs = ["journald"]
type = "aws_s3"
buffer.type = "memory"
'''

Any ideas on what might be wrong?

interestingly the addition of tags.Tag1 = "value1"
seems to "fix" it
Matt Franz
@mdfranz_gitlab
Aug 10 21:30:21.451  INFO vector::sources::docker: Started listening logs on docker container id=7d704b317c21a893a26de131a0495e08ef39ee5144e0a743e23a6027c85316e2
Aug 10 21:30:21.454 TRACE vector::sources::docker: Received one event. event=Log(LogEvent { fields: {"container_created_at": Timestamp(2020-08-11T01:02:56.926028916Z), "container_id": Bytes(b"7d704b317c21a893a26de131a0495e08ef39ee5144e0a743e23a6027c85316e2"), "container_name": Bytes(b"mynginx13"), "image": Bytes(b"nginx"), "label": Map({"maintainer": Bytes(b"NGINX Docker Maintainers <docker-maint@nginx.com>")}), "message": Bytes(b"/docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration"), "source_type": Bytes(b"docker"), "stream": Bytes(b"stdout"), "timestamp": Timestamp(2020-08-11T01:30:21.441879555Z)} })
Aug 10 21:30:21.454  WARN sink{name=cw_log type=aws_cloudwatch_logs}: vector::sinks::aws_cloudwatch_logs: keys in stream template do not exist on the event; dropping event. missing_keys=[Atom('host' type=inline)] rate_limit_secs=30
Aug 10 21:30:21.454 TRACE vector::sources::docker: Received one event. event=Log(LogEvent { fields: {"container_created_at": Timestamp(2020-08-11T01:02:56.926028916Z), "container_id": Bytes(b"7d704b317c21a893a26de131a0495e08ef39ee5144e0a743e23a6027c85316e2"), "container_name": Bytes(b"mynginx13"), "image": Bytes(b"nginx"), "label": Map({"maintainer": Bytes(b"NGINX Docker Maintainers <docker-maint@nginx.com>")}), "message": Bytes(b"/docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/"), "source_type": Bytes(b"docker"), "stream": Bytes(b"stdout"), "timestamp": Timestamp(2020-08-11T01:30:21.441906500Z)} })
Aug 10 21:30:21.454  WARN sink{name=cw_log type=aws_cloudwatch_logs}: vector::sinks::aws_cloudwatch_logs: "keys in stream template do not exist on the event; dropping event." is being rate limited. rate_limit_secs=5
are there known issues with docker logs and cloudwatch? (i didn't see anything obvious but could have missed it)
7 replies
Bumsoo Kim
@bskim45
Hi all, how do you think of using vector as a forwarder of analytic events from Kafka to Kinesis? Though docs says vector is not for non-observability logs, it seems vector would work fine for that simple purpose.
3 replies
Ghost
@ghost~5cd19fe2d73408ce4fbfa0bf
is possible in vector add luascript use a lunajson packages?
Grant Isdale
@grantisdale

Hey all :) When using the Loki sink, I am continuously getting entry out of order errors for my logs. Does anyone have any experience?

I (think) the problem is that because we deploy vector as a daemonset in our clusters (currently at 4 pods) it's a decentralised deployment, making things tricky for loki to handle. But we thought adding a unique label to each instance would solve the problem.

I added a unique label for each instance withlabels.vector_instance = "${HOSTNAME}" but I'm still getting the same error. I can verify the label exists by inspecting the logs in grafana because some are coming through, but most are being rejected!

I've also tried:

labels.vector_instance = {{ host }}
request.in_flight_limit = 1
request.rate_limit_num = 1

3 replies
Felipe Passos
@SharksT
Hello i have one doubt, how do i visualize the aggregated data from vector ? i didn't see a option to use grafana as sink option for example
1 reply
Lakshmi-r21
@Lakshmi-r21
Hi team,could you pleas help us its urgent, we are trying to push the vector logs from accout a (server exists) to account b(which has s3 bucket).File is getting pushed but canonical ID is not getting added we tried adding the below part to config file # ACL acl = "private" # optional, no default grant_full_control = "79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be" # optional, no default grant_read = "79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be" # optional, no default grant_read_acp = "79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be" # optional, no default grant_write_acp = "79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be" # optional, no default even then canonical ID is not getting added.
sagar-jatti
@sagar-jatti
Hi Team,
We are not able to pass the ACL permissions to push objects to give full control on S3 bucket for owner, also i can see "bucket-owner-full-control" this ACL is not present in your Vector page at https://vector.dev/docs/reference/sinks/aws_s3/#acl. Please let us know how to grant full permissions to bucket owner when Vector trying to push data to S3 bucket should have canonical ID. Please respond back ASAP, it is very urgent we are stuck with this almost 3-4 days, your help will be truly appreciated... :-)
25 replies
bhattchaitanya
@bhattchaitanya
In the s3 sink module, is there a way I can specify the original tail file name or parts of the file name as tokens in the s3 prefix path expression?
As of now it seems like only strftime date expressions are allowed
11 replies
Matt Franz
@mdfranz_gitlab
Hi, Is there a transform for fields that allows only specific fields to be passed from a sink. I'm looking at a way to reduce the fields from journald (especially all the SYSTEMD* fields) I could us remove_fields but I would have to specify a huge list) It would be nice to have something like pass_fields (that would only output those specific fields)
5 replies
journalctl (which I believe the sink calls) does have a --output-fields argument that could solve this as well but this seems like a more general use case
bhattchaitanya
@bhattchaitanya
I see that vector process does not use multiple CPU cores and gets pegged at 100% and the throughput drops. How do I make vector make use of all the CPU cores available in the node?
33 replies
Christof Weickhardt
@somehowchris

Hey there

I got to know vector a few months ago and have been using it personally for all the non-profit projects I work on.
Now I would like to bring it's awesome performance to the team I work at currently but at that scope, it's not anymore just a small thing.

Our current setup consists of logstash scaling up or down dynamically depending on the load and all the apps pushing their filebeat, packetbeat, metricbeat, etc. data (elastic stack beats) to it. As I would like to transition to vector my first attempt would be bringing over vector to replace logstash to have all the current setup elastic stack compliant but benefit from the performance of vector.

I again read through some parts of your docs and found some stuff I'm currently concerned about.

One would be the at least delivered once guarantee https://vector.dev/docs/about/guarantees/#at-least-once
You state that you should specify some kind of path to have events queued up until they can be delivered once the destination is available again. A nice feature but that means my container has to be stateful?

I thought about having vector reading from a kafka topic but that raised also some questions. I read part of the kafka source code but couldn't find a part where the instances talk to each other. Does this mean each instance would consume the event? Is vector cloud native?

I may misunderstand some scaling vector but I couldn't find a part in the docs explaining those situations.

24 replies
Vlad Pedosyuk
@vpedosyuk
Hi, I'd like to clarify batching in the kafka sink: It's clearly stated in the docs that it doesn't batch data and send it event by event. Does this mean that batch.num.messages/queue.buffering.max.ms/batch.size/etc librdkafka parameters don't work in Vector?
7 replies