by

Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • 04:56
    xluffy synchronize #2716
  • 04:51
    xluffy opened #2716
  • 04:51
    xluffy review_requested #2716
  • May 31 16:15
    ktff synchronize #2714
  • May 31 15:57
    ktff synchronize #2715
  • May 31 15:45
    binarylogic review_requested #2715
  • May 31 15:45
    binarylogic review_request_removed #2715
  • May 31 15:42
    ktff review_request_removed #2715
  • May 31 15:42
    ktff synchronize #2714
  • May 31 15:39
    ktff review_requested #2715
  • May 31 15:39
    ktff review_requested #2715
  • May 31 15:39
    ktff labeled #2715
  • May 31 15:39
    ktff assigned #2715
  • May 31 15:39
    ktff opened #2715
  • May 31 15:21
    binarylogic review_requested #2714
  • May 31 15:21
    binarylogic review_request_removed #2714
  • May 31 14:46
    ktff review_requested #2714
  • May 31 14:45
    ktff labeled #2714
  • May 31 14:45
    ktff assigned #2714
  • May 31 14:45
    ktff opened #2714
valerypetrov
@valerypetrov
The avg events/sec is around 170.000-200.000 and bytes processed/sec is around 100-120mb
Ana Hobden
@Hoverbear
I'm seeing we should add an encoding option to the socket sink maybe? That way we could eat it off the wire as JSON, should help a bit.
valerypetrov
@valerypetrov
Avg sum output to 2 kafka topics is 50000-60000 msg/sec .
Luke Steensen
@lukesteensen
@valerypetrov which version of vector are you on? we recently made some changes that should increase performance when you're dealing with a large number of fields
but it's still an area we're focusing on and expect to be able to improve significantly
valerypetrov
@valerypetrov
@lukesteensen , version="0.9.1" git_version="v0.9.1" released="Thu, 30 Apr 2020 15:51:58 +0000" arch="x86_64"
Luke Steensen
@lukesteensen
got it. two other things i'm curious about: how many cores the machine has that you're running on (or if you've limited vector to a certain number of threads, how many), and roughly how many concurrent incoming tcp connections to the source?
valerypetrov
@valerypetrov
@lukesteensen , Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz (32 cores) , around 400-500 tcp connections
Luke Steensen
@lukesteensen
great, thank you! sounds like some of our upcoming improvements should help your situation quite a bit
valerypetrov
@valerypetrov
Sounds good, thanks!
logabot
@logabot
hello guys, where is mistake?
type = "regex_parser" drop_field = true field = "message" regex = "^(?P<timestamp>[\\S ]+) \\[(?<pid>[0-9]+)\\] (?P<log_type>[\\S]+): (?P<message>.*)$"
in logs:
May 15 19:35:32 pgsql-01 vector: May 15 19:35:32.537 INFO vector: Vector is starting. version="0.9.1" git_version="v0.9.1" released="Thu, 30 Apr 2020 15:51:58 +0000" arch="x86_64" May 15 19:35:32 pgsql-01 vector: May 15 19:35:32.538 ERROR vector::topology: Configuration error: Transform "parse_postgres_log": Invalid regular expression: regex parse error: May 15 19:35:32 pgsql-01 vector: ^(?P<timestamp>[\S ]+) \[(?<pid>[0-9]+)\] (?P<log_type>[\S]+): (?P<message>.*)$ May 15 19:35:32 pgsql-01 vector: ^
regex-validator looks like good : https://regex101.com/r/FH1mRz/1
lumendes
@lumendes
I am having trouble when trying to parse a timestamp with this format 2020-05-16 18:29:43,501.
Coercing transform or via types always fails and field is lost in the process. The only workaround so far is to remove, via regex, the last 3 digits and the comma.
But then you lose some precision in the timestamp and it will affect sorting, even if minimally. Anyone faced similar issue ?
Andrey Afoninsky
@afoninsky
hi all, a quick question: is vector ready for collecting logs from k8s pods? is there a best practice to do it?
5 replies
asvolodin
@asvolodin

Hello. Can you help me with some newbie question?
Configured file input like this

[sources.log_sources_bolid]
type = "file"
include = ["Z:/QuarantineReports/vectorinput/bolid/*.txt"]
start_at_beginning = false
ignore_older = 86400

but vector still watches old files

May 18 14:32:37.552 INFO source{name=log_sources_bolid type=file}:file_server: file_source::file_server: Found file to watch. path="Z:/QuarantineReports\vectorinput\bolid\2020-05-13 bolid_log.txt" file_position=414508
May 18 14:32:37.568 INFO source{name=log_sources_bolid type=file}:file_server: file_source::file_server: Found file to watch. path="Z:/QuarantineReports\vectorinput\bolid\2020-05-14 bolid_log.txt" file_position=1772364
May 18 14:32:37.568 INFO source{name=log_sources_bolid type=file}:file_server: file_source::file_server: Found file to watch. path="Z:/QuarantineReports\vectorinput\bolid\2020-05-15 bolid_log.txt" file_position=1866780
May 18 14:32:37.568 INFO source{name=log_sources_bolid type=file}:file_server: file_source::file_server: Found file to watch. path="Z:/QuarantineReports\vectorinput\bolid\2020-05-16 bolid_log.txt" file_position=1228360
May 18 14:32:37.568 INFO source{name=log_sources_bolid type=file}:file_server: file_source::file_server: Found file to watch. path="Z:/QuarantineReports\vectorinput\bolid\2020-05-17 bolid_log.txt" file_position=1223140
May 18 14:32:37.583 INFO source{name=log_sources_bolid type=file}:file_server: file_source::file_server: Found file to watch. path="Z:/QuarantineReports\vectorinput\bolid\2020-05-18 bolid_log.txt" file_position=1098976

5 replies
asvolodin
@asvolodin

Hello, another question. How to add fingerprinting.fingerprint_bytes into source?
I tried this

[sources.log_sources_bolid]
   type = "file"
   include = ["Z:/QuarantineReports/vectorinput/bolid/*.txt"]
   start_at_beginning = false 
   ignore_older = 86400
   fingerprinting.fingerprint_bytes = 64

But it`s syntax error

Diego Ragazzi
@diegoragazzi_gitlab

Hey there, I was following the doc (https://vector.dev/docs/reference/sinks/elasticsearch/) to get data from kafka and send it to es (hosted by amazon) but I'm getting the following errors:
https://gist.github.com/ragazzid/96daceda0eebc817933ae82d8b837174

Any idea?

Jason Fehr
@jasonmfehr_twitter

I'm trying to configure a vector sidecar to send data to a central vector service using the http sink/source with certificate authentication. The issue I am facing is that the central vector service is behind a reverse proxy. Thus, the sidecar service must access it via a uri such as: https://my.domain/observe/vector. The reverse proxy presents a valid certificate for my.domain, but the central vector service has its own certificate with a CN that matches the kubernetes service name. If I go behind the reverse proxy, everything works as it should.

Is it possible to use certificate auth to a vector http source the is running behind a reverse proxy?

Jesse Orr
@jesseorr
both the http and vector sinks let you specify a tls certificate, have you configured it in the sink config to use a certificate that you can provide?
11 replies
feasibly the same cert that the http rp server is using should do the trick
Ryan Schlesinger
@ryansch
So I have a dumb question: I stumbled across vector while looking at ways to set up timber.io. Why is there not a timber sink for vector? What am I missing?
Ana Hobden
@Hoverbear
@ryansch Not a dumb question! A good one!
We're working on a ninja-secret (not really) project related to this. If you ask @binarylogic he might let you in on more. :)
Ryan Schlesinger
@ryansch
ok!
Luke Steensen
@lukesteensen
the less exciting answer is that our standard http sink should work just fine for timber, but it would be good for us to add a configuration example somewhere
Ryan Schlesinger
@ryansch
I wondered about that!
Luke Steensen
@lukesteensen
hello everyone! we are considering removing a feature (custom dns resolution) that we believe is largely unused and causing some excessive maintenance burden
if you're using it and would be sad to see it go, please let us know on this issue: timberio/vector#2635
thanks!
Jason Fehr
@jasonmfehr_twitter

I'm running a vector source behind an AWS network load balancer, and the healthchecks are generating a large amount of warnings in the logs. For example:

May 21 21:14:33.126  WARN source{name=logs-vector type=vector}:connection{peer_addr=x.x.x.x}: vector::internal_events::tcp: connection error. error=TLS handshake failed: unexpected EOF
May 21 21:14:33.126  WARN source{name=logs-vector type=vector}:connection{peer_addr=x.x.x.x}: vector::sources::util::tcp: Error received while processing TCP source

Is there any way to quiet these warnings so they don't show up? Since I am using a k8s LoadBalancer service to create the AWS NLB, I cannot modify the healthcheck port.

9 replies
Jason Fehr
@jasonmfehr_twitter

I am getting the following error in a vector source when attempting a vector sink to vector source setup. I am not seeing any errors in the vector sink. In fact, the vector sink is showing that the healthchecks are passing.

May 21 22:38:42.163  WARN source{name=logs-vector type=vector}:connection{peer_addr=x.x.x.x}: vector::internal_events::tcp: connection error. error=TLS handshake failed: error:1408F10B:SSL routines:ssl3_get_record:wrong version number:ssl/record/ssl3_record.c:332:
May 21 22:38:42.163  WARN source{name=logs-vector type=vector}:connection{peer_addr=x.x.x.x}: vector::sources::util::tcp: Error received while processing TCP source

I am using the exact same certs on both sides. I tried both X.509 and PKCS certs but received the same error. My certs have a CN that is different than the domain name in the vector sink address field and do not have any subject alternative names in them. Could that be the issue?

One final piece of information -- these same exact certs works for http sink to http source communication. I can always use that setup but was hoping to use the vector protocol.

8 replies
Ruben
@rubn-g

Hi everyone,

I created a transform to detect indented multiline logs automatically, when using sources different than file

The drawback is that when logs are non JSON and one-liners, the last log line will stay in memory until the next one gets in

It's first time i experiment with Lua, so be kind :) here's the link to the gist with the transform, just in case it's useful for someone

https://gist.github.com/rubn-g/d07a499531f2663787274e82bef10779

Ana Hobden
@Hoverbear
@rubn-g Whoa! =D
Nice job!
Ruben
@rubn-g
@Hoverbear thx! i've a question now though haha i'm using gcp stackdriver sink, but can't set the message severity level, it's always default. Looks like we should have a severity option for that sink
Ana Hobden
@Hoverbear
@rubn-g That sounds good to me. Do you think you can open an issue?
Ruben
@rubn-g
@Hoverbear sure: timberio/vector#2678
Ana Hobden
@Hoverbear
@rubn-g Thanks! Looks good.
Andrey Afoninsky
@afoninsky
please fix me if I wrong: it's not possible to set timestamp meta attribute in kafka sink?
https://cwiki.apache.org/confluence/display/KAFKA/KIP-32+-+Add+timestamps+to+Kafka+message
use case: to have a common service to work with historical data connected to a common pipeline instead of describing rules how to extract timestamp from data in each specific case
2 replies
Karol Chrapek
@kaarolch
Hi
I tried to to convert one nested field in lua. This field has "-" in name:
event.log.response.headers.X-Runtime = tonumber(event.log.response.headers.X-Runtime) / 1000 but during vector start I get Cannot evaluate Lua code defining "hooks.process": syntax error: [string "?"]:1: <name> expected near '('.
Other field without "-" working well.
1 reply
Lakshmi-r21
@Lakshmi-r21
Hello team https://vector.dev/guides/integrate/sources/syslog/aws_s3/ i tried this on my EC2 server , we dont see any logs in S3 bucket can any one please help with this, Vector is running.Is there any other setting we need to do?
36 replies
matrixbot
@matrixbot
Mike Cardwell Are we getting a new release of Vector any time soon?
Arnaud Esteve
@aesteve
Hi everyone, and thanks for your work on vector, that's looking very promising for our needs.
I have a few questions before we can use it for your own needs, before making some noise by opening an issue, maybe asking here is a better fit? Just tell me.
We're interested in measuring HTTP response time / status code of some APIs of ours. But I'm seeing 2 blockers at the moment for this use case.
  • Http source seems to be "log" and not "metric". Basically what we had been writing (custom probe system in Rust) was a gauge for HTTP status, and an histogram for response time.
  • We'd like the http source to support OAuth so that we can measure the response time "as a client of ours would" (like, going through the whole API management stack, etc."
    What would be the best way for us to get started? Forking the project and adapting the http source? Forking and creating a new source component? Is there a doc on how to write a custom component? (or a PR that does so, maybe?).
    At this point, any pointer on the ability to do so would be awesome.
    Thank you!
22 replies
Qu─âng
@min0lta_twitter

Hi everyone, I have a nginx log, in path field of GET method, I want to remove params of request (after ? symbol).

My log

/api/users?a=1&b=2&c=3

My expection

/api/users

So what module i should use?

8 replies
Windham Wong
@windhamwong
Hello, we are looking for Windows Event Log support as source in Vector. Wondering if someone is working on it or its already supported? if no, we are planning to make one and contribute it
Windham Wong
@windhamwong
looking forward to have Vector to replace Winlogbeat
Lakshmi-r21
@Lakshmi-r21
Hi team getting below error when we try vector in windows" INFO vector: Vector is starting. version="0.9.1" git_version="v0.9.1" released="Thu, 30 Apr 2020 15:49:24 +0000" arch="x86_64"
Jun 01 09:14:02.742 ERROR vector::topology: Configuration error: Source "in": data_dir "/var/lib/vector/" does not exist"
/var/lib/vector this path doesn't exist in windows, even though path is not mentioned same error is coming
It worked when i gave path of windows
Lakshmi-r21
@Lakshmi-r21
Jun 01 09:20:22.431 ERROR source{name=in type=file}:file_server: file_source::file_server: Error reading file for fingerprinting err=Access is denied.
(os error 5) file="C:/Windows\System32\winevt\Logs"
when we mention path as "C:/Windows/System32/winevt/Logs"
so is not for windows it support for system log