Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    martin-mao
    @martin-mao
    for aggregation, you cannot configure it in your coordinator, but you can with an aggregation tier
    the reason is that if you pass in old data to the coordinator, it needs to update an old value, but it no longer holds the current old value in memory, so it's not possible to do so
    for the aggregation tier, you can increase the buffer time, but it means that we won't flush the data until your buffer time. So if you set a buffer of 1 hour in the aggregation tier, no data will be flushed until at least 1 hour after it's timestamp has passed
    what is your use case for trying to aggregate old data?
    poorna-tsp
    @poorna-tsp

    @martin-mao In our pipeline, sometimes we get data older than 10min because of lag. So wanted to understand how we can configure carbon ingester to accept this.

    Also we have an aggregator in between, which aggregates data for 10s. So I was trying multiple options. If we remove our aggregator then how we can handle this or if we keep our aggregator then how we can handle this lag.

    As of now our aggregator has its own buffer to handles this old data aggregation.
    For example, for first window(10s) we get 1 event, our aggregator emits 1 even for that timestamp.
    in the 2nd window if it gets another event for same timestamp, then it will have older event value in buffer and recalculates the average and send a new event to graphite. Can we achieve the same behaviour here? Meaning, instead of waiting for complete 1 hour buffer time before flushing the data, Can we configure to have bigger buffer but flush in between?

    poorna-tsp
    @poorna-tsp

    The one without aggregation you configure in your node namespace config (bufferPastDuration in retentionOptions)

    If we configure without aggregation and increase the bufferPartDuration then this will not have that flush issue right? data will be written to db as events are ingested right? Also when we fetch, we will get the last ingested value right?

    martin-mao
    @martin-mao
    @poorna-tsp the coordinator/aggregation tier cannot do multi flush today. That's something we can consider in the future. I'd recommend you run your own aggregation tier and just write to an unaggregated namespace, with a 15m buffer.
    The DB will order writes in the buffer, so your final value within the 15m will be the one used
    poorna-tsp
    @poorna-tsp
    @martin-mao what do you mean by aggregation tier?
    In carbon configuration, when i give policy that matches unaggregated namespace, i am getting error cannot enable carbon ingestion without a corresponding aggregated M3DB namespace
    arnikola
    @arnikola
    your policies stanza in the rules section needs to mach a corresponding namespace
    poorna-tsp
    @poorna-tsp
    local:
        namespaces:
          - namespace: test
            type: aggregated
            retention: 48h
            resolution: 10s
          - namespace: default
            type: unaggregated
            retention: 10m
            resolution: 10s
    carbon:
        ingester:
          debug: true
          listenAddress: "0.0.0.0:7204"
          rules:
            - pattern: .*
              aggregation:
                enabled: false
              policies:
                - resolution: 10s
                  retention: 10m
    With the above configuration, I am getting cannot enable carbon ingestion without a corresponding aggregated M3DB namespace error
    arnikola
    @arnikola
    yeah you'd need to change the carbon rules to be 48h/10s
    since we only allow writes from aggregated results to aggregated namespaces
    poorna-tsp
    @poorna-tsp

    since we only allow writes from aggregated results to aggregated namespaces

    But i have disabled aggregation right? in carbon configuration

    arnikola
    @arnikola
    oh, hmm
    may be an oversight on our part to sanity check disabled aggregations? will follow up
    Rob Skillington
    @robskillington
    @poorna-tsp it's because the type of the 10m/10s namespace
    is "unaggregated"
    even though you have aggregation enabled "false" it still needs to be specified as "aggregated" in the namespace declaration
    (i.e. the namespace declaration is declaring you'll store aggregated values there"
    and then the "aggregation: enabled: false" is declaring you want to directly write
    pre-aggregated values to that namespace
    poorna-tsp
    @poorna-tsp
    Oh ok understood. So i need to have an aggregated namespace with the policy defined in carbon configuration.
    Rob Skillington
    @robskillington

    If you change

          - namespace: default
            type: unaggregated
            retention: 10m
            resolution: 10s

    to

          - namespace: default
            type: aggregated
            retention: 10m
            resolution: 10s

    that will work

    poorna-tsp
    @poorna-tsp
    Got it. Follow up on previous question, how do i run own aggregation tier? what is aggregation tier?
    Also when we disable aggregation in carbon configuration, we do not have to run aggregation tier to have a bigger buffer. We can specify different value for bufferPastDuration while adding namespace.
    On the other hand if i enable aggregation(any type like last/min/sum mean) i need to run own aggregation tier to increase the buffer.
    Rob Skillington
    @robskillington
    @poorna-tsp there's a bit of documentation here (still a work in progress):
    m3db/m3#1741
    Evanhuang
    @evanhuang996_twitter
    @robskillington no precompiled bin files?
    YY Wan
    @yywandb
    hi! I have a question about memory consumption for the m3db storage replicas. What should we expect as a reasonable memory to storage ratio, i.e. amount of memory to support a replica with 500 GB storage?
    martin-mao
    @martin-mao
    hey @yywandb it depends on retention and resolution, but for something like 10 second resolution and 2 day retention, a 1:4 memory to storage is ideal. For more historical data, the ratio goes up, eg 1:8 or 1:16 for multiple months/years
    it also depends on the block size, with smaller block sizes, you can get away from less memory at the cost of some slower query times
    YY Wan
    @yywandb
    cool, thanks @martin-mao !
    Evanhuang
    @evanhuang996_twitter
    2019/08/24 14:50:59 Go Runtime version: go1.11.2
    2019/08/24 14:50:59 Build Version:      v0.11.0
    2019/08/24 14:50:59 Build Revision:     81b34bdd
    2019/08/24 14:50:59 Build Branch:       master
    2019/08/24 14:50:59 Build Date:         2019-08-23-11:08:20
    2019/08/24 14:50:59 Build TimeUnix:     1566572900
    unable to load config from conf/m3.yml: yaml: unmarshal errors:
      line 106: field interval not found in type config.RepairPolicy
      line 107: field offset not found in type config.RepairPolicy
      line 108: field jitter not found in type config.RepairPolicy
      line 153: field hostBlockMetadataSlicePool not found in type config.PoolingPolicy
    Evanhuang
    @evanhuang996_twitter
      repair:
        enabled: false
        interval: 2h
        offset: 30m
        jitter: 1h
        throttle: 2m
        checkInterval: 1m
    it's weird
    Benjamin Raskin
    @benraskin92
    do you by chance have tabs in your yaml file?
    Evanhuang
    @evanhuang996_twitter
    nope , I didn't change config file
    Benjamin Raskin
    @benraskin92
    here’s the RepairPolicy struct:
    // RepairPolicy is the repair policy.
    type RepairPolicy struct {
        // Enabled or disabled.
        Enabled bool `yaml:"enabled"`
    
        // The repair throttle.
        Throttle time.Duration `yaml:"throttle"`
    
        // The repair check interval.
        CheckInterval time.Duration `yaml:"checkInterval"`
    
        // Whether debug shadow comparisons are enabled.
        DebugShadowComparisonsEnabled bool `yaml:"debugShadowComparisonsEnabled"`
    
        // If enabled, what percentage of metadata should perform a detailed debug
        // shadow comparison.
        DebugShadowComparisonsPercentage float64 `yaml:"debugShadowComparisonsPercentage"`
    }
    I think it might have changed since the last release
    let me check
    // RepairPolicy is the repair policy.
    type RepairPolicy struct {
        // Enabled or disabled.
        Enabled bool `yaml:"enabled"`
    
        // The repair interval.
        Interval time.Duration `yaml:"interval" validate:"nonzero"`
    
        Offset time.Duration `yaml:"offset" validate:"nonzero"`
    
        // The repair time jitter.
        Jitter time.Duration `yaml:"jitter" validate:"nonzero"`
    
        // The repair throttle.
        Throttle time.Duration `yaml:"throttle" validate:"nonzero"`
    
        // The repair check interval.
        CheckInterval time.Duration `yaml:"checkInterval" validate:"nonzero"`
    }
    hmm yeah looks like it changed
    Evanhuang
    @evanhuang996_twitter
    yeah I has been changed
        hostBlockMetadataSlicePool:
            size: 131072
            capacity: 3
            lowWatermark: 0.7
            highWatermark: 1.0
    Benjamin Raskin
    @benraskin92
    let me follow up tomorrow with the team to see what happened. apologies for that
    Evanhuang
    @evanhuang996_twitter
    this config also removed
    Benjamin Raskin
    @benraskin92
    gotcha, ill follow up and make sure this doesnt happened again (at least not without proper warning)