资源限制
4 分钟阅读
简要概述
配置 Cortex 服务(如分发器、摄取器等)的默认限制,这些限制适用于所有租户,例如,您可以设置默认的最大样本数量或最大标签数等。
此外,还允许您为每个租户设置特定的限制,这些特定限制将覆盖默认限制,并且仅适用于相应的租户。您可以根据不同租户的需求设置不同的限制,以确保资源分配的公平性和合理性。
通过配置limits_config,您可以在Cortex中管理和调整各个服务的限制,以满足特定的需求和约束。请注意,这些限制可以帮助保护Cortex集群免受滥用或超负荷使用,同时提供适当的资源分配和性能保证。
配置示例
limits:
# 每租户的速率限制(以每秒样本数为单位)
# http code=429 resp=ingestion rate limit (25000) exceeded
ingestion_rate: 25000
# 应将摄取速率限制应用于每个分发器实例(local)
# 还是在整个集群中均匀分配(global)
ingestion_rate_strategy: local
# 每租户允许的突发大小(以样本数量为单位)
ingestion_burst_size: 50000
# 启用所有用户处理带有外部标签的样本的标志,这些外部标签用于标识高可用Prometheus设置中的副本
accept_ha_samples: false
# 从指标中提取哪个标签值作为集群标识
ha_cluster_label: cluster
# 从指标中提取哪个标签值作为副本标识
ha_replica_label: __replica__
ha_max_clusters: 0
drop_labels: []
max_label_name_length: 1024
max_label_value_length: 2048
max_label_names_per_series: 30
max_labels_size_bytes: 0
max_metadata_length: 1024
reject_old_samples: false
reject_old_samples_max_age: 2w
creation_grace_period: 10m
enforce_metadata_metric_name: true
enforce_metric_name: true
ingestion_tenant_shard_size: 0
max_series_per_query: 100000
max_series_per_user: 5000000
max_series_per_metric: 50000
max_global_series_per_user: 0
max_global_series_per_metric: 0
max_metadata_per_user: 8000
max_metadata_per_metric: 10
max_global_metadata_per_user: 0
max_global_metadata_per_metric: 0
max_fetched_chunks_per_query: 2000000
max_fetched_series_per_query: 0
max_fetched_chunk_bytes_per_query: 0
max_fetched_data_bytes_per_query: 0
max_query_lookback: 0s
max_query_length: 0s
max_query_parallelism: 14
max_cache_freshness: 1m
max_queriers_per_tenant: 0
query_vertical_shard_size: 0
ruler_evaluation_delay_duration: 0s
ruler_tenant_shard_size: 0
ruler_max_rules_per_rule_group: 0
ruler_max_rule_groups_per_tenant: 0
store_gateway_tenant_shard_size: 0
compactor_blocks_retention_period: 0s
compactor_tenant_shard_size: 0
s3_sse_type: ""
s3_sse_kms_key_id: ""
s3_sse_kms_encryption_context: ""
alertmanager_receivers_firewall_block_cidr_networks: ""
alertmanager_receivers_firewall_block_private_addresses: false
alertmanager_notification_rate_limit: 0
alertmanager_notification_rate_limit_per_integration: {}
alertmanager_max_config_size_bytes: 0
alertmanager_max_templates_count: 0
alertmanager_max_template_size_bytes: 0
alertmanager_max_dispatcher_aggregation_groups: 0
alertmanager_max_alerts_count: 0
alertmanager_max_alerts_size_bytes: 0
数据结构
github.com/cortexproject/cortex/pkg/util/validation/limits.go
Limits
// Limits describe all the limits for users; can be used to describe global default
// limits via flags, or per-user limits via yaml config.
type Limits struct {
// Distributor enforced limits.
IngestionRate float64 `yaml:"ingestion_rate" json:"ingestion_rate"`
IngestionRateStrategy string `yaml:"ingestion_rate_strategy" json:"ingestion_rate_strategy"`
IngestionBurstSize int `yaml:"ingestion_burst_size" json:"ingestion_burst_size"`
AcceptHASamples bool `yaml:"accept_ha_samples" json:"accept_ha_samples"`
HAClusterLabel string `yaml:"ha_cluster_label" json:"ha_cluster_label"`
HAReplicaLabel string `yaml:"ha_replica_label" json:"ha_replica_label"`
HAMaxClusters int `yaml:"ha_max_clusters" json:"ha_max_clusters"`
DropLabels flagext.StringSlice `yaml:"drop_labels" json:"drop_labels"`
MaxLabelNameLength int `yaml:"max_label_name_length" json:"max_label_name_length"`
MaxLabelValueLength int `yaml:"max_label_value_length" json:"max_label_value_length"`
MaxLabelNamesPerSeries int `yaml:"max_label_names_per_series" json:"max_label_names_per_series"`
MaxLabelsSizeBytes int `yaml:"max_labels_size_bytes" json:"max_labels_size_bytes"`
MaxMetadataLength int `yaml:"max_metadata_length" json:"max_metadata_length"`
RejectOldSamples bool `yaml:"reject_old_samples" json:"reject_old_samples"`
RejectOldSamplesMaxAge model.Duration `yaml:"reject_old_samples_max_age" json:"reject_old_samples_max_age"`
CreationGracePeriod model.Duration `yaml:"creation_grace_period" json:"creation_grace_period"`
EnforceMetadataMetricName bool `yaml:"enforce_metadata_metric_name" json:"enforce_metadata_metric_name"`
EnforceMetricName bool `yaml:"enforce_metric_name" json:"enforce_metric_name"`
IngestionTenantShardSize int `yaml:"ingestion_tenant_shard_size" json:"ingestion_tenant_shard_size"`
MetricRelabelConfigs []*relabel.Config `yaml:"metric_relabel_configs,omitempty" json:"metric_relabel_configs,omitempty" doc:"nocli|description=List of metric relabel configurations. Note that in most situations, it is more effective to use metrics relabeling directly in the Prometheus server, e.g. remote_write.write_relabel_configs."`
MaxExemplars int `yaml:"max_exemplars" json:"max_exemplars"`
// Ingester enforced limits.
// Series
MaxSeriesPerQuery int `yaml:"max_series_per_query" json:"max_series_per_query"`
MaxLocalSeriesPerUser int `yaml:"max_series_per_user" json:"max_series_per_user"`
MaxLocalSeriesPerMetric int `yaml:"max_series_per_metric" json:"max_series_per_metric"`
MaxGlobalSeriesPerUser int `yaml:"max_global_series_per_user" json:"max_global_series_per_user"`
MaxGlobalSeriesPerMetric int `yaml:"max_global_series_per_metric" json:"max_global_series_per_metric"`
// Metadata
MaxLocalMetricsWithMetadataPerUser int `yaml:"max_metadata_per_user" json:"max_metadata_per_user"`
MaxLocalMetadataPerMetric int `yaml:"max_metadata_per_metric" json:"max_metadata_per_metric"`
MaxGlobalMetricsWithMetadataPerUser int `yaml:"max_global_metadata_per_user" json:"max_global_metadata_per_user"`
MaxGlobalMetadataPerMetric int `yaml:"max_global_metadata_per_metric" json:"max_global_metadata_per_metric"`
// Out-of-order
OutOfOrderTimeWindow model.Duration `yaml:"out_of_order_time_window" json:"out_of_order_time_window"`
// Querier enforced limits.
MaxChunksPerQuery int `yaml:"max_fetched_chunks_per_query" json:"max_fetched_chunks_per_query"`
MaxFetchedSeriesPerQuery int `yaml:"max_fetched_series_per_query" json:"max_fetched_series_per_query"`
MaxFetchedChunkBytesPerQuery int `yaml:"max_fetched_chunk_bytes_per_query" json:"max_fetched_chunk_bytes_per_query"`
MaxFetchedDataBytesPerQuery int `yaml:"max_fetched_data_bytes_per_query" json:"max_fetched_data_bytes_per_query"`
MaxQueryLookback model.Duration `yaml:"max_query_lookback" json:"max_query_lookback"`
MaxQueryLength model.Duration `yaml:"max_query_length" json:"max_query_length"`
MaxQueryParallelism int `yaml:"max_query_parallelism" json:"max_query_parallelism"`
MaxCacheFreshness model.Duration `yaml:"max_cache_freshness" json:"max_cache_freshness"`
MaxQueriersPerTenant int `yaml:"max_queriers_per_tenant" json:"max_queriers_per_tenant"`
QueryVerticalShardSize int `yaml:"query_vertical_shard_size" json:"query_vertical_shard_size" doc:"hidden"`
// Query Frontend / Scheduler enforced limits.
MaxOutstandingPerTenant int `yaml:"max_outstanding_requests_per_tenant" json:"max_outstanding_requests_per_tenant"`
// Ruler defaults and limits.
RulerEvaluationDelay model.Duration `yaml:"ruler_evaluation_delay_duration" json:"ruler_evaluation_delay_duration"`
RulerTenantShardSize int `yaml:"ruler_tenant_shard_size" json:"ruler_tenant_shard_size"`
RulerMaxRulesPerRuleGroup int `yaml:"ruler_max_rules_per_rule_group" json:"ruler_max_rules_per_rule_group"`
RulerMaxRuleGroupsPerTenant int `yaml:"ruler_max_rule_groups_per_tenant" json:"ruler_max_rule_groups_per_tenant"`
// Store-gateway.
StoreGatewayTenantShardSize int `yaml:"store_gateway_tenant_shard_size" json:"store_gateway_tenant_shard_size"`
// Compactor.
CompactorBlocksRetentionPeriod model.Duration `yaml:"compactor_blocks_retention_period" json:"compactor_blocks_retention_period"`
CompactorTenantShardSize int `yaml:"compactor_tenant_shard_size" json:"compactor_tenant_shard_size"`
// This config doesn't have a CLI flag registered here because they're registered in
// their own original config struct.
S3SSEType string `yaml:"s3_sse_type" json:"s3_sse_type" doc:"nocli|description=S3 server-side encryption type. Required to enable server-side encryption overrides for a specific tenant. If not set, the default S3 client settings are used."`
S3SSEKMSKeyID string `yaml:"s3_sse_kms_key_id" json:"s3_sse_kms_key_id" doc:"nocli|description=S3 server-side encryption KMS Key ID. Ignored if the SSE type override is not set."`
S3SSEKMSEncryptionContext string `yaml:"s3_sse_kms_encryption_context" json:"s3_sse_kms_encryption_context" doc:"nocli|description=S3 server-side encryption KMS encryption context. If unset and the key ID override is set, the encryption context will not be provided to S3. Ignored if the SSE type override is not set."`
// Alertmanager.
AlertmanagerReceiversBlockCIDRNetworks flagext.CIDRSliceCSV `yaml:"alertmanager_receivers_firewall_block_cidr_networks" json:"alertmanager_receivers_firewall_block_cidr_networks"`
AlertmanagerReceiversBlockPrivateAddresses bool `yaml:"alertmanager_receivers_firewall_block_private_addresses" json:"alertmanager_receivers_firewall_block_private_addresses"`
NotificationRateLimit float64 `yaml:"alertmanager_notification_rate_limit" json:"alertmanager_notification_rate_limit"`
NotificationRateLimitPerIntegration NotificationRateLimitMap `yaml:"alertmanager_notification_rate_limit_per_integration" json:"alertmanager_notification_rate_limit_per_integration"`
AlertmanagerMaxConfigSizeBytes int `yaml:"alertmanager_max_config_size_bytes" json:"alertmanager_max_config_size_bytes"`
AlertmanagerMaxTemplatesCount int `yaml:"alertmanager_max_templates_count" json:"alertmanager_max_templates_count"`
AlertmanagerMaxTemplateSizeBytes int `yaml:"alertmanager_max_template_size_bytes" json:"alertmanager_max_template_size_bytes"`
AlertmanagerMaxDispatcherAggregationGroups int `yaml:"alertmanager_max_dispatcher_aggregation_groups" json:"alertmanager_max_dispatcher_aggregation_groups"`
AlertmanagerMaxAlertsCount int `yaml:"alertmanager_max_alerts_count" json:"alertmanager_max_alerts_count"`
AlertmanagerMaxAlertsSizeBytes int `yaml:"alertmanager_max_alerts_size_bytes" json:"alertmanager_max_alerts_size_bytes"`
}
flagext.StringSlice
github.com/cortexproject/cortex/pkg/util/flagext/
// StringSlice is a slice of strings that implements flag.Value
type StringSlice []string
relabel.Config
github.com/prometheus/prometheus/model/relabel/
flagext.CIDRSliceCSV
// CIDRSliceCSV is a slice of CIDRs that is parsed from a comma-separated string.
// It implements flag.Value and yaml Marshalers.
type CIDRSliceCSV []CIDR
// CIDR is a network CIDR.
type CIDR struct {
Value *net.IPNet
}
NotificationRateLimitMap
type NotificationRateLimitMap map[string]float64
应用场景
客户端写入高可用
服务端 cortex 的 limit 配置:
distributor:
ha_tracker:
enable_ha_tracker: true
ha_tracker_update_timeout: 15s
ha_tracker_update_timeout_jitter_max: 5s
ha_tracker_failover_timeout: 30s
kvstore:
store: etcd
prefix: /cortex/ha-tracker/
etcd:
endpoints:
- 192.168.31.201:2379
dial_timeout: 10s
max_retries: 10
limits:
accept_ha_samples: true
ha_cluster_label: cluster
ha_replica_label: __replica__
- 客户端上报以下数据
pod1 -> metric_name{cluster="cluster1", __replica__="test1"}
pod2 -> metric_name{cluster="cluster1", __replica__="test2"}
由于此时 pod1 与 pod2 处于一个集群 “cluster1”,而分别上报不同的副本 “test1”、“test2”,则同一时间仅允许一个写入。
拒绝日志示例:
replicas did not mach, rejecting sample: replica=test2, elected=test1
允许接收过期数据
- 服务端 cortex 的 limit 配置
limits:
out_of_order_time_window: 30m
- 客户端上报以下数据
服务端 cortex 当前时间: “2023-09-08T11:40:00”
则允许以下数据写入:
metric_name1{} 2023-09-08T11:20:00
metric_name1{} 2023-09-08T11:30:00
metric_name1{} 2023-09-08T11:40:00
则拒绝以下数据写入:
metric_name1{} 2023-09-08T10:50:00
metric_name1{} 2023-09-08T11:10:00
metric_name1{} 2023-09-08T11:20:00
最后修改 2023.09.24: refactor: update cortex (ba4ddf9)