Skip to content
This repository has been archived by the owner on Aug 2, 2023. It is now read-only.

Define node and module streams with specific labels #3

Merged
merged 7 commits into from
Jul 27, 2023
Merged

Conversation

DavidePrincipi
Copy link
Member

@DavidePrincipi DavidePrincipi commented Jul 3, 2023

The PR changes Promtail configuration generator (expandconfig helper) so it can tag each journal record with two labels:

  1. the node_id label. It is simply taken from the Promtail module environment variable, NODE_ID, and written in the config.
  2. the module_id label is set by looking at some journal record fields
    • For rootless modules the _UID field is matched against the module Unix uid, or its subuids. For this purpose an integer range generator function is used to create the subuid range regexp matcher¹.
    • For rootfull modules, SYSLOG_IDENTIFIER, CONTAINER_NAME, _SYSTEMD_UNIT are checked for MODULE_ID matches
  3. nodename label, is not added any more.

The Promtail configuration must be expanded each time a module is added/removed from the node where Promtail runs. For this purpose the module-added and module-removed events are watched².

The Loki instance discovery occurs in expandconfig with the Python library code.

The action configure-module is removed because it is not used and there are no plans to implement alternative Promtail configuration methods.

See also

Notes:

  1. The Python package regex-engine must be present in the Core Python environment
  2. module-added isn't really perfect to capture a module log since the beginning because it occurs after create-module returns. In future work this limitation can be overcome by merging the Promtail service in the Core itself, like Redis or with a dedicated event.

Discover Loki settings from local Redis replica.

Add new labels:
- module_id, by regex-matching journal fields
- node_id, same as the NODE_ID environment variable

The watched journal fields are
- _UID for rootless modules: also subuids are matched with the regex
  pattern
- _SYSTEMD_UNIT, CONTAINER_NAME, SYSLOG_IDENTIFIER for rootfull modules:
  if any of them contains the MODULE_ID the journal record matches

Also drop "nodename" label
The module is configured at service startup. It discovers the cluster
default Loki instance each time it is restarted.

Update README.md
Reconfigure and restart promtail service when a module is added or
removed to the local node.
Copy link
Member

@gsanchietti gsanchietti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me we are losing a bit of flexibility but it should not be a real problem.
Since Promtail is already part of the core, can we also address the notes 2 inside the PR description.

I'm going to leave the implementation review to @Amygos which knows the module better than me

imageroot/bin/expandconfig Outdated Show resolved Hide resolved
Co-authored-by: Giacomo Sanchietti <[email protected]>
@DavidePrincipi
Copy link
Member Author

DavidePrincipi commented Jul 5, 2023

This is an example of Promtail config.yml in node 1

clients:                                                                                                                                                                                       
- bearer_token: b5389fbb-afd4-4001-8d0a-9aefa6fdde41
  url: http://10.5.3.1:20008/loki/api/v1/push
positions:
  filename: /var/lib/promtail/positions.yml
  ignore_invalid_yaml: true
  sync_period: 10s
scrape_configs:
- job_name: journal
  journal:
    json: true
    max_age: 12h
  relabel_configs:
  - replacement: '1'
    target_label: node_id
  - regex: (1001|1[7-9][0-9][0-9][0-9][0-9]|16[6-9][0-9][0-9][0-9]|165[6-9][0-9][0-9]|1655[4-9][0-9]|16553[6-9]|2[0-2][0-9][0-9][0-9][0-9]|23[0-0][0-9][0-9][0-9]|2310[0-6][0-9]|23107[0-2])
    replacement: traefik1
    source_labels:
    - __journal__uid
    target_label: module_id
  - regex: (1002|2[4-8][0-9][0-9][0-9][0-9]|23[2-9][0-9][0-9][0-9]|231[1-9][0-9][0-9]|2310[8-9][0-9]|23107[2-9]|29[0-5][0-9][0-9][0-9]|296[0-5][0-9][0-9]|29660[0-8])
    replacement: ldapproxy1
    source_labels:
    - __journal__uid
    target_label: module_id
  - regex: (1003|29[7-9][0-9][0-9][0-9]|296[7-9][0-9][0-9]|2966[1-9][0-9]|29660[8-9]|3[0-5][0-9][0-9][0-9][0-9]|36[0-1][0-9][0-9][0-9]|362[0-0][0-9][0-9]|3621[0-3][0-9]|36214[0-4])
    replacement: loki1
    source_labels:
    - __journal__uid
    target_label: module_id
  - regex: .*\bpromtail1\b.*
    replacement: promtail1
    source_labels:
    - __journal__systemd_unit
    - __journal_syslog_identifier
    - __journal_container_name
    target_label: module_id
  - regex: (1004|3[7-9][0-9][0-9][0-9][0-9]|36[3-9][0-9][0-9][0-9]|362[2-9][0-9][0-9]|3621[5-9][0-9]|36214[4-9]|4[0-1][0-9][0-9][0-9][0-9]|42[0-6][0-9][0-9][0-9]|427[0-5][0-9][0-9]|4276[0-7][
0-9]|42768[0-0])
    replacement: openldap1
    source_labels:
    - __journal__systemd_unit
    - __journal_syslog_identifier
    - __journal_container_name
    target_label: module_id
  - regex: (1004|3[7-9][0-9][0-9][0-9][0-9]|36[3-9][0-9][0-9][0-9]|362[2-9][0-9][0-9]|3621[5-9][0-9]|36214[4-9]|4[0-1][0-9][0-9][0-9][0-9]|42[0-6][0-9][0-9][0-9]|427[0-5][0-9][0-9]|4276[0-7][
0-9]|42768[0-0])
    replacement: openldap1
    source_labels:
    - __journal__uid
    target_label: module_id
  - regex: (1005|4[3-8][0-9][0-9][0-9][0-9]|42[8-9][0-9][0-9][0-9]|427[7-9][0-9][0-9]|4276[9-9][0-9]|42768[0-9]|49[0-2][0-9][0-9][0-9]|493[0-1][0-9][0-9]|4932[0-0][0-9]|49321[0-6])
    replacement: mail1
    source_labels:
    - __journal__uid
    target_label: module_id
server:
  disable: true

@DavidePrincipi DavidePrincipi merged commit 67acf00 into main Jul 27, 2023
1 check passed
@DavidePrincipi DavidePrincipi deleted the logs-b2 branch July 27, 2023 16:13
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants