Подходит ли платформа для начинающих без опыта работы?

Да, курсы разбиты по уровням: Junior, Middle, Senior. Начинающие могут стартовать с базовых тем Python, Docker и алгоритмов и постепенно двигаться к более сложным темам.

Как быстро можно подготовиться к собеседованию на позицию Junior разработчика?

При занятиях 1–2 часа в день — от 2 до 4 недель на основные темы. Платформа анализирует слабые места по результатам квизов и строит персональный план подготовки.

Какие технологии охватывает платформа?

Python, FastAPI, Django, Docker, алгоритмы и структуры данных, Agile/Scrum, SQL, CI/CD, системный дизайн, код-ревью и более 50 других тем для разработчиков.

Платформа бесплатная?

Большинство учебных материалов и квизов доступны бесплатно после регистрации. Регистрация занимает менее минуты.

Как платформа помогает найти работу программистом?

Платформа даёт фундаментальные знания, которые проверяют на технических собеседованиях: алгоритмы, архитектура, фреймворки. Мок-интервью имитирует реальное собеседование. Система прогресса показывает, какие темы нужно подтянуть перед собеседованием.

cloud_native

Cloud-native логирование

Docker, Kubernetes, sidecar pattern, fluentd, centralized logging

Cloud-native логирование

Cloud-native приложения требуют нового подхода к логированию: stdout вместо файлов, JSON вместо текста, centralized logging вместо локальных файлов. Это следует принципу 12-factor app.

1. Принципы cloud-native логирования

1.1. 12-Factor App: Logs

Принцип: "Treat logs as event streams"

❌ Старый подход (Monolith):
Приложение → Файл на диске → logrotate → scp на сервер → grep
         │
         └─ Проблемы:
            • Логи привязаны к конкретному серверу
            • При масштабировании N серверов → N файлов
            • Сложный поиск по всем серверам
            • Потеря логов при падении сервера

✅ Cloud-native подход:
Приложение → stdout → Log collector → Centralized storage → Search/Visualize
         │
         └─ Преимущества:
            • Логи централизованы
            • Независимо от количества pod'ов
            • Быстрый поиск и агрегация
            • Персистентное хранение

Правила cloud-native логирования:

#	Правило	Почему
1	Логи в stdout/stderr	Контейнеры ephemeral, файлы теряются
2	Логи не ротируются приложением	Это задача infrastructure (K8s, Docker)
3	JSON формат	Машиночитаемость для collectors
4	Централизованный сбор	Единая точка поиска и анализа
5	Контекст в каждом логе	request_id, trace_id, pod_name
6	Без состояния в логах	Логи — это stream, не storage

1.2. Эволюция логирования

┌─────────────────────────────────────────────────────────────────┐
│                    Evolution of Logging                          │
└─────────────────────────────────────────────────────────────────┘

Generation 1: Monolith (2000-2010)
┌──────────────┐
│   App        │ → /var/log/app.log (локальный файл)
│   + DB       │ → logrotate + grep
└──────────────┘

Generation 2: SOA (2010-2015)
┌──────┐  ┌──────┐  ┌──────┐
│ Svc1 │  │ Svc2 │  │ Svc3 │ → Syslog server
└──────┘  └──────┘  └──────┘ → ELK Stack
     │         │         │
     └─────────┴─────────┘
           Сеть

Generation 3: Cloud-native (2015+)
┌─────┐ ┌─────┐ ┌─────┐ ┌─────┐
│Pod1 │ │Pod2 │ │Pod3 │ │PodN │ → stdout
└──┬──┘ └──┬──┘ └──┬──┘ └──┬──┘
   │       │       │       │
   └───────┴───────┴───────┘
           │
     ┌─────▼─────┐
     │ DaemonSet │ → Fluent Bit / Fluentd
     │ Collector │
     └─────┬─────┘
           │
     ┌─────▼─────┐
     │   Loki    │ → Elasticsearch
     │  Storage  │
     └─────┬─────┘
           │
     ┌─────▼─────┐
     │  Grafana  │ → Kibana
     │  Visual   │
     └───────────┘

1.3. Архитектура cloud-native логирования

┌─────────────────────────────────────────────────────────────────┐
│              Cloud-native Logging Architecture                  │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│                        Kubernetes Cluster                        │
│                                                                  │
│  ┌────────────┐  ┌────────────┐  ┌────────────┐                │
│  │ Node 1     │  │ Node 2     │  │ Node 3     │                │
│  │            │  │            │  │            │                │
│  │ ┌────────┐ │  │ ┌────────┐ │  │ ┌────────┐ │                │
│  │ │App Pod │ │  │ │App Pod │ │  │ │App Pod │ │                │
│  │ │ stdout │ │  │ │ stdout │ │  │ │ stdout │ │                │
│  │ └───┬────┘ │  │ └───┬────┘ │  │ └───┬────┘ │                │
│  │     │      │  │     │      │  │     │      │                │
│  │ ┌───▼────┐ │  │ ┌───▼────┐ │  │ ┌───▼────┐ │                │
│  │ │Fluent  │ │  │ │Fluent  │ │  │ │Fluent  │ │                │
│  │ │Bit     │ │  │ │Bit     │ │  │ │Bit     │ │                │
│  │ │(DS)    │ │  │ │(DS)    │ │  │ │(DS)    │ │                │
│  │ └───┬────┘ │  │ └───┬────┘ │  │ └───┬────┘ │                │
│  └─────┼──────┘  └─────┼──────┘  └─────┼──────┘                │
│        │               │               │                        │
│        └───────────────┼───────────────┘                        │
│                        │                                        │
└────────────────────────┼────────────────────────────────────────┘
                         │
                         ▼
              ┌──────────────────────┐
              │  Logging Namespace   │
              │                      │
              │ ┌──────────────────┐ │
              │ │  Elasticsearch   │ │ ← StatefulSet
              │ │  (Storage)       │ │
              │ └────────┬─────────┘ │
              │          │           │
              │ ┌────────▼─────────┐ │
              │ │     Kibana       │ │ ← Visualization
              │ └──────────────────┘ │
              └──────────────────────┘

2. Docker логирование

2.1. stdout/stderr — стандарт

# app.py
import logging
import sys
from pythonjsonlogger import jsonlogger

# Настройка логирования в stdout
logger = logging.getLogger()
logger.setLevel(logging.INFO)

handler = logging.StreamHandler(sys.stdout)
formatter = jsonlogger.JsonFormatter(
    '%(asctime)s %(name)s %(levelname)s %(message)s'
)
handler.setFormatter(formatter)
logger.addHandler(handler)

# Логи идут в stdout, Docker их собирает
logger.info("Application started", extra={"version": "1.0.0"})

Как Docker собирает логи:

┌─────────────────────────────────────────────────────────────┐
│              Docker Logging Flow                             │
└─────────────────────────────────────────────────────────────┘

Приложение
     │
     │ print() / logging.info()
     ▼
┌─────────────────┐
│  Container      │
│  stdout/stderr  │
└────────┬────────┘
         │
         │ Docker Daemon перехватывает
         ▼
┌─────────────────┐
│  Logging Driver │
│  (json-file)    │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│  /var/lib/docker/containers/<container-id>/<container-id>-json.log
└─────────────────┘

2.2. Docker logging drivers

# docker-compose.yml
services:
  app:
    image: myapp:latest
    logging:
      driver: json-file  # По умолчанию
      options:
        max-size: "10m"   # Ротация по размеру
        max-file: "3"     # Количество файлов

Сравнение logging drivers:

Driver	Описание	Когда использовать
`json-file`	Локальный JSON файл	Development, single-node
`syslog`	Отправка в syslog	Интеграция с existing syslog
`journald`	systemd journal	Linux с systemd
`fluentd`	Отправка в Fluentd	Centralized logging
`awslogs`	AWS CloudWatch	AWS ECS/EC2
`gcplogs`	Google Cloud Logging	GCP GKE
`splunk`	Splunk HEC	Enterprise со Splunk
`loki`	Grafana Loki	PLG stack

2.3. Fluentd driver

# docker-compose.yml
version: '3.8'
services:
  app:
    image: myapp:latest
    logging:
      driver: fluentd
      options:
        fluentd-address: localhost:24224
        fluentd-async: "true"
        fluentd-buffer-limit: 512m
        fluentd-retry-wait: 1s
        fluentd-max-retries: 10
        tag: myapp

  fluentd:
    image: fluent/fluentd:v1.16
    ports:
      - "24224:24224"
    volumes:
      - ./fluentd/conf:/fluentd/etc

2.4. Fluentd конфигурация

# fluent.conf
<source>
  @type forward
  port 24224
  bind 0.0.0.0
</source>

<filter myapp.**>
  @type record_transformer
  <record>
    hostname "#{Socket.gethostname}"
    environment production
  </record>
</filter>

<match myapp.**>
  @type elasticsearch
  host elasticsearch
  port 9200
  logstash_format true
  logstash_prefix myapp
  flush_interval 5s
</match>

2.5. Docker log rotation

Проблема: Логи могут заполнить диск.

Решение: Настройка ротации на уровне Docker daemon.

# /etc/docker/daemon.json
{
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "10m",
    "max-file": "3",
    "compress": "true"
  }
}

Применение:

sudo systemctl restart docker

3. Kubernetes логирование

3.1. Модели логирования в K8s

┌─────────────────────────────────────────────────────────────┐
│              Kubernetes Logging Patterns                     │
└─────────────────────────────────────────────────────────────┘

Pattern 1: Node-level logging agent (DaemonSet)
┌─────────┐  ┌─────────┐  ┌─────────┐
│  Pod 1  │  │  Pod 2  │  │  Pod 3  │
└────┬────┘  └────┬────┘  └────┬────┘
     │           │           │
     └───────────┼───────────┘
                 │
           ┌─────▼─────┐
           │  Agent    │ ← DaemonSet на каждой node
           │ (Fluentd) │
           └─────┬─────┘
                 │
                 ▼
           Storage (Loki/ES)

Pattern 2: Sidecar container
┌─────────────────────────┐
│         Pod             │
│ ┌─────────┐  ┌────────┐ │
│ │   App   │  │ Sidecar│ │
│ │ stdout  │──│ Agent  │ │
│ └─────────┘  └───┬────┘ │
└──────────────────┼──────┘
                   │
                   ▼
             Storage

Pattern 3: Direct to backend
┌─────────┐
│   App   │ → CloudWatch / GCP Logging
└─────────┘

3.2. Базовая конфигурация приложения

# logging_config.py
import logging
import sys
import os
from pythonjsonlogger import jsonlogger

def setup_logging():
    """Настройка logging для Kubernetes."""
    logger = logging.getLogger()
    logger.setLevel(logging.INFO)

    # Удаляем существующие handlers
    logger.handlers = []

    # JSON formatter для cloud-native
    formatter = jsonlogger.JsonFormatter(
        '%(asctime)s %(name)s %(levelname)s %(message)s '
        '%(filename)s %(lineno)d %(funcName)s'
    )

    # stdout handler
    handler = logging.StreamHandler(sys.stdout)
    handler.setFormatter(formatter)
    logger.addHandler(handler)

    return logger

# Добавление Kubernetes metadata
def get_k8s_metadata():
    """Получение Kubernetes metadata из environment."""
    return {
        "pod_name": os.getenv("POD_NAME", "unknown"),
        "namespace": os.getenv("NAMESPACE", "default"),
        "node_name": os.getenv("NODE_NAME", "unknown"),
        "container_name": os.getenv("CONTAINER_NAME", "unknown"),
    }

3.3. Kubernetes Deployment с Downward API

# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
  namespace: production
spec:
  replicas: 3
  selector:
    matchLabels:
      app: myapp
  template:
    metadata:
      labels:
        app: myapp
        version: v1.2.3
    spec:
      containers:
      - name: app
        image: myapp:1.2.3
        env:
        # Downward API для metadata
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        - name: CONTAINER_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: POD_IP
          valueFrom:
            fieldRef:
              fieldPath: status.podIP
        resources:
          requests:
            memory: "128Mi"
            cpu: "100m"
          limits:
            memory: "256Mi"
            cpu: "500m"
        ports:
        - containerPort: 8080
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /ready
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 5

3.4. Fluent Bit DaemonSet

# fluent-bit-daemonset.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluent-bit
  namespace: logging
  labels:
    app: fluent-bit
spec:
  selector:
    matchLabels:
      app: fluent-bit
  template:
    metadata:
      labels:
        app: fluent-bit
    spec:
      serviceAccountName: fluent-bit
      tolerations:
      - key: node-role.kubernetes.io/master
        effect: NoSchedule
      - key: node-role.kubernetes.io/control-plane
        effect: NoSchedule
      priorityClassName: system-node-critical
      containers:
      - name: fluent-bit
        image: fluent/fluent-bit:2.1
        ports:
        - containerPort: 2020
          name: http
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
        - name: config
          mountPath: /fluent-bit/etc/
        resources:
          limits:
            memory: 128Mi
          requests:
            cpu: 100m
            memory: 64Mi
        livenessProbe:
          httpGet:
            path: /api/v1/health
            port: http
        readinessProbe:
          httpGet:
            path: /api/v1/health
            port: http
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers
      - name: config
        configMap:
          name: fluent-bit-config

3.5. Fluent Bit Configuration

# fluent-bit-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: fluent-bit-config
  namespace: logging
data:
  fluent-bit.conf: |
    [SERVICE]
        Flush         5
        Log_Level     info
        Parsers_File  parsers.conf
        HTTP_Server   On
        HTTP_Listen   0.0.0.0
        HTTP_Port     2020

    [INPUT]
        Name              tail
        Tag               kube.*
        Path              /var/log/containers/*.log
        Parser            docker
        DB                /var/log/flb_kube.db
        Mem_Buf_Limit     50MB
        Skip_Long_Lines   On
        Refresh_Interval  10
        Rotate_Wait       30

    [FILTER]
        Name                kubernetes
        Match               kube.*
        Kube_URL            https://kubernetes.default.svc:443
        Kube_CA_File        /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        Kube_Token_File     /var/run/secrets/kubernetes.io/serviceaccount/token
        Merge_Log           On
        Merge_Log_Key       log_processed
        K8S-Logging.Parser  On
        K8S-Logging.Exclude On
        Labels              On
        Annotations         Off

    [OUTPUT]
        Name            es
        Match           *
        Host            elasticsearch-master.logging.svc.cluster.local
        Port            9200
        Logstash_Format On
        Logstash_Prefix k8s-logs
        Retry_Limit     False
        Replace_Dots    On

    [OUTPUT]
        Name            loki
        Match           *
        Host            loki.logging.svc.cluster.local
        Port            3100
        Labels          job=fluent-bit

  parsers.conf: |
    [PARSER]
        Name        docker
        Format      json
        Time_Key    time
        Time_Format %Y-%m-%dT%H:%M:%S.%L
        Time_Keep   On
        Decode_Field_As escaped_utf-8 log do_next
        Decode_Field_As json log

    [PARSER]
        Name        json
        Format      json
        Time_Key    timestamp
        Time_Format %Y-%m-%dT%H:%M:%S.%L

3.6. Fluent Bit RBAC

# fluent-bit-rbac.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: fluent-bit
  namespace: logging
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: fluent-bit
rules:
- apiGroups: [""]
  resources:
  - namespaces
  - pods
  - pods/logs
  verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: fluent-bit
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: fluent-bit
subjects:
- kind: ServiceAccount
  name: fluent-bit
  namespace: logging

4. Логирование в managed Kubernetes

4.1. GKE + Cloud Logging

# gke-logging.yaml
apiVersion: v1
kind: Namespace
metadata:
  name: logging
---
# Cloud Logging agent уже установлен в GKE
# Дополнительная настройка не требуется

# Для кастомных логов:
apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
  namespace: production
data:
  # Cloud Logging автоматически собирает stdout
  LOG_FORMAT: "json"
  LOG_LEVEL: "INFO"

Python приложение для GKE:

import google.cloud.logging
from google.cloud.logging_v2 import Client
import logging

# Инициализация Cloud Logging
client = Client()
client.setup_logging()

# Стандартный logging теперь идёт в Cloud Logging
logger = logging.getLogger(__name__)
logger.info("Application started")

# С метаданными
logger.info(
    "User action",
    extra={
        "user_id": 123,
        "request_id": "abc-123"
    }
)

4.2. EKS + CloudWatch

# eks-fluentd.yaml
# Fluentd DaemonSet для EKS
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd-cloudwatch
  namespace: logging
spec:
  selector:
    matchLabels:
      app: fluentd-cloudwatch
  template:
    metadata:
      labels:
        app: fluentd-cloudwatch
    spec:
      serviceAccountName: fluentd-cloudwatch
      containers:
      - name: fluentd
        image: fluent/fluentd-kubernetes-daemonset:v1.16
        env:
        - name: AWS_REGION
          value: us-east-1
        - name: FLUENT_AWS_LOG_GROUP_NAME
          value: /eks/my-cluster
        - name: FLUENT_AWS_LOG_STREAM_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        resources:
          limits:
            memory: 256Mi
          requests:
            cpu: 100m
            memory: 128Mi

IAM Policy для Fluentd:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "logs:CreateLogGroup",
        "logs:CreateLogStream",
        "logs:PutLogEvents",
        "logs:DescribeLogStreams"
      ],
      "Resource": "arn:aws:logs:*:*:*"
    }
  ]
}

4.3. AKS + Azure Monitor

# aks-logging.yaml
# Azure Monitor Container Insights уже включен
# Для кастомных логов:

apiVersion: v1
kind: ConfigMap
metadata:
  name: container-azm-ms-agentconfig
  namespace: kube-system
data:
  # Конфигурация сбора логов
  agentSettings: |
    {
      "logCollection": {
        "enableContainerLog": true,
        "logLevel": "Info",
        "excludeNamespaces": ["kube-system", "logging"]
      }
    }

4.4. Сравнение managed решений

Провайдер	Решение	Стоимость	Особенности
GKE	Cloud Logging	$0.50/GB	Бесплатно 50GB/мес
EKS	CloudWatch	$0.50/GB	Интеграция с AWS
AKS	Azure Monitor	$2.50/GB	Integration с Azure
Self-hosted	Loki + S3	$0.023/GB	Дешевле, но сложнее

5. Sidecar pattern

5.1. Когда использовать

┌─────────────────────────────────────────────────────────────┐
│              DaemonSet vs Sidecar Decision Tree              │
└─────────────────────────────────────────────────────────────┘

Нужна ли кастомная обработка для каждого приложения?
    │
    ├─ Нет → DaemonSet (эффективнее по ресурсам)
    │         │
    │         └─ Один collector на node
    │         └─ ~50-100MB RAM на node
    │         └─ Проще управление
    │
    └─ Да → Sidecar (гибче)
              │
              └─ Collector в каждом pod
              └─ ~50-100MB RAM на pod
              └─ Кастомная конфигурация

Сравнение:

Критерий	DaemonSet	Sidecar
Ресурсы	Эффективно (1 на node)	Больше (1 на pod)
Гибкость	Общая конфигурация	Per-pod конфигурация
Управление	Централизованное	Распределённое
Изоляция	Нет	Да
Когда	Стандартные логи	Мультитенант, compliance

5.2. Sidecar пример

# deployment-with-sidecar.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
  namespace: production
spec:
  replicas: 3
  selector:
    matchLabels:
      app: myapp
  template:
    metadata:
      labels:
        app: myapp
    spec:
      containers:
      # Основное приложение
      - name: app
        image: myapp:latest
        volumeMounts:
        - name: logs
          mountPath: /app/logs
        env:
        - name: LOG_FILE
          value: /app/logs/app.log
        - name: LOG_FORMAT
          value: "json"
        resources:
          limits:
            memory: "512Mi"
            cpu: "500m"

      # Sidecar: Fluent Bit
      - name: fluent-bit
        image: fluent/fluent-bit:2.1
        volumeMounts:
        - name: logs
          mountPath: /app/logs
          readOnly: true
        - name: config
          mountPath: /fluent-bit/etc/
        resources:
          limits:
            memory: "64Mi"
            cpu: "100m"

      volumes:
      - name: logs
        emptyDir: {}
      - name: config
        configMap:
          name: fluent-bit-sidecar-config

# fluent-bit-sidecar-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: fluent-bit-sidecar-config
  namespace: production
data:
  fluent-bit.conf: |
    [SERVICE]
        Flush 1
        Log_Level info

    [INPUT]
        Name tail
        Path /app/logs/app.log
        Parser json
        Tag myapp.logs

    [FILTER]
        Name record_modifier
        Match *
        Record hostname ${HOSTNAME}
        Record service myapp

    [OUTPUT]
        Name  forward
        Match *
        Host  fluentd-aggregator.logging.svc.cluster.local
        Port  24224

5.3. Sidecar для мультитенант приложений

# multi-tenant-sidecar.yaml
# Разные tenants → разные destinations
apiVersion: v1
kind: ConfigMap
metadata:
  name: fluent-bit-tenant-config
data:
  fluent-bit.conf: |
    [SERVICE]
        Flush 1

    [INPUT]
        Name tail
        Path /app/logs/tenant-*.log
        Parser json
        Tag tenant.$TAG['tenant']

    [FILTER]
        Name grep
        Match tenant.tenant-a.*
        Regex log .*sensitive.*

    [OUTPUT]
        Name   es
        Match  tenant.tenant-a.*
        Host   es-tenant-a.logging.svc
        Index  tenant-a-logs

    [OUTPUT]
        Name   es
        Match  tenant.tenant-b.*
        Host   es-tenant-b.logging.svc
        Index  tenant-b-logs

6. Centralized logging stacks

6.1. Выбор стека

┌─────────────────────────────────────────────────────────────┐
│              Logging Stack Selection                         │
└─────────────────────────────────────────────────────────────┘

Нужен ли full-text search?
    │
    ├─ Да → ELK Stack (Elasticsearch, Logstash, Kibana)
    │         │
    │         ├─ Полный поиск по содержимому
    │         ├─ Больше ресурсов (RAM, CPU, Disk)
    │         └─ Сложнее в управлении
    │
    └─ Нет → PLG Stack (Promtail, Loki, Grafana)
              │
              ├─ Индексация только labels
              ├─ Дешевле хранение (S3, GCS)
              └─ Проще в управлении

6.2. ELK Stack (Elasticsearch, Logstash, Kibana)

# elasticsearch-statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: elasticsearch
  namespace: logging
  labels:
    app: elasticsearch
spec:
  serviceName: elasticsearch
  replicas: 3
  selector:
    matchLabels:
      app: elasticsearch
  template:
    metadata:
      labels:
        app: elasticsearch
    spec:
      initContainers:
      - name: fix-permissions
        image: busybox
        command: ["sh", "-c", "chown -R 1000:1000 /usr/share/elasticsearch/data"]
        volumeMounts:
        - name: data
          mountPath: /usr/share/elasticsearch/data
      containers:
      - name: elasticsearch
        image: docker.elastic.co/elasticsearch/elasticsearch:8.10.0
        env:
        - name: discovery.type
          value: single-node
        - name: cluster.name
          value: logging-cluster
        - name: node.name
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: ES_JAVA_OPTS
          value: "-Xms1g -Xmx1g"
        - name: xpack.security.enabled
          value: "false"
        ports:
        - containerPort: 9200
          name: http
        - containerPort: 9300
          name: transport
        volumeMounts:
        - name: data
          mountPath: /usr/share/elasticsearch/data
        resources:
          requests:
            cpu: "500m"
            memory: "2Gi"
          limits:
            cpu: "2"
            memory: "4Gi"
        livenessProbe:
          httpGet:
            path: /_cluster/health
            port: 9200
          initialDelaySeconds: 60
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /_cluster/health
            port: 9200
          initialDelaySeconds: 30
          periodSeconds: 5
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 100Gi
      storageClassName: gp2

# kibana-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: kibana
  namespace: logging
spec:
  replicas: 1
  selector:
    matchLabels:
      app: kibana
  template:
    metadata:
      labels:
        app: kibana
    spec:
      containers:
      - name: kibana
        image: docker.elastic.co/kibana/kibana:8.10.0
        env:
        - name: ELASTICSEARCH_HOSTS
          value: "http://elasticsearch-0.elasticsearch:9200"
        ports:
        - containerPort: 5601
        resources:
          requests:
            cpu: "250m"
            memory: "512Mi"
          limits:
            cpu: "1"
            memory: "1Gi"
---
apiVersion: v1
kind: Service
metadata:
  name: kibana
  namespace: logging
spec:
  selector:
    app: kibana
  ports:
  - port: 5601
    targetPort: 5601
  type: LoadBalancer

6.3. PLG Stack (Promtail, Loki, Grafana)

# loki-statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: loki
  namespace: logging
spec:
  serviceName: loki
  replicas: 1
  selector:
    matchLabels:
      app: loki
  template:
    metadata:
      labels:
        app: loki
    spec:
      containers:
      - name: loki
        image: grafana/loki:2.9.0
        args:
        - -config.file=/etc/loki/loki.yaml
        ports:
        - containerPort: 3100
          name: http
        volumeMounts:
        - name: config
          mountPath: /etc/loki
        - name: data
          mountPath: /loki
        resources:
          requests:
            cpu: "250m"
            memory: "512Mi"
          limits:
            cpu: "1"
            memory: "1Gi"
      volumes:
      - name: config
        configMap:
          name: loki-config
      - name: data
        emptyDir: {}
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: loki-config
  namespace: logging
data:
  loki.yaml: |
    auth_enabled: false

    server:
      http_listen_port: 3100
      grpc_listen_port: 9096

    common:
      instance_addr: 127.0.0.1
      path_prefix: /loki

    schema_config:
      configs:
      - from: 2023-01-01
        store: tsdb
        object_store: filesystem
        schema: v13
        index:
          prefix: index_
          period: 24h

    storage_config:
      filesystem:
        directory: /loki/chunks

    limits_config:
      retention_period: 168h  # 7 дней
      max_entries_limit_per_query: 5000

    ingester:
      wal:
        enabled: true
        dir: /loki/wal

# promtail-daemonset.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: promtail
  namespace: logging
spec:
  selector:
    matchLabels:
      app: promtail
  template:
    metadata:
      labels:
        app: promtail
    spec:
      serviceAccountName: promtail
      containers:
      - name: promtail
        image: grafana/promtail:2.9.0
        args:
        - -config.file=/etc/promtail/promtail.yaml
        volumeMounts:
        - name: config
          mountPath: /etc/promtail
        - name: run
          mountPath: /run/promtail
        - name: containers
          mountPath: /var/lib/docker/containers
          readOnly: true
        - name: pods
          mountPath: /var/log/pods
          readOnly: true
        env:
        - name: HOSTNAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        resources:
          limits:
            memory: 128Mi
          requests:
            cpu: 100m
            memory: 64Mi
      volumes:
      - name: config
        configMap:
          name: promtail-config
      - name: run
        hostPath:
          path: /run/promtail
      - name: containers
        hostPath:
          path: /var/lib/docker/containers
      - name: pods
        hostPath:
          path: /var/log/pods

# promtail-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: promtail-config
  namespace: logging
data:
  promtail.yaml: |
    server:
      http_listen_port: 9080
      grpc_listen_port: 0

    positions:
      filename: /run/promtail/positions.yaml

    clients:
    - url: http://loki.logging.svc.cluster.local:3100/loki/api/v1/push

    scrape_configs:
    - job_name: kubernetes-pods
      kubernetes_sd_configs:
      - role: pod
      pipeline_stages:
      - docker: {}
      - json:
          expressions:
            level: level
            user_id: user_id
      - labels:
          level:
          user_id:
      relabel_configs:
      - source_labels:
        - __meta_kubernetes_pod_label_app
        target_label: app
      - source_labels:
        - __meta_kubernetes_namespace
        target_label: namespace
      - source_labels:
        - __meta_kubernetes_pod_name
        target_label: pod

6.4. Loki с S3 storage

# loki-s3-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: loki-config
data:
  loki.yaml: |
    schema_config:
      configs:
      - from: 2023-01-01
        store: tsdb
        object_store: s3
        schema: v13
        index:
          prefix: index_
          period: 24h

    storage_config:
      aws:
        s3: s3://loki-bucket/loki
        region: us-east-1
        bucketnames: loki-bucket

    limits_config:
      retention_period: 720h  # 30 дней

6.5. Сравнение стеков

Параметр	ELK	PLG (Loki)	Cloud-native
Хранение	Local SSD / EBS	S3 / GCS	Managed
Стоимость	$$$	$	$$
Поиск	Full-text	По labels	Зависит от провайдера
Ресурсы	Высокие	Низкие	Managed
Сложность	Высокая	Средняя	Низкая
Масштабирование	Сложное	Простое	Автоматическое

7. Structured logging для cloud

7.1. JSON формат

import logging
from pythonjsonlogger import jsonlogger
import os
import socket
import uuid
from datetime import datetime

class CloudFormatter(jsonlogger.JsonFormatter):
    """Cloud-native JSON formatter с метаданными."""

    def __init__(self):
        super().__init__()
        # Генерация instance_id для этого процесса
        self.instance_id = str(uuid.uuid4())[:8]

    def add_fields(self, log_record, record, message_dict):
        super().add_fields(log_record, record, message_dict)

        # Стандартизация полей (OpenTelemetry compatible)
        log_record['timestamp'] = datetime.utcnow().isoformat() + 'Z'
        log_record['severity'] = record.levelname
        log_record['logger'] = record.name

        # Kubernetes metadata
        log_record['k8s'] = {
            'pod_name': os.getenv('POD_NAME', 'unknown'),
            'namespace': os.getenv('NAMESPACE', 'default'),
            'node_name': os.getenv('NODE_NAME', 'unknown'),
            'container_name': os.getenv('CONTAINER_NAME', 'unknown'),
        }

        # Host info
        log_record['host'] = {
            'hostname': socket.gethostname(),
            'instance_id': self.instance_id,
        }

        # Process info
        log_record['process'] = {
            'pid': os.getpid(),
            'thread': record.thread,
        }

        # Location info
        log_record['location'] = {
            'file': record.filename,
            'line': record.lineno,
            'function': record.funcName,
        }

        # Удаляем дублирующие поля
        for field in ['asctime', 'levelname', 'name', 'filename',
                      'lineno', 'funcName', 'thread']:
            log_record.pop(field, None)

# Использование
logger = logging.getLogger()
handler = logging.StreamHandler()
handler.setFormatter(CloudFormatter())
logger.addHandler(handler)
logger.setLevel(logging.INFO)

logger.info("User logged in", extra={
    "user_id": 123,
    "request_id": "abc-123",
    "trace_id": "xyz-789"
})

Вывод:

{
  "timestamp": "2026-03-21T10:00:00.123456Z",
  "severity": "INFO",
  "logger": "app.api",
  "message": "User logged in",
  "user_id": 123,
  "request_id": "abc-123",
  "trace_id": "xyz-789",
  "k8s": {
    "pod_name": "myapp-abc123-def456",
    "namespace": "production",
    "node_name": "gke-cluster-default-pool-abc123",
    "container_name": "myapp"
  },
  "host": {
    "hostname": "myapp-abc123-def456",
    "instance_id": "a1b2c3d4"
  },
  "process": {
    "pid": 1,
    "thread": 140234567890
  },
  "location": {
    "file": "api.py",
    "line": 42,
    "function": "login"
  }
}

7.2. Минимальный набор полей

# Обязательные поля для cloud-native логов
REQUIRED_FIELDS = {
    'timestamp': 'ISO 8601 формат (2026-03-21T10:00:00.123Z)',
    'severity': 'INFO, WARNING, ERROR, CRITICAL',
    'message': 'Человекочитаемое сообщение',
    'logger': 'Имя logger'а',
}

# Рекомендуемые поля для tracing
TRACING_FIELDS = {
    'trace_id': 'Distributed trace ID (32 hex chars)',
    'span_id': 'Span ID (16 hex chars)',
    'request_id': 'Уникальный ID запроса',
    'parent_span_id': 'ID родительского span (опционально)',
}

# Контекстные поля
CONTEXT_FIELDS = {
    'user_id': 'ID пользователя',
    'session_id': 'ID сессии',
    'duration_ms': 'Время выполнения в мс',
    'error_type': 'Тип ошибки для классификации',
    'error_code': 'Код ошибки для алертинга',
}

# Infrastructure поля
INFRA_FIELDS = {
    'k8s.pod_name': 'Имя pod',
    'k8s.namespace': 'Namespace',
    'k8s.node_name': 'Имя node',
    'host.hostname': 'Имя хоста',
    'process.pid': 'PID процесса',
}

7.3. OpenTelemetry формат

from opentelemetry import trace
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
import logging
from pythonjsonlogger import jsonlogger

# Настройка tracing
trace.set_tracer_provider(TracerProvider())
trace.get_tracer_provider().add_span_processor(
    BatchSpanProcessor(OTLPSpanExporter())
)

tracer = trace.get_tracer(__name__)

class OTLPFormatter(jsonlogger.JsonFormatter):
    """Formatter совместимый с OpenTelemetry."""

    def add_fields(self, log_record, record, message_dict):
        super().add_fields(log_record, record, message_dict)

        # Получение текущего span context
        span = trace.get_current_span()
        span_context = span.get_span_context()

        # Добавление tracing context
        log_record['trace_id'] = format(span_context.trace_id, '032x')
        log_record['span_id'] = format(span_context.span_id, '016x')
        log_record['trace_flags'] = span_context.trace_flags

# Использование с tracing
logger = logging.getLogger()
handler = logging.StreamHandler()
handler.setFormatter(OTLPFormatter())
logger.addHandler(handler)

with tracer.start_as_current_span("process_request") as span:
    logger.info("Processing request", extra={"user_id": 123})
    # Лог автоматически связан с span

8. Distributed tracing интеграция

8.1. Контекст в логах

import logging
from contextvars import ContextVar
from functools import wraps

# Context variables для tracing
trace_id_var: ContextVar[str] = ContextVar('trace_id', default='')
request_id_var: ContextVar[str] = ContextVar('request_id', default='')
user_id_var: ContextVar[str] = ContextVar('user_id', default='')

class TracingFormatter(logging.Formatter):
    """Formatter с tracing контекстом."""

    def format(self, record):
        # Добавление tracing context из contextvars
        record.trace_id = trace_id_var.get()
        record.request_id = request_id_var.get()
        record.user_id = user_id_var.get()
        return super().format(record)

# Middleware для извлечения контекста
def tracing_middleware(func):
    @wraps(func)
    async def wrapper(request, *args, **kwargs):
        # Извлечение из headers
        trace_id = request.headers.get('X-Trace-ID', '')
        request_id = request.headers.get('X-Request-ID', '')
        user_id = request.headers.get('X-User-ID', '')

        # Установка в context
        trace_id_var.set(trace_id)
        request_id_var.set(request_id)
        user_id_var.set(user_id)

        return await func(request, *args, **kwargs)
    return wrapper

# Использование
logger = logging.getLogger()
handler = logging.StreamHandler()
handler.setFormatter(TracingFormatter(
    '%(timestamp)s %(trace_id)s %(request_id)s %(levelname)s %(message)s'
))
logger.addHandler(handler)

8.2. Propagation между сервисами

┌─────────────────────────────────────────────────────────────┐
│              Distributed Tracing Flow                        │
└─────────────────────────────────────────────────────────────┘

Client
   │
   │ X-Trace-ID: abc123
   │ X-Request-ID: req-456
   ▼
┌─────────────────┐
│  API Gateway    │
│  (генерирует    │
│   trace_id)     │
└────────┬────────┘
         │
         │ X-Trace-ID: abc123
         │ X-Request-ID: req-456
         ▼
┌─────────────────┐
│   Users API     │
│   (логирует с   │
│    контекстом)  │
└────────┬────────┘
         │
         │ X-Trace-ID: abc123
         │ X-Request-ID: req-456
         ▼
┌─────────────────┐
│   Orders API    │
│   (логирует с   │
│    контекстом)  │
└─────────────────┘

Все логи имеют одинаковый trace_id → можно отследить весь запрос

8.3. Пример с FastAPI

from fastapi import FastAPI, Request
from contextlib import asynccontextmanager
import logging
import uuid
from pythonjsonlogger import jsonlogger

# Context storage
from contextvars import ContextVar

context_trace_id = ContextVar('trace_id', default='')
context_request_id = ContextVar('request_id', default='')

@asynccontextmanager
async def lifespan(app: FastAPI):
    # Настройка логирования при старте
    logger = logging.getLogger()
    logger.handlers = []
    handler = logging.StreamHandler()
    handler.setFormatter(jsonlogger.JsonFormatter(
        '%(asctime)s %(trace_id)s %(request_id)s %(levelname)s %(message)s'
    ))
    logger.addHandler(handler)
    logger.setLevel(logging.INFO)
    yield

app = FastAPI(lifespan=lifespan)

@app.middleware("http")
async def tracing_middleware(request: Request, call_next):
    # Генерация или извлечение trace_id
    trace_id = request.headers.get('X-Trace-ID', str(uuid.uuid4()))
    request_id = request.headers.get('X-Request-ID', str(uuid.uuid.uuid4()))

    # Установка в context
    context_trace_id.set(trace_id)
    context_request_id.set(request_id)

    logger = logging.getLogger(__name__)
    logger.info(f"Incoming request", extra={
        "trace_id": trace_id,
        "request_id": request_id,
        "method": request.method,
        "path": request.url.path,
    })

    response = await call_next(request)

    logger.info(f"Outgoing response", extra={
        "trace_id": trace_id,
        "request_id": request_id,
        "status_code": response.status_code,
    })

    # Добавление headers в response
    response.headers['X-Trace-ID'] = trace_id
    response.headers['X-Request-ID'] = request_id

    return response

@app.get("/users/{user_id}")
async def get_user(user_id: int):
    logger = logging.getLogger(__name__)
    trace_id = context_trace_id.get()
    request_id = context_request_id.get()

    logger.debug(f"Fetching user {user_id}", extra={
        "trace_id": trace_id,
        "request_id": request_id,
        "user_id": user_id,
    })

    return {"user_id": user_id}

9. Graceful shutdown

9.1. Обработка SIGTERM

import signal
import logging
import sys
import time
from logging.handlers import QueueHandler, QueueListener
import queue
import threading

class GracefulLoggingManager:
    """Менеджер логирования с graceful shutdown."""

    def __init__(self):
        self.log_queue = queue.Queue(maxsize=10000)
        self.queue_handler = QueueHandler(self.log_queue)
        self.listener = None
        self._shutdown = threading.Event()
        self._shutdown_timeout = 30  # секунды

    def start(self):
        # Настройка handler
        handler = logging.StreamHandler(sys.stdout)
        handler.setFormatter(logging.Formatter('%(message)s'))

        # Запуск listener
        self.listener = QueueListener(self.log_queue, handler)
        self.listener.start()

        # Настройка logger
        logger = logging.getLogger()
        logger.addHandler(self.queue_handler)
        logger.setLevel(logging.INFO)

        # Регистрация signal handlers
        signal.signal(signal.SIGTERM, self._signal_handler)
        signal.signal(signal.SIGINT, self._signal_handler)

        logging.getLogger(__name__).info("Logging manager started")

    def _signal_handler(self, signum, frame):
        logger = logging.getLogger(__name__)
        logger.info(f"Received signal {signum}, initiating graceful shutdown...")

        self._shutdown.set()
        self.stop()

        sys.exit(0)

    def stop(self):
        """Graceful shutdown логирования."""
        if self.listener:
            logger = logging.getLogger(__name__)
            logger.info("Flushing log queue...")

            # Остановка listener
            self.listener.stop()

            # Ожидание обработки всех сообщений
            timeout = time.time() + self._shutdown_timeout
            while not self.log_queue.empty() and time.time() < timeout:
                time.sleep(0.1)

            if not self.log_queue.empty():
                remaining = self.log_queue.qsize()
                logger.warning(f"Shutdown complete, {remaining} logs dropped")

        logging.getLogger(__name__).info("Logging manager stopped")

# Использование
if __name__ == '__main__':
    manager = GracefulLoggingManager()
    manager.start()

    logger = logging.getLogger(__name__)
    logger.info("Application started")

    try:
        # ... работа приложения ...
        while True:
            logger.info("Processing...")
            time.sleep(1)
    except KeyboardInterrupt:
        pass
    finally:
        manager.stop()

9.2. Kubernetes terminationGracePeriodSeconds

# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
spec:
  template:
    spec:
      # Время на graceful shutdown
      terminationGracePeriodSeconds: 30

      containers:
      - name: app
        image: myapp:latest

        # PreStop hook для дополнительного времени
        lifecycle:
          preStop:
            exec:
              # Sleep перед получением SIGTERM
              command: ["/bin/sh", "-c", "sleep 5"]

        # Environment для настройки logging
        env:
        - name: LOG_FLUSH_TIMEOUT
          value: "25"  # Должно быть < terminationGracePeriodSeconds

Timeline graceful shutdown:

┌─────────────────────────────────────────────────────────────┐
│              Kubernetes Shutdown Timeline                    │
└─────────────────────────────────────────────────────────────┘

T=0s     Pod marked for termination
         │
         ▼
T=0s     preStop hook (sleep 5)
         │
         ▼
T=5s     SIGTERM отправлен приложению
         │
         ├─ Приложение перестаёт принимать новые запросы
         ├─ Ожидание завершения текущих запросов
         └─ Flush логов (25 сек максимум)
         │
         ▼
T=30s    SIGKILL если процесс ещё жив
         │
         ▼
         Pod удалён

9.3. Async graceful shutdown

import asyncio
import signal
import logging
from contextlib import asynccontextmanager

class AsyncGracefulShutdown:
    """Graceful shutdown для asyncio приложений."""

    def __init__(self):
        self.shutdown_event = asyncio.Event()
        self.logger = logging.getLogger(__name__)

    def setup_signal_handlers(self):
        loop = asyncio.get_running_loop()

        for sig in (signal.SIGTERM, signal.SIGINT):
            loop.add_signal_handler(
                sig,
                lambda s=sig: asyncio.create_task(self.shutdown(s))
            )

    async def shutdown(self, signum):
        signame = signal.Signals(signum).name
        self.logger.info(f"Received {signame}, shutting down...")

        # Ожидание завершения текущих задач
        self.logger.info("Waiting for pending tasks...")
        await asyncio.sleep(1)

        # Flush логов
        self.logger.info("Flushing logs...")
        for handler in logging.getLogger().handlers:
            if hasattr(handler, 'flush'):
                handler.flush()
            if hasattr(handler, 'close'):
                handler.close()

        self.shutdown_event.set()

@asynccontextmanager
async def lifespan(app):
    shutdown = AsyncGracefulShutdown()
    shutdown.setup_signal_handlers()

    # Настройка logging
    logging.basicConfig(
        level=logging.INFO,
        format='%(asctime)s %(levelname)s %(message)s'
    )

    yield

    # Cleanup при shutdown
    logging.shutdown()

# FastAPI пример
from fastapi import FastAPI

app = FastAPI(lifespan=lifespan)

@app.get("/")
async def root():
    return {"message": "Hello"}

10. Security и compliance

10.1. Маскирование чувствительных данных

import logging
import re
from pythonjsonlogger import jsonlogger

class SecureFormatter(jsonlogger.JsonFormatter):
    """Formatter с маскированием чувствительных данных."""

    # Паттерны для маскирования
    SENSITIVE_PATTERNS = [
        (re.compile(r'password["\']?\s*[:=]\s*["\']?[^"\',\s]+', re.I), 'password=***'),
        (re.compile(r'api[_-]?key["\']?\s*[:=]\s*["\']?[^"\',\s]+', re.I), 'api_key=***'),
        (re.compile(r'token["\']?\s*[:=]\s*["\']?[^"\',\s]+', re.I), 'token=***'),
        (re.compile(r'\b\d{16}\b'), '****-****-****-****'),  # Credit card
        (re.compile(r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b'), '***@***'),  # Email
    ]

    def format(self, record):
        # Форматирование
        message = super().format(record)

        # Маскирование
        for pattern, replacement in self.SENSITIVE_PATTERNS:
            message = pattern.sub(replacement, message)

        return message

# Использование
logger = logging.getLogger()
handler = logging.StreamHandler()
handler.setFormatter(SecureFormatter())
logger.addHandler(handler)

# Эти логи будут замаскированы
logger.info("User login", extra={"password": "secret123"})
logger.info("API call", extra={"api_key": "sk-123456"})
logger.info("User email", extra={"email": "user@example.com"})

10.2. Audit logging

import logging
from datetime import datetime
import json

# Отдельный logger для audit логов
audit_logger = logging.getLogger('audit')
audit_handler = logging.FileHandler('/var/log/audit/audit.log')
audit_handler.setFormatter(jsonlogger.JsonFormatter())
audit_logger.addHandler(audit_handler)
audit_logger.setLevel(logging.INFO)

def log_audit_event(event_type, user_id, resource, action, details=None):
    """Логирование audit события."""
    audit_logger.info(json.dumps({
        'event_type': event_type,
        'timestamp': datetime.utcnow().isoformat(),
        'user_id': user_id,
        'resource': resource,
        'action': action,
        'details': details or {},
        'compliance': {
            'sox': True,
            'gdpr': True,
            'hipaa': False,
        }
    }))

# Использование
log_audit_event(
    event_type='USER_LOGIN',
    user_id=123,
    resource='auth_system',
    action='login',
    details={'ip': '192.168.1.1', 'success': True}
)

log_audit_event(
    event_type='DATA_ACCESS',
    user_id=123,
    resource='users_table',
    action='SELECT',
    details={'rows_affected': 10}
)

10.3. GDPR compliance

import logging
from typing import Optional, Set

class GDPRFormatter(logging.Formatter):
    """Formatter с поддержкой GDPR."""

    def __init__(self, include_pii: bool = False):
        super().__init__()
        self.include_pii = include_pii
        self.pii_fields: Set[str] = {
            'email', 'phone', 'ssn', 'passport',
            'address', 'birth_date', 'full_name'
        }

    def format(self, record):
        if not self.include_pii:
            # Удаление PII полей
            if hasattr(record, 'msg') and isinstance(record.msg, dict):
                for field in self.pii_fields:
                    record.msg.pop(field, None)

            if hasattr(record, 'extra'):
                for field in self.pii_fields:
                    record.extra.pop(field, None)

        return super().format(record)

# Использование в production (PII выключен)
handler = logging.StreamHandler()
handler.setFormatter(GDPIFormatter(include_pii=False))

# В debug режиме можно включить
# handler.setFormatter(GDPIFormatter(include_pii=True))

11. Cost optimization

11.1. Sampling логов

import logging
import random
from pythonjsonlogger import jsonlogger

class SamplingFormatter(jsonlogger.JsonFormatter):
    """Formatter с sampling для снижения costs."""

    def __init__(self, sample_rate: float = 1.0):
        super().__init__()
        self.sample_rate = sample_rate

    def format(self, record):
        # DEBUG логи с sampling
        if record.levelno == logging.DEBUG:
            if random.random() > self.sample_rate:
                return None  # Не логировать

        return super().format(record)

# Использование
handler = logging.StreamHandler()

# 10% DEBUG логов
handler.setFormatter(SamplingFormatter(sample_rate=0.1))

logger = logging.getLogger()
logger.addHandler(handler)
logger.setLevel(logging.DEBUG)

11.2. Уровень логирования по окружению

import logging
import os

def setup_logging_by_environment():
    """Настройка уровня логирования по окружению."""
    env = os.getenv('ENVIRONMENT', 'development')

    logger = logging.getLogger()
    logger.handlers = []
    handler = logging.StreamHandler()

    if env == 'development':
        logger.setLevel(logging.DEBUG)
        handler.setFormatter(logging.Formatter(
            '%(asctime)s %(levelname)s %(message)s'
        ))
    elif env == 'staging':
        logger.setLevel(logging.INFO)
        handler.setFormatter(logging.Formatter(
            '%(asctime)s %(levelname)s %(message)s'
        ))
    elif env == 'production':
        logger.setLevel(logging.WARNING)
        handler.setFormatter(jsonlogger.JsonFormatter())
    else:
        logger.setLevel(logging.INFO)

    logger.addHandler(handler)
    return logger

11.3. Retention policy

# loki-retention.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: loki-config
data:
  loki.yaml: |
    limits_config:
      # Хранение логов
      retention_period: 168h  # 7 дней

      # По tenant
      per_tenant_override_config: /etc/loki/overrides.yaml

# overrides.yaml
overrides:
  tenant-a:
    retention_period: 720h  # 30 дней для compliance
  tenant-b:
    retention_period: 72h   # 3 дня для обычных

11.4. Стоимость хранения

┌─────────────────────────────────────────────────────────────┐
│              Storage Cost Comparison (100GB/день)            │
└─────────────────────────────────────────────────────────────┘

Elasticsearch (SSD):
  • 100GB/день × 7 дней = 700GB
  • SSD: $0.10/GB/мес
  • Итого: 700 × $0.10 = $70/мес

Loki + S3:
  • 100GB/день × 7 дней = 700GB
  • S3: $0.023/GB/мес
  • Итого: 700 × $0.023 = $16/мес

CloudWatch:
  • 100GB/день × 30 дней = 3TB
  • Ingestion: $0.50/GB = $1500/мес
  • Storage: $0.03/GB = $90/мес
  • Итого: $1590/мес

Экономия с Loki: ~95% vs CloudWatch

12. Troubleshooting

12.1. Логи не собираются

Проблема: Fluent Bit не собирает логи.

Проверка:

# Проверка Fluent Bit pod
kubectl get pods -n logging -l app=fluent-bit

# Проверка логов Fluent Bit
kubectl logs -n logging -l app=fluent-bit

# Проверка конфигурации
kubectl get configmap fluent-bit-config -n logging -o yaml

# Проверка прав доступа
kubectl auth can-i get pods --as=system:serviceaccount:logging:fluent-bit

12.2. Логи дублируются

Проблема: Логи появляются несколько раз.

Решение: Отключить propagate.

logger = logging.getLogger('myapp')
logger.propagate = False  # Не передавать в root logger

12.3. Высокая задержка логов

Проблема: Логи появляются с задержкой.

Причины:

Большой Flush интервал
Очереди переполнены
Network latency

Решение:

# Fluent Bit настройка
[SERVICE]
    Flush         1  # Уменьшить с 5 до 1
    Mem_Buf_Limit 50MB  # Увеличить буфер

12.4. Loki не принимает логи

Проверка:

# Проверка Loki status
curl http://loki.logging.svc.cluster.local:3100/ready

# Проверка логов Loki
kubectl logs -n logging -l app=loki

# Проверка конфигурации
kubectl get configmap loki-config -n logging -o yaml

13. Best Practices

13.1. Checklist для production

┌─────────────────────────────────────────────────────────────┐
│              Production Logging Checklist                    │
└─────────────────────────────────────────────────────────────┘

[ ] Логи в stdout/stderr (не в файлы)
[ ] JSON формат для всех логов
[ ] Добавлен trace_id для distributed tracing
[ ] Добавлен request_id для трассировки запроса
[ ] Kubernetes metadata через Downward API
[ ] Graceful shutdown с flush логов
[ ] Sampling для DEBUG логов
[ ] Маскирование PII данных
[ ] Retention policy настроен
[ ] Алерты на ошибки настроены
[ ] Dashboard в Grafana/Kibana создан

13.2. Do's and Don'ts

Do ✅	Don't ❌
Логи в stdout	Файлы в контейнере
JSON формат	Текст формат
Контекст в каждом логе	Логи без context
Sampling для DEBUG	100% DEBUG в production
Graceful shutdown	Нет обработки SIGTERM
Маскирование PII	Пароли в логах
Централизованный сбор	Локальные файлы

Резюме

Cloud-native принципы

Принцип	Описание
Log to stdout	Не файлы, stdout/stderr
Structured logs	JSON для машиночитаемости
Centralized	Сбор в единое хранилище
Context	request_id, trace_id, user_id
Graceful shutdown	Flush buffers при termination
Security	Маскирование PII данных

Architectures

Approach	Когда использовать
DaemonSet	Стандартный подход, эффективно
Sidecar	Кастомная обработка на pod
Direct	CloudWatch, GCP Logging

Tools

Tool	Назначение
Fluent Bit	Lightweight log collector
Fluentd	Full-featured log processor
Elasticsearch	Storage & search
Loki	Cheap storage for logs
Kibana/Grafana	Visualization

Cost optimization

Strategy	Экономия
Sampling DEBUG	50-90%
Loki vs ES	70-80%
Retention policy	Пропорционально времени
Level by env	30-50%

Проверьте свои знания

Вопросы ещё не добавлены

Вопросы для этой подтемы ещё не добавлены.

Далее: Production-паттерны