This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

5. Module Controller V2 Operation and Maintenance

Operation and Maintenance of Modules under the Koupleless Module Controller V2 Architecture

1 - 5.1 Module Release

Koupleless Module Online and Offline Procedures

Note: The current ModuleController v2 has only been tested on Kubernetes (K8S) version 1.24, with no testing on other versions. ModuleController V2 relies on certain Kubernetes (K8S) features; thus, the K8S version must not be lower than V1.10.

Module Release

ModuleController V2 supports deploying modules using any Pod deployment method, including but not limited to bare Pod deployment, Deployments, DaemonSets, and StatefulSets. Below, we demonstrate the release process using Deployment as an example; configurations for other methods can refer to the template configuration in Deployment:

kubectl apply -f samples/module-deployment.yaml --namespace yournamespace

The complete content is as follows:

apiVersion: apps/v1  # Specifies the API version, which must be listed in `kubectl api-versions`
kind: Deployment  # Specifies the role/type of resource to create
metadata:  # Metadata/attributes of the resource
  name: test-module-deployment  # Name of the resource, must be unique within the same namespace
  namespace: default # Namespace where it will be deployed
spec:  # Specification field of the resource
  replicas: 1
  revisionHistoryLimit: 3 # Retains historical versions
  selector: # Selector
    matchLabels: # Matching labels
      app: test-module-deployment
  strategy: # Strategy
    rollingUpdate: # Rolling update
      maxSurge: 30% # Maximum additional replicas that can exist, can be a percentage or an integer
      maxUnavailable: 30% # Maximum number of Pods that can become unavailable during the update, can be a percentage or an integer
    type: RollingUpdate # Rolling update strategy
  template: # Template
    metadata: # Metadata/attributes of the resource
      labels: # Sets resource labels
        module-controller.koupleless.io/component: module # Required, declares Pod type for management by module controller
        # Unique ID for Deployment
        app: test-module-deployment-non-peer
    spec: # Specification field of the resource
      containers:
        - name: biz1 # Required, declares the module's bizName, must match the artifactId declared in pom.xml
          image: https://serverless-opensource.oss-cn-shanghai.aliyuncs.com/module-packages/stable/biz1-web-single-host-0.0.1-SNAPSHOT-ark-biz.jar
          env:
            - name: BIZ_VERSION # Required, declares module's biz_version, value must match the version declared in pom.xml
              value: 0.0.1-SNAPSHOT
      affinity:
        nodeAffinity: # Required, declares the base selector to ensure modules are scheduled onto designated bases
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: base.koupleless.io/version
                    operator: In
                    values:
                      - 1.1.1 # Specified base version, mandatory, at least one required
                  - key: base.koupleless.io/name
                    operator: In
                    values:
                      - base  # Specified base bizName, mandatory, at least one required
      tolerations: # Required, allows pods to be scheduled onto base nodes
        - key: "schedule.koupleless.io/virtual-node"
          operator: "Equal"
          value: "True"
          effect: "NoExecute"

All configurations align with a regular Deployment, except for mandatory fields; additional Deployment configurations can be added for custom functionality.

Subsequent module updates can be achieved by updating the module Deployment’s Container Image and BIZ_VERSION, utilizing the Deployment’s RollingUpdate for phased updates.

The Module Controller ensures lossless module traffic during rolling updates by controlling the Pod status update sequence on the same base. The process is as follows:

  1. After updating the Deployment, new version module Pods are created based on the update strategy.
  2. The K8S Scheduler schedules these Pods to the VNode, where the old module version is still installed.
  3. The Module Controller detects the successful scheduling of Pods and initiates the installation of the new module version.
  4. Once the installation is complete, the Module Controller checks the status of all modules on the current base, sorts the associated Pods by creation time, and updates their statuses in sequence. This causes the Pods corresponding to the old module version to become Not Ready first, followed by the new version Pods becoming Ready.
  5. The Deployment controller, upon detecting that the new Pods are Ready, begins cleaning up old version Pods. It prioritizes deleting Pods that are Not Ready. At this point, the old version Pods on the same base are already Not Ready and are deleted, preventing Ready state old version Pods on other bases from being deleted first.

Throughout this process, there is no instance where a base lacks a module, ensuring lossless traffic during the module update.

Checking Module Status

This requirement can be met by examining Pods with nodeName corresponding to the base’s node. First, understand the mapping between base services and nodes.

In the design of Module Controller V2, each base generates a globally unique UUID at startup as the identifier for the base service. The corresponding node’s Name includes this ID.

Additionally, the IP of the base service corresponds one-to-one with the node’s IP, allowing selection of the corresponding base Node via IP.

Therefore, you can use the following command to view all Pods (modules) installed on a specific base and their statuses:

kubectl get pod -n <namespace> --field-selector status.podIP=<baseIP>

Or

kubectl get pod -n <namespace> --field-selector spec.nodeName=virtual-node-<baseUUID>

Module Offline

Removing the module’s Pod or other controlling resources in the K8S cluster completes the module offline process. For instance, in a Deployment scenario, you can directly delete the corresponding Deployment to offline the module:

kubectl delete yourmoduledeployment --namespace yournamespace

Replace yourmoduledeployment with your ModuleDeployment name and yournamespace with your namespace.

For customizing module release and operation strategies (such as grouping, Beta testing, pausing, etc.), refer to Module Operation and Scheduling Strategies.

The demonstrated example uses kubectl; directly calling the K8S API Server to delete Deployment also achieves module group offline.

Module Scaling

Since ModuleController V2 fully leverages K8S’s Pod orchestration scheme, scaling only occurs on ReplicaSets, Deployments, StatefulSets, etc. Scaling can be implemented according to the respective scaling methods; below, we use Deployment as an example:

kubectl scale deployments/yourdeploymentname --namespace=yournamespace --replicas=3

Replace yourdeploymentname with your Deployment name, yournamespace with your namespace, and set the replicas parameter to the desired scaled quantity.

Scaling strategies can also be implemented through API calls.

Module Replacement

In ModuleController v2, modules are tightly bound to Containers. To replace a module, you need to execute an update logic, updating the module’s Image address on the Pod where the module resides.

The specific replacement method varies slightly depending on the module deployment method; for instance, directly updating Pod information replaces the module in-place, while Deployment executes the configured update strategy (e.g., rolling update, creating new version Pods before deleting old ones). DaemonSet also executes the configured update strategy but with a different logic – deleting before creating, which might cause traffic loss.

Module Rollback

Being compatible with native Deployments, rollback can be achieved using Deployment’s rollback method.

To view deployment history:

kubectl rollout history deployment yourdeploymentname

To rollback to a specific version:

kubectl rollout undo deployment yourdeploymentname --to-revision=<TARGET_REVISION>

Other Operational Issues

Module Traffic Service Implementation

A native Service can be created for the module, which can provide services only when the base and ModuleController are deployed within the same VPC.

As bases and ModuleController may not be deployed in the same VPC currently, interaction between them is realized through MQTT message queues. Base nodes integrate the IP of the Pod where the base resides, and module Pods integrate the IP of the base node. Therefore, when the base itself and ModuleController are not in the same VPC, the IP of the module is actually invalid, preventing external service provision.

A potential solution involves forwarding at the Load Balancer (LB) layer of the Service, redirecting the Service’s traffic to the base service on the corresponding IP of the K8S where the base resides. Further evaluation and optimization of this issue will be based on actual usage scenarios.

Incompatible Base and Module Release

  1. Deploy a module’s Deployment first, specifying the latest version of the module code package address in Container and the name and version information of the new version base in nodeAffinity. This Deployment will create corresponding Pods, but they won’t be scheduled until new version bases are created.
  2. Update the base Deployment to release the new version image, triggering the replacement and restart of the base. Upon startup, the base informs the ModuleController V2 controller, creating a corresponding version node.
  3. After the creation of the corresponding version base node, the K8S scheduler automatically triggers scheduling, deploying the Pods created in step 1 onto the base node for installation of the new version module, thus achieving simultaneous release.


2 - 5.2 Module Release Operations Strategy

Koupleless Module Release Operations Strategy

Operations Strategy

To achieve zero-downtime changes in the production environment, the module release operations leverage Kubernetes (K8S) native scheduling capabilities to provide secure and reliable update functionality. Users can deploy module Pods according to business requirements.

Scheduling Strategy

Dispersion Scheduling: Achieved through native Deployment controls, with PodAffinity configurations facilitating dispersion scheduling.

Peer and Non-Peer Deployment

Peer and non-peer deployment strategies can be realized by selecting different deployment methods.

Peer Deployment

Two implementation methods are provided:

  1. Using DaemonSet: Modules can be deployed as DaemonSets, where a DaemonSet controller automatically creates a module Pod for each base node upon its addition, ensuring peer deployment.

    Note that DaemonSet rolling updates occur by uninstalling before reinstalling; choose based on actual business needs.

  2. Via Deployment: Unlike DaemonSet, an additional component is required to maintain module replica count equivalent to the number of base nodes (under development, expected in the next release). Supports install-before-uninstall, avoiding backend traffic loss in a microservices architecture.

    While Deployments strive for dispersion, they do not guarantee complete dispersion; modules might be deployed multiple times to the same base. For strong dispersion, add Pod anti-affinity settings in the Deployment, as shown below:

apiVersion: apps/v1
kind: Deployment
metadata:
    name: test-module-deployment
    namespace: default
    labels:
        module-controller.koupleless.io/component: module-deployment
spec:
    replicas: 1
    revisionHistoryLimit: 3
    selector:
        matchLabels:
            module.koupleless.io/name: biz1
            module.koupleless.io/version: 0.0.1
    strategy:
        rollingUpdate:
            maxSurge: 30%
            maxUnavailable: 30%
        type: RollingUpdate
    template:
        metadata:
            labels:
                module-controller.koupleless.io/component: module
                module.koupleless.io/name: biz1
                module.koupleless.io/version: 0.0.1
        spec:
            containers:
            - name: biz1
              image: https://serverless-opensource.oss-cn-shanghai.aliyuncs.com/module-packages/test_modules/biz1-0.0.1-ark-biz.jar
              env:
              - name: BIZ_VERSION
                value: 0.0.1
            affinity:
                nodeAffinity:
                    requiredDuringSchedulingIgnoredDuringExecution:
                        nodeSelectorTerms:
                        - matchExpressions:
                          - key: base.koupleless.io/version
                            operator: In
                            values: ["1.0.0"] # If modules can only be scheduled to specific node versions, this field is mandatory.
                          - key: base.koupleless.io/name
                            operator: In
                            values: ["base"]
                podAntiAffinity: # Core configuration for dispersion scheduling
                    requiredDuringSchedulingIgnoredDuringExecution:
                    - labelSelector:
                        matchLabels:
                            module.koupleless.io/name: biz1
                            module.koupleless.io/version: 0.0.1
                      topologyKey: topology.kubernetes.io/zone
            tolerations:
            - key: "schedule.koupleless.io/virtual-node"
              operator: "Equal"
              value: "True"
              effect: "NoExecute"

Non-Peer Deployment

Achieved by deploying modules as Deployments or ReplicaSets, with deployments based on the replica count setting.

Batch Updates

The strategy for batch updates requires custom control logic. ModuleController V2 introduces a capability where, when different versions of the same-named module are installed sequentially on a base, the Pod of the earlier-installed module enters BizDeactivate status and transitions to the Failed phase. Exploit this logic to implement batch update strategies.



3 - 5.3 Health Check

Background

The purpose of health checks is to obtain the status of an application throughout its lifecycle, including the operational and runtime phases, so that users can make decisions based on this status. For instance, if the application status is DOWN, it indicates a malfunction in the application, and the user may choose to restart or replace the machine.

In the case of a single application, health checks are relatively simple:

  • Operational phase status:
    • If it’s starting up, the status is UNKNOWN;
    • If startup fails, the status is DOWN;
    • If startup is successful, the status is UP.
  • Runtime phase status:
    • If all health checkpoints of the application are healthy, the status is UP;
    • If any health checkpoint of the application is not healthy, the status is DOWN.

In multi-application scenarios, the situation can be much more complex. We need to consider the impact of the multi-application’s status during both the operational phase and the runtime phase on the overall application health. When designing health checks, we need to consider the following two issues:

  • During the module operational phase, should the module start-up status affect the overall application health status?

    In different operational scenarios, users have different expectations. koupleless module operations have three scenarios:

    ScenarioImpact of the Module on the Overall Application Health Status
    Module Hot-DeploymentProvide configuration to let users decide whether the hot-deployment result should affect the overall application health status (default configuration is: does not affect the original health status of the application)
    Static Merge DeploymentModule deployment occurs during the base startup, so the module startup status should directly affect the overall health status of the application
    Module ReplayModule replay occurs during the base startup, thus the module startup status should directly affect the overall health status of the application
  • During the module runtime phase, should the module running status affect the overall application health status?

    The module runtime phase status should have a direct impact on the overall application health status.

Under this context, we have designed a health check approach for multi-application scenarios.

Usage

Requirements

  • Koupleless version >= 1.1.0
  • sofa-ark version >= 2.2.9

Obtain the overall health status of the application

There are 3 types of health status for the base:

StatusMeaning
UPHealthy, indicating readiness
UNKNOWNCurrently starting up
DOWNUnhealthy (may be due to startup failure or unhealthy running state)

Since Koupleless supports hot deployment of modules, while obtaining the overall health status of the application, users may wish for the module deployment result to impact the overall application health status or not.

Module launch result does not affect the overall application health status (default)

  • Features: For a healthy base, if the module installation fails, it will not affect the overall application health status.
  • Usage: Same as the health check configuration for regular Spring Boot, configure in the base’s application.properties:
# or do not configure management.endpoints.web.exposure.include
management.endpoints.web.exposure.include=health
# If you need to display all information, configure the following content
management.endpoint.health.show-components=always
management.endpoint.health.show-details=always
  • Access: {baseIp:port}/actuator/health
  • Result:
{
    // Overall health status of the application
    "status": "UP",
    "components": {
        // Aggregated health status of the modules
        "arkBizAggregate": {
            "status": "UP",
            "details": {
                "biz1:0.0.1-SNAPSHOT": {
                    "status": "UP",
                    // Can see the health status of all active HealthIndicators in the modules
                    "details": {
                        "diskSpace": {
                          "status": "UP",
                          "details": {
                            "total": 494384795648,
                            "free": 272435396608,
                            "threshold": 10485760,
                            "exists": true
                            }
                        },
                        "pingHe": {
                          "status": "UP",
                          "details": {}
                        }
                    }
                }
            }
        },
        // Startup health status of base and modules
        "masterBizStartUp": {
            "status": "UP",
            // Including the startup status of each module.
            "details": {
                "base:1.0.0": {
                    "status": "UP"
                },
                "biz1:0.0.1-SNAPSHOT": {
                    "status": "UP"
                },
                "biz2:0.0.1-SNAPSHOT": {
                    "status": "DOWN"
                }
            }
        }
    }
}

Overall Health Status in Different Scenarios

Scenario 1: Start base

StatusMeaning
UPBase is healthy
UNKNOWNBase is starting up
DOWNBase is unhealthy

Scenario 2: Start base and install modules with static merge deployment

StatusMeaning
UPBase and module are healthy
UNKNOWNBase or module is starting up
DOWNBase startup failed / base is unhealthy / module startup failed / module is unhealthy

Scenario 3: After base starts, install modules with hot deployment

Provide configuration to let users decide whether the result of module hot deployment affects the overall health status of the application (The default configuration is: Does not affect the original health status of the application)

Default Configuration: In the scenario of hot deployment, whether or not a module is successfully installed does not affect the overall health status of the application, as follows:

StatusMeaning
UPBase and module are healthy
UNKNOWNBase is starting up
DOWNBase startup failed / base is unhealthy / module is unhealthy

Scenario 4: Base running

StatusMeaning
UPBase and module are healthy
UNKNOWN-
DOWNBase is unhealthy or module is unhealthy

Scenario 5: After base started, reinstall module

Reinstall module refers to automatically pulling the module baseline and installing the module after the base is started.

Reinstall module is not supported at the moment

StatusMeaning
UPBase and module are healthy
UNKNOWNBase or module is starting up
DOWNBase is unhealthy or module startup failed or module is unhealthy

Module launch result affects the overall application health status

  • Features: For a healthy base, if a module installation fails, the overall application health status will also fail.
  • Usage: In addition to the above configuration, you need to configure koupleless.healthcheck.base.readiness.withAllBizReadiness=true, that is, configure in the base’s application.properties:
# Alternatively, do not configure management.endpoints.web.exposure.include
management.endpoints.web.exposure.include=health
# If you need to display all information, configure the following content
management.endpoint.health.show-components=always
management.endpoint.health.show-details=always
# Do not ignore module startup status
koupleless.healthcheck.base.readiness.withAllBizReadiness=true
  • Access: {baseIp:port}/actuator/health
  • Result:
{
    // Overall health status of the application
    "status": "UP",
    "components": {
        // Aggregated health status of the modules
        "arkBizAggregate": {
            "status": "UP",
            "details": {
                "biz1:0.0.1-SNAPSHOT": {
                    "status": "UP",
                    // Can see the health status of all active HealthIndicators in the modules
                    "details": {
                        "diskSpace": {
                          "status": "UP",
                          "details": {
                            "total": 494384795648,
                            "free": 272435396608,
                            "threshold": 10485760,
                            "exists": true
                            }
                        },
                        "pingHe": {
                          "status": "UP",
                          "details": {}
                        }
                    }
                }
            }
        },
        // Startup health status of base and modules
        "masterBizStartUp": {
            "status": "UP",
            // Including the startup status of each module.
            "details": {
                "base:1.0.0": {
                    "status": "UP"
                },
                "biz1:0.0.1-SNAPSHOT": {
                    "status": "UP"
                }
            }
        }
    }
}

Overall Health Status in Different Scenarios

Scenario 1: Start base

StatusMeaning
UPBase is healthy
UNKNOWNBase is starting up
DOWNBase is unhealthy

Scenario 2: Start base and install modules with static merge deployment

StatusMeaning
UPBase and module are healthy
UNKNOWNBase or module is starting up
DOWNBase startup failed / base is unhealthy / module startup failed / module is unhealthy

Scenario 3: After base starts, install modules with hot deployment

Provide configuration to let users decide whether the result of module hot deployment affects the overall health status of the application (The default configuration is: Does not affect the original health status of the application)

When configuring as koupleless.healthcheck.base.readiness.withAllBizReadiness=true:

StatusMeaning
UPBase and module are healthy
UNKNOWNBase or module is starting up
DOWNBase startup failed / Module startup failed / base is unhealthy / module is unhealthy

Scenario 4: Base running

StatusMeaning
UPBase and module are healthy
UNKNOWN-
DOWNBase is unhealthy or module is unhealthy

Scenario 5: After base started, reinstall module

Reinstall module refers to automatically pulling the module baseline and installing the module after the base is started.

Reinstall module is not supported at the moment.

Obtaining the Health Status of a Single Module

  • Usage: Consistent with the normal springboot health check configuration, enable the health node, i.e. configure in the module’s application.properties:
# or do not configure management.endpoints.web.exposure.include
management.endpoints.web.exposure.include=health
  • Access: {baseIp:port}/{bizWebContextPath}/actuator/info
  • Result:
{
    "status": "UP",
    "components": {
        "diskSpace": {
            "status": "UP",
            "details": {
                "total": 494384795648,
                "free": 270828220416,
                "threshold": 10485760,
                "exists": true
            }
        },
        "ping": {
            "status": "UP"
        }
    }
}

Get information about base, modules, and plugins

  • Usage: Same as the regular springboot health check configuration, enable the info endpoint, i.e., configure in the base’s application.properties:
# Note: If the user configures management.endpoints.web.exposure.include on their own, they need to include the health endpoint, otherwise the health endpoint cannot be accessed
management.endpoints.web.exposure.include=health,info
  • Access: {baseIp:port}/actuator/info
  • Result:
{
    "arkBizInfo": [
      {
        "bizName": "biz1",
        "bizVersion": "0.0.1-SNAPSHOT",
        "bizState": "ACTIVATED",
        "webContextPath": "biz1"
      },
      {
        "bizName": "base",
        "bizVersion": "1.0.0",
        "bizState": "ACTIVATED",
        "webContextPath": "/"
      }
    ],
    "arkPluginInfo": [
        {
            "pluginName": "koupleless-adapter-log4j2",
            "groupId": "com.alipay.sofa.koupleless",
            "artifactId": "koupleless-adapter-log4j2",
            "pluginVersion": "1.0.1-SNAPSHOT",
            "pluginUrl": "file:/Users/lipeng/.m2/repository/com/alipay/sofa/koupleless/koupleless-adapter-log4j2/1.0.1-SNAPSHOT/koupleless-adapter-log4j2-1.0.1-SNAPSHOT.jar!/",
            "pluginActivator": "com.alipay.sofa.koupleless.adapter.Log4j2AdapterActivator"
        },
        {
            "pluginName": "web-ark-plugin",
            "groupId": "com.alipay.sofa",
            "artifactId": "web-ark-plugin",
            "pluginVersion": "2.2.9-SNAPSHOT",
            "pluginUrl": "file:/Users/lipeng/.m2/repository/com/alipay/sofa/web-ark-plugin/2.2.9-SNAPSHOT/web-ark-plugin-2.2.9-SNAPSHOT.jar!/",
            "pluginActivator": "com.alipay.sofa.ark.web.embed.WebPluginActivator"
        },
        {
            "pluginName": "koupleless-base-plugin",
            "groupId": "com.alipay.sofa.koupleless",
            "artifactId": "koupleless-base-plugin",
            "pluginVersion": "1.0.1-SNAPSHOT",
            "pluginUrl": "file:/Users/lipeng/.m2/repository/com/alipay/sofa/koupleless/koupleless-base-plugin/1.0.1-SNAPSHOT/koupleless-base-plugin-1.0.1-SNAPSHOT.jar!/",
            "pluginActivator": "com.alipay.sofa.koupleless.plugin.ServerlessRuntimeActivator"
        }
    ]
}

4 - 5.4 Deployment of Module Controller V2

Deployment methodology for Koupleless Module Controller V2

Note: ModuleController V2 has only been tested on K8S version 1.24 and relies on certain K8S features. Therefore, the K8S version should not be lower than V1.10.

Resource File Locations

  1. Role Definition
  2. RBAC Definition
  3. ServiceAccount Definition
  4. ModuleControllerV2 Deployment Definition

Deployment Method

Use the kubectl apply command to sequentially apply the above four resource files to complete the deployment of a single-instance ModuleController.

For using the Module Controller’s sharded cluster capability, modify the above deployment definition to a Deployment version, placing the Pod Spec content into the Deployment template.

To use load balancing in a sharded cluster, set the IS_CLUSTER parameter to true in the Module Controller ENV configuration.

Configurable Parameter Explanation

Environment Variable Configuration

Below are some configurable environment variables and their explanations:

  • ENABLE_MQTT_TUNNEL

    • Meaning: Flag to enable MQTT operations pipeline. Set to true to enable. If enabled, configure the related environment variables below.
  • MQTT_BROKER

    • Meaning: URL of the MQTT broker.
  • MQTT_PORT

    • Meaning: MQTT port number.
  • MQTT_USERNAME

    • Meaning: MQTT username.
  • MQTT_PASSWORD

    • Meaning: MQTT password.
  • MQTT_CLIENT_PREFIX

    • Meaning: MQTT client prefix.
  • ENABLE_HTTP_TUNNEL

    • Meaning: Flag to enable HTTP operations pipeline. Set to true to enable. Optionally configure the environment variables below.
  • HTTP_TUNNEL_LISTEN_PORT

    • Meaning: Module Controller HTTP operations pipeline listening port, default is 7777.
  • REPORT_HOOKS

    • Meaning: Error reporting links. Supports multiple links separated by ;. Currently only supports DingTalk robot webhooks.
  • ENV

    • Meaning: Module Controller environment, set as VNode label for operations environment isolation.
  • IS_CLUSTER

    • Meaning: Cluster flag. If true, Virtual Kubelet will start with cluster configuration.
  • WORKLOAD_MAX_LEVEL

    • Meaning: Cluster configuration indicating the maximum workload level for workload calculation in Virtual Kubelet. Default is 3. Refer to Module Controller architecture design for detailed calculation rules.
  • ENABLE_MODULE_DEPLOYMENT_CONTROLLER

    • Meaning: Flag to enable the Module Deployment Controller. If true, the deployment controller will start to modify Module deployment replicas and baselines.
  • VNODE_WORKER_NUM

    • Meaning: Number of concurrent processing threads for VNode Modules. Set to 1 for single-threaded.
  • CLIENT_ID

    • Meaning: Optional, Module Controller instance ID. need to be unique in one env, will generate a random UUID in default.

Documentation Reference

For detailed structure and implementation, refer to the documentation.

5 - 5.5 Module Information Retrieval

Koupleless Module Information Retrieval

View the names and statuses of all installed modules on a base instance

kubectl get module -n <namespace> -l koupleless.alipay.com/base-instance-ip=<pod-ip> -o custom-columns=NAME:.metadata.name,STATUS:.status.status

or

kubectl get module -n <namespace> -l koupleless.alipay.com/base-instance-name=<pod-name> -o custom-columns=NAME:.metadata.name,STATUS:.status.status

View detailed information of all installed modules on a base instance

kubectl describe module -n <namespace> -l koupleless.alipay.com/base-instance-ip=<pod-ip>

or

kubectl describe module -n <namespace> -l koupleless.alipay.com/base-instance-name=<pod-name>

Replace <pod-ip> with the IP of the base instance you want to view, <pod-name> with the name of the base instance you want to view, and <namespace> with the namespace of the resources you want to view.

6 - 5.6 Error Codes

This article mainly introduces the error codes of Arklet, ModuleController, and KouplelessBoard.

ErrorCode Rules

Two-level error codes, support dynamic combination, using PascalCase, different levels of error codes can only be separated by “."
such as Arklet.InstallModuleFailed
Level 1: Error Source
Level 2: Error Type

Suggestion

Briefly explain the solution for upstream operations to refer to.

Arklet Error Codes

Level 1: Error Source

CodeMeaning
UserErrors caused by the user
ArkletExceptions from Arklet itself
ModuleControllerExceptions caused by specific upstream components
OtherUpstreamExceptions caused by unknown upstream

Level 2: Error Type

Business TypeError SourceError TypeMeaningSolution
GeneralArkletUnknownErrorUnknown error (default)Please check

ModuleControllerInvalidParameterParameter validation failedPlease check the parameters
ModuleControllerInvalidRequestInvalid operation typePlease check the request
OtherUpstreamDecodeURLFailedURL parsing failedPlease check if the URL is valid
Query RelatedArkletNoMatchedBizModule query failed, no target biz exists-
ArkletInvalidBizNameModule query failed, query parameter bizName cannot be emptyPlease add the query parameter bizName
Installation RelatedArkletInstallationRequirementNotMetModule installation conditions are not metPlease check the necessary parameters for module installation
ArkletPullBizErrorPackage pulling failedPlease retry
ArkletPullBizTimeOutPackage pulling timed outPlease retry
UserDiskFullDisk full when pulling the packagePlease replace the base
UserMachineMalfunctionMachine malfunctionPlease restart the base
UserMetaspaceFullMetaspace exceeds the thresholdPlease restart the base
ArkletInstallBizExecutingModule is being installedPlease retry

ArkletInstallBizTimedOutUninstalling old module failed during module installationPlease check
ArkletInstallBizFailedNew module installation failed during module installationPlease check
UserInstallBizUserErrorModule installation failed, business exceptionPlease check the business code
Uninstallation RelatedArkletUninstallBizFailedUninstallation failed, current biz still exists in the containerPlease check
ArkletUnInstallationRequirementNotMetModule uninstallation conditions are not metThe current module has multiple versions, and the version to be uninstalled is in the active state, which is not allowed to be uninstalled

ModuleController Error Codes

Level 1: Error Source

CodeMeaning
UserErrors caused by the user
ModuleControllerExceptions from ModuleController itself
KouplelessBoardExceptions caused by specific upstream components
ArkletExceptions caused by specific downstream components
OtherUpstreamExceptions caused by unknown upstream
OtherDownstreamExceptions caused by unknown downstream

Level 2: Error Type

Business TypeError SourceError TypeMeaningSolution
GeneralModuleControllerUnknownErrorUnknown error (default)Please check

OtherUpstreamInvalidParameterParameter validation failedPlease check the parameters
ArkletArkletServiceNotFoundBase service not foundPlease ensure that the base has Koupleless dependency
ArkletNetworkErrorNetwork call exceptionPlease retry
OtherUpstreamSecretAKErrorSignature exceptionPlease confirm that there are operation permissions
ModuleControllerDBAccessErrorDatabase read/write failedPlease retry
OtherUpstreamDecodeURLFailedURL parsing failedPlease check if the URL is valid
ModuleControllerRetryTimesExceededMultiple retries failedPlease check
ModuleControllerProcessNodeMissedLack of available working nodesPlease retry later
ModuleControllerServiceMissedService missingPlease check if ModuleController version contains the template type
ModuleControllerResourceConstranedResource limited (thread pool, queue, etc. full)Please retry later
Installation RelatedArkletInstallModuleTimedOutModule installation timed outPlease retry
Arklet / UserInstallModuleFailedModule installation failedPlease check the failure reason
ArkletInstallModuleExecutingModule is being installedThe same module is being installed, please retry later
UserDiskFullDisk fullPlease replace
Uninstallation RelatedOtherUpstreamEmptyIPListIP list is emptyPlease enter the IP to be uninstalled

ArkletUninstallBizTimedOutModule uninstallation timed outPlease retry

ArkletUninstallBizFailedModule uninstallation failedPlease check
Base RelatedModuleControllerBaseInstanceNotFoundBase instance not foundPlease ensure that the base instance exists

KubeAPIServerGetBaseInstanceFailedFailed to query base informationPlease ensure that the base instance exists
ModuleControllerBaseInstanceInOperationBase instance is under operationPlease retry later
ModuleControllerBaseInstanceNotReadyBase data not read or base is not availablePlease ensure that the base is available
ModuleControllerBaseInstanceHasBeenReplacedBase instance has been replacedAdditional base instances will be added later, please wait
ModuleControllerInsufficientHealthyBaseInstanceInsufficient healthy base instancesPlease scale out
Scaling RelatedModuleControllerRescaleRequirementNotMetScaling conditions are not metPlease check if there are enough machines for scaling/Check the scaling ratio

⚠️ Note: The base runs on different base instances, such as pods. Therefore, BaseInstanceInOperation, BaseInstanceNotReady, BaseInstanceHasBeenReplaced, InsufficientHealthyBaseInstance error codes may refer to both the application status of the base and the status of the base instance.

DashBoard Error Codes

Level 1: Error Source

CodeMeaning
KouplelessBoardExceptions from KouplelessBoard itself
ModuleControllerExceptions caused by specific downstream components
OtherUpstreamExceptions caused by unknown upstream
OtherDownstreamExceptions caused by unknown downstream

Level 2: Error Type

Business TypeError SourceError TypeMeaningSolution
GeneralKouplelessBoardUnknownErrorUnknown error (default)

OtherUpstreamInvalidParameterParameter validation failedPlease check the parameters
Work OrderKouplelessBoardOperationPlanNotFoundWork order not foundPlease check
KouplelessBoardOperationPlanMutualExclusionWork order mutual exclusionPlease retry
Internal ErrorKouplelessBoardInternalErrorInternal system errorPlease check
KouplelessBoardThreadPoolErrorThread pool call exceptionPlease check
Operation and MaintenanceModuleControllerBaseInstanceOperationFailedOperation failedPlease check
ModuleControllerBaseInstanceUnderOperationUnder operationPlease retry
ModuleControllerBaseInstanceOperationTimeOutOperation timed outPlease retry
ModuleControllerOverFiftyPercentBaseInstancesUnavaliableMore than 50% of machine traffic is unreachablePlease check the base instance
KouplelessBoardBaselineInconsistencyConsistency check failed (inconsistent baseline)Please check
External Service Call ErrorOtherDownstreamExternalErrorExternal service call errorPlease check
KouplelessBoardNetworkErrorExternal service call timed outPlease retry