This is the multi-page printable view of this section. Click here to print.
5. Module Controller V2 Operation and Maintenance
1 - 5.1 Module Release
Note: The current ModuleController v2 has only been tested on Kubernetes (K8S) version 1.24, with no testing on other versions. ModuleController V2 relies on certain Kubernetes (K8S) features; thus, the K8S version must not be lower than V1.10.
Module Release
ModuleController V2 supports deploying modules using any Pod deployment method, including but not limited to bare Pod deployment, Deployments, DaemonSets, and StatefulSets. Below, we demonstrate the release process using Deployment as an example; configurations for other methods can refer to the template
configuration in Deployment:
kubectl apply -f samples/module-deployment.yaml --namespace yournamespace
The complete content is as follows:
apiVersion: apps/v1 # Specifies the API version, which must be listed in `kubectl api-versions`
kind: Deployment # Specifies the role/type of resource to create
metadata: # Metadata/attributes of the resource
name: test-module-deployment # Name of the resource, must be unique within the same namespace
namespace: default # Namespace where it will be deployed
spec: # Specification field of the resource
replicas: 1
revisionHistoryLimit: 3 # Retains historical versions
selector: # Selector
matchLabels: # Matching labels
app: test-module-deployment
strategy: # Strategy
rollingUpdate: # Rolling update
maxSurge: 30% # Maximum additional replicas that can exist, can be a percentage or an integer
maxUnavailable: 30% # Maximum number of Pods that can become unavailable during the update, can be a percentage or an integer
type: RollingUpdate # Rolling update strategy
template: # Template
metadata: # Metadata/attributes of the resource
labels: # Sets resource labels
module-controller.koupleless.io/component: module # Required, declares Pod type for management by module controller
# Unique ID for Deployment
app: test-module-deployment-non-peer
spec: # Specification field of the resource
containers:
- name: biz1 # Required, declares the module's bizName, must match the artifactId declared in pom.xml
image: https://serverless-opensource.oss-cn-shanghai.aliyuncs.com/module-packages/stable/biz1-web-single-host-0.0.1-SNAPSHOT-ark-biz.jar
env:
- name: BIZ_VERSION # Required, declares module's biz_version, value must match the version declared in pom.xml
value: 0.0.1-SNAPSHOT
affinity:
nodeAffinity: # Required, declares the base selector to ensure modules are scheduled onto designated bases
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: base.koupleless.io/stack
operator: In
values:
- java # Mandatory in a multi-language environment, specifies the tech stack
- key: base.koupleless.io/version
operator: In
values:
- 1.1.1 # Specified base version, mandatory, at least one required
- key: base.koupleless.io/name
operator: In
values:
- base # Specified base bizName, mandatory, at least one required
tolerations: # Required, allows pods to be scheduled onto base nodes
- key: "schedule.koupleless.io/virtual-node"
operator: "Equal"
value: "True"
effect: "NoExecute"
All configurations align with a regular Deployment, except for mandatory fields; additional Deployment configurations can be added for custom functionality.
Subsequent module updates can be achieved by updating the module Deployment’s Container Image and BIZ_VERSION, utilizing the Deployment’s RollingUpdate for phased updates.
The Module Controller ensures lossless module traffic during rolling updates by controlling the Pod status update sequence on the same base. The process is as follows:
- After updating the Deployment, new version module Pods are created based on the update strategy.
- The K8S Scheduler schedules these Pods to the VNode, where the old module version is still installed.
- The Module Controller detects the successful scheduling of Pods and initiates the installation of the new module version.
- Once the installation is complete, the Module Controller checks the status of all modules on the current base, sorts the associated Pods by creation time, and updates their statuses in sequence. This causes the Pods corresponding to the old module version to become Not Ready first, followed by the new version Pods becoming Ready.
- The Deployment controller, upon detecting that the new Pods are Ready, begins cleaning up old version Pods. It prioritizes deleting Pods that are Not Ready. At this point, the old version Pods on the same base are already Not Ready and are deleted, preventing Ready state old version Pods on other bases from being deleted first.
Throughout this process, there is no instance where a base lacks a module, ensuring lossless traffic during the module update.
Checking Module Status
This requirement can be met by examining Pods with nodeName corresponding to the base’s node. First, understand the mapping between base services and nodes.
In the design of Module Controller V2, each base generates a globally unique UUID at startup as the identifier for the base service. The corresponding node’s Name includes this ID.
Additionally, the IP of the base service corresponds one-to-one with the node’s IP, allowing selection of the corresponding base Node via IP.
Therefore, you can use the following command to view all Pods (modules) installed on a specific base and their statuses:
kubectl get pod -n <namespace> --field-selector status.podIP=<baseIP>
Or
kubectl get pod -n <namespace> --field-selector spec.nodeName=virtual-node-<baseUUID>
Module Offline
Removing the module’s Pod or other controlling resources in the K8S cluster completes the module offline process. For instance, in a Deployment scenario, you can directly delete the corresponding Deployment to offline the module:
kubectl delete yourmoduledeployment --namespace yournamespace
Replace yourmoduledeployment with your ModuleDeployment name and yournamespace with your namespace.
For customizing module release and operation strategies (such as grouping, Beta testing, pausing, etc.), refer to Module Operation and Scheduling Strategies.
The demonstrated example uses kubectl
; directly calling the K8S API Server to delete Deployment also achieves module group offline.
Module Scaling
Since ModuleController V2 fully leverages K8S’s Pod orchestration scheme, scaling only occurs on ReplicaSets, Deployments, StatefulSets, etc. Scaling can be implemented according to the respective scaling methods; below, we use Deployment as an example:
kubectl scale deployments/yourdeploymentname --namespace=yournamespace --replicas=3
Replace yourdeploymentname with your Deployment name, yournamespace with your namespace, and set the replicas
parameter to the desired scaled quantity.
Scaling strategies can also be implemented through API calls.
Module Replacement
In ModuleController v2, modules are tightly bound to Containers. To replace a module, you need to execute an update logic, updating the module’s Image address on the Pod where the module resides.
The specific replacement method varies slightly depending on the module deployment method; for instance, directly updating Pod information replaces the module in-place, while Deployment executes the configured update strategy (e.g., rolling update, creating new version Pods before deleting old ones). DaemonSet also executes the configured update strategy but with a different logic – deleting before creating, which might cause traffic loss.
Module Rollback
Being compatible with native Deployments, rollback can be achieved using Deployment’s rollback method.
To view deployment history:
kubectl rollout history deployment yourdeploymentname
To rollback to a specific version:
kubectl rollout undo deployment yourdeploymentname --to-revision=<TARGET_REVISION>
Other Operational Issues
Module Traffic Service Implementation
A native Service can be created for the module, which can provide services only when the base and ModuleController are deployed within the same VPC.
As bases and ModuleController may not be deployed in the same VPC currently, interaction between them is realized through MQTT message queues. Base nodes integrate the IP of the Pod where the base resides, and module Pods integrate the IP of the base node. Therefore, when the base itself and ModuleController are not in the same VPC, the IP of the module is actually invalid, preventing external service provision.
A potential solution involves forwarding at the Load Balancer (LB) layer of the Service, redirecting the Service’s traffic to the base service on the corresponding IP of the K8S where the base resides. Further evaluation and optimization of this issue will be based on actual usage scenarios.
Incompatible Base and Module Release
- Deploy a module’s Deployment first, specifying the latest version of the module code package address in Container and the name and version information of the new version base in nodeAffinity. This Deployment will create corresponding Pods, but they won’t be scheduled until new version bases are created.
- Update the base Deployment to release the new version image, triggering the replacement and restart of the base. Upon startup, the base informs the ModuleController V2 controller, creating a corresponding version node.
- After the creation of the corresponding version base node, the K8S scheduler automatically triggers scheduling, deploying the Pods created in step 1 onto the base node for installation of the new version module, thus achieving simultaneous release.
2 - 5.2 Module Release Operations Strategy
Operations Strategy
To achieve zero-downtime changes in the production environment, the module release operations leverage Kubernetes (K8S) native scheduling capabilities to provide secure and reliable update functionality. Users can deploy module Pods according to business requirements.
Scheduling Strategy
Dispersion Scheduling: Achieved through native Deployment controls, with PodAffinity configurations facilitating dispersion scheduling.
Peer and Non-Peer Deployment
Peer and non-peer deployment strategies can be realized by selecting different deployment methods.
Peer Deployment
Two implementation methods are provided:
Using DaemonSet: Modules can be deployed as DaemonSets, where a DaemonSet controller automatically creates a module Pod for each base node upon its addition, ensuring peer deployment.
Note that DaemonSet rolling updates occur by uninstalling before reinstalling; choose based on actual business needs.
Via Deployment: Unlike DaemonSet, an additional component is required to maintain module replica count equivalent to the number of base nodes (under development, expected in the next release). Supports install-before-uninstall, avoiding backend traffic loss in a microservices architecture.
While Deployments strive for dispersion, they do not guarantee complete dispersion; modules might be deployed multiple times to the same base. For strong dispersion, add Pod anti-affinity settings in the Deployment, as shown below:
apiVersion: apps/v1
kind: Deployment
metadata:
name: test-module-deployment
namespace: default
labels:
module-controller.koupleless.io/component: module-deployment
spec:
replicas: 1
revisionHistoryLimit: 3
selector:
matchLabels:
module.koupleless.io/name: biz1
module.koupleless.io/version: 0.0.1
strategy:
rollingUpdate:
maxSurge: 30%
maxUnavailable: 30%
type: RollingUpdate
template:
metadata:
labels:
module-controller.koupleless.io/component: module
module.koupleless.io/name: biz1
module.koupleless.io/version: 0.0.1
spec:
containers:
- name: biz1
image: https://serverless-opensource.oss-cn-shanghai.aliyuncs.com/module-packages/test_modules/biz1-0.0.1-ark-biz.jar
env:
- name: BIZ_VERSION
value: 0.0.1
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: base.koupleless.io/stack
operator: In
values: ["java"]
- key: base.koupleless.io/version
operator: In
values: ["1.0.0"] # If modules can only be scheduled to specific node versions, this field is mandatory.
- key: base.koupleless.io/name
operator: In
values: ["base"]
podAntiAffinity: # Core configuration for dispersion scheduling
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
module.koupleless.io/name: biz1
module.koupleless.io/version: 0.0.1
topologyKey: topology.kubernetes.io/zone
tolerations:
- key: "schedule.koupleless.io/virtual-node"
operator: "Equal"
value: "True"
effect: "NoExecute"
Non-Peer Deployment
Achieved by deploying modules as Deployments or ReplicaSets, with deployments based on the replica count setting.
Batch Updates
The strategy for batch updates requires custom control logic. ModuleController V2 introduces a capability where, when different versions of the same-named module are installed sequentially on a base, the Pod of the earlier-installed module enters BizDeactivate status and transitions to the Failed phase. Exploit this logic to implement batch update strategies.
3 - 5.3 Health Check
Background
The purpose of health checks is to obtain the status of an application throughout its lifecycle, including the operational and runtime phases, so that users can make decisions based on this status. For instance, if the application status is DOWN, it indicates a malfunction in the application, and the user may choose to restart or replace the machine.
In the case of a single application, health checks are relatively simple:
- Operational phase status:
- If it’s starting up, the status is UNKNOWN;
- If startup fails, the status is DOWN;
- If startup is successful, the status is UP.
- Runtime phase status:
- If all health checkpoints of the application are healthy, the status is UP;
- If any health checkpoint of the application is not healthy, the status is DOWN.
In multi-application scenarios, the situation can be much more complex. We need to consider the impact of the multi-application’s status during both the operational phase and the runtime phase on the overall application health. When designing health checks, we need to consider the following two issues:
During the module operational phase, should the module start-up status affect the overall application health status?
In different operational scenarios, users have different expectations. koupleless module operations have three scenarios:
Scenario Impact of the Module on the Overall Application Health Status Module Hot-Deployment Provide configuration to let users decide whether the hot-deployment result should affect the overall application health status (default configuration is: does not affect the original health status of the application) Static Merge Deployment Module deployment occurs during the base startup, so the module startup status should directly affect the overall health status of the application Module Replay Module replay occurs during the base startup, thus the module startup status should directly affect the overall health status of the application During the module runtime phase, should the module running status affect the overall application health status?
The module runtime phase status should have a direct impact on the overall application health status.
Under this context, we have designed a health check approach for multi-application scenarios.
Usage
Requirements
- Koupleless version >= 1.1.0
- sofa-ark version >= 2.2.9
Obtain the overall health status of the application
There are 3 types of health status for the base:
Status | Meaning |
---|---|
UP | Healthy, indicating readiness |
UNKNOWN | Currently starting up |
DOWN | Unhealthy (may be due to startup failure or unhealthy running state) |
Since Koupleless supports hot deployment of modules, while obtaining the overall health status of the application, users may wish for the module deployment result to impact the overall application health status or not.
Module launch result does not affect the overall application health status (default)
- Features: For a healthy base, if the module installation fails, it will not affect the overall application health status.
- Usage: Same as the health check configuration for regular Spring Boot, configure in the base’s application.properties:
# or do not configure management.endpoints.web.exposure.include
management.endpoints.web.exposure.include=health
# If you need to display all information, configure the following content
management.endpoint.health.show-components=always
management.endpoint.health.show-details=always
- Access: {baseIp:port}/actuator/health
- Result:
{
// Overall health status of the application
"status": "UP",
"components": {
// Aggregated health status of the modules
"arkBizAggregate": {
"status": "UP",
"details": {
"biz1:0.0.1-SNAPSHOT": {
"status": "UP",
// Can see the health status of all active HealthIndicators in the modules
"details": {
"diskSpace": {
"status": "UP",
"details": {
"total": 494384795648,
"free": 272435396608,
"threshold": 10485760,
"exists": true
}
},
"pingHe": {
"status": "UP",
"details": {}
}
}
}
}
},
// Startup health status of base and modules
"masterBizStartUp": {
"status": "UP",
// Including the startup status of each module.
"details": {
"base:1.0.0": {
"status": "UP"
},
"biz1:0.0.1-SNAPSHOT": {
"status": "UP"
},
"biz2:0.0.1-SNAPSHOT": {
"status": "DOWN"
}
}
}
}
}
Overall Health Status in Different Scenarios
Scenario 1: Start base
Status | Meaning |
---|---|
UP | Base is healthy |
UNKNOWN | Base is starting up |
DOWN | Base is unhealthy |
Scenario 2: Start base and install modules with static merge deployment
Status | Meaning |
---|---|
UP | Base and module are healthy |
UNKNOWN | Base or module is starting up |
DOWN | Base startup failed / base is unhealthy / module startup failed / module is unhealthy |
Scenario 3: After base starts, install modules with hot deployment
Provide configuration to let users decide whether the result of module hot deployment affects the overall health status of the application (The default configuration is: Does not affect the original health status of the application)
Default Configuration: In the scenario of hot deployment, whether or not a module is successfully installed does not affect the overall health status of the application, as follows:
Status | Meaning |
---|---|
UP | Base and module are healthy |
UNKNOWN | Base is starting up |
DOWN | Base startup failed / base is unhealthy / module is unhealthy |
Scenario 4: Base running
Status | Meaning |
---|---|
UP | Base and module are healthy |
UNKNOWN | - |
DOWN | Base is unhealthy or module is unhealthy |
Scenario 5: After base started, reinstall module
Reinstall module refers to automatically pulling the module baseline and installing the module after the base is started.
Reinstall module is not supported at the moment
Status | Meaning |
---|---|
UP | Base and module are healthy |
UNKNOWN | Base or module is starting up |
DOWN | Base is unhealthy or module startup failed or module is unhealthy |
Module launch result affects the overall application health status
- Features: For a healthy base, if a module installation fails, the overall application health status will also fail.
- Usage: In addition to the above configuration, you need to configure koupleless.healthcheck.base.readiness.withAllBizReadiness=true, that is, configure in the base’s application.properties:
# Alternatively, do not configure management.endpoints.web.exposure.include
management.endpoints.web.exposure.include=health
# If you need to display all information, configure the following content
management.endpoint.health.show-components=always
management.endpoint.health.show-details=always
# Do not ignore module startup status
koupleless.healthcheck.base.readiness.withAllBizReadiness=true
- Access: {baseIp:port}/actuator/health
- Result:
{
// Overall health status of the application
"status": "UP",
"components": {
// Aggregated health status of the modules
"arkBizAggregate": {
"status": "UP",
"details": {
"biz1:0.0.1-SNAPSHOT": {
"status": "UP",
// Can see the health status of all active HealthIndicators in the modules
"details": {
"diskSpace": {
"status": "UP",
"details": {
"total": 494384795648,
"free": 272435396608,
"threshold": 10485760,
"exists": true
}
},
"pingHe": {
"status": "UP",
"details": {}
}
}
}
}
},
// Startup health status of base and modules
"masterBizStartUp": {
"status": "UP",
// Including the startup status of each module.
"details": {
"base:1.0.0": {
"status": "UP"
},
"biz1:0.0.1-SNAPSHOT": {
"status": "UP"
}
}
}
}
}
Overall Health Status in Different Scenarios
Scenario 1: Start base
Status | Meaning |
---|---|
UP | Base is healthy |
UNKNOWN | Base is starting up |
DOWN | Base is unhealthy |
Scenario 2: Start base and install modules with static merge deployment
Status | Meaning |
---|---|
UP | Base and module are healthy |
UNKNOWN | Base or module is starting up |
DOWN | Base startup failed / base is unhealthy / module startup failed / module is unhealthy |
Scenario 3: After base starts, install modules with hot deployment
Provide configuration to let users decide whether the result of module hot deployment affects the overall health status of the application (The default configuration is: Does not affect the original health status of the application)
When configuring as koupleless.healthcheck.base.readiness.withAllBizReadiness=true:
Status | Meaning |
---|---|
UP | Base and module are healthy |
UNKNOWN | Base or module is starting up |
DOWN | Base startup failed / Module startup failed / base is unhealthy / module is unhealthy |
Scenario 4: Base running
Status | Meaning |
---|---|
UP | Base and module are healthy |
UNKNOWN | - |
DOWN | Base is unhealthy or module is unhealthy |
Scenario 5: After base started, reinstall module
Reinstall module refers to automatically pulling the module baseline and installing the module after the base is started.
Reinstall module is not supported at the moment.
Obtaining the Health Status of a Single Module
- Usage: Consistent with the normal springboot health check configuration, enable the health node, i.e. configure in the module’s application.properties:
# or do not configure management.endpoints.web.exposure.include
management.endpoints.web.exposure.include=health
- Access: {baseIp:port}/{bizWebContextPath}/actuator/info
- Result:
{
"status": "UP",
"components": {
"diskSpace": {
"status": "UP",
"details": {
"total": 494384795648,
"free": 270828220416,
"threshold": 10485760,
"exists": true
}
},
"ping": {
"status": "UP"
}
}
}
Get information about base, modules, and plugins
- Usage: Same as the regular springboot health check configuration, enable the info endpoint, i.e., configure in the base’s application.properties:
# Note: If the user configures management.endpoints.web.exposure.include on their own, they need to include the health endpoint, otherwise the health endpoint cannot be accessed
management.endpoints.web.exposure.include=health,info
- Access: {baseIp:port}/actuator/info
- Result:
{
"arkBizInfo": [
{
"bizName": "biz1",
"bizVersion": "0.0.1-SNAPSHOT",
"bizState": "ACTIVATED",
"webContextPath": "biz1"
},
{
"bizName": "base",
"bizVersion": "1.0.0",
"bizState": "ACTIVATED",
"webContextPath": "/"
}
],
"arkPluginInfo": [
{
"pluginName": "koupleless-adapter-log4j2",
"groupId": "com.alipay.sofa.koupleless",
"artifactId": "koupleless-adapter-log4j2",
"pluginVersion": "1.0.1-SNAPSHOT",
"pluginUrl": "file:/Users/lipeng/.m2/repository/com/alipay/sofa/koupleless/koupleless-adapter-log4j2/1.0.1-SNAPSHOT/koupleless-adapter-log4j2-1.0.1-SNAPSHOT.jar!/",
"pluginActivator": "com.alipay.sofa.koupleless.adapter.Log4j2AdapterActivator"
},
{
"pluginName": "web-ark-plugin",
"groupId": "com.alipay.sofa",
"artifactId": "web-ark-plugin",
"pluginVersion": "2.2.9-SNAPSHOT",
"pluginUrl": "file:/Users/lipeng/.m2/repository/com/alipay/sofa/web-ark-plugin/2.2.9-SNAPSHOT/web-ark-plugin-2.2.9-SNAPSHOT.jar!/",
"pluginActivator": "com.alipay.sofa.ark.web.embed.WebPluginActivator"
},
{
"pluginName": "koupleless-base-plugin",
"groupId": "com.alipay.sofa.koupleless",
"artifactId": "koupleless-base-plugin",
"pluginVersion": "1.0.1-SNAPSHOT",
"pluginUrl": "file:/Users/lipeng/.m2/repository/com/alipay/sofa/koupleless/koupleless-base-plugin/1.0.1-SNAPSHOT/koupleless-base-plugin-1.0.1-SNAPSHOT.jar!/",
"pluginActivator": "com.alipay.sofa.koupleless.plugin.ServerlessRuntimeActivator"
}
]
}
4 - 5.4 Deployment of Module Controller V2
Note: ModuleController V2 has only been tested on K8S version 1.24 and relies on certain K8S features. Therefore, the K8S version should not be lower than V1.10.
Resource File Locations
Deployment Method
Use the kubectl apply
command to sequentially apply the above four resource files to complete the deployment of a single-instance ModuleController.
For using the Module Controller’s sharded cluster capability, modify the above deployment definition to a Deployment version, placing the Pod Spec content into the Deployment template.
To use load balancing in a sharded cluster, set the IS_CLUSTER
parameter to true in the Module Controller ENV configuration.
Configurable Parameter Explanation
Environment Variable Configuration
Below are some configurable environment variables and their explanations:
ENABLE_MQTT_TUNNEL
- Meaning: Flag to enable MQTT operations pipeline. Set to
true
to enable. If enabled, configure the related environment variables below.
- Meaning: Flag to enable MQTT operations pipeline. Set to
MQTT_BROKER
- Meaning: URL of the MQTT broker.
MQTT_PORT
- Meaning: MQTT port number.
MQTT_USERNAME
- Meaning: MQTT username.
MQTT_PASSWORD
- Meaning: MQTT password.
MQTT_CLIENT_PREFIX
- Meaning: MQTT client prefix.
ENABLE_HTTP_TUNNEL
- Meaning: Flag to enable HTTP operations pipeline. Set to
true
to enable. Optionally configure the environment variables below.
- Meaning: Flag to enable HTTP operations pipeline. Set to
HTTP_TUNNEL_LISTEN_PORT
- Meaning: Module Controller HTTP operations pipeline listening port, default is 7777.
REPORT_HOOKS
- Meaning: Error reporting links. Supports multiple links separated by
;
. Currently only supports DingTalk robot webhooks.
- Meaning: Error reporting links. Supports multiple links separated by
ENV
- Meaning: Module Controller environment, set as VNode label for operations environment isolation.
IS_CLUSTER
- Meaning: Cluster flag. If
true
, Virtual Kubelet will start with cluster configuration.
- Meaning: Cluster flag. If
WORKLOAD_MAX_LEVEL
- Meaning: Cluster configuration indicating the maximum workload level for workload calculation in Virtual Kubelet. Default is 3. Refer to Module Controller architecture design for detailed calculation rules.
ENABLE_MODULE_DEPLOYMENT_CONTROLLER
- Meaning: Flag to enable the Module Deployment Controller. If
true
, the deployment controller will start to modify Module deployment replicas and baselines.
- Meaning: Flag to enable the Module Deployment Controller. If
VNODE_WORKER_NUM
- Meaning: Number of concurrent processing threads for VNode Modules. Set to 1 for single-threaded.
CLIENT_ID
- Meaning: Optional, Module Controller instance ID. need to be unique in one env, will generate a random UUID in default.
Documentation Reference
For detailed structure and implementation, refer to the documentation.
5 - 5.5 Module Information Retrieval
View the names and statuses of all installed modules on a base instance
kubectl get module -n <namespace> -l koupleless.alipay.com/base-instance-ip=<pod-ip> -o custom-columns=NAME:.metadata.name,STATUS:.status.status
or
kubectl get module -n <namespace> -l koupleless.alipay.com/base-instance-name=<pod-name> -o custom-columns=NAME:.metadata.name,STATUS:.status.status
View detailed information of all installed modules on a base instance
kubectl describe module -n <namespace> -l koupleless.alipay.com/base-instance-ip=<pod-ip>
or
kubectl describe module -n <namespace> -l koupleless.alipay.com/base-instance-name=<pod-name>
Replace <pod-ip>
with the IP of the base instance you want to view, <pod-name>
with the name of the base instance you want to view, and <namespace>
with the namespace of the resources you want to view.
6 - 5.6 Error Codes
This article mainly introduces the error codes of Arklet, ModuleController, and KouplelessBoard.
ErrorCode Rules
Two-level error codes, support dynamic combination, using PascalCase, different levels of error codes can only be separated by “."
such as Arklet.InstallModuleFailed
Level 1: Error Source
Level 2: Error Type
Suggestion
Briefly explain the solution for upstream operations to refer to.
Arklet Error Codes
Level 1: Error Source
Code | Meaning |
---|---|
User | Errors caused by the user |
Arklet | Exceptions from Arklet itself |
ModuleController | Exceptions caused by specific upstream components |
OtherUpstream | Exceptions caused by unknown upstream |
Level 2: Error Type
Business Type | Error Source | Error Type | Meaning | Solution |
---|---|---|---|---|
General | Arklet | UnknownError | Unknown error (default) | Please check |
ModuleController | InvalidParameter | Parameter validation failed | Please check the parameters | |
ModuleController | InvalidRequest | Invalid operation type | Please check the request | |
OtherUpstream | DecodeURLFailed | URL parsing failed | Please check if the URL is valid | |
Query Related | Arklet | NoMatchedBiz | Module query failed, no target biz exists | - |
Arklet | InvalidBizName | Module query failed, query parameter bizName cannot be empty | Please add the query parameter bizName | |
Installation Related | Arklet | InstallationRequirementNotMet | Module installation conditions are not met | Please check the necessary parameters for module installation |
Arklet | PullBizError | Package pulling failed | Please retry | |
Arklet | PullBizTimeOut | Package pulling timed out | Please retry | |
User | DiskFull | Disk full when pulling the package | Please replace the base | |
User | MachineMalfunction | Machine malfunction | Please restart the base | |
User | MetaspaceFull | Metaspace exceeds the threshold | Please restart the base | |
Arklet | InstallBizExecuting | Module is being installed | Please retry | |
Arklet | InstallBizTimedOut | Uninstalling old module failed during module installation | Please check | |
Arklet | InstallBizFailed | New module installation failed during module installation | Please check | |
User | InstallBizUserError | Module installation failed, business exception | Please check the business code | |
Uninstallation Related | Arklet | UninstallBizFailed | Uninstallation failed, current biz still exists in the container | Please check |
Arklet | UnInstallationRequirementNotMet | Module uninstallation conditions are not met | The current module has multiple versions, and the version to be uninstalled is in the active state, which is not allowed to be uninstalled |
ModuleController Error Codes
Level 1: Error Source
Code | Meaning |
---|---|
User | Errors caused by the user |
ModuleController | Exceptions from ModuleController itself |
KouplelessBoard | Exceptions caused by specific upstream components |
Arklet | Exceptions caused by specific downstream components |
OtherUpstream | Exceptions caused by unknown upstream |
OtherDownstream | Exceptions caused by unknown downstream |
Level 2: Error Type
Business Type | Error Source | Error Type | Meaning | Solution |
---|---|---|---|---|
General | ModuleController | UnknownError | Unknown error (default) | Please check |
OtherUpstream | InvalidParameter | Parameter validation failed | Please check the parameters | |
Arklet | ArkletServiceNotFound | Base service not found | Please ensure that the base has Koupleless dependency | |
Arklet | NetworkError | Network call exception | Please retry | |
OtherUpstream | SecretAKError | Signature exception | Please confirm that there are operation permissions | |
ModuleController | DBAccessError | Database read/write failed | Please retry | |
OtherUpstream | DecodeURLFailed | URL parsing failed | Please check if the URL is valid | |
ModuleController | RetryTimesExceeded | Multiple retries failed | Please check | |
ModuleController | ProcessNodeMissed | Lack of available working nodes | Please retry later | |
ModuleController | ServiceMissed | Service missing | Please check if ModuleController version contains the template type | |
ModuleController | ResourceConstraned | Resource limited (thread pool, queue, etc. full) | Please retry later | |
Installation Related | Arklet | InstallModuleTimedOut | Module installation timed out | Please retry |
Arklet / User | InstallModuleFailed | Module installation failed | Please check the failure reason | |
Arklet | InstallModuleExecuting | Module is being installed | The same module is being installed, please retry later | |
User | DiskFull | Disk full | Please replace | |
Uninstallation Related | OtherUpstream | EmptyIPList | IP list is empty | Please enter the IP to be uninstalled |
Arklet | UninstallBizTimedOut | Module uninstallation timed out | Please retry | |
Arklet | UninstallBizFailed | Module uninstallation failed | Please check | |
Base Related | ModuleController | BaseInstanceNotFound | Base instance not found | Please ensure that the base instance exists |
KubeAPIServer | GetBaseInstanceFailed | Failed to query base information | Please ensure that the base instance exists | |
ModuleController | BaseInstanceInOperation | Base instance is under operation | Please retry later | |
ModuleController | BaseInstanceNotReady | Base data not read or base is not available | Please ensure that the base is available | |
ModuleController | BaseInstanceHasBeenReplaced | Base instance has been replaced | Additional base instances will be added later, please wait | |
ModuleController | InsufficientHealthyBaseInstance | Insufficient healthy base instances | Please scale out | |
Scaling Related | ModuleController | RescaleRequirementNotMet | Scaling conditions are not met | Please check if there are enough machines for scaling/Check the scaling ratio |
⚠️ Note: The base runs on different base instances, such as pods. Therefore, BaseInstanceInOperation, BaseInstanceNotReady, BaseInstanceHasBeenReplaced, InsufficientHealthyBaseInstance error codes may refer to both the application status of the base and the status of the base instance.
DashBoard Error Codes
Level 1: Error Source
Code | Meaning |
---|---|
KouplelessBoard | Exceptions from KouplelessBoard itself |
ModuleController | Exceptions caused by specific downstream components |
OtherUpstream | Exceptions caused by unknown upstream |
OtherDownstream | Exceptions caused by unknown downstream |
Level 2: Error Type
Business Type | Error Source | Error Type | Meaning | Solution |
---|---|---|---|---|
General | KouplelessBoard | UnknownError | Unknown error (default) | |
OtherUpstream | InvalidParameter | Parameter validation failed | Please check the parameters | |
Work Order | KouplelessBoard | OperationPlanNotFound | Work order not found | Please check |
KouplelessBoard | OperationPlanMutualExclusion | Work order mutual exclusion | Please retry | |
Internal Error | KouplelessBoard | InternalError | Internal system error | Please check |
KouplelessBoard | ThreadPoolError | Thread pool call exception | Please check | |
Operation and Maintenance | ModuleController | BaseInstanceOperationFailed | Operation failed | Please check |
ModuleController | BaseInstanceUnderOperation | Under operation | Please retry | |
ModuleController | BaseInstanceOperationTimeOut | Operation timed out | Please retry | |
ModuleController | OverFiftyPercentBaseInstancesUnavaliable | More than 50% of machine traffic is unreachable | Please check the base instance | |
KouplelessBoard | BaselineInconsistency | Consistency check failed (inconsistent baseline) | Please check | |
External Service Call Error | OtherDownstream | ExternalError | External service call error | Please check |
KouplelessBoard | NetworkError | External service call timed out | Please retry |