部署KubeEdge、EdgeMesh、Sedna-en

Download `keadm`

Use keadm to install KubeEdge. Official docs: https://kubeedge.io/docs/setup/install-with-keadm/

(The English page lists downloads; the Chinese page didn’t—confusing.)

wget <https://github.com/kubeedge/kubeedge/releases/download/v1.16.2/keadm-v1.16.2-linux-amd64.tar.gz>

tar -zxvf keadm-v1.16.2-linux-amd64.tar.gz
cp keadm-1.16.2-linux-amd64/keadm/keadm /usr/local/bin/keadm

Cloud side (KubeEdge master)

sudo keadm init --advertise-address=YOUR_CLOUD_IP --kubeedge-version=v1.16.2 --set iptablesManager.mode="external" --set cloudCore.modules.dynamicController.enable=true

There is also --kube-config=/root/.kube/config, but that’s the default so I omitted it. Point it elsewhere if needed (keadm --help).

If init fails, the reasons are endless (ask me how I know). Common ones:

Control-plane taint blocking workloads:

kubectl taint nodes master node-role.kubernetes.io/control-plane:NoSchedule-

cloudcore image not pulled — verify containerd’s proxy, or retag to a domestic mirror. I used docker.aityp.com and searched for cloudcore v1.16.2.

kubectl edit pod/daemonset/deployment ** -n **

Swap in the mirror URL.

I also hit disk pressure once—rarer, but it happens.

When in doubt: kubectl describe pod/node, read logs, search GitHub issues.

Once cloudcore is healthy, keadm gettoken --kube-config=... gives the join token for edgecore.

Edge side (KubeEdge worker)

On the edge host (containerd + keadm already installed):

sudo keadm join  --kubeedge-version=1.16.2 --cloudcore-ipport="CLOUD_IP":10000  --remote-runtime-endpoint=unix:///run/containerd/containerd.sock --cgroupdriver=systemd --token=YOUR_TOKEN

If earlier steps are solid this usually works; otherwise check logs and issues.

Deploy EdgeMesh

This is where I burned the most time. If EdgeMesh isn’t right, the joint-inference demo won’t run.

Docs: https://edgemesh.netlify.app/guide/

Enable the edge Kube-API endpoint

dynamicController on the cloud — already enabled if you used the keadm init line above (--set cloudCore.modules.dynamicController.enable=true).
metaServer on the edge — edit /etc/kubeedge/config/edgecore.yaml, then restart edgecore.

vim /etc/kubeedge/config/edgecore.yaml
modules:
  ...
  edgeMesh:
    enable: false
  ...
  metaManager:
    metaServer:
      enable: true

Restart:

systemctl restart edgecore

On the edge, set clusterDNS and clusterDomain, then restart edgecore again.

$ vim /etc/kubeedge/config/edgecore.yaml
modules:
  ...
  edged:
    ...
    tailoredKubeletConfig:
      ...
      clusterDNS:
      - 169.254.96.16
      clusterDomain: cluster.local
...

Placement in the YAML matters (learned the hard way).

Keep clusterDNS as 169.254.96.16 unless you know what you’re doing—that value should match bridgeDeviceIP in the EdgeMesh agent config. If you change one, keep them consistent.

Restart:

systemctl restart edgecore

On the edge, sanity-check the local Kube-API shim:

$ curl 127.0.0.1:10550/api/v1/services
{"apiVersion":"v1","items":[{"apiVersion":"v1","kind":"Service","metadata":{"creationTimestamp":"2021-04-14T06:30:05Z","labels":{"component":"apiserver","provider":"kubernetes"},"name":"kubernetes","namespace":"default","resourceVersion":"147","selfLink":"default/services/kubernetes","uid":"55eeebea-08cf-4d1a-8b04-e85f8ae112a9"},"spec":{"clusterIP":"10.96.0.1","ports":[{"name":"https","port":443,"protocol":"TCP","targetPort":6443}],"sessionAffinity":"None","type":"ClusterIP"},"status":{"loadBalancer":{}}},{"apiVersion":"v1","kind":"Service","metadata":{"annotations":{"prometheus.io/port":"9153","prometheus.io/scrape":"true"},"creationTimestamp":"2021-04-14T06:30:07Z","labels":{"k8s-app":"kube-dns","kubernetes.io/cluster-service":"true","kubernetes.io/name":"KubeDNS"},"name":"kube-dns","namespace":"kube-system","resourceVersion":"203","selfLink":"kube-system/services/kube-dns","uid":"c221ac20-cbfa-406b-812a-c44b9d82d6dc"},"spec":{"clusterIP":"10.96.0.10","ports":[{"name":"dns","port":53,"protocol":"UDP","targetPort":53},{"name":"dns-tcp","port":53,"protocol":"TCP","targetPort":53},{"name":"metrics","port":9153,"protocol":"TCP","targetPort":9153}],"selector":{"k8s-app":"kube-dns"},"sessionAffinity":"None","type":"ClusterIP"},"status":{"loadBalancer":{}}}],"kind":"ServiceList","metadata":{"resourceVersion":"377360","selfLink":"/api/v1/services"}}

Empty lists or ~10s latency mean misconfiguration—re-read the steps.

After this, the edge Kube-API endpoint is on; continue with EdgeMesh itself.

Install EdgeMesh

Clear legacy taints and label the kubernetes service as the docs require:

$ kubectl taint nodes --all node-role.kubernetes.io/master-
$ kubectl label services kubernetes service.edgemesh.kubeedge.io/service-proxy-name=""

Manual install:

Clone EdgeMesh

$ git clone <https://github.com/kubeedge/edgemesh.git>
$ cd edgemesh

Install CRDs

$ kubectl apply -f build/crds/istio/
customresourcedefinition.apiextensions.k8s.io/destinationrules.networking.istio.io created
customresourcedefinition.apiextensions.k8s.io/gateways.networking.istio.io created
customresourcedefinition.apiextensions.k8s.io/virtualservices.networking.istio.io created

Deploy edgemesh-agent — edit build/agent/resources/04-configmap.yaml: set relayNodes and regenerate the PSK per the comments. I used one cloud/master node as relay (advertiseAddress = master IP). Comment out the rest. For a toy lab you can even skip rotating PSK (don’t tell anyone).

relayNodes:
- nodeName: cloud # master’s node name
  advertiseAddress:
  - *.*.*.*  # master’s IP

Apply:

$ kubectl apply -f build/agent/resources/
serviceaccount/edgemesh-agent created
clusterrole.rbac.authorization.k8s.io/edgemesh-agent created
clusterrolebinding.rbac.authorization.k8s.io/edgemesh-agent created
configmap/edgemesh-agent-cfg created
configmap/edgemesh-agent-psk created
daemonset.apps/edgemesh-agent created

Check status

$ kubectl get all -n kubeedge -o wide
NAME                       READY   STATUS    RESTARTS   AGE   IP              NODE         NOMINATED NODE   READINESS GATES
pod/edgemesh-agent-7gf7g   1/1     Running   0          39s   192.168.0.71    k8s-node1    <none>           <none>
pod/edgemesh-agent-fwf86   1/1     Running   0          39s   192.168.0.229   k8s-master   <none>           <none>
pod/edgemesh-agent-twm6m   1/1     Running   0          39s   192.168.0.121   ke-edge2     <none>           <none>
pod/edgemesh-agent-xwxlp   1/1     Running   0          39s   192.168.0.187   ke-edge1     <none>           <none>

NAME                            DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE   CONTAINERS       IMAGES                           SELECTOR
daemonset.apps/edgemesh-agent   4         4         4       4            4           <none>          39s   edgemesh-agent   kubeedge/edgemesh-agent:latest   k8s-app=kubeedge,kubeedge=edgemesh-agent

Official checklist is fine, but Running ≠ working. On the edge:

crictl logs EDGEMESH_CONTAINER_ID

You want periodic heartbeat logs, not a stuck init.

Run the EdgeMesh test case

Strongly recommended—it surfaces a ton of issues. When I asked for help, the first question was “did the test case pass?” I ran the starred Cross-Edge-Cloud scenario.

Deploy probe pods:

$ kubectl apply -f examples/test-pod.yaml
pod/alpine-test created
pod/websocket-test created

Cloud zone:

$ kubectl apply -f examples/cloudzone.yaml
namespace/cloudzone created
deployment.apps/tcp-echo-cloud created
service/tcp-echo-cloud-svc created
deployment.apps/busybox-sleep-cloud created

Edge zone:

$ kubectl apply -f examples/edgezone.yaml
namespace/edgezone created
deployment.apps/tcp-echo-edge created
service/tcp-echo-edge-svc created
deployment.apps/busybox-sleep-edge created

At this step, edgezone pods on the edge sat in ContainerCreating. Without a running pod you can’t crictl logs or kubectl logs; kubectl describe was empty-ish. systemctl status edgecore finally surfaced the error: /run/flannel/subnet.env missing on the edge. I copied the file from the cloud; shortly after, pods went Running.

Cloud → edge

$ BUSYBOX_POD=$(kubectl get all -n cloudzone | grep pod/busybox | awk '{print $1}')
$ kubectl -n cloudzone exec $BUSYBOX_POD -c busybox -i -t -- sh
$ telnet tcp-echo-edge-svc.edgezone 2701
Welcome, you are connected to node ke-edge1.
Running on Pod tcp-echo-edge.
In namespace edgezone.
With IP address 172.17.0.2.
Service default.
Hello Edge, I am Cloud.
Hello Edge, I am Cloud.

Worked for me.

Edge → cloud (docs use docker ps; with containerd use crictl)

crictl ps
# find busybox container ID
crictl exec -it CONTAINER_ID sh

First telnet tcp-echo-cloud-svc.cloudzone 2701 failed with something like name or service not known—I had misconfigured the edge Kube-API endpoint. After fixing that, I hit no route to host, same as edgemesh#533. Following the “EdgeMesh Q&A” Zhihu post (section 3), I flushed iptables rules, redeployed EdgeMesh, and:

$ telnet tcp-echo-cloud-svc.cloudzone 2701
Welcome, you are connected to node k8s-master.
Running on Pod tcp-echo-cloud.
In namespace cloudzone.
With IP address 10.244.0.8.
Service default.
Hello Cloud, I am Edge.
Hello Cloud, I am Edge.

That’s a real EdgeMesh success. I lost count of the hours—but that’s the learning tax. Early on I didn’t read logs and thrashed blindly; now logs are the first stop.

Deploy Sedna

Official install: https://sedna.readthedocs.io/en/latest/setup/install.html

curl <https://raw.githubusercontent.com/kubeedge/sedna/main/scripts/installation/install.sh> | SEDNA_ACTION=create bash -

Sometimes the script fails to detect versions—read the output. If the version is blank, abort, uninstall Sedna, retry.

# uninstall
curl <https://raw.githubusercontent.com/kubeedge/sedna/main/scripts/installation/install.sh> | SEDNA_ACTION=delete bash -

If installs still fail, mirror swaps may be required again.

References

Full stack (Chinese): CSDN — k8s + KubeEdge + Sedna