Skip to content

Latest commit

 

History

History
920 lines (825 loc) · 39.9 KB

File metadata and controls

920 lines (825 loc) · 39.9 KB

BOARD

ghproxy

plank

Plank Pod Utilities

tide

So here we see Merge Requirements and repos that have incoming PRs

confirm version of k8s being tested matched foldera

generate list of tests required for a specific version

generate list of tests submitted as run by a PR

add blocking tag if any requeired tests not run

comment with list of missing tests

strat

get presubmits job results showing up as comments

optional

required

simple presubmit job (cat/dog) counts something cncf/k8s-conformance

Setup a cluster

Used eksctl though long term, maybe we use cluster-api and aws-service-operation-k8s

aws eks list-clusters

loading secrets

TODO: Where did we get these? How do we want to manage them in the future?

github-hmac / hook

kubectl delete secret hmac-token
kubectl create secret generic hmac-token --from-file=hmac=.secret-hook

github-oauth

kubectl delete secret oauth-token
kubectl create secret generic oauth-token --from-file=oauth=.secret-oauth

prow components manifst

cluster/starter.yaml

https://github.com/kubernetes/test-infra/blob/master/prow/getting_started_deploy.md#add-the-prow-components-to-the-cluster

kubectl apply -f manifests/starter.yaml

components

services

kubectl get services

pods

kubectl get pods

deployment

kubectl get deployments

ingress

kubectl get ingress

kubectl get ingress ing -o yaml

Rob -> ALB Ingress => other ingress

AWS Blog - NLB Nginx Ingress Controller on EKS NGINX Ingress Controller - Install Guide

Network Load Balancer with the NGINX Ingress resource

#  curl -LO https://raw.githubusercontent.com/kubernetes/ingress-nginx/master/deploy/static/provider/aws/deploy.yaml
# curl -LO https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-0.32.0/deploy/static/provider/aws/deploy.yaml
kubectl apply -f manifests/ingress/deploy.yaml  # 404s / docs may have moved
# curl -LO https://raw.githubusercontent.com/kubernetes/ingress-nginx/master/deploy/static/mandatory.yaml
curl -LO https://raw.githubusercontent.com/cornellanthony/nlb-nginxIngress-eks/master/nlb-service.yaml
curl -LO https://raw.githubusercontent.com/cornellanthony/nlb-nginxIngress-eks/master/apple.yaml
curl -LO  https://raw.githubusercontent.com/cornellanthony/nlb-nginxIngress-eks/master/banana.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/master/deploy/static/mandatory.yaml
kubectl apply -f https://raw.githubusercontent.com/cornellanthony/nlb-nginxIngress-eks/master/nlb-service.yaml
kubectl apply -f https://raw.githubusercontent.com/cornellanthony/nlb-nginxIngress-eks/master/apple.yaml
kubectl apply -f https://raw.githubusercontent.com/cornellanthony/nlb-nginxIngress-eks/master/banana.yaml

Troubleshooting resources

EKS Managed Nodes So in AWS Console land in order to grok the nodes you need to look at EC2 . Do not bother with the EKS Clusters page for reason?

When you logon to the nodes with the unknown state and run the following

[ec2-user@ip-192-168-45-255 ~]$ systemctl status kubelet
● kubelet.service - Kubernetes Kubelet
   Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/kubelet.service.d
           └─10-eksclt.al2.conf
   Active: active (running) since Mon 2020-05-11 03:07:19 UTC; 16h ago
     Docs: https://github.com/kubernetes/kubernetes
 Main PID: 7983 (kubelet)
    Tasks: 83
   Memory: 222.9M
   CGroup: /system.slice/kubelet.service
           ├─ 7983 /usr/bin/kubelet --node-ip=192.168.45.255 --node-labels=role=prow,alpha.eksctl.io/cluster-name=prow-dev,alpha.eksctl.io/nodegroup-name=prow-1,alpha.eksctl.io/instance-id=i-063c273807d19a3...
           └─24396 /usr/bin/python2 -s /usr/bin/aws eks get-token --cluster-name prow-dev --region ap-southeast-2

May 11 19:14:58 ip-192-168-45-255.ap-southeast-2.compute.internal kubelet[7983]: E0511 19:14:58.711930    7983 reflector.go:125] k8s.io/kubernetes/pkg/kubelet/kubelet.go:445: Failed to list *v1.Se...authorized
May 11 19:14:58 ip-192-168-45-255.ap-southeast-2.compute.internal kubelet[7983]: E0511 19:14:58.712010    7983 controller.go:125] failed to ensure node lease exists, will retry in 7s, error: Unauthorized
May 11 19:14:58 ip-192-168-45-255.ap-southeast-2.compute.internal kubelet[7983]: E0511 19:14:58.712078    7983 reflector.go:125] object-"default"/"deck-token-g5pc5": Failed to list *v1.Secret: Unauthorized
May 11 19:14:59 ip-192-168-45-255.ap-southeast-2.compute.internal kubelet[7983]: E0511 19:14:59.018466    7983 reflector.go:125] object-"kube-system"/"kube-proxy": Failed to list *v1.ConfigMap: Unauthorized
May 11 19:14:59 ip-192-168-45-255.ap-southeast-2.compute.internal kubelet[7983]: E0511 19:14:59.326603    7983 reflector.go:125] k8s.io/kubernetes/pkg/kubelet/kubelet.go:454: Failed to list *v1.No...authorized
May 11 19:14:59 ip-192-168-45-255.ap-southeast-2.compute.internal kubelet[7983]: E0511 19:14:59.326665    7983 reflector.go:125] object-"default"/"sinker-token-8pgvp": Failed to list *v1.Secret: Unauthorized
May 11 19:14:59 ip-192-168-45-255.ap-southeast-2.compute.internal kubelet[7983]: E0511 19:14:59.634835    7983 reflector.go:125] object-"default"/"tide-token-9fqsp": Failed to list *v1.Secret: Unauthorized
May 11 19:14:59 ip-192-168-45-255.ap-southeast-2.compute.internal kubelet[7983]: E0511 19:14:59.943901    7983 reflector.go:125] object-"default"/"hook-token-dz222": Failed to list *v1.Secret: Unauthorized
May 11 19:14:59 ip-192-168-45-255.ap-southeast-2.compute.internal kubelet[7983]: E0511 19:14:59.944074    7983 reflector.go:125] object-"default"/"plugins": Failed to list *v1.ConfigMap: Unauthorized
May 11 19:15:00 ip-192-168-45-255.ap-southeast-2.compute.internal kubelet[7983]: E0511 19:15:00.254296    7983 reflector.go:125] object-"default"/"hmac-token": Failed to list *v1.Secret: Unauthorized
Hint: Some lines were ellipsized, use -l to show in full.
[ec2-user@ip-192-168-45-255 ~]$ date
Mon May 11 19:15:41 UTC 2020
[ec2-user@ip-192-168-45-255 ~]$
eksctl get --cluster prow-dev nodegroup

# need to check this
 eksctl delete --cluster prow-dev nodegroup
# pasted result
ii@ip-172-31-4-91:~$ eksctl delete nodegroup --cluster prow-dev prow-1
[ℹ]  eksctl version 0.19.0-rc.1
[ℹ]  using region ap-southeast-2
[ℹ]  combined include rules: prow-1
[ℹ]  1 nodegroup (prow-1) was included (based on the include/exclude rules)
[ℹ]  will delete 1 nodegroups from auth ConfigMap in cluster "prow-dev"
[!]  removing nodegroup from auth ConfigMap: instance identity ARN "arn:aws:iam::928655657136:role/eksctl-prow-dev-nodegroup-prow-1-NodeInstanceRole-1UFBFQ9Q5BFN1" not found in auth ConfigMap
[ℹ]  will drain 1 nodegroup(s) in cluster "prow-dev"
[ℹ]  cordon node "ip-192-168-4-247.ap-southeast-2.compute.internal"
[ℹ]  cordon node "ip-192-168-45-255.ap-southeast-2.compute.internal"
[!]  ignoring DaemonSet-managed Pods: kube-system/aws-node-t9mrd, kube-system/kube-proxy-tggtw
[!]  ignoring DaemonSet-managed Pods: kube-system/aws-node-lc6f5, kube-system/kube-proxy-kxmzh
[!]  ignoring DaemonSet-managed Pods: kube-system/aws-node-t9mrd, kube-system/kube-proxy-tggtw
[!]  ignoring DaemonSet-managed Pods: kube-system/aws-node-lc6f5, kube-system/kube-proxy-kxmzh
[✔]  drained nodes: [ip-192-168-4-247.ap-southeast-2.compute.internal ip-192-168-45-255.ap-southeast-2.compute.internal]
[ℹ]  will delete 1 nodegroups from cluster "prow-dev"
[ℹ]  1 task: { delete nodegroup "prow-1" [async] }
[ℹ]  will delete stack "eksctl-prow-dev-nodegroup-prow-1"
[✔]  deleted 1 nodegroup(s) from cluster "prow-dev"

Creating a managed nodegroup

EKS - Creating a cluster

eksctl create nodegroup -f eksctl.yaml

go get go

curl -L https://dl.google.com/go/go1.14.2.linux-amd64.tar.gz | sudo tar -C /usr/local -xzf -

hook up

Setting up repo with a hook … Source coude for the add-hook below. hook main.go

Bazel separates flags ro the command being run using – Here for example, bazel refuses to parse –help (no wonder nobody understands it!) so in order to have –help interpred by the add-hook code prepend – first ~ ii@ip-172-31-4-91 ~/test-infra $ bazel run //experiment/add-hook – –help INFO: Analyzed target //experiment/add-hook:add-hook (1 packages loaded, 556 targets configured). INFO: Found 1 target… INFO: From Generating Descriptor Set proto_library @go_googleapis//google/iam/v1:iam_proto: google/iam/v1/options.proto:20:1: warning: Import google/api/annotations.proto is unused. google/iam/v1/policy.proto:21:1: warning: Import google/api/annotations.proto is unused. Target //experiment/add-hook:add-hook up-to-date: bazel-bin/experiment/add-hook/linux_amd64_stripped/add-hook INFO: Elapsed time: 76.356s, Critical Path: 17.69s INFO: 213 processes: 213 linux-sandbox. INFO: Build completed successfully, 215 total actions INFO: Build completed successfully, 215 total actions Usage of newhome/ii.cache/bazel/_bazel_ii/8dad4840a73c734f8c8c7e2d452a8/execroot/io_k8s_test_infra/bazel-out/k8-fastbuild/bin/experiment/add-hook/linux_amd64_stripped/add-hook: -confirm Apply changes to github -event value Receive hooks for the following events, defaults to [“*”] (all events) (default *) -github-endpoint value GitHub’s API endpoint (may differ for enterprise). (default https://api.github.com) -github-graphql-endpoint string GitHub GraphQL API endpoint (may differ for enterprise). (default “https://api.github.com/graphql”) -github-host string GitHub’s default host (may differ for enterprise) (default “github.com”) -github-token-path string Path to the file containing the GitHub OAuth secret. -hmac-path string Path to hmac secret -hook-url string URL to send hooks -repo value Add hooks for this org or org/repo ~

echo $PATH
go get -u k8s.io/test-infra/experiment/add-hook
add-hook
add-hook
  (
  bazel run //experiment/add-hook -- \
    --github-endpoint=http://ghproxy/
    --github-token-path=../prow-config/.secret-oauth \
    --hmac-path=../prow-config/.secret-hook \
    --hook-url http://prow.cncf.io/hook \
    --repo cncf/k8s-conformance \
    --repo cncf/apisnoop \
    --repo cncf-infra/prow-config \
  ) 2>&1
# --confirm=false  # Remove =false to actually add hook
  :

Adding more repos to prow

  • The new repo will need to be defined in the hook above, but also added to plugins

content of plugins.yaml showing cncf/k8s-conformance added

cat plugins.yaml

  • After updating plugins run the following to apply it it the cluster.

Lets apply the change

kubectl create configmap plugins --from-file=plugins.yaml=./plugins.yaml  --dry-run -o yaml | kubectl replace configmap plugins -f -

ghproxy

kubectl apply -f manifests/ghproxy.yaml

Verifying Conformance Certification Requests

Live Repo : https://github.com/cncf/k8s-conformance Test Repo : https://github.com/cncf-infra/k8s-conformance a fork of the cncf repo

https://github.com/cncf/apisnoop/projects/29 kubernetes-sigs/apisnoop#342

Requirements

Check the consistencey of the PR to the above repos Ensure that the versoin referenced in the PR Title corresponds to the version of k8s referenced in the supplied logs

Design

Implement as a External Plugin that interacts but is no linked into the Hook component of Prow

Implementation

Plugin

name verify-conformance-request desc Checks a k8s-conformance PR to see if it is internally consitent.

Development setup

Code location /home/ii/go/src/k8s.io/test-infra/prow/external-plugins/verify-conformance-request

Building Code

In the mean time following the steps below

Literate Build of the go code

Execute the block below using ,, So note here that we are bulding locally on the host And developing the plugin in the k8s/test-infra clone while we figure out how to vendor k8s/test-infra/prow dependancies.

# Workaround for above is to place the cncf plugin in to the the k8s/infra code base
cd ~/go/src/k8s.io/test-infra/prow/external-plugins/verify-conformance-request
# make changes
go build
cp verify-conformance-request /home/ii/prow-config/prow/external-plugins/verify-conformance-request/
ls -al /home/ii/prow-config/prow/external-plugins/verify-conformance-request/

Running the external plugin locally

$ ./verify-conformance-request --hmac-secret-file=/home/ii/.secret-hook --github-token-path=/home/ii/.secret-oauth --plugin-config=/home/ii/prow-config/plugins.yaml

*** Building Container How to test a plugin Test data has been placeid in /home/ii/prow-config/prow/external-plugins/verify-conformance-request/test-data/open-pr.json You can send a test webhook using phony as follows:

bazel run //prow/cmd/phony -- \
 --address=http://localhost:8888/hook \
 --hmac="secret_text_does_here" --event=pull_request \
 --payload=/home/ii/prow-config/prow/external-plugins/verify-conformance-request/test-data/open-pr.json

N.B. the --hmac flag requires a string with the text of the hmacs secret.

Build and push the container

Make sure that the Building code step above is done and that you have the binary copied into the prow-config repo

build the container and tag

  • Will build this as a container and publish to the cncf-infra ECR repository ecr/repo cncf-infra
  • The link above will also provide you with a list of commands to run if you get stuck
  • TODO: Bryan is updating repo to be the plugin name instead of cncf-prow, once it is up change this push to go to verify-conformance-request (rememver to update verify-conformance-deploy.yaml

Remember to change the version Bryan is doing work on this so this will as he rolls out new procedures. Thanks Bryan!

cd /home/ii/p row-config/prow/external-plugins/verify-conformance-release
aws ecr get-login-password --region ap-southeast-2  | docker login --username AWS --password-stdin 928655657136.dkr.ecr.ap-southeast-2.amazonaws.com
docker build -t cncf-prow .
docker tag cncf-prow:latest 928655657136.dkr.ecr.ap-southeast-2.amazonaws.com/verify-conformance-release:latest
docker push 928655657136.dkr.ecr.ap-southeast-2.amazonaws.com/verify-conformance-release:latest

Run the container to make sure it is working (optional step that can be used for troubleshooting)

docker run  -v /home/ii/.secret-hook:/etc/webhook/hmac -v /home/ii/.secret-oauth:/etc/github/oauth -v /home/ii/prow-config/prow/external-plugins/verify-conformance-request/vcr.yaml:/plugin/vcr.yaml -v /home/ii/prow-config/plugins.yaml:/etc/plugins/plugins.yaml 847cf1d2cf02  /bin/bash -c "/plugin/verify-conformance-request --hmac-secret-file=/etc/webhook/hmac --github-token-path=/etc/github/oauth --plugin-config=/plugin/vcr.yaml --update-period=1m"

I do not understand why the above docker run is not seeing the repo

  • I did notice if I exec into that container and run the command in -c it works as expected
docker exec -i -t f39470700e75 bash
root@f39470700e75:/plugin# cat /plugin/vcr.yaml
external_plugins:
cncf-infra/k8s-conformance:
- name: verify-conformance-request
events:
- issue_comment
- pull_request
root@f39470700e75:/plugin# /plugin/verify-conformance-request --hmac-secret-file=/etc/webhook/hmac --github-token-path=/etc/github/oauth --plugin-config=/plugin/vcr.yaml --update-period=1m
WARN[0000] It doesn't look like you are using ghproxy to cache API calls to GitHub! This has become a required component of Prow and other components will soon be allowed to add features that may rapidly consume API ratelimit without caching. Starting May 1, 2020 use Prow components without ghproxy at your own risk! https://github.com/kubernetes/test-infra/tree/master/ghproxy#ghproxy
WARN[0000] no plugins specified-- check syntax?
INFO[0000] Throttle(360, 360)                            client=github
INFO[0000] verify-conformance-request : HandleAll : Checking all PRs for handling  plugin=verify-conformance-request
INFO[0000] Server exited.                                error="listen tcp :8888: bind: address already in use"
INFO[0000] Search for query "archived:false is:pr is:open repo:"cncf-infra/k8s-conformance"" cost 1 point(s). 4991 remaining.  plugin=verify-conformance-request
INFO[0000] Considering 1 PRs.                            plugin=verify-conformance-request
INFO[0000] IsVerifiable: title of PR is "NOT A REAL CONFORMANCE REQ for  v1.18"  plugin=verify-conformance-request
INFO[0000] AddLabel(cncf-infra, k8s-conformance, 1, verifiable)  client=github
  • For now I am going to call this good and use the above flags to build out the verify-conformance-deploy.yaml

Next steps update verify-conformance-deployment.yaml to emulate the docker run

Also, ensure that you are referencing the tag of the image that you just built

loading config map for vcr.yaml

kubectl delete configmap vcr-config
kubectl create configmap vcr-config --from-file=/home/ii/prow-config/prow/external-plugins/verify-conformance-request/vcr.yaml

apply verify-conformance-deployment.yaml

     kubectl apply -f manifests/verify-conformance-release-deployment.yaml
#     kubectl apply -f manifests/verify-conformance-test-deployment.yaml

Lets look at the pods.

kubectl get po14372bd4-d12e-11ea-b2a4-ea964f830367ds

Initially we crashed without logs, this helped me get a meaningful error

kubectl describe pod verify-conformance-request-5b7647499f-lr49f

See if the logs will tell us anything.

kubectl logs verify-conformance-request-5b7647499f-lr49f | tail -20

Random header to stop the content below accidentally getting collapsed with another header

This is how test-infra deploy needs-rebase external plugin

*** Configuration

plugins:
  cncf-infra/k8s-conformance:
  # - approve
  - verify-conf-request
  - assign
  • Footnotes

** software *** direnv *** aws-iam-authenticator https://docs.aws.amazon.com/eks/latest/userguide/install-aws-iam-authenticator.html ** gotchas *** documentation seems to call it the oauth secret.... when in fact it’s a github personal access tokens *** cluster authentication / iam kubernetes-sigs/aws-iam-authenticator#174 (comment)

*** cluster-admin role

kubectl get clusterrolebinding cluster-admin -o yaml

** ENV for aws cli https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-envvars.html

**AWS_PROFILE**