Hi Jim, Great video and very high success rates from the looks of the feedback, although I do have 1 concern and that is combing RKE2 & Longhorn all on the single network, I built a K3S\Longhorn cluster and experience huge performance issue due to Longhorn replication and automatic snapshotting processes....how difficult would be to segregate the storage network from the RKE2 pod and ingress network? Cheers
there is no enough likes for your video, the amount of work that you put into this is incredible, thanks, i'm waiting for my new homelab server to try all of this.
@@Jims-Garage now that i have my proxmox server, i tried this script, but in the end the kubectl does not connect to the vip ip address, i did the complete process 3 times, with fresh vms, it still gives the same error, any ideas?
@@raulgil8207If you can come on Discord and show the output of your logs that would help. I suspect it's failing early on. Are you able to manually SSH with certificates?
Thank you and happy new year! 🥂🍾 I can confirm that both rke2 and longhorn works even on debian12 cloud generic (with a little bit of tuning of the script - like the ssh - and the installation of open-iscsi on the workers)
Hey Jim, great video and script again. I'm on my own homelab journey too and your videos have helped me so much, as I'm also a Linux newb as well (know enough to be dangerous). I'm late to this video because I had some issues with some equipment. Thought I'd just jump in the deep end with this as had already followed your k3s setup but figured I'd keep upto date. Script worked perfectly after I figured out an issue with something two feet in front of keyboard as I copied and pasted your script like yourself into WinSCP, but could not get it to run with a error message "/bin/sh^M: bad interpreter" till I work-out about unix format. Hope you are still using rke2 as am following along, keep up the good work.
Great one Jim. Thanks for this great video. I was just about to hack your k3s script to use RKE2. There is already lots of content about this version. There's a big move going on from K3s to RKE2.
Thanks, that's good to know. It seems like an obvious migration given the benefits and similarities with K3S. I'm going to dual cluster for a while in case of issues (so far, so good).
I love this series, and it's very good for learning about kubernetes in all shapes and sizes. Excellent to see someone go through it and have an opportunity to play along. I'm wondering though, why not create a script-download-run-embed in an image like with cloud-init. Having your own github repo host the version of the script that you need to run on each node, and then having an image for every master/worker that you can apply and copy. On startup it would get the github script, and run it on first boot to set itself up within the cluster. This makes everything much more parallel, since the scalability of this script ends if you want to do - say - 10 workers and masters. Since you have to wait for each one before going on to the next one.
I know that must have taken quiet a lot of time getting that script to work as expected.. There are always things that we overlook hehe... Appreciate all you do and it is very helpful indeed!
Thanks once for the great videos ❤ , a little request please zoom in more when viewing the scripts the texts i mean as i am watching you from mobile 😅 , thanks
I’ve used ansible. And while I love the capabilities. I prefer your script as it has a lower bar to execute. Ansible requires learning the syntax and structure while I already understand scripts well enough.
Hello Jim, Great video. Do you know if it's possible to change the cluster IP from the default 10.43.x.x to something else, in case that range is already in use on the network?
I don't believe so. However, it's an internal Kubernetes range, it will not conflict with existing external networks (much like how Docker works). You expose services through the loadbalancer defining the network range you want to use.
thank you for the script. in my deployment i had 2 worker nodes on proxmox and 5 quadcore thin clients. All with ubuntu server 24.04 installed. What i had to modifiy to get this running was: 1. the sudoers file in /etc (%sudo ALL=(ALL:ALL) NOPASSWD:ALL), because the script stoped woring at "SSH -tt...'; 2. i had to create the ~/.ssh/config file. Everything else runs great at the script.
So I recently got an old machine, setup Proxmox on it, but I'm a little bit confused on the steps here. Should i just setup 6 VMs and then run the script (with the edits I need) to get this up and running?
How many of Jim's videos do I need to search before I find where he generates the cert files? I have plain old kvm/qemu not Proxmox. I can ssh into all of my nodes using ssh keys (passwordless) from the kvm hypervisor host. What sort of certs files are expected?
I simply use the certs generated by Proxmox. You should be able to use the ones you already are (or generate some new ones and use ssh-copy, I cover that in my ansible series).
Thanks for the video, I'm really looking forward to deploying it. Do you have any video/guidance on how to setup the SSH certificates to make sure your script works as intended?
Hi, I've been trying out your cilium version, however it does not work. The lb-range does not exist in your cilium config and the vip is unable to get created as well. Any fixes regarding this?
Not off top of my head, although there are many reasons that could interrupt deployment (VMs are fundamentally different to LXCs). I hope to do some testing in future to enable LXCs.
Should the local cluster not be left for rancher management abs a new cluster with workers etc be deployed separately so you aren't giving local access to all your services?
In a proper production environment you want to separate clusters. In a homelab I think this is an acceptable tradeoff given most will be running Docker in a single machine.
Hello Jim, is your script to install RKE2 with Cilium works? Because I would like to do some tests but I am not sure if it is there but it is still "work in progress" or not (since there are some comments about kube-vip installation but without really install it)
@@Jims-Garage Thanks 👍 hope it will be soon on top of the list 😅 About kube vip, do you think it could have sense to use it at least as service lb even with cilium?
@@Jims-Garage Ciao Jim, just a last question to let me better understand, do you know if with Cilium is possible to assign a vip for master nodes (to allow communication between the admin machine and one - random - master node) as you did in your scripts for the installations of rke2/k3s? or, to control he cluster from the admin vm do I still need kube vip (or something similar)? So chilium will manage the cloud system side of the cluster?
just ran the script after 25mins it end with ::1]:8080: connect: connection refused, The connection to the server localhost:8080 was refused - did you specify the right host or port?
I ran kubectl get nodes on the master1 I get this error......Command 'kubectl' not found, but can be installed with:sudo snap install kubectl...been trying this since yesterday afternoon after i checked your github I thought i was doing something wrong so I waited for the video..still same error.. i even spun new nodes at least 3 different time@@Jims-Garage
I ran kubectl get nodes on the master1 I get this error......Command 'kubectl' not found, but can be installed with:sudo snap install kubectl...been trying this since yesterday afternoon after i checked your github I thought i was doing something wrong so I waited for the video..still same error.. i even spun new nodes at least 3 different time
I like Your vids, my traefik now just work with docker. Thanks to You! Next approach is kubernetes :) traefik and docker works greats, but what when i want to add separate domain with proxmox, not in docker. How to do that with Your traefik template? @@Jims-Garage
Hey Jim, thanks so much for the video series, super helpful! I'm having a weird issue with the script however. It's asking for the password for the admin box during running. Appears to be happening during step 3, at line 147-149. When I start typing the admin password, it displays text typed in clear text. Am I missing something obvious here? Testing using all ubuntu 2204 server nodes on top of an esxi cluster.
Actually, correction. I was able to modify script with installing sshpass on all my nodes and passing through the password during that command during the install. Probably not the "right" way to do it but it seems to be working now. Strange haha.
The entirety of Step 3: (lines 137-151) results in a prompt for the password on the admin box and then echoes that password to the screen and this entire ssh -tt ... section is never executed on master1 I am trying to run this on Synology Ubuntu VMs, all 6 created from the one image, names and IPs changed as appropriate. The SSH keys have no passphrase.
Great video, work first time, i struggle a bit in first go, realised RAM needed Atleast 5 GB and disk space 30 GB to finish the cluster setup comfortably. My setup is behind pfsense , and i use HAProxy to offload cert and redirect to port to access all app in network. However there is some extra setup need to be done with Metallb and BGP mode. I have the pfsense side ready to accept the request from Metallb using FSS plugin. But I am not sure how/what to modify the Metallb to advertise the loadbalancer ip to pfsense. Any help ?
Thanks. The lbrange should be a shared VIP that is dynamically assigned on service request. I haven't tested with OpnSense, but it works out of the gate with Sophos. What have you tried?
@@Jims-Garage I have it fixed and working now, every IP given out by metallb now advertise to pfsense. i had to deploy 2 more config file , BGPAdvertisements.yaml and BGPPeers.yaml. which define all the details. and IPAddressPools.yaml has to be edited to add protocol: BGP. after that everything should work, incase any one wornering.
@@Jims-Garage what would be the command to expose app without any certificate? my pfsense haproxy handle all https/http offloading for domain pointing. i think selfsign certificate is the reason why HAproxy doest work and i am not able to point any domain to the ip address. Thanks for your help.
@@NoBiggi in the service section of service.yaml you need to specify an IP in the loadbalancerIP range. Then you should be able to access the same as you would with Docker.
Cause myself extra problems by using two sets of ssh keys. One from main pc to admin vm and from admin vm to rke cluster nodes. Had to do a round robin public key authorization on the admin node for the script to work. As I said my fault. Script worked flawlessly once I figured that out. Only took me 3 months to figure out. 😅
@@Jims-Garage Thanks, I was following your script to install rancher, but somehow the rancher got installed to only worker node, while I wanted to install them on the master nodes instead, is there a way to specify some parameters to let rancher only live on master nodes? Thanks a lot!
@@Jims-Garage got it, thanks! but removing the tag would allow all pods get moved to masters as well. I kinda just want rancher to be on masters, was trying to play with the taint and toleration stuff but no luck yet.. not sure if Im doing it wrong.
@@Jims-Garage sure, but no one calls them certificates. They are typically referred to as keys or collectively as a key pair. This is most likely where some of the viewers confusion is coming from.
Can I suggest changing the following line as indicated (to pick up the actual certName)? Current: ssh-copy-id $user@node Changed: ssh-copy-id -i $certName $user@node
@@Jims-Garage Per default you yre right. But to be honest. PEr defasult the credetinal management of kubernetes is worst too. You are end up in bioth platforms to use vault. And that is the same level of security am I wrong? By the way you can use Boundary to get more secure in nomad.