I have started using tailscale(TS) at work to help improve our networking. My experience with TS up until now has been with my personal tailnet. When you transition to use TS at work, you are probably going to be responsible for the tailnet that your users (your teammates) use. And they rely on that improved connectivity so you want to make sure they don't experience downtime as you consolidate your tailnet. You also want to make sure the connectivity permissions are right: for example, you may not want user nodes to access other user nodes. For this, you want to use ACL's, tags, and tests.
I suspect I will be talking more about TS in the future but today I will write about tags and some important subtleties I encountered when using them.
You should use tags to provide server role permissions. A node inherits the permissions of the user used when logging in to the Tailnet. You probably don't want that. You can tag the node, and by doing so you are removing the inherited permissions. The node then uses the tag's permissions.
That misunderstanding gave me some trouble where I had user's nodes tagged without any permissions associated to those tags. My users reported that they lost connectivity to the tailnet right after login in. They had to login and logout for things to get back to normal. But then they would lose connectivity again. This was a tough one to troubleshoot.
The problem was that they started with their user permissions (which gave them the correct access to the tailnet) but then once they were tagged they lost all the permissions because I did not have any permissions associated with the tag.
Notice you are not going to catch this issue with acl testing. I had tests in place, but I was testing the connectivity of the user roles but those were not the roles that the user nodes were getting. They were getting the tag permissions which I didn't setup. And that was the problem.
You have two options to solve the issue. Either you remove the tags or you give the right permissions to the tags.
In my case, I decided to remove the tags. That presented some other challenges.
I couldn't remove the tag from the nodes from the admin UI. Tailscale was telling me to re-authenticate. That confused me. I thought, I could just take tailscale down and then remove the tag.
The problem is when you are taking tailscale down, the control plane still has the information of that node and that particular node was still using the tag that I wanted to remove to get its permissions.
So the solution is to logout from the tailnet, and login again. Once you login again the node gets the permissions of the user. Then the tag is not in use and it can be removed.