Hi @cole ,
Thank you for your insights, you were on the right direction.
Here it is a brief explanation of our infrastructure and how we found the source of the issue:
Infrastructure
- AKS (Azure Kubernetes Service) with a Azure CNI as a Network configuration
- VM (Virtual machine) with RSW (RStudio Workbench)
- VM with NFS host (Network File System)
All the resources are on the same VNET (Virtual Network), the AKS is in a subnet and the VMs in a different one.
This VNET has a NSG (Network Security Group) allowing the traffic just to the specific ports and all the traffic inside the VNET is allowed by default.
Troubleshooting
We discovered that if we opened all the ports we were getting the mentioned error before.
However, if you do not opened all the ports between the NFS host and the client you will get a time out.
This is because the client will try to reach the NFS trying several ports that are usually used to connect with the NFS protocol, like 111, 2049, 1110, 4045, 892, etc. and you will not get a response back from the NFS host because there is something wrong on the communication.
We were able to see this by trying to mount the NFS directory on a pod, as you suggested Cole.
So on the pod we executed the mount command and we obtained the IP which the pod communicates to the exterior with curl ifconfig.me.
Then on the NFS host we executed the following command to observe all the communications with the port 2049 (default port for NFS protocol).
sudo tcpdump -i eth0 -nn -s0 -v port 2049 | grep "<AKS-IP>"
The interface on our case was "eth0".
Issue
Basically, the network topology was not correctly set up and after a few changes all worked.
For example, be careful if the machines or pods are on the same VNET they can only connect to each other using private IPs. On our case after the network changes we also had to change the "Host" parameter in "launcher-mounts", from an "A record" registered in the Azure-provided DNS to a private IP. So the communication between the pods and the NFS host is trough privates IP.
Hopefully this can be helpful for someone else with the same issue 