I'm looking at a Docker swarm host, which is the only host in our Docker swarm cluster (no, not recommended).
Due to a DNS issue in Moby we've sent intra-container traffic via either the host's external IP (192.168.1.50) or via
docker_gwbridge (172.18.0.1). This works as these requests will be DNAT'ed to 172.18.0.2 (which fronts a Docker container handling the VIP / load balancing functionality)
We see randomly that request duration between an nginx container and memcached container (two different containers) takes 1 second or more.
We see this by running mc_conn_tester.pl inside the nginx container.
When traffic between the two containers go via 192.168.1.50, the mc_conn_tester.pl reports a lot of 1 sec requests when we use 192.168.1.50 also in the test. If we run mc_conn_tester.pl against 172.18.0.1 instead, we do not see any 1 sec requests.
When traffic between the two containers go via 172.18.0.1, the mc_conn_tester.pl reports a lof of one 1 sec requests when we use 172.180.1 in the test. If we run mc_conn_tester.pl against 192.168.1.50 instead, we do not see any 1 sec requests.
What could cause this behaviour and give 1 sec (and up) delays, and how can we debug this further?