Byzantine connectivity issues with docker & package managers
Good day nerds :)
I decided to do a write up about docker and some weird connectivity issues that i had. So maybe it will be useful for the others as well. In this piece i am going to give the debugging process rather than the direct solution, unfortunately there is no one direct solution to these kind of problems in general...
So the anatomy of the problem:
- Host machine has connectivity to outside world and DNS resolves just fine.
- Container on the host machine has connectivity, DNS resolves just fine.
- You try to update or fetch a package via apt-get or yum, connection freezes.
Before debugging the problem, you have to make sure that you have the latest docker version available to you.. compare local and latest release:
~|⇒ docker -v Docker version 1.9.1, build a34a1d5
Than first thing to check is if i can manually talk with the package repository, whether its archive.ubuntu.com or debian dist.. And it seems that the signal travels between container and destination just fine.
OK, that doesn't make much sense, lets start some debugging, show me some information on docker0 interface.
~|⇒ ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: enp0s25: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000 link/ether 50:7b:9d:5e:74:f4 brd ff:ff:ff:ff:ff:ff 3: wlp3s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether dc:53:60:49:4a:0d brd ff:ff:ff:ff:ff:ff inet 192.168.0.157/24 brd 192.168.0.255 scope global dynamic wlp3s0 valid_lft 599467sec preferred_lft 599467sec inet6 fe80::de53:60ff:fe49:4a0d/64 scope link valid_lft forever preferred_lft forever 4: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default link/ether 02:42:7a:e9:5c:6b brd ff:ff:ff:ff:ff:ff inet 172.17.0.1/16 scope global docker0 valid_lft forever preferred_lft forever
Wait it's state DOWN ? Your container is not running silly.
~|⇒ docker run -t -i ubuntu:14.04 /bin/bash
interface status docker0 :
4: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default link/ether 02:42:7a:e9:5c:6b brd ff:ff:ff:ff:ff:ff inet 172.17.0.1/16 scope global docker0 valid_lft forever preferred_lft forever inet6 fe80::42:7aff:fee9:5c6b/64 scope link valid_lft forever preferred_lft forever 6: veth2415ecd@if5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP group default link/ether 6e:67:6b:e7:70:0a brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet6 fe80::6c67:6bff:fee7:700a/64 scope link valid_lft forever preferred_lft forever
So the docker0 interface is up and running. And apt-get update fails, hmm, lets enforce usage of ipv4:
apt-get update -o Acquire::ForceIPv4=true
What if the DNS we use is problematic... ? OK let's test that.. lets run the container and make it use different dns servers:
docker run --dns 220.127.116.11 --dns 18.104.22.168 -t -i ubuntu:14.04 /bin/bash
No luck, we are getting stuck at %0 for update..
Ok, at this point if you are using debian or ubuntu, you have to make sure docker0 interface is not being handled by the NetworkManager :)
Let's see if the NetworkManager is handling the docker0 interface:
~|⇒ nmcli dev status DEVICE TYPE STATE CONNECTION docker0 bridge connected docker0 wlp3s0 wifi connected dlink-39B4 enp0s25 ethernet unavailable -- lo loopback unmanaged --
Bad NetworkManager, Bad!
So let's prevent this, get the mac address of the docker0 interface
docker0 Link encap:Ethernet HWaddr 02:42:7a:e9:5c:6b
~|⇒ sudo emacs /etc/NetworkManager/NetworkManager.conf
Add the following to the end of the file:
Stop & start NetworkManager
sudo service network-manager stop
sudo service network-manager start
Alright, now is another good time to check if the container can successfully perform apt-get update.. If you see the issue continuing, we have a few more tricks..
Lets launch the container with a flag that specifically tells it about the networking you want.
docker run --net=host -t -i ubuntu:14.04 /bin/bash
Hmm no luck?
OK you know what, let's just try and reset everything
pkill docker iptables -t nat -F /etc/init.d/networking restart brctl delbr docker0 docker -d
So yeah, up until this moment if you couldn't resolve the problem, I would recommend you to take down your firewall for a brief moment to test without it, before doing that you should observe your iptables rules first of course. In my case even though iptables had special rules to allow docker connectivity (that is why the container has access to outside world) there was a rule about nginx that clashed with a container rule...
Anyway, i put these steps to help others, it is possible that maybe you solved your issue in the nth step before the end :)
PS: the inline markdown uncontrollably wants to highlight mac addresses with failure :) sorry about that!