Good day nerds :)
I decided to do a write up about docker and some weird connectivity issues that i had. So maybe it will be useful for the others as well. In this piece i am going to give the debugging process rather than the direct solution, unfortunately there is no one direct solution to these kind of problems in general...
So the anatomy of the problem:
- Host machine has connectivity to outside world and DNS resolves just fine.
- Container on the host machine has connectivity, DNS resolves just fine.
- You try to update or fetch a package via apt-get or yum, connection freezes.
Before debugging the problem, you have to make sure that you have the latest docker version available to you.. compare local and latest release:
~|⇒ docker -v
Docker version 1.9.1, build a34a1d5
https://github.com/docker/docker/releases
Than first thing to check is if i can manually talk with the package repository, whether its archive.ubuntu.com or debian dist.. And it seems that the signal travels between container and destination just fine.
OK, that doesn't make much sense, lets start some debugging, show me some information on docker0 interface.
~|⇒ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: enp0s25: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000
link/ether 50:7b:9d:5e:74:f4 brd ff:ff:ff:ff:ff:ff
3: wlp3s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether dc:53:60:49:4a:0d brd ff:ff:ff:ff:ff:ff
inet 192.168.0.157/24 brd 192.168.0.255 scope global dynamic wlp3s0
valid_lft 599467sec preferred_lft 599467sec
inet6 fe80::de53:60ff:fe49:4a0d/64 scope link
valid_lft forever preferred_lft forever
4: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
link/ether 02:42:7a:e9:5c:6b brd ff:ff:ff:ff:ff:ff
inet 172.17.0.1/16 scope global docker0
valid_lft forever preferred_lft forever
Wait it's state DOWN ? Your container is not running silly.
~|⇒ docker run -t -i ubuntu:14.04 /bin/bash
interface status docker0 :
4: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
link/ether 02:42:7a:e9:5c:6b brd ff:ff:ff:ff:ff:ff
inet 172.17.0.1/16 scope global docker0
valid_lft forever preferred_lft forever
inet6 fe80::42:7aff:fee9:5c6b/64 scope link
valid_lft forever preferred_lft forever
6: veth2415ecd@if5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP group default
link/ether 6e:67:6b:e7:70:0a brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet6 fe80::6c67:6bff:fee7:700a/64 scope link
valid_lft forever preferred_lft forever
So the docker0 interface is up and running. And apt-get update fails, hmm, lets enforce usage of ipv4:
apt-get update -o Acquire::ForceIPv4=true
Still fails...
What if the DNS we use is problematic... ? OK let's test that.. lets run the container and make it use different dns servers:
docker run --dns 8.8.8.8 --dns 8.8.4.4 -t -i ubuntu:14.04 /bin/bash
No luck, we are getting stuck at %0 for update..
Ok, at this point if you are using debian or ubuntu, you have to make sure docker0 interface is not being handled by the NetworkManager :)
Let's see if the NetworkManager is handling the docker0 interface:
~|⇒ nmcli dev status
DEVICE TYPE STATE CONNECTION
docker0 bridge connected docker0
wlp3s0 wifi connected dlink-39B4
enp0s25 ethernet unavailable --
lo loopback unmanaged --
Bad NetworkManager, Bad!
http://support.qacafe.com/knowledge-base/how-do-i-prevent-network-manager-from-controlling-an-interface/
So let's prevent this, get the mac address of the docker0 interface
~|⇒ ifconfig
docker0 Link encap:Ethernet HWaddr 02:42:7a:e9:5c:6b
.
.
.
.
~|⇒ sudo emacs /etc/NetworkManager/NetworkManager.conf
Add the following to the end of the file:
[keyfile]
unmanaged-devices=mac:02:42:7a:e9:5c:6b
Stop & start NetworkManager
sudo service network-manager stop
sudo service network-manager start
Alright, now is another good time to check if the container can successfully perform apt-get update.. If you see the issue continuing, we have a few more tricks..
Lets launch the container with a flag that specifically tells it about the networking you want.
docker run --net=host -t -i ubuntu:14.04 /bin/bash
Hmm no luck?
OK you know what, let's just try and reset everything
pkill docker
iptables -t nat -F
/etc/init.d/networking restart
brctl delbr docker0
docker -d
So yeah, up until this moment if you couldn't resolve the problem, I would recommend you to take down your firewall for a brief moment to test without it, before doing that you should observe your iptables rules first of course. In my case even though iptables had special rules to allow docker connectivity (that is why the container has access to outside world) there was a rule about nginx that clashed with a container rule...
Anyway, i put these steps to help others, it is possible that maybe you solved your issue in the nth step before the end :)
PS: the inline markdown uncontrollably wants to highlight mac addresses with failure :) sorry about that!
Happy Hacking,
Emir