Byzantine connectivity issues with docker & package managers

Good day nerds :)

I decided to do a write up about docker and some weird connectivity issues that i had. So maybe it will be useful for the others as well. In this piece i am going to give the debugging process rather than the direct solution, unfortunately there is no one direct solution to these kind of problems in general...

So the anatomy of the problem:

  • Host machine has connectivity to outside world and DNS resolves just fine.
  • Container on the host machine has connectivity, DNS resolves just fine.
  • You try to update or fetch a package via apt-get or yum, connection freezes.

Before debugging the problem, you have to make sure that you have the latest docker version available to you.. compare local and latest release:

~|⇒ docker -v
Docker version 1.9.1, build a34a1d5  

https://github.com/docker/docker/releases

Than first thing to check is if i can manually talk with the package repository, whether its archive.ubuntu.com or debian dist.. And it seems that the signal travels between container and destination just fine.

OK, that doesn't make much sense, lets start some debugging, show me some information on docker0 interface.

~|⇒ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default  
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: enp0s25: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000  
    link/ether 50:7b:9d:5e:74:f4 brd ff:ff:ff:ff:ff:ff
3: wlp3s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000  
    link/ether dc:53:60:49:4a:0d brd ff:ff:ff:ff:ff:ff
    inet 192.168.0.157/24 brd 192.168.0.255 scope global dynamic wlp3s0
       valid_lft 599467sec preferred_lft 599467sec
    inet6 fe80::de53:60ff:fe49:4a0d/64 scope link 
       valid_lft forever preferred_lft forever
4: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default  
    link/ether 02:42:7a:e9:5c:6b brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 scope global docker0
       valid_lft forever preferred_lft forever

Wait it's state DOWN ? Your container is not running silly.

~|⇒ docker run -t -i ubuntu:14.04 /bin/bash

interface status docker0 :

4: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default  
    link/ether 02:42:7a:e9:5c:6b brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 scope global docker0
       valid_lft forever preferred_lft forever
    inet6 fe80::42:7aff:fee9:5c6b/64 scope link 
       valid_lft forever preferred_lft forever
6: veth2415ecd@if5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP group default  
    link/ether 6e:67:6b:e7:70:0a brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet6 fe80::6c67:6bff:fee7:700a/64 scope link 
       valid_lft forever preferred_lft forever

So the docker0 interface is up and running. And apt-get update fails, hmm, lets enforce usage of ipv4:

apt-get update -o Acquire::ForceIPv4=true  

Still fails...

What if the DNS we use is problematic... ? OK let's test that.. lets run the container and make it use different dns servers:

docker run --dns 8.8.8.8 --dns 8.8.4.4 -t -i ubuntu:14.04 /bin/bash  

No luck, we are getting stuck at %0 for update..

Ok, at this point if you are using debian or ubuntu, you have to make sure docker0 interface is not being handled by the NetworkManager :)

Let's see if the NetworkManager is handling the docker0 interface:

~|⇒ nmcli dev status
DEVICE   TYPE      STATE        CONNECTION  
docker0  bridge    connected    docker0  
wlp3s0   wifi      connected    dlink-39B4  
enp0s25  ethernet  unavailable  --  
lo       loopback  unmanaged    --  

Bad NetworkManager, Bad!
http://support.qacafe.com/knowledge-base/how-do-i-prevent-network-manager-from-controlling-an-interface/

So let's prevent this, get the mac address of the docker0 interface

~|⇒ ifconfig

docker0 Link encap:Ethernet HWaddr 02:42:7a:e9:5c:6b
.
.
.
.

~|⇒ sudo emacs /etc/NetworkManager/NetworkManager.conf

Add the following to the end of the file:

[keyfile]
unmanaged-devices=mac:02:42:7a:e9:5c:6b  

Stop & start NetworkManager

sudo service network-manager stop

sudo service network-manager start

Alright, now is another good time to check if the container can successfully perform apt-get update.. If you see the issue continuing, we have a few more tricks..

Lets launch the container with a flag that specifically tells it about the networking you want.

docker run --net=host -t -i ubuntu:14.04 /bin/bash  

Hmm no luck?

OK you know what, let's just try and reset everything

pkill docker  
iptables -t nat -F  
/etc/init.d/networking restart
brctl delbr docker0  
docker -d  

So yeah, up until this moment if you couldn't resolve the problem, I would recommend you to take down your firewall for a brief moment to test without it, before doing that you should observe your iptables rules first of course. In my case even though iptables had special rules to allow docker connectivity (that is why the container has access to outside world) there was a rule about nginx that clashed with a container rule...

Anyway, i put these steps to help others, it is possible that maybe you solved your issue in the nth step before the end :)

PS: the inline markdown uncontrollably wants to highlight mac addresses with failure :) sorry about that!

Happy Hacking,

Emir