Advertising container hostname using avahi-daemon in ResinOS v2


#1

I’ve been trying to get this working for a while now. In ResinOS v1 it works fine by changing the hostname and starting the local avahi-daemon service.

In ResinOS v2, there is already an avahi-daemon running on the host, so I can’t run the local one.

I am running on systemd images, on intel NUC and QEMU:
NUC = Resin OS 2.0.0-rc4.rev1
QEMU = Resin OS 2.0.0-rc5.rev1

I’ve tried changing config.json to set my hostname there. I can’t ping my $(hostname).local from the device or from the network.

After setting DBUS address to the host’s DBUS socket, I’ve tried using dbus-send:

dbus-send --system --print-reply --reply-timeout=2000 --type=method_call   --dest=org.freedesktop.Avahi /   org.freedesktop.Avahi.Server.SetHostName string:"$(hostname)"

This seems to result in the avahi-daemon process on the host crashing and I can’t seem to recover it using systemctl.

In either case, I can’t ping my $(hostname).local from the device or from the network.

Any ideas?

– ab1


#2

Hey, I’ve been testing this on Raspberry Pi 3, and works without a hitch (linebreaks added for readability, and I guess you have DBUS_SYSTEM_BUS_ADDRESS defined somewhere else, but otherwise seems to be the same).

DBUS_SYSTEM_BUS_ADDRESS=unix:path=/host/run/dbus/system_bus_socket \
  dbus-send \
  --system \
  --print-reply \
  --reply-timeout=2000 \
  --type=method_call \
  --dest=org.freedesktop.Avahi \
  / \
  org.freedesktop.Avahi.Server.SetHostName \
  string:"${newhostname}"

This works just fine and pinging works too. Checking it now in QEMU, just wanted to give you a heads up.


#3

I am trying now (again) by building a separate v2 image without avahi-*. I think there is a race condition when there is avahi-daemon trying to start in the container.


#4

I was thinking about it, and one way you could have 1.x and 2.x devices with the same application, is to 1) use systemctl 2) in the Dockerfile do a systemctl disable avahi-daemon, 3) and on the 1.x devices only run a systemctl start avahi-daemon in the start script. Haven’t had a chance to test it out fully, but it should be on the right path, and would get rid of the race condition if there’s any.

Nevertheless, a separate 2.x image would probably be a better / more reliable way to test these things out.


#5

I like your first idea, it feels cleaner. Trying it out now…


#6

Just an update - with Resin OS 2.0.0-rc4.rev1 on Intel NUC and avahi-daemon disabled in Dockerfile, avahi-resolve works but ping (from within the container hangs).

root@blackbox:~# avahi-resolve -4 --name blackbox.local
blackbox.local  172.17.0.1

Not sure if ping works from outside the container (i.e. from the LAN), as I can’t test that.

– ab1


#7

Yeah, we are looking at that. The hostOS correctly resolves and communicates with .local addresses, but from within the container it does not do that.


#8

Does it resolve externally (i.e. pinging the .local name from another node on the same network segment as the host)


#9

What I’ve checked so far:

  • externally pinging the resin.io device through its .local address resolves it totally fine (both in the case of the default hostname, and the dbus-adjusted hostname)
  • pinging outside .local addresses from the hostOS is totally fine (ie. avahi works as intended in the host)
  • Edit: pinging from the container itself cannot resolve .local addresses (it seems) pinging .local addresses from the container should work after installing libnss-mdns, which also sets up the relevant files (mainly /etc/nsswitch.conf). There might be issues avahi-daemon running both in the host and the container, but it needs testing for the specific use cases.

Thus the issue being imho “how to use nss from the container without the avahi-daemon, not to get conflict with the host avahi”.


#10

OK, great. In that case it doesn’t really affect me, since the discovery happens from outside the host/container.


#11

Yeah, if you are discovering outside of the container, it should all work. :checkered_flag: Generally helps testing things the way that they are going to be used, to avoid spurious issues.

A sidenote, I’ve tested it within the container on RPi3 resinOS 2.0.0+rev2 with the Debian image, that installing libnss-mdns will set you up just fine for pinging (and this connecting to, I guess) .local addresses. It installs and starts avahi-daemon, but it doesn’t seem to have any conflict. Will update my list of things tried above :slight_smile:


#12

#13

#14

Unfortunately I don’t have a physical device on my local network to test external resolution and I am not trusting QEMU hosted devices. Hence, trying by “proxy” to test internally.


#16

OK, verified external avahi name resolution with QEMU now works. Thanks for your help, I think this is the most elegant solution keeping the same Dockerfile between two ResinOS major versions.


#17

On a related note, since we can’t run our own instance of avahi-daemon anymore, is there any easy way to advertise new services via avahi?

I’ve been trying to create a new service via d-bus to the host avahi, but it seems like the whole EntryGroup interface of methods can’t be seen?

root@3aab805-3aab805:~# dbus-send \
>   --system \
>   --print-reply \
>   --reply-timeout=2000 \
>   --type=method_call \
>   --dest=org.freedesktop.Avahi \
>   / \
>   org.freedesktop.Avahi.Server.EntryGroupNew
method return sender=:1.2 -> dest=:1.4 reply_serial=2
   object path "/Client0/EntryGroup1"

root@3aab805-3aab805:~# DBUS_SYSTEM_BUS_ADDRESS=unix:path=/host/run/dbus/system_bus_socket \
>   dbus-send \
>   --system \
>   --print-reply \
>   --reply-timeout=2000 \
>   --type=method_call \
>   --dest=org.freedesktop.Avahi \
>   /Client0/EntryGroup1 \
>   org.freedesktop.Avahi.EntryGroup.AddService \
> int32:-1 int32:-1 uint32:0 string:'HTTP Service' string:'_http._tcp' string:'' string:'' uint16:80 array:byte:""
Error org.freedesktop.DBus.Error.UnknownObject: Method "AddService" with signature "iiussssqay" on interface "org.freedesktop.Avahi.EntryGroup" doesn't exist

I thought it might just be me getting the method argument types wrong, but even when calling one of the methods with only output arguments shows up as doesn't exist.

root@3aab805-3aab805:~# DBUS_SYSTEM_BUS_ADDRESS=unix:path=/host/run/dbus/system_bus_socket \
>   dbus-send \
>   --system \
>   --print-reply \
>   --reply-timeout=2000 \
>   --type=method_call \
>   --dest=org.freedesktop.Avahi \
>    /Client0/EntryGroup1 \
>   org.freedesktop.Avahi.EntryGroup.GetState
Error org.freedesktop.DBus.Error.UnknownObject: Method "GetState" with signature "" on interface "org.freedesktop.Avahi.EntryGroup" doesn't exist

Anybody have any luck with this?


#18

Here’s an example using pydbus, I’m sure that can be simpler as well (though dbus-send can be somewhat limited in cases). This works for sure and modular (can just use the python scripts in your start script like start.sh, and the rest of your code being different.


#19

Thanks @imrehg for the example implementation!

I finally managed to get it to work in my image, but after much hair-pulling.

For some reason, the registered services disappear if the python script finishes - this didn’t occur in the example image, cause the running flask server (app.run(host='0.0.0.0', port=80)) keeps the script running indefinitely.

However, when I used the sample script in the readme, I couldn’t see my service being registered at all:

from avahi.service import AvahiService
avahiservice = AvahiService("resin webserver", "_http._tcp", 80)

This seems weird to me, cause from what I understand from service.py, after the EntryGroup is commited, it should remain registered?

Example myserver.py to showcase this:

import time
from avahi.service import AvahiService

avahiservice = AvahiService("resin webserver", "_http._tcp", 80)
print "Now you see it"
time.sleep(5)
print "And then it disappears"

On my laptop:

$dns-sd -B _http._tcp .
Browsing for _http._tcp
DATE: ---Fri 05 May 2017---
10:11:36.055  ...STARTING...
Timestamp     A/R    Flags  if Domain               Service Type         Instance Name
10:13:30.284  Add        2   4 local.               _http._tcp.          resin webserver
10:13:35.653  Rmv        0   4 local.               _http._tcp.          resin webserver

My current workaround is to have a indefinite sleep loop, and background the python script.

import time
from avahi.service import AvahiService

avahiservice = AvahiService("resin webserver", "_http._tcp", 80)
while True:
    time.sleep(5)

In my startup.sh:

python myserver.py &

#20

Hey, no, when your application that sets the service exits, the service is removed too, that sort of makes sense from the udev/avahi point of view. I’m guessing Python cleans it up when the script exists.

So yeah, have to keep the service running. In that Flask example it made complete sense, because one should only advertise the service as long as Flask is running.

Will add some functionality to the scripts to have this built in easier and add more info to the readme too :thumbsup:, but in the meantime your solution is totally fine. It’s just a script anyways. dbus and avahi are peculiar beasts.

What kind of server are you running yourself? (out of curiosity)


#21

Hmm, interesting, because I’m wondering how would dbus know how to clean up what a script does, like, for example, if the script did EntryGroup.UpdateServiceTxt.

In any case, things are working well for me - I’m ready for resinOS 2, and looking forward to when resinHUP can do the upgrade!

What kind of server are you running yourself? (out of curiosity)

The server is to support a point-of-sales system, so the service discovery is required for POS registers to find it, and push completed transactions to.