Device does not reconnect to resin.io after network loss and recovery


#1

Hi,

We have intel NUC devices connected by an ethernet cable.
If one loses internet connection due to a network issue the device stays offline in resin.io.
We reproduced this by pulling the ethernet cable and then reinserting it after some time (several hours later).
Once we power cycle the device it connects just fine.

Shouldn’t devices (re)try to send updated after some time?


#5

Hi. Can you try to reproduce this and when that happens try to fet the NM logs and send them over to us?

journalctl --no-pager -u NetworkManager


#7

I tried it two times:
https://pastebin.com/SUQgvrW1
https://pastebin.com/awq5Xbry
If you need more info, please let me know.


#14

Hi! Thanks for the logs. Could you confirm what resinOS version you’re using?


#16

Hi,
We’re using Resin OS 2.12.7+rev2.


#20

Thank you for confirming your os version. We worked on a similar reported issue these days. Would you be able to check a possible fix using a custom image?


#21

Yes we can test using a custom image. Let us know where we can find it.


#22

Hi. We will release a new OS version for the Intel NUC during this week. This new release will have some fixes with regards to DHCP timeout which we suspect will fix your problem also. So stay tuned for later this week on the new release


#23

Good to know, thanks! We’ll let you know if this fixed the issue.


#24

@floion We found these releases, but don’t see these releases in the dropdown in the “add device” screen in resin. When will these release be made public? Can we test them earlier?


#25

they will be available in the following days; we discovered a bug while testing the version in staging so we need to do yet another release; stay tuned for deployment to production


#26

We updated to Resin OS 2.13.5+rev1. Here are the logs:

root@a8f7528:~# journalctl --no-pager -u NetworkManager
– Logs begin at Tue 2018-07-17 00:50:11 UTC, end at Tue 2018-07-17 06:49:48 UTC. –
Jul 17 06:46:48 a8f7528 NetworkManager[643]: [1531810008.0866] device (enp1s0): carrier: link connected
Jul 17 06:46:48 a8f7528 NetworkManager[643]: [1531810008.0876] device (enp1s0): state change: unavailable -> disconnected (reason ‘carrier-changed’, sys-iface-state: ‘managed’)
Jul 17 06:46:48 a8f7528 NetworkManager[643]: [1531810008.0897] policy: auto-activating connection ‘Wired connection 1’ (ad9ec2cc-1f99-3794-9d1f-90ba4b143746)
Jul 17 06:46:48 a8f7528 NetworkManager[643]: [1531810008.0920] device (enp1s0): Activation: starting connection ‘Wired connection 1’ (ad9ec2cc-1f99-3794-9d1f-90ba4b143746)
Jul 17 06:46:48 a8f7528 NetworkManager[643]: [1531810008.0924] device (enp1s0): state change: disconnected -> prepare (reason ‘none’, sys-iface-state: ‘managed’)
Jul 17 06:46:48 a8f7528 NetworkManager[643]: [1531810008.0937] manager: NetworkManager state is now CONNECTING
Jul 17 06:46:48 a8f7528 NetworkManager[643]: [1531810008.0946] device (enp1s0): state change: prepare -> config (reason ‘none’, sys-iface-state: ‘managed’)
Jul 17 06:46:48 a8f7528 NetworkManager[643]: [1531810008.0960] device (enp1s0): state change: config -> ip-config (reason ‘none’, sys-iface-state: ‘managed’)
Jul 17 06:46:48 a8f7528 NetworkManager[643]: [1531810008.0968] dhcp4 (enp1s0): activation: beginning transaction (no timeout)
Jul 17 06:46:53 a8f7528 NetworkManager[643]: [1531810013.3214] dhcp6 (enp1s0): activation: beginning transaction (timeout in 45 seconds)
Jul 17 06:46:54 a8f7528 NetworkManager[643]: [1531810014.4646] dhcp4 (enp1s0): address 192.168.1.134
Jul 17 06:46:54 a8f7528 NetworkManager[643]: [1531810014.4647] dhcp4 (enp1s0): plen 24
Jul 17 06:46:54 a8f7528 NetworkManager[643]: [1531810014.4647] dhcp4 (enp1s0): expires in 86400 seconds
Jul 17 06:46:54 a8f7528 NetworkManager[643]: [1531810014.4647] dhcp4 (enp1s0): nameserver ‘10.31.10.51’
Jul 17 06:46:54 a8f7528 NetworkManager[643]: [1531810014.4648] dhcp4 (enp1s0): nameserver ‘10.97.32.17’
Jul 17 06:46:54 a8f7528 NetworkManager[643]: [1531810014.4648] dhcp4 (enp1s0): nameserver ‘10.31.10.50’
Jul 17 06:46:54 a8f7528 NetworkManager[643]: [1531810014.4648] dhcp4 (enp1s0): domain name ‘redacted
Jul 17 06:46:54 a8f7528 NetworkManager[643]: [1531810014.4649] dhcp4 (enp1s0): gateway 192.168.1.1
Jul 17 06:46:54 a8f7528 NetworkManager[643]: [1531810014.4679] dhcp4 (enp1s0): state changed unknown -> bound
Jul 17 06:46:54 a8f7528 NetworkManager[643]: [1531810014.4702] device (enp1s0): state change: ip-config -> ip-check (reason ‘none’, sys-iface-state: ‘managed’)
Jul 17 06:46:54 a8f7528 NetworkManager[643]: [1531810014.4737] device (enp1s0): state change: ip-check -> secondaries (reason ‘none’, sys-iface-state: ‘managed’)
Jul 17 06:46:54 a8f7528 NetworkManager[643]: [1531810014.4757] device (enp1s0): state change: secondaries -> activated (reason ‘none’, sys-iface-state: ‘managed’)
Jul 17 06:46:54 a8f7528 NetworkManager[643]: [1531810014.4773] manager: NetworkManager state is now CONNECTED_LOCAL
Jul 17 06:46:54 a8f7528 NetworkManager[643]: [1531810014.4818] manager: NetworkManager state is now CONNECTED_SITE
Jul 17 06:46:54 a8f7528 NetworkManager[643]: [1531810014.4821] policy: set ‘Wired connection 1’ (enp1s0) as default for IPv4 routing and DNS
Jul 17 06:46:54 a8f7528 NetworkManager[643]: [1531810014.4826] dns-mgr: Writing DNS information to /sbin/resolvconf
Jul 17 06:46:54 a8f7528 NetworkManager[643]: [1531810014.4976] device (enp1s0): Activation: successful, device activated.
Jul 17 06:46:54 a8f7528 NetworkManager[643]: [1531810014.4994] manager: NetworkManager state is now CONNECTED_GLOBAL
Jul 17 06:47:38 a8f7528 NetworkManager[643]: [1531810058.3075] dhcp6 (enp1s0): request timed out
Jul 17 06:47:38 a8f7528 NetworkManager[643]: [1531810058.3076] dhcp6 (enp1s0): state changed unknown -> timeout
Jul 17 06:47:38 a8f7528 NetworkManager[643]: [1531810058.3077] dhcp6 (enp1s0): canceled DHCP transaction
Jul 17 06:47:38 a8f7528 NetworkManager[643]: [1531810058.3077] dhcp6 (enp1s0): state changed timeout -> done
Jul 17 06:48:36 a8f7528 NetworkManager[643]: [1531810116.9740] manager: (resin-vpn): new Tun device (/org/freedesktop/NetworkManager/Devices/10)

These are the latest logs, this seems to work just fine. Thanks for the quick response!