Strange behaviour IOT2040 (IOT2000 baseimage)


#1

Working on a multi-container app on the IOT2040 there is some strange behaviour on the download of multi services on resinstaging. I have build 2 services (github.com/bas-prorail/resin-datacollector-edge) and after the successful build (unicorn) there is only one service (s7comm) downloading until it hits approx. 25% then slowing down and completely stopping at 29% (with error TLS handshake timeout) followed by a reset and starting at 0% again. The other one (datacollector-edge) is just showing no download at all, logging 'TLS handshake timeout and then restarting. The logging shows only 14 seconds between ‘downloading’ and ‘failed’.

27.04.18 12:44:52 (+0200) Failed to download image 'registry2.resinstaging.io/v2/818daa5c8aeb81eaa1f06e78b0d57df8@sha256:078abbad982b3479fb976bad74c7ce339b0a62049c23e3a79c4825edbc099720' due to '(HTTP code 500) server error - Get https://registry2.resinstaging.io/v2/v2/818daa5c8aeb81eaa1f06e78b0d57df8/manifests/sha256:078abbad982b3479fb976bad74c7ce339b0a62049c23e3a79c4825edbc099720: Get https://api.resinstaging.io/auth/v1/token?account=d_1fed70a079b29cc347ddbbf73bccd5cc&scope=repository%3Av2%2F818daa5c8aeb81eaa1f06e78b0d57df8%3Apull&service=registry2.resinstaging.io: net/http: TLS handshake timeout '
27.04.18 12:44:56 (+0200) Downloading image 'registry2.resinstaging.io/v2/818daa5c8aeb81eaa1f06e78b0d57df8@sha256:078abbad982b3479fb976bad74c7ce339b0a62049c23e3a79c4825edbc099720'
27.04.18 12:45:10 (+0200) Failed to download image 'registry2.resinstaging.io/v2/818daa5c8aeb81eaa1f06e78b0d57df8@sha256:078abbad982b3479fb976bad74c7ce339b0a62049c23e3a79c4825edbc099720' due to '(HTTP code 500) server error - Get https://registry2.resinstaging.io/v2/v2/818daa5c8aeb81eaa1f06e78b0d57df8/manifests/sha256:078abbad982b3479fb976bad74c7ce339b0a62049c23e3a79c4825edbc099720: Get https://api.resinstaging.io/auth/v1/token?account=d_1fed70a079b29cc347ddbbf73bccd5cc&scope=repository%3Av2%2F818daa5c8aeb81eaa1f06e78b0d57df8%3Apull&service=registry2.resinstaging.io: net/http: TLS handshake timeout '
27.04.18 12:45:14 (+0200) Downloading image 'registry2.resinstaging.io/v2/818daa5c8aeb81eaa1f06e78b0d57df8@sha256:078abbad982b3479fb976bad74c7ce339b0a62049c23e3a79c4825edbc099720'
27.04.18 12:45:28 (+0200) Failed to download image 'registry2.resinstaging.io/v2/818daa5c8aeb81eaa1f06e78b0d57df8@sha256:078abbad982b3479fb976bad74c7ce339b0a62049c23e3a79c4825edbc099720' due to '(HTTP code 500) server error - Get https://registry2.resinstaging.io/v2/v2/818daa5c8aeb81eaa1f06e78b0d57df8/manifests/sha256:078abbad982b3479fb976bad74c7ce339b0a62049c23e3a79c4825edbc099720: Get https://api.resinstaging.io/auth/v1/token?account=d_1fed70a079b29cc347ddbbf73bccd5cc&scope=repository%3Av2%2F818daa5c8aeb81eaa1f06e78b0d57df8%3Apull&service=registry2.resinstaging.io: net/http: TLS handshake timeout '
27.04.18 12:45:34 (+0200) Downloading image 'registry2.resinstaging.io/v2/818daa5c8aeb81eaa1f06e78b0d57df8@sha256:078abbad982b3479fb976bad74c7ce339b0a62049c23e3a79c4825edbc099720'
27.04.18 12:45:49 (+0200) Failed to download image 'registry2.resinstaging.io/v2/818daa5c8aeb81eaa1f06e78b0d57df8@sha256:078abbad982b3479fb976bad74c7ce339b0a62049c23e3a79c4825edbc099720' due to '(HTTP code 500) server error - Get https://registry2.resinstaging.io/v2/v2/818daa5c8aeb81eaa1f06e78b0d57df8/manifests/sha256:078abbad982b3479fb976bad74c7ce339b0a62049c23e3a79c4825edbc099720: Get https://api.resinstaging.io/auth/v1/token?account=d_1fed70a079b29cc347ddbbf73bccd5cc&scope=repository%3Av2%2F818daa5c8aeb81eaa1f06e78b0d57df8%3Apull&service=registry2.resinstaging.io: net/http: TLS handshake timeout '
27.04.18 12:45:55 (+0200) Downloading image 'registry2.resinstaging.io/v2/818daa5c8aeb81eaa1f06e78b0d57df8@sha256:078abbad982b3479fb976bad74c7ce339b0a62049c23e3a79c4825edbc099720'

When pulling the 2 services apart and downloading each (one after the other) as ‘main’ single service they are downloading and started as expected. Striking difference between ‘multi-service’ and ‘single service’ is that in single service it takes about 1-2 minutes before downloading starts after showing ‘main – downloading - 0%’, so I would expect that ‘multi-service’ does the same but after 14 sec. it restarts.

I’m working with resinstaging because the IOT2000 image on this platform is supporting multi-container, the OS is 2.12.5+rev2 with a fresh factorybuild uSD (CL10, 16GB) I have tried also another uSD but have the same issue.
@imrehg Any suggestions on how to solve this?

Thank you, Bas


#2

@bas_thingshub, not totally sure, maybe can you switch on the Delta downloads from the configuration? That downloads things slightly differently, and maybe enough to get things unblocked. It does sound strange, though. I’m also going to check how the IOT2000 release testing is going, to have it available on Production. Staging is not guaranteed to always work, since it’s the testing ground, and it might result in a bit more debugging time spent than would be good.