New device in multicontainer application won't download/start images

support
raspberrypi3

#1

TL;DR Images are starting and stopping immediately, and other images are stuck on 0%. Device will not get first commit.

I created a new application in Resin, and provisioned a new device by downloading the iso. This is a branch off of a working deployment, so the code is not the issue. The device shows up in the dashboard now, and I can restart it, shell in, etc. However, the services will not stay started, and some are stuck on 0% downloading. The application published successfully from a git push, it’s just struggling to start on the first time.

I’m trying to make the case that we should get a paying plan, but I can’t do that if the devices won’t even deploy. I need help on this ASAP, as there’s nothing I can do on my end to fix it as it seems to be a supervisor/Resin API issue.

Resin: see https://dashboard.resin.io/devices/1fb106e2fe92775e3182e9f56e8429f7/summary for the exact device having issues.


#4

Great - now my dev deployment is doing the same after an update. All the images are starting then getting killed immediately by the supervisor - what is going on here?? My code deploys are downloading, then the supervisor loses its mind trying to start and stop all the services.

https://dashboard.resin.io/devices/597b2a18f64e87e2ec435c8e3a48316e/summary


#5

Hi @marcymarcy - could you grant support access to the affected devices from the device actions? And if possible, could you share your docker-compose.yml? We’ve seen a few cases where specific fields can confuse the supervisor, they’re usually easy to fix but we need to pinpoint what’s wrong. Thanks!


#6

CRAP it’s all my deployments, even the ones I haven’t touched today! This is bad, all production is currently down. Here’s the link to a prod device that hasn’t been updated for over a week, and this started happening today:

https://dashboard.resin.io/devices/7d49b3316a8c87a8300ecb88f5e1c0a9/summary (support access granted)

And here’s the docker-compose.yml which hasn’t changed since last week and was working then:
(I had trouble with formatting and pasted it in the next reply)


#7
version: '2'
services:
  nginx:
    build: ./nginx
#    depends_on: 
#      - mitmpassive
#    ports:
#      - "80:80"
    network_mode: host
    volumes:
      - mitmoutput:/files
  
  frontend:
    build: ./frontend
    restart: always
    network_mode: host

  webhooks:
    build: ./webhook
    privileged: true
    restart: always
    command: ["webhook","-hooks","/opt/webhooks/hooks.json","-port","8080","-verbose", "-header", "Access-Control-Allow-Origin=*", "-header", "Access-Control-Allow-Methods=GET,PUT"]
    ports:
      - "12345:8080"
    volumes:
      - mitmoutput:/opt/mitmoutput
    labels:
      io.resin.features.balena-socket: '1'
    environment:
      internalport: 8080

  mitmpassive:
    image: quay.io/realeyes/mitmproxy-rpi3:4.0.3
    command: ["/usr/bin/mitmweb","-p","8888","--web-iface","127.0.0.1","--web-port","8080","--mode","transparent","--showhost","-w","+/swap/defaultcapture"]
#    command: ["/usr/bin/mitmweb","-p","8888","--web-iface","127.0.0.1","--web-port","8080","--mode","transparent","--showhost","-w","+/opt/mitmoutput/Capture"]
    network_mode: host
    environment:
      max_mem_in_kb: 450000
    mem_limit: 550m
    mem_reservation: 500m
#    healthcheck:
#      test: ["CMD-SHELL", "if [ $(free -m | grep Mem: | awk '{print $3}') -le ${max_mem_in_kb} ]; then exit 0; else exit 1; fi"]
#      start_period: 40s
    restart: unless-stopped
    volumes:
      - mitmoutput:/opt/mitmoutput
      - swap:/swap

#  mitmweb:
#    image: quay.io/realeyes/mitmproxy-rpi3:4.0.3
#    command: ["/usr/bin/mitmweb","-p","8888","--web-iface","127.0.0.1","--web-port","8080","--mode","transparent","--showhost","-w","+/swap/defaultcapture"]
#    command: ["/usr/bin/mitmweb","-p","8888","--web-iface","127.0.0.1","--web-port","8080","--mode","transparent","--showhost","-w","/opt/mitmoutput/defaultcapture"]
#    network_mode: host
#    environment:
#      max_mem_in_kb: 450000
#    mem_limit: 550m
#    mem_reservation: 500m
#    healthcheck:
#      test: ["CMD-SHELL", "if [ $(free -m | grep Mem: | awk '{print $3}') -le ${max_mem_in_kb} ]; then exit 0; else exit 1; fi"]
#      start_period: 40s
#    restart: unless-stopped
#    volumes:
#      - mitmoutput:/opt/mitmoutput
#      - swap:/swap
#    ports:
#      - "8888:8888"
#    expose:
#      - "8080"

  iptables:
    build: ./iptables
    privileged: true
    restart: no
    network_mode: host

# File Parse script now run directly on webhooks service
#  fileparse:
#    build: ./fileparse
#    restart: no
#    network_mode: host
#    volumes:
#      - mitmoutput:/opt/mitmoutput:ro

#  file-manager:
#    build: ./file-access
#    command: ["node","--harmony","index.js","-p","8999","-d","/files"]
#    restart: always
#    volumes:
#      - mitmoutput:/files 
#    network_mode: host
#  landingpage:
#    build: ./landing-page
#    network_mode: host

volumes:
  mitmoutput:
  swap:

#9

This may be related to a new Env variable I added to all deployments today - the supervisor container is throwing errors on trying to assign the ENV variable.


#13

@marcymarcy indeed I think it’s that - there’s an env var that seems to have an invalid value, does it have a newline? We’re working on adding better error handling to this, but you should be able to fix this by changing that env var value.


#14

It’s a really long var, as it’s a Base64 of a cert file. How would I put in a variable that long correctly? Or better yet, how do I get variables in at build time instead of in the environment at runtime? The reason I’m adding this in is I don’t want to commit a cert to the repo as that’s rather bad practice.


#15

There’s actually a fix that will be deployed soon that will not allow you to create variables with newlines. Base64 is fine as long as it doesn’t have newline characters. Otherwise you can encode it in JSON, URL percent encoding, or some other format like that and decode it from your app.

Build secrets are in our roadmap (they will be part of the resin push command in our CLI).


#16

Apparently printing a base64 to terminal causes newlines - I created it without newlines and reuploaded, and the devices seem to be taking it now.

Thanks for the reply on build time secrets, that will help immensely to deploy this sort of information into containers.


#17

Thanks for the help! Reuploading the env variable without newline characters fixed all deployments.