Starting a process with systemd on a resin.io managed device

systemd

#1

Hi,

Is there anything to do other than putting the INITSYSTEM env var anywhere in your dockerfile ?

I’m totally stuck here to make the TICK stack managed by systemd and I’m not sure if it’s Resin related or just some systemd mystery :

UPDATE : also maybe I have to say I haven’t restarted the container at all, I just expect “systemctl start” to work right after “systemctl enable”, am I wrong ?

UPDATE 2 : so you don’t have to do the install to see the telegraf.service script, here it is. Maybe using a non-root user is a problem inside a container ? (I have double check the “ExecStart” command line can run as “telegraf” user).

[Unit]
Description=The plugin-driven server agent for reporting metrics into InfluxDB
Documentation=https://github.com/influxdata/telegraf
After=network.target

[Service]
EnvironmentFile=-/etc/default/telegraf
User=telegraf
ExecStart=/usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d ${TELEGRAF_OPTS}
ExecReload=/bin/kill -HUP $MAINPID
Restart=on-failure
KillMode=control-group

[Install]
WantedBy=multi-user.target

#3

@Tristan107,

Are what does systemctl status telegraf give you? Im assuming executing /usr/bin/telegraf as root works as expected right?


#4

Hi @craig-mulligan,

You can find the output here, the command works as root AND as telegraf :

UPDATE : during container start, I have this strange error message, maybe related ?

06.09.17 00:27:15 (+0200) Systemd init system enabled.
06.09.17 00:27:15 (+0200) systemd 230 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ -LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN)
06.09.17 00:27:15 (+0200) Detected virtualization docker.
06.09.17 00:27:15 (+0200) Detected architecture arm.
06.09.17 00:27:15 (+0200) Set hostname to <d367deb>.
**06.09.17 00:27:15 (+0200) Failed to install release agent, ignoring: No such file or directory**

#5

Today, after a full restart (which forced me to redo some conf since everything is not in my Dockerfile yet), systemd do start my telegraf process when I do systemctl start telegraf.

I noticed something which is resin related though : since systemd can’t be detected at build image time, the telegraf debian package installs init.d scripts which will be mixed with your telegraf.service instructions, I prefer to remove these init.d scripts.

I may post my final Dockerfile here since there is nothing confidential.

UPDATE : here are the relevant part of my setup (working now) :

Dockerfile.template :

# Install telegraf
 && wget -O /tmp/telegraf.deb https://dl.influxdata.com/telegraf/releases/telegraf_${TELEGRAF_VERSION}_armhf.deb \
 && dpkg -i /tmp/telegraf.deb \
 && rm /tmp/telegraf.deb \
 # We use systemd, not init.d services
 && rm -Rf /etc/init.d/telegraf \ 
 && cp /usr/lib/telegraf/scripts/telegraf.service /usr/lib/systemd/system/telegraf.service

start.sh :

#!/bin/sh

# Maybe needed to take into account the new telegraf.service
systemctl daemon-reload

# Enable telegraf service boot loading
systemctl enable telegraf

# Start telegraf service
systemctl start telegraf

# To prevent Docker from exiting
journalctl -f

#6

@Tristan107 glad you got it working :slight_smile:

I think I’ll copy your systemd set up. I’ve extracted the telegraf stuff I’m doing in another project into a a small repo, there maybe things you can copy from there? Most of it’s pretty standard but I added some exec scripts to monitor some resin services on the device see here: https://github.com/craig-mulligan/resin-telegraf/tree/master/scripts/resin-services.

I also have backend service that polls the resin api and logs device status to influxdb if you’re interested in that.


#7

Ok, please take some more setup then : here’s the full Dockerfile.template + start.sh for a TIC install (no Kapacitor yet).

I don’t copy my conf files here, but it’s just copied and adapted from the default conf, and for telegraf I’ve splitted it in one file per plugin in telegraf.d directory for easier maintenance.

Dockerfile.template :

FROM resin/%%RESIN_ARCH%%-debian:stretch

ENV TIMEZONE=Europe/Paris

ENV TELEGRAF_VERSION=1.3.5-1
ENV INFLUXDB_VERSION=1.3.5
ENV CHRONOGRAF_VERSION=1.3.7.0

# Timezone setting
 RUN echo ${TIMEZONE} > /etc/timezone \
  && rm -f /etc/localtime \
  && ln -s /usr/share/zoneinfo/${TIMEZONE} /etc/localtime \
  && dpkg-reconfigure tzdata \

# Install basic tools
 && apt-get update \
 && apt-get install -y \
				apt-transport-https \
				binutils \
				less \
				wget \

# Install Telegraf
 && wget -O /tmp/telegraf.deb https://dl.influxdata.com/telegraf/releases/telegraf_${TELEGRAF_VERSION}_armhf.deb \
 && dpkg -i /tmp/telegraf.deb \
 && rm /tmp/telegraf.deb \
 # We use systemd, not init.d services
 && rm -Rf /etc/init.d/telegraf \ 
 && cp /usr/lib/telegraf/scripts/telegraf.service /usr/lib/systemd/system/telegraf.service \
 
# Install InfluxDB
 && wget -O /tmp/influxdb.deb https://dl.influxdata.com/influxdb/releases/influxdb_${INFLUXDB_VERSION}_armhf.deb \
 && dpkg -i /tmp/influxdb.deb \
 && rm /tmp/influxdb.deb \
 # We use systemd, not init.d services
 && rm -Rf /etc/init.d/influxdb \ 
 && cp /usr/lib/influxdb/scripts/influxdb.service /usr/lib/systemd/system/influxdb.service \
	

# Install Chronograf
 && wget -O /tmp/chronograf.deb https://dl.influxdata.com/chronograf/releases/chronograf_${CHRONOGRAF_VERSION}_armhf.deb \
 && dpkg -i /tmp/chronograf.deb \
 && rm /tmp/chronograf.deb \
 # We use systemd, not init.d services
 && rm -Rf /etc/init.d/chronograf \ 
 && cp /usr/lib/chronograf/scripts/chronograf.service /usr/lib/systemd/system/chronograf.service \

 && echo "End of main RUN"

# Add app
RUN mkdir -p /usr/src/app
ADD /app /usr/src/app

# Add Configuration files for the TICK Stack
ADD /config /etc/telegraf
ADD /config/influxdb.conf /etc/influxdb/influxdb.conf

# Enable systemd init system in container
ENV INITSYSTEM=on

# Start InfluxDB, Kapacitor, Chronograf, and Telegraf
CMD ["bash", "/usr/src/app/start.sh"]

start.sh :

#!/bin/sh

# Make and chown influxdb data directory
mkdir -p /data/influxdb
chown influxdb:influxdb /data/influxdb

# Make and chown chronograf data directory
mkdir -p /data/chronograf
chown chronograf:chronograf /data/chronograf

# Maybe needed to take into account the new telegraf.service
systemctl daemon-reload

# Enable Telegraf service boot loading
systemctl enable telegraf

# Start Telegraf service
systemctl start telegraf

# Enable InfluxDB service boot loading
systemctl enable influxdb

# Start InfluxDB service
systemctl start influxdb

# Enable Chronograf service boot loading
systemctl enable chronograf

# Start Chronograf service
systemctl start chronograf

# To prevent Docker from exiting
journalctl -f

#8

Thanks @Tristan107,

I think it’s better practice to only run telegraf on the device and then post the data to a server running the rest of the stack. Running influx remotely means you can keep all device/“hosts” in a single timeseries.


#9

Sure, we have a remote InfluxDB too, and some other databases (our Telegraf uses MQTT as an output to keep it decoupled from our remote backend, so we can add any subscribers to our broker server-side at any moment).

We had to keep a local storage in case of deconnections, and in case some data can’t be legally uploaded and we still need to be able to provide a full local service.


#10

Oh cool! I thought telegraf had some offline support, but looking at it now I can’t find anything so that’s good idea.

I’m currently posting with http because it was simpler to setup but mqtt is probably a better transport in most of these cases.


#11

I’ve just done it and MQTT is really simple to setup.

As a subscriber, I use telegraf too (input MQTT, output InfluxDB) so it goes really easily from my MQTT broker to my central InfluxDB. I may use more complex subscribers if I want to load data into DB like Warp10 with a different format.

For the device, I think you are right though, it’s loaded at 50-60% mem usage with only supervision tools installed, so I may remove Chronograf and Kapacitor to keep only Telegraf and local InfluxDB.