Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion 'head' failed at src/svg.c:79, function svg_header() #20

Open
fsateler opened this issue Oct 3, 2016 · 17 comments
Open

Assertion 'head' failed at src/svg.c:79, function svg_header() #20

fsateler opened this issue Oct 3, 2016 · 17 comments

Comments

@fsateler
Copy link
Member

fsateler commented Oct 3, 2016

I have a reproducible assertion failure with bootchart v231 when invoked as pid1, citing the above message.

The problem seems to only occur when the initrd does not mount / as rw, and it is instead done by systemd-remount-fs.

I have managed to reproduce as follows:

# Build a bootable debian image including systemd-bootchart
mkosi -p systemd-bootchart -b -d debian
# Boot into it and set the kernel command line
qemu-system-x86_64 -m 512 -bios /usr/share/ovmf/OVMF.fd --enable-kvm -drive file=image.raw,format=raw -vga qxl
# this is inside the vm
mount /dev/sda1 /boot/efi
echo "quiet init=/lib/systemd/systemd-bootchart" > /etc/kernel/cmdline
# Ensure systemd remounts / as rw to avoid unrelated failures
echo "/dev/sda2 / ext4 errors=remount-ro 0 1" > /etc/fstab
kernel-install add $(uname -r) /boot/vmlinuz-$(uname -r)
# Disable getty on tty1 so that we do not loose the boot messages
systemctl mask getty@tty1
reboot
# Now we have the assertion error during boot

I notice that the errors do not present themselves if the initrd mounts the root filesystem as rw (ie, add rw to the kernel cmdline).

@sofar
Copy link
Contributor

sofar commented Oct 3, 2016

Does your rootfs stay read-only forever?

If this is a case where you run bootchart just very briefly, and the system just hasn't gotten the rootfs mounted read-write fast enough, you can opt to just make bootchart run longer.

You can also opt to output the bootchart to a different location, perhaps one that is read-write.

It's not up to systemd-bootchart to mount filesystems read-write - you'll have to provide this to systemd-bootchart in some way yourself.

@fsateler
Copy link
Member Author

fsateler commented Oct 3, 2016

Does your rootfs stay read-only forever?

No, it is remounted rw by sytemd-remount-fs.

If this is a case where you run bootchart just very briefly, and the system just hasn't gotten the rootfs mounted read-write fast enough, you can opt to just make bootchart run longer.

You can also opt to output the bootchart to a different location, perhaps one that is read-write.

It's not up to systemd-bootchart to mount filesystems read-write - you'll have to provide this to systemd-bootchart in some way yourself.

I suspect the problem is not (only) this, as the userspace boot is just 653ms in this VM:

Startup finished in 4.093s (firmware) + 392ms (loader) + 1.109s (kernel) + 653ms (userspace) = 6.248s

Moreover, when invoking systemd-bootchart with -o flag to a read-only directory, it correctly tells me it cannot open the file. Also, /run/log is a tmpfs so it is of course rw.

On boot, after the assert, an empty file can be found in /run/log.

It appears something is causing systemd-bootchart to try to write out the svg file before it has collected any samples, but I don't know how to test this...

@sofar
Copy link
Contributor

sofar commented Oct 5, 2016

Hmm indeed, the assert fails because nothing has been logged(likely), or the list head pointer went missing (unlikely).

I've ran systemd-bootchart on plenty of systems which boot in less than 100ms userspace, so 653ms isn't necessarily too short for bootchart, at all.

No idea why this is happening, though. I don't know what mkosi creates either (I'm not a debian user).

@fsateler
Copy link
Member Author

fsateler commented Oct 5, 2016

My laptop shows this problem too and it boots from grub to dracut to
systemd-bootchart. The mkosi image is just a relatively simple way to
reproduce.

Fwiw, the assert happens relatively fast on my laptop that is slower to
boot. There, the assert happens almost immedistely, but the boot takes
about 15 seconds to complete. So something seems to be inducing bootchart
to finish collecting samples too early.

On 5 Oct 2016 1:14 p.m., "Auke Kok" notifications@github.com wrote:

Hmm indeed, the assert fails because nothing has been logged(likely), or
the list head pointer went missing (unlikely).

I've ran systemd-bootchart on plenty of systems which boot in less than
100ms userspace, so 653ms isn't necessarily too short for bootchart, at all.

No idea why this is happening, though. I don't know what mkosi creates
either (I'm not a debian user).


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#20 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABQsHUWBZGseYsn8oDo0CUFkPipfB0Orks5qw8zzgaJpZM4KM43p
.

@fsateler
Copy link
Member Author

@sofar any ideas where I can place some logging printfs to decide where the problem is?
BTW, I have sometimes seen that the svg is in fact produced, but it contains only 500 samples (that should be around 0.02 seconds). My computer does not boot that fast 😉

@sofar
Copy link
Contributor

sofar commented Oct 26, 2016

@fsateler what sample rate are you using? the default is to log only 25 samples per second - 500 samples results in 20seconds logging total.

@fsateler
Copy link
Member Author

huh, I guess I got confused by the '25.000' formatting of the graph info. Attaching graph now instead of describing.

bootchart-20161025-2006.svg.txt

@sofar
Copy link
Contributor

sofar commented Oct 26, 2016

OK, there's something going badly wrong. There's only a single sample visible in the graph/data, but the code claims that 500 samples were recorded. That means that for 499 samples, it didn't see any recordable data. I'm wondering if you're lacking some of the required PROCFS kernel options, or that bootchart is in some other way limited from reading the necessary proc files for most of the time.

Without a full strace of it running at boot time, it'll be difficult to figure out what is the issue...

Maybe try running bootchart manually after it has run (with -r) and seeing if it creates valid output?

@fsateler
Copy link
Member Author

The command works as expected when run from within a running session. I'm going to try to attach an strace to the init bootchart on the next reboot.
bootchart-20161026-1434.txt

@fsateler
Copy link
Member Author

fsateler commented Nov 6, 2016

I have been unsuccessful in running bootchart under strace. Any hints on how to do so?

@sofar
Copy link
Contributor

sofar commented Nov 6, 2016

The most reliable way to do so is to make a special script that mounts your rootfs rw, and then wraps bootchart with strace while also executing /sbin/init.

something like this: untested

#!/bin/sh
mount -o remount,rw /
strace -o /strace.dump -f -tt /usr/lib/systemd/systemd-bootchart &
exec /sbin/init

@fsateler
Copy link
Member Author

fsateler commented Nov 6, 2016

Nice, now I get the strace.
strace.txt

@sofar
Copy link
Contributor

sofar commented Nov 6, 2016

what's the content of /etc/systemd/bootchart.conf?

@fsateler
Copy link
Member Author

fsateler commented Nov 7, 2016

It's all commented out:

#  This file is part of systemd.
#
#  systemd is free software; you can redistribute it and/or modify it
#  under the terms of the GNU Lesser General Public License as published by
#  the Free Software Foundation; either version 2.1 of the License, or
#  (at your option) any later version.
#
# Entries in this file show the compile time defaults.
# You can change settings by editing this file.
# Defaults can be restored by simply deleting this file.
#
# See bootchart.conf(5) for details.

[Bootchart]
#Samples=500
#Frequency=25
#Relative=no
#Filter=yes
#Output=<folder name, defaults to /run/log>
#Init=/path/to/init-binary
#PlotMemoryUsage=no
#PlotEntropyGraph=no
#ScaleX=100
#ScaleY=20
#ControlGroup=no
#PerCPU=no

@sofar
Copy link
Contributor

sofar commented Apr 5, 2017

I wonder if 395d046 is possibly related. Can you test this commit?

@fsateler
Copy link
Member Author

fsateler commented Apr 6, 2017

I have run this, and on my laptop the assert is gone, but collection still appears to stop after 1 sample.

bootchart.txt
strace.txt

@fsateler
Copy link
Member Author

fsateler commented Apr 6, 2017

Hmm, it appears I uploaded an older bootchart. This is the one.
bootchart.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants