LEDE Project

  • Status Assigned
  • Percent Complete
    0%
  • Task Type Bug Report
  • Category Base system
  • Assigned To
    Alexander Couzens
  • Operating System All
  • Severity Critical
  • Priority Very Low
  • Reported Version lede-17.01
  • Due in Version Undecided
  • Due Date Undecided
  • Votes
  • Private
Attached to Project: LEDE Project
Opened by Rob White - 24.07.2017

FS#927 - SQUASHFS error: xz decompression failed

Ubiquiti AirGateway and Bullet M2
LEDE 17.01.2
Lighttpd
PHP7

Runs very well on boot up. Then after some interval ranging from minutes to days, the following typical error occurs, repeated numerous times with differing blocks:
Mon Jul 24 07:09:39 2017 kern.err kernel: [43650.730023] SQUASHFS error: xz decompression failed, data probably corrupt
Mon Jul 24 07:09:39 2017 kern.err kernel: [43650.735459] SQUASHFS error: squashfs_read_data failed to read block 0x1e5d9a

Following this the unit becomes unresponsive, very slow or reboots its self.
Tried on two different devices and get the same result.

Images produced with Imagebuilder with ipv6, usb, ppp, luci removed to give space on flash.

The exact same config on OpenWrt CC (but with php5) gives no problems.

root@BlueWave:~# free

           total       used       free     shared    buffers     cached

Mem: 28176 19956 8220 132 1460 4296
-/+ buffers/cache: 14200 13976
Swap: 0 0 0

root@BlueWave:~# df -h
Filesystem Size Used Available Use% Mounted on
/dev/root 4.8M 4.8M 0 100% /rom
tmpfs 13.8M 132.0K 13.6M 1% /tmp
/dev/mtdblock5 1.4M 508.0K 964.0K 35% /overlay
overlayfs:/overlay 1.4M 508.0K 964.0K 35% /
tmpfs 512.0K 0 512.0K 0% /dev


psyborg55 commented on 24.07.2017 08:57

32MB of RAM is the problem

Rob White commented on 24.07.2017 09:54

@psyborg55
Just saying 32MB of RAM is the problem is not helpful.
Can you be more specific?
I can set up a test to drop free ram to less than 1MB and start to see a slowdown but no errors.
I can believe this is the problem, but have no ACTUAL evidence.

psyborg55 commented on 24.07.2017 10:02

sysupgrading >8MB image on 32MB device just yesterday gave me same errors once or twice - device booted but wifi did not work, other attempts resulted in non-bootable device.

flashing any of the images from u-boot webserver - device booted successfully and wifi worked fine.

Rob White commented on 24.07.2017 10:43

The image I have built is 6.2MB and both tftp and sysupgrade work fine with no issues other than this.
Devices normally sit around 7 to 9 MB free ram, dropping occasionally to 1.5MB under high load, very quickly recovering.
The only problem is the occasional squashfs read errors, some of which are terminal..
My first thoughts were flash failure but multiple devices show the same symptoms.

Currently I am thinking the xz decompression is using excessive amounts of ram and failing.
I do have config and data files built into the squashfs but the largest is only a few KB.

psyborg55 commented on 24.07.2017 11:04

kernels >4 are pure bullshit.

Project Manager
Alexander Couzens commented on 10.08.2017 04:45

@Rob White can you share your images? Or maybe your image config so I can reproduce your problem?

Rob White commented on 10.08.2017 13:05

@Alexander Couzens
Yes, I can build an image with some test scripts to force the problem.
What hardware do you have available?
I have airGateway, airRouter, Bullet M2 and Nanostation M2.
Alternatively I could provide a makeimage script and files folder for Imagebuilder.

It is true that this does seem to only occur on a 32MB device, however the idle free memory does not actually differ much from OpenWrt CC which leads me to suspect some issue is causing LEDE to use excessive memory, most likely the xz decompression. If found to be true this could make 32MB devices much more stable with LEDE.

Project Manager
Alexander Couzens commented on 10.08.2017 13:33

@bluewavenet I've bullet m2 and nanostation m2 available. It might also help to reproduce the bug on a qemu target (e.g. mips qemu) or x86.

Rob White commented on 10.08.2017 14:58

@Alexander Couzens
I'll make an image for the Bullet M2 then. Might take a few days to find the time.... ;-) I could not reproduce the bug on x86 as the smallest ram config I have is 256MB :-D

diizzyy commented on 25.08.2017 09:25

You're most likely running into issues because underlying subsystems are running out of memory especially if you're running additional services.
https://lede-project.org/meta/infobox/432_warning?s[]=flash

Lucian CRISTIAN commented on 03.10.2017 14:57

beaglebone black has 512MB ram and it fails the same on squashfs
https://bugs.lede-project.org/index.php?do=details&task_id=1034&order=id&sort=desc

Koen Vandeputte commented on 04.10.2017 20:41

Same error seen a few times on my gw2388 boards. (cns3xxx)

Hardware:
- 16MB NOR using squashfs (5MB free)
- 256MB RAM (>80MB free after full boot)

I recall this suddenly popped up a few months ago..

Loading...

Available keyboard shortcuts

Tasklist

Task Details

Task Editing