We've recently had a few switches (MSN 2100 and 2700s) lock up due to running out space. This was purely due to Btrfs becoming very unbalanced. After a few rounds of progressive balancing, well over 50% of the drive space was recovered. It would seem that our situation occurred due to snapshots resulting from frequent net commits and several apt installs.
In an attempt to prevent this from happening in the future we've added the following simple cron job
root@isp0:~# crontab -l
0 3 * * * /sbin/btrfs balance start -dusage=50 -musage=50 /
It's been a relatively short time we've had this going so I cannot say for sure if it's effective or not, but it seems like this should be something that Cumulus configures by default?