Hi all,
In formulating our global backup solution, we’ve encountered a directory that contains nearly 600,000 empty files. It is in "/var/lib/lava-server/default/media/lava-logs”.
Looking in the lava-server code, I *think* the offending line is in models.py:
log_file = models.FileField( upload_to='lava-logs', default=None, null=True, blank=True)
The worrying thing is, there are a few files that *do* have something in - a total of 22M worth to date.
The issue is really about eating up inodes.
Can anyone enlighten us as to if these files server any purpose, and if we can perhaps at least ignore the empty ones?
Thanks
Dave
That snippet comes from lava_scheduler_app/models.py, the TestJob model. That part of the model has not been changed since https://git.linaro.org/lava/lava-server.git/e7006debb3b836147e7c704336d15ec1...
Author: Michael-Doyle Hudsonmichael.hudson@linaro.org Author date: 19/08/2011 04:24
The code using it is job.output_file() - a check to see if output.txt exists, otherwise the content of lava_logs is the log file for that job. (The database entry is a reference to the file). https://git.linaro.org/lava/lava-server.git/blob/HEAD:/lava_scheduler_app/mo...
A django FileField is intended to support upload of files: https://docs.djangoproject.com/en/1.9/ref/models/fields/#filefield
I can't find non-zero length files on staging. (find . -type f ! -size 0) but there are 35,124 zero length files.
Non-zero length files would infer that there is not an output.txt for those jobs.
Every test job gets a log_file object created, it's question of which jobs have non-zero lava-logs files, whether an output.txt exists in ../job-output/job-$JOBID/ and why this happened.
To know if this attribute can be dropped from the model, we'd need a list of filenames which are non-zero, along with a list of which ones have and do not have an equivalent output.txt for the job ID in the non-zero filename.
On 7 January 2016 at 14:07, Dave Pigott dave.pigott@linaro.org wrote:
Hi all,
In formulating our global backup solution, we’ve encountered a directory that contains nearly 600,000 empty files. It is in "/var/lib/lava-server/default/media/lava-logs”.
Looking in the lava-server code, I *think* the offending line is in models.py:
log_file = models.FileField( upload_to='lava-logs', default=None, null=True, blank=True)
The worrying thing is, there are a few files that *do* have something in - a total of 22M worth to date.
The issue is really about eating up inodes.
Can anyone enlighten us as to if these files server any purpose, and if we can perhaps at least ignore the empty ones?
Thanks
Dave
Okay - just executed 'find . -type f ! -size 0’ on master, and we get no output there either.
Dave
On 7 Jan 2016, at 14:37, Neil Williams neil.williams@linaro.org wrote:
That snippet comes from lava_scheduler_app/models.py, the TestJob model. That part of the model has not been changed since https://git.linaro.org/lava/lava-server.git/e7006debb3b836147e7c704336d15ec1...
Author: Michael-Doyle Hudsonmichael.hudson@linaro.org Author date: 19/08/2011 04:24
The code using it is job.output_file() - a check to see if output.txt exists, otherwise the content of lava_logs is the log file for that job. (The database entry is a reference to the file). https://git.linaro.org/lava/lava-server.git/blob/HEAD:/lava_scheduler_app/mo...
A django FileField is intended to support upload of files: https://docs.djangoproject.com/en/1.9/ref/models/fields/#filefield
I can't find non-zero length files on staging. (find . -type f ! -size 0) but there are 35,124 zero length files.
Non-zero length files would infer that there is not an output.txt for those jobs.
Every test job gets a log_file object created, it's question of which jobs have non-zero lava-logs files, whether an output.txt exists in ../job-output/job-$JOBID/ and why this happened.
To know if this attribute can be dropped from the model, we'd need a list of filenames which are non-zero, along with a list of which ones have and do not have an equivalent output.txt for the job ID in the non-zero filename.
On 7 January 2016 at 14:07, Dave Pigott dave.pigott@linaro.org wrote:
Hi all,
In formulating our global backup solution, we’ve encountered a directory that contains nearly 600,000 empty files. It is in "/var/lib/lava-server/default/media/lava-logs”.
Looking in the lava-server code, I *think* the offending line is in models.py:
log_file = models.FileField( upload_to='lava-logs', default=None, null=True, blank=True)
The worrying thing is, there are a few files that *do* have something in - a total of 22M worth to date.
The issue is really about eating up inodes.
Can anyone enlighten us as to if these files server any purpose, and if we can perhaps at least ignore the empty ones?
Thanks
Dave
--
Neil Williams
neil.williams@linaro.org http://www.linux.codehelp.co.uk/
So that 22M is just the space taken up by the 516,023 zero length files.
The code already has an exception check:
try: open(log_file.name) except IOError: log_file = None
Therefore, removing these zero length files will not cause any change in behaviour. Indeed, the code as-is can never actually work - log_file.name is relative to MEDIA_ROOT from settings which is not joined to the path before trying to open it.
from django.conf import settings import os from lava_scheduler_app.models import TestJob job = TestJob.objects.get(id=5896) open(os.path.join(settings.MEDIA_ROOT, job.log_file.name))
<open file u'/var/lib/lava-server/default/media/lava-logs/job-5896.log', mode 'r' at 0x7fe22c979d20>
open(job.log_file.name)
Traceback (most recent call last): File "<console>", line 1, in <module> IOError: [Errno 2] No such file or directory: u'lava-logs/job-5896.log'
Which is odd, seeing as the function directly above this does join settings.MEDIA_ROOT - output_dir()
I'll see about a migration in 2016.1 to remove log_file() - note that this will affect every TestJob. There is no code to remove this file if the TestJob is removed, so the files themselves would get left behind on other installs. I can add it to the packaging, possibly, or just document it.
On 7 January 2016 at 14:55, Dave Pigott dave.pigott@linaro.org wrote:
Okay - just executed 'find . -type f ! -size 0’ on master, and we get no output there either.
Dave
On 7 Jan 2016, at 14:37, Neil Williams neil.williams@linaro.org wrote:
That snippet comes from lava_scheduler_app/models.py, the TestJob model. That part of the model has not been changed since https://git.linaro.org/lava/lava-server.git/e7006debb3b836147e7c704336d15ec1...
Author: Michael-Doyle Hudsonmichael.hudson@linaro.org Author date: 19/08/2011 04:24
The code using it is job.output_file() - a check to see if output.txt exists, otherwise the content of lava_logs is the log file for that job. (The database entry is a reference to the file). https://git.linaro.org/lava/lava-server.git/blob/HEAD:/lava_scheduler_app/mo...
A django FileField is intended to support upload of files: https://docs.djangoproject.com/en/1.9/ref/models/fields/#filefield
I can't find non-zero length files on staging. (find . -type f ! -size 0) but there are 35,124 zero length files.
Non-zero length files would infer that there is not an output.txt for those jobs.
Every test job gets a log_file object created, it's question of which jobs have non-zero lava-logs files, whether an output.txt exists in ../job-output/job-$JOBID/ and why this happened.
To know if this attribute can be dropped from the model, we'd need a list of filenames which are non-zero, along with a list of which ones have and do not have an equivalent output.txt for the job ID in the non-zero filename.
On 7 January 2016 at 14:07, Dave Pigott dave.pigott@linaro.org wrote:
Hi all,
In formulating our global backup solution, we’ve encountered a directory that contains nearly 600,000 empty files. It is in "/var/lib/lava-server/default/media/lava-logs”.
Looking in the lava-server code, I *think* the offending line is in models.py:
log_file = models.FileField( upload_to='lava-logs', default=None, null=True, blank=True)
The worrying thing is, there are a few files that *do* have something in - a total of 22M worth to date.
The issue is really about eating up inodes.
Can anyone enlighten us as to if these files server any purpose, and if we can perhaps at least ignore the empty ones?
Thanks
Dave
--
Neil Williams
neil.williams@linaro.org http://www.linux.codehelp.co.uk/
linaro-validation@lists.linaro.org