Mikael Liljeroth | 13 Jul 2012 16:22
Picon

Truncated files after reboot

Hi, I have a problem with my JFS partition and I have a couple of questions.

After an unclean unmount and a restart a couple of files have been truncated to size 0.

I am running linux 2.6.23.
I would really appreciate any help I can get, and I apologize for the huge amount of questions :)

What could be the reason for these 0 sized files? Could a logredo be the reason for this or a fsck?
Could jfs_logdump tell me anything interesting and if so what should I be looking for?

From what I understand JFS does not actually write the log to disk when a transaction is committed. Is this right? And if so, where is the log stored until it is written to disk, and does jfs_logdump show me the log on disk or the log currently buffered somewhere?

The word "buffer" is used in a lot of places regarding the jfs log but I have not found any actual definition of where this buffer is located or how big it is or when it is flushed to the disk. One example taken from "JFS Log" paper:

"In txCommit(), the tlck's are processed and log records are written (at least to the buffer). The affected inodes are "written" to the inode extent buffer (they are maintained in separate memory from the extent buffer). Then a commit record is written".

Then it says:

"After the commit record has actually been written (I/O complete), the block map and inode map are updated as needed, and the meta-data pages are marked 'homeok'."

Why is the block and inode maps updated After the commit record is written, isn't that information important to the transaction? And what does meta-data "pages" refer to in this context? Are the log records called pages, and are they located on disk or in memory?

What I don't understand is when and in what order the log records are written to disk. My current theory is that the "buffers" related to the files that become empty never is written to disk and when my system reboots without unmounting the partition properly the meta-data-changes, (related to the size of the file) are lost.

Is there any way of preventing this from happening in the future.

Best regards
Mikael Liljeroth
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Jfs-discussion mailing list
Jfs-discussion <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jfs-discussion
Peter Grandi | 15 Jul 2012 21:17
X-Face
Picon

Re: Truncated files after reboot

> Hi, I have a problem with my JFS partition and I have a couple
> of questions. After an unclean unmount and a restart a couple
> of files have been truncated to size 0.

In virtually every case that is desired behavior, it is not a
problem.

> What could be the reason for these 0 sized files? [ ... ]

O_PONIES. http://lwn.net/Articles/351422/

> What I don't understand is when and in what order the log
> records are written to disk.

That depends, and does not matter.

> Is there any way of preventing this from happening in the
> future.

Setting 'hdparm -W 0' and mounting with '-o sync' is the only
safe way, with any filesystem type.

Otherwise there are varying degrees of lower safety, such as
modifying the applications you use to 'fsync' at the right
times, or changing page cache flusher parameters.

Whether people like it or not there is a (really huge) tradeoff
between throughput and latency/safety as to how often writes are
flushed to disk.

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
Mikael Liljeroth | 16 Jul 2012 20:04
Picon

Re: Truncated files after reboot

Thanks for the answer.

I'm sorry I should have been more precise. In my case the files are more than a couple of days old and they are not the last modified on the system, is this still the desired behaviour?

Do you mean that the order and time the log records are written to disk could not be the reason why my files have been truncated? Just out of curiousity, where can I find information on what exactly the order and timing of log records disk writes depend on? I read in this (https://www.usenix.org/events/usenix05/tech/general/full_papers/prabhakaran/prabhakaran.pdf) document that the journal writes are not triggered by the ordinary kupdate deamon timer and that they are postponed indefinitely. If I understand correctly the journal memory buffer should be written to disk when the buffer is full or when there are sync activity or the partition is unmounted. Since there have been a lot of disk activity after the files in questions have been written to, and the files created after these files still exist and have its content intact, it seems strange that this is, by design, causing the logredo to truncate my files since their corresponding transactions already should be comitted to disk and not considered in the replay of the log, right?

The page cache flusher parameters sounds very interesting, what exactly is this?

I do not need a ponies flag, just something to improve the situation a little. I would be willing to accept that files were truncated, but not files that are more than a couple of hours old. The best solution for me would be something without having to modify the source of my own applications if possible. Patching the kernel or jfsutils or modifying some parameter would be better for my particular situation.

Regarding the fsync approach, would it be possible to do this system wide or does it have to be the process holding the filedescriptor to that particular file? I'm thinking something like a separate process with a timer that issues a fsync every 10 min or so?

Regards
Mikael Liljeroth

2012/7/15 Peter Grandi <pg <at> jf2.to.sabi.co.uk>
> Hi, I have a problem with my JFS partition and I have a couple
> of questions. After an unclean unmount and a restart a couple
> of files have been truncated to size 0.

In virtually every case that is desired behavior, it is not a
problem.

> What could be the reason for these 0 sized files? [ ... ]

O_PONIES. http://lwn.net/Articles/351422/

> What I don't understand is when and in what order the log
> records are written to disk.

That depends, and does not matter.

> Is there any way of preventing this from happening in the
> future.

Setting 'hdparm -W 0' and mounting with '-o sync' is the only
safe way, with any filesystem type.

Otherwise there are varying degrees of lower safety, such as
modifying the applications you use to 'fsync' at the right
times, or changing page cache flusher parameters.

Whether people like it or not there is a (really huge) tradeoff
between throughput and latency/safety as to how often writes are
flushed to disk.

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Jfs-discussion mailing list
Jfs-discussion <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jfs-discussion

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Jfs-discussion mailing list
Jfs-discussion <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jfs-discussion
Tino Reichardt | 16 Jul 2012 22:15
Picon

Re: Truncated files after reboot

* Mikael Liljeroth <mikael.liljeroth <at> gmail.com> wrote:
> Thanks for the answer.
> 
> I'm sorry I should have been more precise. In my case the files are more
> than a couple of days old and they are not the last modified on the system,
> is this still the desired behaviour?

Yes, if you have more than adequate RAM on that machine, it is possible,
that the pages were not flushed to disk, before some outage :(

You can flush all current fs cache via sync(8). Also most RAM of the
page cache can be flushed on linux also. Here is a goot drescription of
this: http://linux-mm.org/Drop_Caches

> Do you mean that the order and time the log records are written to disk
> could not be the reason why my files have been truncated? Just out of
> curiousity, where can I find information on what exactly the order and
> timing of log records disk writes depend on? I read in this (

This is very difficult to analyse. It may be restoreable via the
difference of the pmap[] and wmap[] tables of the dmap structure,
which is a part of the allocation bitmap.

I have a tool, which can dump the content of the two bitmaps, so you can
get the difference of the two. And of cause, there may be some data
which was partly written.

> I do not need a ponies flag, just something to improve the situation a
> little. I would be willing to accept that files were truncated, but not
> files that are more than a couple of hours old. The best solution for me
> would be something without having to modify the source of my own
> applications if possible. Patching the kernel or jfsutils or modifying some
> parameter would be better for my particular situation.
> 
> Regarding the fsync approach, would it be possible to do this system wide
> or does it have to be the process holding the filedescriptor to that
> particular file? I'm thinking something like a separate process with a
> timer that issues a fsync every 10 min or so?

crate a cronjob, which does a sync:

# mkdir /etc/cron.minute
# echo '*/1 * * * * /usr/sbin/run-cron /etc/cron.minute' \
  > /var/spool/cron/root  (or: cronatb -e)
# echo sync > /etc/cron.minute/do-a-sync
# chmod +x /etc/cron.minute/do-a-sync

The prog is available here, but only pmap is dumped, you have to modify
it a bit: http://www.mcmilk.de/projects/frisbee-ng/dl/ ...
If you need help with the bitmap dump tool (fng-info), just send another
mail to this list.

Good luck.

--

-- 
regards, TR

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
Mikael Liljeroth | 16 Jul 2012 22:26
Picon

Re: Truncated files after reboot

Thank you very much :)

Regards
Mikael Liljeroth


2012/7/16 Tino Reichardt <list-jfs <at> mcmilk.de>
* Mikael Liljeroth <mikael.liljeroth <at> gmail.com> wrote:
> Thanks for the answer.
>
> I'm sorry I should have been more precise. In my case the files are more
> than a couple of days old and they are not the last modified on the system,
> is this still the desired behaviour?

Yes, if you have more than adequate RAM on that machine, it is possible,
that the pages were not flushed to disk, before some outage :(

You can flush all current fs cache via sync(8). Also most RAM of the
page cache can be flushed on linux also. Here is a goot drescription of
this: http://linux-mm.org/Drop_Caches

> Do you mean that the order and time the log records are written to disk
> could not be the reason why my files have been truncated? Just out of
> curiousity, where can I find information on what exactly the order and
> timing of log records disk writes depend on? I read in this (

This is very difficult to analyse. It may be restoreable via the
difference of the pmap[] and wmap[] tables of the dmap structure,
which is a part of the allocation bitmap.

I have a tool, which can dump the content of the two bitmaps, so you can
get the difference of the two. And of cause, there may be some data
which was partly written.

> I do not need a ponies flag, just something to improve the situation a
> little. I would be willing to accept that files were truncated, but not
> files that are more than a couple of hours old. The best solution for me
> would be something without having to modify the source of my own
> applications if possible. Patching the kernel or jfsutils or modifying some
> parameter would be better for my particular situation.
>
> Regarding the fsync approach, would it be possible to do this system wide
> or does it have to be the process holding the filedescriptor to that
> particular file? I'm thinking something like a separate process with a
> timer that issues a fsync every 10 min or so?

crate a cronjob, which does a sync:

# mkdir /etc/cron.minute
# echo '*/1 * * * * /usr/sbin/run-cron /etc/cron.minute' \
  > /var/spool/cron/root  (or: cronatb -e)
# echo sync > /etc/cron.minute/do-a-sync
# chmod +x /etc/cron.minute/do-a-sync


The prog is available here, but only pmap is dumped, you have to modify
it a bit: http://www.mcmilk.de/projects/frisbee-ng/dl/ ...
If you need help with the bitmap dump tool (fng-info), just send another
mail to this list.

Good luck.

--
regards, TR

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Jfs-discussion mailing list
Jfs-discussion <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jfs-discussion

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Jfs-discussion mailing list
Jfs-discussion <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jfs-discussion
Peter Grandi | 17 Jul 2012 01:28
X-Face
Picon

Re: Truncated files after reboot

> I'm sorry I should have been more precise. In my case the
> files are more than a couple of days old and they are not the
> last modified on the system,

That's dtill not precise. What does "more than a couple of days
old" mean? And what does "last modified" mean?

My impression is that some of the analysis here is based on a
mistaken impression of how things are supposed to work or work
(at the Linux level, never mind storage).

> is this still the desired behaviour?

If the file was never 'fsync'ed it may still be in-memory, even
if 'close'd, as 'man 2 close' states:

 «A successful close does not guarantee that the data has been
  successfully saved to disk, as the kernel defers writes. It is
  not common for a file system to flush the buffers when the
  stream is closed. If you need to be sure that the data is
  physically stored use fsync(2). (It will depend on the disk
  hardware at this point.)»

However, I seem to remember that when I asked in this list how
often JFS log entries were flushed to the journal the answer was
every few seconds.

  Terminology note: in the JFS code usually 'log' refers to
  in-memory transaction records, while 'journal' refers to the
  on-disk ones.

But that's for metadata only. Data IIRC is flushed strictly only
on 'fsync' (or when there is memory pressure of course).

BTW there was some years ago a relatively small bug as to log
flushing order:

  http://www.mail-archive.com/git-commits-head <at> vger.kernel.org/msg32017.html

> [ ... ] where can I find information on what exactly the order
> and timing of log records disk writes depend on?

The JFS source code is relatively small. Probably you can look
at where 'jfs_lazycommit', 'jfs_flush_journal', 'jfs_sync' and
'lmLogSync', (in 'jfs_logmgr.c' mostly).

> [ ... ] Regarding the fsync approach, would it be possible to
> do this system wide or does it have to be the process holding
> the filedescriptor to that particular file?

A file can be opened via many many filedescriptors. What is
unique to each filedescriptor is the current position in the
file rather than the "dirty" pages in it.

> I'm thinking something like a separate process with a timer
> that issues a fsync every 10 min or so?

That should work...

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
Fiedler Roman | 17 Jul 2012 08:04
Picon
Favicon

Re: Truncated files after reboot

> -----Ursprüngliche Nachricht-----
> Von: Peter Grandi [mailto:pg <at> jf2.for.sabi.co.UK]
> Gesendet: Dienstag, 17. Juli 2012 01:28
> An: Linux fs JFS
> Betreff: Re: [Jfs-discussion] Truncated files after reboot
> 
> > I'm sorry I should have been more precise. In my case the
> > files are more than a couple of days old and they are not the
> > last modified on the system,
> 
> That's dtill not precise. What does "more than a couple of days
> old" mean? And what does "last modified" mean?´

This issue has some similarities to the one I was  observing on multiple machines, the cause why I banned jfs
from all production. The reproducer uses hard reboots, but it also occurred on minimal systems, where
reboot is very fast, e.g. <2sec (missing sync?)

Last modified: files corrupted, e.g. /var/log/dmesg.3.gz (a file which is created once, never touched
again, just renamed and which was quite old when getting corrupted). Fsck will then complete corruption
by removing the data, before that the actions to the corrupted file content are quite random, perhaps
blocks are included in more than one  file at the same time.

The Ubuntu issue from below has some reproducer, which wrecked each jfs filesystem back then within
minutes. Perhaps one could try to use it to see if it still leads to massive fs corruption, even of files not touched?

https://bugs.launchpad.net/ubuntu/+source/jfsutils/+bug/754495

http://www.mail-archive.com/jfs-discussion <at> lists.sourceforge.net/msg01682.html

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
Mikael Liljeroth | 17 Jul 2012 11:06
Picon

Re: Truncated files after reboot

Thanks for all the tips and pointers, much appreciated. I have a lot to learn, but I will try the fsync approach in the meantime.


Regards
Mikael
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Jfs-discussion mailing list
Jfs-discussion <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jfs-discussion
Mikael Liljeroth | 15 Aug 2012 15:51
Picon

Re: Truncated files after reboot

Hi again, I've tried to sync the file system but that does not seem to help. I still get 0 byte files after a sync and a hard reboot.


I looked at the kernel source and found that jfs_sync_fs calls jfs_flush_journal with wait <= 1 and jfs_syncpt with parameter hard_sync set to 0. However, according to the comments to lmLogSync in jfs_logmgr.c hard_sync = 1 means: "push dirty metapages out to disk".

As far as I can see this only happens when the log is closed via lmLogShutdown which calls jfs_flush_journal(log, 2); This happens when unmounting a rw file system, but it doesn't seem to happen for an ordinary sync.

jfs_syncpt is also called with hard_sync = 1 from txEnd in jfs_txnmgr.c under certain circumstances which I have not investigated yet, but I have not found any connection to sync.

Plus, in jfs_umount_rw in jfs_umount.c the following is also done:

/*
* Make sure all metadata makes it to disk
*/
dbSync(sbi->ipbmap);
diSync(sbi->ipimap);

I'm not entirely sure what dbSync and diSync do or what the filemap_fdatawrite(sbi->ipbmap->imapping) etc for a hard_sync=1 does. But isn't that required to properly write everything to disk, which one would want when doing a sync?

Perhaps a call to jfs_syncpt(log,1) in jfs_sync_fs could improve my situation?

Regards
Mikael

2012/7/17 Mikael Liljeroth <mikael.liljeroth <at> gmail.com>
Thanks for all the tips and pointers, much appreciated. I have a lot to learn, but I will try the fsync approach in the meantime.

Regards
Mikael

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Jfs-discussion mailing list
Jfs-discussion <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jfs-discussion

Gmane