4 Oct 2011 18:45
Re: [ANNOUNCE] jfsutils-1.1.15
Tim Nufire <jfsdiscusss_tim <at> ibink.com>
2011-10-04 16:45:21 GMT
2011-10-04 16:45:21 GMT
For what it's worth, we have not seen a single journal replay error since the last patch and are using the default journal size…
Tim
On Sep 4, 2011, at 7:46 AM, Sandon Van Ness wrote:
------------------------------------------------------------------------------Well I went through several more fsck's due to 2.6.39 (I upgraded for newer scsi driver support for a new raid controller) and it had some bug with USB plug events which was causing panics.
Anyway now I am wondering if its possible that the reason my journal isn't replaying is because my journal size is 1024MB. I remember I made it 1024 MB when I formatted it. I now vaguely remember something about a supposed maximum size of 128 MB for the journal log size?
Could a 1024MB journal log cause this? I assumed I should make it bigger because of how big the volume is but maybe that was useless?
I want to use the right one now since I just setup a 84 TB usable (90TB raw) raid array and did a test format (everything working well) but before I do the final format I want to make sure I can avoid running into this issue again (if its avoidable).
If it is the journal log size then I can copy the 29TB of data I currently have on my 36 TB volume over to the 84 TB volume and re-create the file-system And defrag it in the process).
On 07/29/2011 09:13 AM, Dave Kleikamp wrote:On 07/28/2011 07:10 AM, Sandon Van Ness wrote:On 04/22/2011 05:42 AM, Dave Kleikamp wrote:Failed updating the block map. I'll need to look into this.Doh! You're right. I was thinking it was something it got at compileOk so my computer kernel panic'd (damn nvidia GPU drivers) and I had to
time.
Yeah, I trust you, now that you pointed out the hard-coded date in the
header.![]()
I'll have to try to recreate the problem again and see what else needs
fixing.
Thanks,
Shaggy
do an fsck again (the first time since I previously replied to this
thread).
One bit of behavior I noticed is it did sit at the trying to replay
journal log for quite some time before it finally error'd with the
logredo failed out but still wasn't able to do it. I seem to remember
before it would almost instantly say logredo failed:
fsck.jfs version 1.1.15, 04-Mar-2011
processing started: 7/25/2011 22:53:12
The current device is: /dev/sdd1
Block size in bytes: 4096
Filesystem size in blocks: 8718748407
**Phase 0 - Replay Journal Log
logredo failed (rc=-220). fsck continuing.
**Phase 1 - Check Blocks, Files/Directories, and Directory Entriesfsck.jfs doesn't do anything special to optimize I/O.
**Phase 2 - Count links
Incorrect link counts have been detected. Will correct.
**Phase 3 - Duplicate Block Rescan and Directory Connectedness
**Phase 4 - Report Problems
File system object DF3649600 is linked as:
/boxbackup/mail/sandon/Maildir/.Eastvale yahoogroup/cur
cannot repair the data format error(s) in this directory.
cannot repair DF3649600. Will release.
File system object DF3704486 is linked as:
/boxbackup/mail/sandon/Maildir/.saturation/cur
cannot repair the data format error(s) in this directory.
cannot repair DF3704486. Will release.
File system object DF3704736 is linked as:
/boxbackup/mail/sandon/Maildir/.saturation
**Phase 5 - Check Connectivity
**Phase 6 - Perform Approved Corrections
103120 files reconnected to /lost+found/.
**Phase 7 - Rebuild File/Directory Allocation Maps
**Phase 8 - Rebuild Disk Allocation Maps
**Phase 9 - Reformat File System Log
34874993628 kilobytes total disk space.
1890058 kilobytes in 651997 directories.
26331821630 kilobytes in 6731444 user files.
11924 kilobytes in extended attributes
9376504 kilobytes reserved for system use.
8535673628 kilobytes are available for use.
Filesystem is clean.
The three directories that went to lost+ found weren't a big deal since
they were just backups. They are also huge directories with 10s of
thousands of files in them.
Also I was kind of curious if the fsck of JFS uses libaio or another
type of multi-threaded I/O that speeds up the I/O on raid arrays? The
fsck took about 15 minutes and it seems like the disk activity on my
array was much more than most single threaded apps that do a lot of
random reads on the array although it could just be a lot of my metadata
is arranged sequentially on the array and that is why.
Also very soon (less than a month) I will be building a 30x3TB (raid6)I'm not sure if the problems above might be large file system related,
array so 84TB (76.4 TiB) so I will get a chance to try jfs with>64TiB.
Since my current file-system which is over 75% full and over 32TiB is
working ok I don't suspect any problems.
or not. It's possible that we might hit some new limit with a larger
filesystem, so I'd be interested if you have any more issues.I do recall Tim mentioning that this did fix his problem but he had
smaller volumes (24TB) so larger than 16TiB smaller than 32TiB (not sure
if that matters or not).
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1_______________________________________________
Jfs-discussion mailing list
Jfs-discussion <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jfs-discussion
------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________ Jfs-discussion mailing list Jfs-discussion <at> lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/jfs-discussion
RSS Feed