Re: GlusterFS performance with small files.
Jerker Nyberg <jerker@...
2012-04-05 09:23:37 GMT
I have a basic knowledge (I am a system administrator not a file system
developer) but anyway this is how I understand the current situation:
You don't have to use distributed parallel cluster file systems (Lustre,
GlusterFS, Ceph, Panasas, FhGFS etc), there are shared disk file systems
also to look into. (OCFS2, GFS2 (Red Hat Global File System), StorNext
(known as Xsan on Mac) etc) I have not really understood where GPFS fits
in, it is as far as I understand block based but can scale to many
servers, but I guess you do not need hundreds of backend servers.. Some of
these require quite some knowledge and time to set up correctly.
I have personally only run GlusterFS and Ceph although Panasas is also
used at our university for HPC. We ran for several years Xsan for some Mac
servers (podcast producer and Apache/MySQL) with a FibreChannel attached
Xraid but I would not recommend that solution today.
At a hosting company soon ten years ago we splitted up the users between
different backend storage NFS/MySQL-servers and then put up a couple of
front end servers (load balanced with LVS) in front of each backend,
running Postfix/Courier-IMAP/Apache/etc. It is a proven solution although
not as scalable beyond a single machine as the modern cloud inspired file
systems are... But none of them is quite as stable yet.
Many SSD-drives can fit in a normal PC-server nowadays. For mail using
Maildir usually the IOps are more important than bandwidth anyway. Keep
your eyes open for FreeBSD/ZFS or Illumnos/ZFS too. ZFS still seem to be
several years ahead of anything else native to Linux. ZFSonLinux.org is
stable for my backup server when I do not use deduplication.
On Wed, 4 Apr 2012, David Whiteman wrote:
> Thanks for the reply. Changing to mbox is not really an option, we are stuck
> with MailDir format.
> All current cluster filesystems I've read into seem to have problems with
> small files.
> I guess the only alternative seems to be a DRBD setup, but this would limit
> me to 2 nodes only and was the reason I was looking into GlusterFS.
> Anyone know of any alternatives to GlusterFS that offer similar performance
> (with very small files) to NFS?
> On 03/04/12 17:40, Bryan Whitehead wrote:
>> A bunch of small files is terrible performance. Really not much you
>> can do about that. Store each mailbox in a single file. MailDir format
>> is definitely going to suck.
>> On Tue, Apr 3, 2012 at 3:05 AM, David Whiteman<davew@...>
>>> I am currently looking into GlusterFS to use as a storage cluster for our
>>> email storage. I want to mount the storage from different servers (or
>>> services accessing the storage include exim, courier-imapd, courier-pop3d.
>>> Our emails are stored in MailDir format, which is many small files. I have
>>> read that GlusterFS doesn't perform very well with small files, is this
>>> still the case?
>>> I would like to achieve similar (or better) performance to our current NFS
>>> setup, with the added redundancy that GlusterFS provides.
>>> Is there any utilities I can use to test the performance?
>>> Thanks in Advance
>>> Gluster-users mailing list
> Gluster-users mailing list