Knowledgebase

Portal Home > Knowledgebase > Articles Database > CentOS 5.3,vsftpd, high iowait, high LOAD :(


CentOS 5.3,vsftpd, high iowait, high LOAD :(




Posted by ignitionservers, 09-26-2009, 05:20 PM
Hi, This is the first time I'm having to actually ask for help here since I've been searching the internet for a solution all day and have drawn a blank so far.. First off, system info: CentOS 5.3 Dual Xeon 3.0Ghz 4Gb Ram 2x1TB Sata Drives 300Mbit Capped Gige Port This system is to be used as an ftp server. So I setup partitions, moved a ton of data over. Installed vsftpd. Ran it. Users started downloading, BOOM! High IOWait, High Server Load.. =/ Everytime vsftpd gets about 700 concurrent users, the server load starts skyrocketing (mainly because of the high iowait 80-90%..). I thought it might be a disk issue, so I ran: Both gave me good results [speeds of 100+MB/s] I thought it might be a kernel issue, so I tried these kernels: CentOS (2.6.18-128.7.1.el5) CentOS (2.6.18-164.el5) CentOS (2.6.18-164.el5PAE) CentOS (2.6.18-128.el5PAE) No joy All seem to be giving the same results.. Right now, there're 167 vsftpd processes: Everything seems to be running beautifully. But once the number of users go up.. as they will in the next 15 minutes, things slow down to a crawl. [root@ID5961 vs]# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda3 895G 647G 202G 77% / /dev/sda1 289M 32M 243M 12% /boot /dev/sdb1 917G 818G 53G 94% /mnt/new tmpfs 1.8G 0 1.8G 0% /dev/shm I'm not sure what's causing the bottleneck here. If hdparm is giving good results, what is causing the IO bottleneck ? Please help! EDIT: Oh, by the way, I tried changing the IO scheduler as well [it's cfq by default..], tried every one of them. Worse results on every other. Oh yes, and just incase: [root@ID5961 vs]# uname -a Linux ID5961.choopa.com 2.6.18-128.7.1.el5 #1 SMP Mon Aug 24 08:20:55 EDT 2009 i686 i686 i386 GNU/Linux

Posted by DATARTIM, 09-26-2009, 06:01 PM
I have only glanced but it could just be genuine i/o issues. Are the users just downloading or downloading/uploading ? (i.e read/write)

Posted by ignitionservers, 09-26-2009, 06:09 PM
The users are only downloading. When you mean genuine IO issue you mean that a hardware limitation ? The HDParm gives me this: It's more or less the same numbers for /dev/sda* as well. Note that I've upgraded to an experimental kernel: Linux 2.6.24.7-65.el5rt.centosvanilla Things do look a bit better on this except that I have a constant error coming up in dmesg... =/ What I don't get is, if the disc can be read at 100mb/s, why would it start locking up at an access rate of 10mb/s ? My MRTG confirms this since the transfer rate is under 200mbps right now.. Any ideas ? Some more info now that the load is higher already: This is with about 205 vsftpd processes =/

Posted by ignitionservers, 09-26-2009, 06:22 PM
Oh, and something worth mentioning here is that when I was transferring data TO this server [that is, download to it..] it would regularly hit transfer rates of upto 330MBps without any high iowait issues or load problems. [I used wget..] How come it finds a bottleneck, while UPLOADING ? (granted that the number of files open must be MUCH larger but the transfer rates are still comparable..)

Posted by MaB, 09-26-2009, 06:23 PM
You may have a genuine IO bottle neck. Just because the throughput (mbps) of your drives are good, doesn't mean that you have enough IOPs to support that many users. Are the users all downloading different files or are they downloading the same set of files? How big is the dataset - if it's relatively small then you can add enough ram to the server so that the files are cached instead of read from the HD. If it's a lot of different files and the dataset is very large, you need more harddrives to serve up more IOPS

Posted by ignitionservers, 09-26-2009, 06:31 PM
They're all different files (some same ofcourse..). But the dataset is about 1.5TB's. Mostly different stuff. So maybe these SATA drives aren't good enough to pump through this much data ? What kind of setup would be able to handle this ?

Posted by UNIXy, 09-26-2009, 06:44 PM
Use sdparm (SATA) instead hdparm (IDE). Too bad you've already defined a fixed set of partitions. Having RAID-1 would increase your read throughput significantly. Also, the numbers (100MB/s buffered) hdparm gave you are not sustained so most likely the sustained throughput is 1/5 of that. Get a 2-3 1U boxes with 4x250 RAID-10 disks instead. Regards

Posted by geedeedee, 09-26-2009, 08:53 PM
I've had the same issue with vsftpd where it had high i/o loads and then switching to pureftpd the i/o loads were very low even when pushing a lot more traffic than I did over vsftpd. I believe there's an issue with vsftpd and centos working together under medium/heavy operation heck even under light operation a clear difference can be seen. Couldn't tell you what exactly is causing it but something fishy is going on for sure...

Posted by ignitionservers, 09-27-2009, 01:50 AM
Thanks for the tip. I'm going to try out PureFTPd. *Fingers crossed*

Posted by ignitionservers, 09-27-2009, 04:00 AM
Just reporting back. I've installed Pure-ftpd and it's running right now. So far, the results compared to vsftpd are AMAZING! I still can't say anything for sure because we're not at peak time right now.. but with 350 users load is: That's pretty reasonable considering that vsftpd would spike to about 5.0 with this many users. Lets see how this thing scales. I'm hoping this scales well. I had already given up hope with this box.. Thanks geedeedee, I agree, things seem very fishy with the way these things are behaving so differently.. Just as a matter of curiosity [and I actually tried searching for this other places as well..] how do I rehash pure-ftpd ? I read the docs and stuff but couldn't find anything on it. Is it not possible to change parameters while it's running ? I'm running it standalone as a daemon. This is the command I used: I want to change the number of concurrent sessions per ip to 10. That is put -C 10 in there. But I don't want to shut down the server and start it again. Would this be possible ?

Posted by geedeedee, 09-27-2009, 09:36 AM
You just need to kill the pure-ftpd (SERVER) process and that will still leave all downloads/idle/etc connections open without closing them. Simply start it back up after that and changes will take effect on new connections. One drawback with pureftpd that I've noticed is the further away you are from your server the more you will notice a download speed decrease when compared to vsftpd. vsftpd in some cases will be significantly higher for example if someone was downloading from California and your server was in New York there would most likely be a noticeable difference in download speed. This holds especially true for people downloading in another country. You can test this for yourself and decide but I'd rather deal with that than the strange i/o issues vsftpd seems to bring on. Last edited by geedeedee; 09-27-2009 at 09:42 AM.

Posted by ignitionservers, 09-27-2009, 01:05 PM
I already did notice that It's true that vsftpd was able to push about 5mb/s to another server of mine but pureftpd was only able to push about 3.5-4mb/s. Unfortunately, I had some pretty bad news. While pure-ftpd did a very good job of handling connections until the 500 mark. After we crossed 500 today, the server load started going up dramatically. Right now, this is what I see: Surprisingly, the server is still quite responsive and quick. However, I'm sure that a solid IO bottleneck has been met because the server is using about 200Mbps, but transfers which were earlier going at 2mb/s are now going at 100kb/s or so. Since the network is not saturated I can only assume that the disk is HEAVILY saturated. [as is obvious by the 95.7% iowait ] Anyhow, I'm still looking on advice on ANYTHING that can be done to fix/improve this situation. Other than that, does anyone here have any idea of what kind of machine would be able to saturate this connection [300Mbps] quite well ? We have static data, served over ftp, some ISO images and then other smaller files, more or less like any linux distro. So there are files which are 650mb,700mb in size and then there are small files 15mb, 20mb, etc. Some DVD ISOs as well which are about 2-3GB in size... All in all, the data set is about 1.5TB and I'm looking at about 1000 concurrent sessions, tops! 300Mbps is my bandwidth ceiling. Please recommend any hardware config that could do this ? Would a newer machine with SATAII be comfortable with this kind of requirement ? Also, a big thank you to all those who've taken the trouble to help me out here so far..

Posted by DATARTIM, 09-27-2009, 01:17 PM
How old is this box if they aren't SATAII ? I'd go for 4 x 1TB in RAID 10 That would certainly improve perfomance hugely. Step up would be 15k drives.

Posted by ignitionservers, 09-27-2009, 03:06 PM
It's an old 32-bit Dual Xeon. Kinda dated I guess. 2x1TB should be fine for me really. I just need 2TB of space. Or is it that 4 has some advantage that I'm not aware of ? My understanding is that the data is spread over 4 drives so I gain in speed because there are more spindles running over less data ? What about the CPU, etc. ? Just a drive change would be enough :? Last edited by ignitionservers; 09-27-2009 at 03:09 PM.

Posted by geedeedee, 09-27-2009, 05:18 PM
What the poster above suggested is good as raid 10 will take 50% of each disks capacity so you'd end up with close to 2TB usable abit less but fairly close after formatting. That should help a lot with your high i/o load and you'll have some redundancy. You might also want to look into using xfs for your larger files 500MB+. It handles large files very well compared to other filesystems and I would keep your other files smaller than 500MB on ext3 which is what you probably use now for everything.

Posted by ignitionservers, 09-28-2009, 02:25 AM
I spoke to the techs at my host and they said more or less the same thing. They said that I could either go in for SAS or new drives with RAID. However, they also suggested that we could try going for RAID over these two disks and see if there is a performance improvement. I don't think RAID would give me that much of an IO bandwidth boost to be able to handle DOUBLE the capacity that it can handle right now.. So at this point of time should I go for SAS or just 4x1TB in RAID10 like suggested earlier ? [I have a feeling that SAS would be expensive ? ]

Posted by ZenMonk, 09-28-2009, 04:23 AM
Go for sas if its affordable and the disk transfer is atleast 1.5 times faster than sata2. Else, i would suggest what DATARTIM has already suggested, go for raid10 setup with sata2 with 15k rpm.

Posted by ignitionservers, 09-28-2009, 07:03 AM
I do think SAS might work out to be quite expensive and this is a not-for-profit thing so it's kind of tough to work out all the figures. I wonder how much 15K SATA2 would set me back by.. will look around..

Posted by ignitionservers, 09-28-2009, 08:54 AM
I looked around a bit but can't find options for 15k SATA2. Those are only available as SAS drives and they don't have much space and are rather expensive. My host said that the chassis for this server can only take two hard drives so we'll have to anyway switch to a different box. Right now, the hard disks which are SATA1 @ 7200rpm top out at about 10mb/s. This is what I see from iostat -dk They never go much higher than 10MB/s. Since I need to push 300Mbps which is about 36MB/s, I guess the best way to go would be 4x1TB drives in either RAID0 or RAID1+0, right ? Since I only need 2TB of space, RAID1+0 should be fine for me with ofcourse the added benefit of mirroring. Also, I'm correct in assuming that a RAID0 (or RAID1+0 even..) would give me 4 times the speed of having say a single 4TB drive ? Basically my question is, for a 4x1TB RAID1+0 configuration, even though the available drive space is 2TB, the total read speed is 4X the speed of a single drive, right ? [I'm guessing this might be so because read operations can be pulled off the mirror disks as well ?] So even at 8MB/s per drive, 4 disks should be able to easily push out 36MB/s without a problem ?

Posted by DATARTIM, 09-28-2009, 08:55 AM
Not sata 15K just 15K SAS. As it's not for profit then I would go for 4 x 1TB SATAII in RAID 10. You should see a marked improvement. Sure 15k drives would yield a larger improvement but to get the same usable space you'd have to go for nearer 8 of them and cost wise this would be very high. Are you renting or buying the boxes ?

Posted by ignitionservers, 09-28-2009, 09:08 AM
I'm renting this server. I think 4x1TB SATA2 should do the job. I've sent in a request asking my host for a quote, lets see what he says. The chassis would need to be changed so it's basically a whole new server. I don't really care about the CPU config and stuff so take the same system, swap out the 2x1TB SATA and put in 4x1TB SATAII, even if we keep the same config just in a different box. Any idea on what a fair price difference on this would be ?

Posted by DATARTIM, 09-28-2009, 09:53 AM
Well, put it this way for a box in the US with a quad core xeon and 4GB RAM with 4 x 1TB hdd in RAID 10 I'd expect to pay $250 - $300 or so. So yours should be around there.

Posted by ignitionservers, 09-28-2009, 09:58 AM
Actually, this box is a Dual Xeon with HT I think. It's not really a quad core xeon. I'm more concerned about the price difference since I had already budgeted for what I'm paying here. Is it reasonable to assume that my host would charge me $25/month per 1TB SATA Drive ? So put together that's a $50 overhead and then I don't know how much the RAID controller would cost...but I've seen folks charging like $20/mo. for that. So a $70 overhead is a reasonable estimate ?

Posted by DATARTIM, 09-28-2009, 10:14 AM
I'd say $75 - $100 would be about right. Given they have to change the chassis you should check that you can't get this new server cheaper elsewhere.

Posted by ignitionservers, 09-28-2009, 10:27 AM
Actually, the reason I moved to them [I had 3 separate servers earlier in the EU..] was because they gave me an extremely good deal on bandwidth. And since most of my users are from the US.. this is better than hosting in the EU for me. So I'd like to stay with them as far as possible. Lets hope that they give me a good quote on this. Luckily, I still haven't cancelled my EU servers and they stay with me for another 3-4 days so atleast I still have a backup

Posted by ignitionservers, 09-28-2009, 06:55 PM
My host is building me a new system. Any idea if he should use Core2Duo or Core2Quad ? Basically, my question is, for a 1000 user ftp server, how much CPU power is really required ? He asked me how much memory to put in.. I said just 2GB. Why ? Because, well pureftpd or vsftpd use about 50mb of memory when we have 500 users on the ftp. That's pretty good performance. So 100MB for 1000 users, or even if it's a bit more, 2GB should be more than enough! I just don't know how to find out whether they use a lot of cpu time or not. [is top reliable for this ?] Any other ways to check how much cpu time these servers are using ?



Was this answer helpful?

Add to Favourites Add to Favourites    Print this Article Print this Article

Also Read


Language:

Contact us