This is a discussion on Solaris 10 FTP Server performance within the comp.unix.solaris forums, part of the Solaris Operating System category; --> We just upgraded a host in our environment from a Netra 1280 running Solaris 9 to a Netra 1290 ...
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| We just upgraded a host in our environment from a Netra 1280 running Solaris 9 to a Netra 1290 running Solaris 10. (Both rigs house 64GB of RAM and contain 8x1500MHz Ultrasparc IV processors). The server receives and transmits hundreds of thousands of files per day, mostly via FTP. Many of these FTP transactions are single file transactions, so there is a lot of TCP overhead with opening and closing the FTP services on a per file basis (nothing I can do about the client side, that's just the way it works.) Since the upgrade, it seems we are experiencing quite a degradation in the performance of the FTP server, and I'm finding it hard not to be suspicious of the new OS. On the old Solaris 2.9 system we had the occasional blip, but we seemed to run okay. On the new system, the FTP clients fail over to the standby FTP server far too frequently. I have started to run in.ftpd in debugging mode during the non-peak hours to attempt to get a glimpse of what could be causing this. Here is one issue that sticks out: Dec 23 02:09:07 ftp-server in.ftpd[12033]: [ID 927837 daemon.info] connect from client1 Dec 23 02:09:12 ftp-server in.ftpd[12082]: [ID 927837 daemon.info] connect from client2 Dec 23 02:09:12 ftp-server in.ftpd[12083]: [ID 927837 daemon.info] connect from client3 Dec 23 02:09:18 ftp-server in.ftpd[12091]: [ID 927837 daemon.info] connect from client4 Dec 23 02:09:29 ftp-server in.ftpd[12130]: [ID 927837 daemon.info] connect from client5 Dec 23 02:09:38 ftp-server ftpd[12091]: [ID 612163 daemon.debug] <--- 220 ftp-server FTP server ready. Dec 23 02:09:38 ftp-server ftpd[12083]: [ID 612163 daemon.debug] <--- 220 ftp-server FTP server ready. Dec 23 02:09:38 ftp-server ftpd[12082]: [ID 612163 daemon.debug] <--- 220 ftp-server FTP server ready. Dec 23 02:09:38 ftp-server ftpd[12033]: [ID 612163 daemon.debug] <--- 220 ftp-server FTP server ready. Dec 23 02:09:38 ftp-server ftpd[12130]: [ID 612163 daemon.debug] <--- 220 ftp-server FTP server ready. Normally these connect/response lines are right on top of each other, but all too often in the debug logs, I see the above, where the FTP server just seems to go catatonic for 30 seconds or so before answering 5 or 6 clients in rapid succession. Note that the response time between the first client's request and the response is 29 seconds....enough to have the client grow impatient and fail over. Can anyone help explain why we would see the above behaviour on in.ftpd running on Solaris 10, but never on Solaris 9? I don't expect there to be a magic bullet in /etc/system, but if anyone has any suggestion on kernel tuneables that might maximize this server's performance, I would entertain any suggestions. Please note the following however: - I cannot control the client's timeout setting, retry rate, or any other behaviour. - I cannot migrate to another FTP server such as proftpd. Thanks for any information you can provide. -- Fred |
| |||
| Fred Chagnon <fchagnon@gmail.com> wrote: > answering 5 or 6 clients in rapid succession. Note that the response > time between the first client's request and the response is 29 > seconds....enough to have the client grow impatient and fail over. Can you catch one of these with truss and see what it's waiting on? At first blush, I'd wonder if it was hanging on DNS resolution or similar. -- Brandon Hume - hume -> BOFH.Ca, http://WWW.BOFH.Ca/ |
| |||
| On Dec 23, 4:32 pm, Fred Chagnon <fchag...@gmail.com> wrote: > We just upgraded a host in our environment from a Netra 1280 running > Solaris 9 to a Netra 1290 running Solaris 10. (Both rigs house 64GB of > RAM and contain 8x1500MHz Ultrasparc IV processors). > > The server receives and transmits hundreds of thousands of files per > day, mostly via FTP. Many of these FTP transactions are single file > transactions, so there is a lot of TCP overhead with opening and > closing the FTP services on a per file basis (nothing I can do about > the client side, that's just the way it works.) > > Since the upgrade, it seems we are experiencing quite a degradation in > the performance of the FTP server, and I'm finding it hard not to be > suspicious of the new OS. On the old Solaris 2.9 system we had the > occasional blip, but we seemed to run okay. On the new system, the FTP > clients fail over to the standby FTP server far too frequently. I have > started to run in.ftpd in debugging mode during the non-peak hours to > attempt to get a glimpse of what could be causing this. Here is one > issue that sticks out: > > Dec 23 02:09:07 ftp-server in.ftpd[12033]: [ID 927837 daemon.info] > connect from client1 > Dec 23 02:09:12 ftp-server in.ftpd[12082]: [ID 927837 daemon.info] > connect from client2 > Dec 23 02:09:12 ftp-server in.ftpd[12083]: [ID 927837 daemon.info] > connect from client3 > Dec 23 02:09:18 ftp-server in.ftpd[12091]: [ID 927837 daemon.info] > connect from client4 > Dec 23 02:09:29 ftp-server in.ftpd[12130]: [ID 927837 daemon.info] > connect from client5 > Dec 23 02:09:38 ftp-server ftpd[12091]: [ID 612163 daemon.debug] <--- > 220 ftp-server FTP server ready. > Dec 23 02:09:38 ftp-server ftpd[12083]: [ID 612163 daemon.debug] <--- > 220 ftp-server FTP server ready. > Dec 23 02:09:38 ftp-server ftpd[12082]: [ID 612163 daemon.debug] <--- > 220 ftp-server FTP server ready. > Dec 23 02:09:38 ftp-server ftpd[12033]: [ID 612163 daemon.debug] <--- > 220 ftp-server FTP server ready. > Dec 23 02:09:38 ftp-server ftpd[12130]: [ID 612163 daemon.debug] <--- > 220 ftp-server FTP server ready. > > Normally these connect/response lines are right on top of each other, > but all too often in the debug logs, I see the above, where the FTP > server just seems to go catatonic for 30 seconds or so before > answering 5 or 6 clients in rapid succession. Note that the response > time between the first client's request and the response is 29 > seconds....enough to have the client grow impatient and fail over. > > Can anyone help explain why we would see the above behaviour on > in.ftpd running on Solaris 10, but never on Solaris 9? I don't expect > there to be a magic bullet in /etc/system, but if anyone has any > suggestion on kernel tuneables that might maximize this server's > performance, I would entertain any suggestions. netstat -s - look for errors If there are any I would look at whether you are actually in full duplex mode like you maybe think you are..?? Might be a DNS issue but I doubt it |
| |||
| > Can you catch one of these with truss and see what it's waiting on? At > first blush, I'd wonder if it was hanging on DNS resolution or similar. I suspected this at first, but when I snoop the network interface I don't see very many DNS resolution requests. Not near enough to match the number of transactions anyway. Furthermore, I don't understand why the change in OS would have introduced a DNS resolution bottleneck. Is it possible to configure in.ftpd to not need this? Fred |
| |||
| Fred Chagnon wrote: >>Can you catch one of these with truss and see what it's waiting on? At >>first blush, I'd wonder if it was hanging on DNS resolution or similar. > > > I suspected this at first, but when I snoop the network interface I > don't see very many DNS resolution requests. Not near enough to match > the number of transactions anyway. Furthermore, I don't understand why > the change in OS would have introduced a DNS resolution bottleneck. Is > it possible to configure in.ftpd to not need this? > > Fred Your DNS resolver may not transmit a request for EACH name to be resolved. If there is caching going on, the resolver may "remember" the answer rather than having to go to a server for it. |
| ||||
| > Your DNS resolver may not transmit a request for EACH name to be > resolved. If there is caching going on, the resolver may "remember" the > answer rather than having to go to a server for it. Hmm. So then if indeed my issue is related to DNS resolution, is it a performance problem within the name service caching daemon service? Has anyone experienced issues with this service when moving to Solaris 10? I've been using Solaris 10 for a couple of years now and haven't ever taken a hit like this. Fred |