Knowledgebase

Portal Home > Knowledgebase > Articles Database > 4 Proc E7-4850 V2 - Weird Performance


4 Proc E7-4850 V2 - Weird Performance




Posted by FindersCheapers, 07-07-2016, 08:17 PM
I've never used the Intel E7 CPU's before. For the last week I've been running performance tests on an 4-Proc E7-4850 2.3 GHZ V2 machine, 256 GB RAM, RAID 0 with 3 x 800GB SSD's. The motherboard is a Supermicro X10QBi. I'm quite perplexed by this setup. On one hand, it appears to have blazingly fast speed on things like hashtable lookups. Its so fast that I'm amazed. So say I have 90 threads going on the 96 logical processors, looping and doing hashtable lookups. Its almost magical how fast the whole thing is! On the other hand, it seems a bit like a slow pig when dealing with lots of context switches between threads. So this server has 96 logical processors. Say I had 300 threads running. I feel like the context switching slows this server way down when compared to say, the E5-2690 V2 line of processors? What has been everyone's experience with E7's? How do you use them? What do they excel at? I'm currently feeling as though I have a lot of logical cores that I can't figure out how to fully utilize. Its kindof frustrating. Maybe I just don't have enough RAM....

Posted by (Stephen), 07-07-2016, 10:12 PM
I know many apps in the past were 'multithreaded' and multi core ready, but were not really optimized for 'megathreading' even with the ability to spawn a lot of threads, they have a large delay in switching between and doling out the tasks. Sounds like that could still be the case.

Posted by Mister Bark, 07-09-2016, 02:08 PM
When you deal with a machine like this, using a default kernel and binaries provided by a distribution is out of question IMO. Did you compile everything yourself from source? This is the way to go if you want to benefit from full performance. Use the right CFLAGS and kernel options, but of course that's gonna take you a while because there's always a lot of research for each option... Personally, for each server I install, I compile myself from source: - gcc (twice so it compiles itself) - kernel - libc - all important binaries and libs I will use the most (e.g. ssl, gzip, etc. depending on the application) The number one optimization is probably the native tuning in the CFLAGS. Most of the time I used these: CFLAGS='-mtune=native -march=native -O2 -finline-functions -funswitch-loops -fpredictive-commoning -fgcse-after-reload -ftree-vectorize' But in your case there's gonna be plenty of options for multi-threading and also scheduling in the kernel, don't forget that one. And then I would say that for this approach, the best distribution is Slackware because you get a real, pure Linux, not a mess like Ubuntu e.g..

Posted by FindersCheapers, 07-09-2016, 02:30 PM
Mister Bark, you are bringing up some excellent points. This is software written by me in c# and is running on Windows Server 2012 R2. In the build options, there is an option to target x64 CPU's instead of doing the "any cpu" thing. I should try that and see if it is any better. I'm obviously a slave to the whim's of Windows and .NET in this scenario. I have had very good luck though with E5-2690's and am able to PEG all the logical processors at close to 100%, so I know it can be done. I'm just a bit perplexed by these E7''s. I did spend some time studying the recommended memory config's for this motherboard and I think instead of 256 GB of RAM I should be running 380GB. I'll put that on my todo list as well.

Posted by Mister Bark, 07-09-2016, 02:36 PM
hmm... Maybe if you were running on Linux you would need half the memory and get the double performance? Why Windows? is it really something you cannot do on Linux? I cannot think of any large company out there that runs on Windows servers, not even Microsoft (bing e.g.)

Posted by FindersCheapers, 07-09-2016, 02:45 PM
I would, I agree. But, I've got 23 years experience coding on Window's (C -> C++ -> C#). I find C# to be very productive. I could try Java on Linux or I could try c# mono on Linux, but honestly, I've got 11 years worth of c# code and its cheaper to just throw more hardware at a problem than to convert or migrate code, at least for now. I can see an inflection point at some time in the future, if my hosting costs move beyond say, $500K a year...

Posted by Mister Bark, 07-09-2016, 02:54 PM
I see, I don't know how much learning it takes to make it portable so... I can't tell. But I believe it might be worth taking the leap and stop procrastinating! haha I mean, if it takes 6 months figuring it out, isn't it worth it to stop dealing with Windows issues? I remember these days at 15 years old when I connected the parallel port of my desktop to its reset switch so I could go to school while the websites were being hosted, and a VB program of mine would reset from time to time so the system doesn't freeze... And one day I switched to Linux, that was a bit painful at the beginning but exciting in the same time and I really don't regret it! Now, you are at a whole other scale obviously, but it's still the same concept. It's long term VS short term, and there's always a time when it's time to invest a bit and reach the next level. Growth is always exciting that 's the good news

Posted by alanwoo, 07-09-2016, 11:34 PM
Facebook recently move from dual processor server to single processor server to get better performance and remove NUMA, this maybe similar to your context switching case, as more processor increase overhead than giving you performance in certain area.

Posted by FindersCheapers, 07-10-2016, 12:35 AM
I looked that up right after you mentioned it. Facebook's blog post on the transition is very interesting reading. If anyone else wants to read it: https://code.facebook.com/posts/1711...king-up-power/ They did seem to have a power consumption problem that was one of the big drivers for the transition. If I interpreted the blog post correctly, it was an instance of "If we keep adding more high power servers, we are going to need a new source of electricity" or maybe "our electric bill will eat too much of our profits", (or maybe both)... I'm glad you mentioned NUMA. I have been studying the NUMA performance of my server and it is not symmetric, e.g. two of the four NUMA units have almost all of the activity. I asked my webhost to verify that the memory was interleaved properly and they took the server offline and said it was configured correctly. I did worry a bit that they don't know or care about the optimal memory settings. I did give them specific instructions, e.g. where the DIMMS should be located, where the memory cards should be placed, etc, but did they even read my suggestions? I have no idea...

Posted by ReliableSite, 07-10-2016, 11:02 AM
C# doesn't multi-thread automatically. The code would need to account for the available processing power and create threads within the code. In many cases based on your requirements, the processing may not actually need more than a few threads and actually benefits from less cores at faster speeds. Based on your performance testing, the latter seems to be the case.

Posted by FindersCheapers, 07-10-2016, 11:59 AM
The code is massively parallel with lots of "dials" to adjust the number of threads. Bottle necks are dealt with via thread safe queues and dedicated work distribution threads. For the most part I try to have no blocking areas for working child threads although IO is queued and cached in memory with a dedicated write thread. I've been doing this type of big data processing for over 10 years. I've just never used a quad core machine like this before. This machine has the largest number of logical cores I've ever dealt with and also the slowest processor speed I've ever used.

Posted by ReliableSite, 07-10-2016, 12:02 PM
Thanks for the details. The E7s are funky CPUs. It looks like your application works better with faster CPUs rather than more cores then based on your performance testing. You might actually get a nice win win with the new E5 V4 CPUs. More cores, faster clock speeds, great pricing.



Was this answer helpful?

Add to Favourites Add to Favourites    Print this Article Print this Article

Also Read


Language:

Contact us