The Week in Chess

Wednesday, June 18, 2014

Stockfish 14061621 x64 Tests - Possible Regression

The latest development of Stockfish 14061621 was released on June 16, 2014. There were several patches incorporated in it since version 5 released more than 2 weeks ago in which many have passed the short and long time control tests. This motivated me to test it with the anticipation of having a much stronger engine. But my enthusiasm was doused when the first test run at 1 minute + 1 second time control immediately showed that the newer Stockfish had difficulty scoring higher against Stockfish 5 and then in the end it was defeated with a score of 51-49 in favor of Stockfish 5. I thought it was just a fluke, so I conducted more tests simultaneously at 3 minutes + 2 seconds in the same computer and another 1 minute + 1 second in another computer. The results were 52-48 and 51.5-48.5 respectively, again in favor of Stockfish 5.  I also made 3 minutes + seconds matches with Stockfish 14061621 against Houdini 4, Komodo 7a and Gull 3 and it showed that it won against them all but it was considerably lower than the scores made by Stockfish 5 against the same opponents.

I begin to doubt if my tests are wrong or the Stockfish team had not noticed the regression.  There were no similar negative results published in the chess forums, so I thought that maybe some have the same negative results but was just shy to share it. This is something for the Stockfish team to verify.

Here are some samples of my tests:


Stockfish 14061621 x64 vs. Stockfish 5 x64 Match 100R 1M1S 1
RankEngineScoreStStS-B
1Stockfish 5 x64 51.0/100· ·· ·· ··17-15-68 2499.00 
2Stockfish_14061621_x6449.0/10015-17-68· ·· ·· ·· 2499.00 


100 games played / Tournament finished

Tournament start: 2014.06.17, 00:22:38
Latest update: 2014.06.17, 06:54:07
Level: Blitz 1/1
Hardware: AMD Phenom(tm) II X4 945 Processor with 1.8 GB Memory
Operating system: Windows 7 Ultimate Professional Service Pack 1 (Build 7601) 64 bit
Table created with: Arena 3.5


Stockfish 14061621 x64 vs. Stockfish 5 x64 - Match 100R 1M1S 2
RankEngineScoreStStS-B
1Stockfish 5 x64 51.5/100· ·· ·· ·12-9-79 2497.75 
2Stockfish_14061621_x6448.5/1009-12-79· ·· ·· · 2497.75 


100 games played / Tournament finished

Tournament start: 2014.06.18, 01:54:37
Latest update: 2014.06.18, 19:40:21
Level: Blitz 1/1
Hardware: Intel(R) Core(TM)2 CPU 4300 @ 1.80GHz with 3.9 GB Memory
Operating system: Windows 7 Ultimate Professional Service Pack 1 (Build 7601) 64 bit
Table created with: Arena 3.5


Stockfish 14061621 x64 vs. Stockfish 5 x64 - Match 100R 3M2S
RankEngineScoreStStS-B
1Stockfish 5 x64 52.0/100· ·· ·· ··14-10-76 2496.00 
2Stockfish_14061621_x6448.0/10010-14-76· ·· ·· ·· 2496.00 


100 games played / Tournament finished

Tournament start: 2014.06.17, 09:40:51
Latest update: 2014.06.18, 03:44:13
Level: Blitz 3/2
Hardware: AMD Phenom(tm) II X4 945 Processor with 1.8 GB Memory
Operating system: Windows 7 Ultimate Professional Service Pack 1 (Build 7601) 64 bit
Table created with: Arena 3.5
Download the Stockfish 14061621 test games PGN here.

6 comments:

  1. Nothing wrong with latest SDV...no regression. Stockfish lately becoming like Komodo....performs exponentially better with better hardware.....
    high time you trashed your obsolete AMDs and switch to modern Intel processors with at least 6 cores and 32 GB RAM, which are most suited to computer chess gaming !

    ReplyDelete
  2. I think nobody is obliged to invest more n more money to Intel just to be sure SF dev version takes 5 elo more or not.

    ReplyDelete
    Replies
    1. How do you know ELO will increase only 5 ELO ? Why not +100 ELO !?....LOL

      Delete
  3. It was just an example. Any number you wish can be used there. What i meant in fact, there's an effort here regardless of what cpu, be it extreme or obsolete, is used. And there are results which are correlated to the conditions. I respect them. Coming to your agressive suggestion, what would you think if someone says 6 cores are nothing and you need 32 cores at least and 192GB RAM? This will not sound like a joke in one year you know? My view is that pointless to be stuck on neverending techno updates that feed commercial giants.

    ReplyDelete
  4. I love your blog. This is a cool site and I wanted to post a little note to tell you, good job! Best wishes!!!
    Play Chess

    ReplyDelete

Chessdom News