The Week in Chess

Sunday, May 11, 2014

Stockfish 14051008 x64 vs. Stockfish 14042120 x64 - Test Match, 400 Rounds 05/11/14

The latest Stockfish 14051008 development version is at least 30 ELO rating points stronger than 14042120!

It's time once again to calibrate the Stockfish ELO Rating after 20 days since the last publication of the Owl Rating List where a Stockfish development version was the feature. There were several commits of Stockfish during that period and I picked up the ones that may have significant effects in the ELO rating.

- Allow a slave to 'late join' another splitpoint - by Joona Kiiski

Instead of waiting to be allocated, actively search
for another split point to join when finishes its
search. Also modify split conditions

- Rewrite Score extractors - by Ron Britvich

Less tricky and even a bit faster. With this
version Visual Studio Ultimate 2013 Update 2 RC
runs fine even in O2 optimization.

- Penalize hanging pieces - by snicolet


- Speed up by almost 3% - by Jonathan Calovski

This apparently silly tweak allows
to speed up the bench by almost 3%.
Not clear why, repeating with perft,
the speed up vanishes.

- Speed up picking of killers - by Marco Costalba

Changing the order of the conditions gives
about 1% speed up!

- Remove RookOn7th and merge values into psqt - by Arjun Temurnikar

Tested in no-regression mode:

- Remove penalty for knight when few enemy pawns - by Arjun Temurnikar

Tested in standard mode at STC and no-regression
mode at LTC:

- Shuffle movepicker score - by Jonathan Calovski

Believed to be a speed optimization as benched
on Windows with bench realtime affinity 0x1 deleting
highest and lowest runs:

- Move queen vs. 3 minors rule to imbalance tables - by joergoster

Tuned with CLOP after 57k games.
Simplification: tested in no-regression mode.


--------------------------------------------------------------------------------------------

The latest version chosen for the test match is Stockfish 14051008 released on May 10 against Stockfish 14042120, the current leader in the Owl Rating List. It's a One-On-One match of 400 rounds at 1 minute base + 1 second increment time control. The tournament conditions are indicated in the BayesELO match statistics below.

After 13 hours of intense battle, the latest Stockfish 14051008  development version emerged as the winner with a score of +87-46=267. To get the estimated ELO increase I ran ELOStat, BayesELO and Ordo with an arbitrary ELO median of 3150. The ELO rating results were similar which gives a 36 ELO points increase. Also to approximate the nearest ELO rating possible, I incorporated temporarily the match scores with the Owl Rating List database and the result was a tentative ELO rating of 3177 for the new Stockfish 14051008 which is 31 points higher than the current leader.

Here is the run results of the ELO Rating System Statistics: 


STOCKFISH ELO
Rating System 14051008 14042120 Diff ELO
ELOStat 3168 3132 36
BayesELO 3168 3132 36
Ordo 3168 3132 36
Owl Rating List 3177 3146 31


Here is the BayesELO match statistics:

Stockfish_14051008_x64 vs. Stockfish 14042120 x64 - Test Match 400R 1M1S, 05-11-2014
RankEngineScoreStStS-B
1Stockfish_14051008_x64220.5/400· ·· ·· ·· ·87-46-267 39579.75 
2Stockfish 14042120 x64179.5/40046-87-267· ·· ·· ·· · 39579.75 


400 games played / Tournament finished

Tournament start: 2014.05.11, 00:13:12
Latest update: 2014.05.11, 14:35:38
Level: Blitz 1/1
Hardware: AMD Phenom(tm) II X4 945 Processor with 1.8 GB Memory
Operating system: Windows 7 Ultimate Professional Service Pack 1 (Build 7601) 64 bit
Table created with: Arena 3.5
Download the test match games PGN here.

No comments:

Post a Comment

Chessdom News