From the email inbox:
"Q. Jeff, In UPR Tools, what are the boxes for GN Hi Score and Low Score used for?"
A. I added GN Hi Score and Low Score to UPR Tools to help with calibration.
The letters GN are an abbreviation for GroupName.
When you persist a GN Hi Score numeric value for a GroupName - the persisted value for GN Hi Score becomes a hard cutoff upper limit for horse scoring by that GroupName when you click the Calc Races and/or Build Database buttons.
In other words, all horses that would normally score out higher than the persisted GN Hi Score numeric value are scored out at a value equal to the persisted GN Hi Score numeric value.
When you persist a GN Low Score numeric value for a GroupName - the persisted value for GN Low Score becomes a hard cutoff lower limit for horse scoring by that GroupName when you click the Calc Races and/or Build Database buttons.
All horses that would normally score out lower than the persisted GN Low Score numeric value are scored out at a value equal to the persisted GN Low Score numeric value.
Fyi - I'm fully aware the word "Hi" (without the quotes) means "Hello" and that "High" refers to elevation.
I purposely used "Hi" instead of "High" on the face of the UPR Tools Expression Builder because a 2 character label was easier to squeeze into that part of the Expression Builder UI than a 4 character label.
Screenshots --
Upper part of the UPR Tools Expression Builder UI after loading in a GroupName I was working on last year:
Data Window output for the same GroupName - factor breakout data - lower numeric value rows:
Data Window output for the same GroupName - factor breakout data - upper numeric value rows:
Note that in the above screenshots:
GN Hi Score for the GroupName is set to 0.3209
I've highlighted the last row in the Factor Breakout that was populated with data. There are more than 800 plays in this row. All of the other rows prior to that have a few dozen plays at most.
That's because the GroupName horse scoring aglorithm enforced the GN Hi Score of 0.3209 as the upper limit during the DB Build routine. Put another way: no horses in the data ended up with UPR numeric value greater than the GN Hi Score.
GN Low Score for the GroupName is set to 0.0448
I've highlighted the first row in the Factor Breakout that was populated with data. There are more than 1000 plays in this row. The adjacent two rows after that have 343 and 46 plays respectively.
That's because the GroupName horse scoring aglorithm enforced the GN Low Score of 0.0448 as the lower limit during the DB Build routine. Put another way: no horses in the data ended up with UPR numeric value less than the GN Low Score.
Q. Why did I set GN Hi Score and Low Score for the GroupName and why did I add GN Hi Score and Low Score to UPR Tools?
A. After the first few DB Build routines for the new GroupName I noticed something.
When I looked at the UPR factor breakout data for the first few rows at the low end between zero and 0.045 in increments of 0.025 - the horses in that range had an avg win rate of about 0.0448 - even though the new GroupName was scoring them out much lower than that.
As I drilled down into the data - when I looked at the rows from 0.0125 to 0.0175 in increments of 0.0025 there were hundreds of horses that were scoring out between 0.0125 and 0.0175 - but the avg win rate for those horses was about 0.0448.
I was also seeing a similar effect in the factor breakout data at the high end.
For example, horses from 0.325 all the way up to 0.65 broken out in increments of 0.05 had an avg win rate of about 0.32.
After seeing that I started checking other GroupNames - and realized many of them were showing a similar effect.
I realized I needed to start calibrating my win likelihood estimates.
It occurred to me that creating the ability to set upper and lower limits for GroupName horse scoring might be a good first step.
So I added GN Hi Score and Low Score to UPR Tools.
-jp
.
|