Part 1.
I created a fun little data project based on 11 races.
HDW V4 .JCP files. Kentucky Derby only. Calendar years 2010 through 2020.
Last year being the only Kentucky Derby that was run in September.
Link to a text file that shows Data Window output: http://www.JCapper.com/Messageboard/Reports/FunLittleDataProject-05012021.txt
Because the database contains 11 races only I didn't spend a whole lot of time looking for max roi.
After a few minutes of looking at the data I landed on two factor constraints:
TrackLast - Track code most recent running line.
Over the past 11 years, starters with a TrackLast other than GPX-MTH-OPX-SAX-TPX are a combined 0 for 115 in the Kentucky Derby.
Of course none of that means this year's Derby winner won't have a last race running line from a track other than GPX-MTH-OPX-SAX-TPX.
Last time I checked, the current favorite has a last race running line from Keeneland.
Imo, this is exactly the stuff that makes model building fascinating.
TDX_DIST_BEST10 - Time Decayed Speed Fig at a similar distance (best of Last 10 running lines.)
Over the past 11 years, starters outside the top 4 for TDX_DIST_BEST10 are a combined 3 for 164 in the Kentucky Derby.
After seeing more KY Derbies than I care to admit, it's obvious to me at least - that ability to get the distance is important.
And after spending a few minutes looking the data in this little project, the data appears to suggest TDX_DIST_BEST10 has done a pretty nice job of pointing out ability to get the distance - at least for the past 11 KY Derbies.
None of this is to say this year's Derby winner won't shock the world by running a new lifetime top 15 lengths faster than anything he's shown in the past.
Keep in mind these are 3 year olds who are still growing up. And for that reason, anything is possible.
Again, this is exactly the stuff that makes model building fascinating.
-jp
.
|