JCapper Message Board

General Discussion
-- Multicap Data Files vs. Date from Handicappers' Data Warehouse

Home	Register Log In

Multicap Data Files vs. Date from Handicappers' Data Warehouse

endearus
11/13/2009
10:31:31 AM

First, thank you Jeff for accommodating data from the Brisnet Multicaps files.

I beleive you are comtemplating making files for JCapper accessible from HDW.

I am wondering whether I should stay with Brisnet since pricing after next year is uncertain.

What would the differences in the data be? Is data from HDW 'better', more reliable, etc? Could the files from HDW be used by other programs? Does it have more fields?

Thanks,

Eric

jeff
11/13/2009
12:32:34 PM

Hey Eric,

I'll try to address your questions/comments one at a time.

--quote:

"First, thank you Jeff for accommodating data from the Brisnet Multicaps files."

--end quote

My pleasure. I can only imagine how every TSN subscriber must have felt when they first got the news TSN was going away... that feeling of having been thrown under the bus.

--quote:

"I beleive you are comtemplating making files for JCapper accessible from HDW."

--end quote

I am. As in within the next 30 days.

--quote:

"I am wondering whether I should stay with Brisnet since pricing after next year is uncertain."

--end quote

The people at Brisnet want to allow discount data plan pricing to continue in 2011 and beyond. But it's not up to them. It's up to Churchill Downs management - who aren't exactly known for caring what players want.

--quote:

"What would the differences in the data be? Is data from HDW 'better', more reliable, etc? Could the files from HDW be used by other programs? Does it have more fields?"

--end quote

Differences in the data are that the speed and pace figs in HDW files are Cramer based - while the speed and pace figs in Brisnet files are based on a Brisnet algorithm.

Could the HDW files be used by other programs?

Answer: No.

HDW files are not comma delimnited text. They're cut in binary format. Each vendor is given a slightly different file format -- along with a "field map" allowing the software vendor (me) to extract data out of the binary file format.

JCapper HDW files will be different from HDW files for HTR, which in turn are (I'm told) different from HSH, which in turn are (again I'm told) different from files meant for Netcapper, Synergism, and others.

--quote:

"Is data from HDW 'better', more reliable, etc?"

--end quote

Eric it's honestly too early for me to be able to make that determination. Within the next 10-14 days I expect to have databases built spanning at least a year's back data - all tracks everywhere - to test on. At that point I'll know more.

As soon as I do the requisite R&D - expect me to post some numbers.

I will tell you that about two years ago I reached an agreement with a well known figure maker to make his figures available in JCapper.

Early small sample testing (one month at AQU) indicated his figures were VERY good. But after I compiled a sizeable database (all of 2006) it became apparent that Brisnet speed and pace figures were superior.

This figure maker (who btw is no longer in the speed fig making business) and I had become friends... I had to call him and tell him I had no choice but to scrap the project. It wasn't an easy thing to do.

The point that I'm trying to make here -- and it applies to HDW as well as any other data project that I might become involved in... is that I adhere to certain standards.

I won't publish without first seeing win rate and roi numbers from large data samples -- and those numbers have to very clearly indicate that what I'm about to publish offers significant benefit to my customers.

-jp

.

beissner
11/13/2009
2:30:24 PM

WE TRUST IN JEFF !!!!!! Fred

beissner
11/13/2009
2:30:38 PM

WE TRUST IN JEFF !!!!!! Fred

LunaticFringe
11/14/2009
2:44:14 PM

Those opting to go the HDW route can expect to go how far back in time (past days/months/years) for track files?

jeff
11/14/2009
5:37:31 PM

That's yet to be determined.

In the case of a new account sign up where the customer signing up for a new HDW data subscription is the customer of a software vendor who has been offering HDW files for a while - they have a history of offering that customer 45 days back data.

But this is something different for them.

Here, they are being given the chance to potentially win over all of the customers of a software vendor whose customers they've never been given a crack at winning over before.

You guys represent a sizeable chunk of potential new business for them.

You guys have an advantage over the customers of quite a few other software vendors in that:

1. Your software builds databases.

2. Your software has a Data Window that makes it extremely easy for every one of you to run side by side comparisons of output produced by databases built using the data files of different data suppliers.

3. Many of you are former TSN Data Subscribers who were recently given (in some cases) up to three year's back Brisnet data so that you could run side by side comparisons between your new Brisnet data files and your old TSN data files.

Brisnet and HDW are competitors.

Brisnet didn't balk at offering you significant back data to test on when you converted over to Brisnet from TSN.

For that reason I'm going to be pushing them to offer you at least a year's worth of back data.

-jp

.

LunaticFringe
11/14/2009
11:35:22 PM

Hey, you've sold me! Now you "only" need to sell it to HDW.

I'm one of the guys BRIS "made good on" with the back MultiCaps files. And they did it promptly upon request.

As far as I'm concerned, the bar has been raised for HDW. My business is up for grabs.

jeff
11/15/2009
12:48:19 PM

One thing I noticed is that HDW run styles are different than Brisnet Run Styles.

Here's the Run Style description that I received from them:

Note 9: Running Style / Position
Each horse�s past performance line is assigned an actual running style. Roughly, the running styles are identified as
follows (the actual definitions are much more complex but this is a rough and ready guide):
The case of the running style is important for E (e) but not so much for the others. An eP is closer
To P than E, a Ps is closer to P than S, etc. These different cases are not used in any statistics compiled by HDW and
can be ignored if so desired.

E On the lead at the � mile

e Near the lead at the � mile and falls further back in position and/or beaten lengths at each successive point of
call � always a failure line

EP Within approximately a length of the lead at the � mile

P Within approximately a length of the lead at the � mile

PS Within approximately a length of the lead at the top of the stretch

S None of the above but 1) gained ground during race and 2) not too far back at the � mile or top of stretch

SS None of the above but 1) gained ground during race and 2) far back at the � mile or top of stretch

U None of the above and 1) ugly race (no discernable running style or 2) unknown running style
The running style assigned to the horse in the Horse Data section is based on the running style(s) of the horse�s last 3 or,
if a maiden, the best finishes in the last 10 races. A single tick � indicates 1 win only. An exclamation mark ! indicates
multiple wins (2 or 3) and only at that running style. Nothing following the RS indicates the horse has won multiple times
but at different running styles, with the lowest running style in the hierarchy being assigned to the horse.

Examples:

If the horse has won only as an E, and has won only once, the Running Style will be E�.

If the horse has won only as an E, and has won 2 or 3 times, the Running Style will be E!.

If the horse has won as an E and EP, the Running Style will be EP.

If the horse has won only as a PS, and has won 2 or 3 times, the Running Style will be PS!.

If the horse has won as an E, EP and S, the Running Style will be S.

The Position is arrived at by calculating each horse�s fastest raw � mile time in the last 10 races and ranking all the
horses in the race. The 1 horse is the fastest to the � in this race, based on the last 10 races of all the horses.

Too early to tell if or how much significance exists in the run style differences... I'll know once I get enough data to run some numbers.

That said, the Upper Case / lower case and various punctuation mark run style designations have certainly piqued my curiosity. Very Interesting.

-jp

.

JimG
11/15/2009
6:07:15 PM

Jeff,

I am amazed that HDW did not require you to drop support of Bris files in return for signing up with them. To my knowledge, you will be the first software developer signed up with HDW that will also offer an alternative.

Jim

jeff
11/15/2009
6:32:21 PM

Jim,

I met with the guys from HDW in their office this past April when I was in Lexington. I can tell you they're good guys and I'm looking forward to doing business with them.

Very early on, the subject of alternate data files did come up. I told them I had promised a number of you that support in JCapper for Brisnet and TSN files would always be there as long as Brisnet and TSN continued to make data files.

THAT hasn't changed.

The option of which file to use... to switch - or not to switch... is yours.

-jp

.

JustRalph
11/17/2009
9:31:16 PM

How much? for HDW files..........

jeff
11/18/2009
11:11:33 AM

Ralph,

It's yet to be finalized... But it's looking to me like $119.00 a month for unlimited downloads, data and results - all tracks that they offer.

-jp

.

beissner
11/18/2009
11:48:16 AM

Jeff sounds good to me. Include me in !!!!....Fred

Acorn54
12/11/2009
3:47:06 PM

jeff
acorn here
have you finished your research into whether hdw files are just as good or better than the bris files?

jeff
12/11/2009
6:04:16 PM

No. I'm not that far along yet.

Challenges
Before I post a progress update, I'm going to post some of the challenges encountered so that you can (hopefully) get a very real sense of the amount of work involved to make this happen.

Developers like me can be a pain in the ass as if you didn't already know that!

So far, I have really been impressed by the effort the people at HDW have shown and the lengths they have gone to to win me over. A big part of the process is me looking over the full menu of factors that are available in the data... and picking and choosing what I want from that menu... how I want it formatted... and where in the file I want it.

As of right now the "file spec" for the files produced by HDW for JCapper appears to have been finalized. It did take a little back and forth to make that happen.

File Format and Structure
First, the files themselves are not comma delimited text. Instead, they are binary. Which means they need a completey new set of routines to read (and verify) every piece of info parsed out of them. Second, the HDW info model is one file per race instead of one file per race card that Brisnet uses. Which means that after the info has been parsed out of the files, it needs to be "re-assembled" from a block of info for a single race into a block of info for a single race card - so that it can be fed into the number crunching algorithms present in JCapper.

Increased Level Of Detail
HDW data files contain quite a bit more information in them than Brisnet files do... run up distance, gelded since last start, more detailed equipment and medication descripritions, shoe type, tongue ties, more detailed stats for sires, dams, damssires, riders, trainers, owners, as well as stats for the individual horse career-wise under a variety of conditions, and more accurate descriptions of class levels that horses race at...

From what I can see, the other developers out there haven't done enough innovation in the area of exploring/exploiting the increased level of detail available from HDW compared to what I see as being "possible."

I've always felt limited in my R&D because of the "basic" nature of the info found in the files from TSN/Brisnet.

Going forward I am really exited at the prospect of developing new algorithms to take advantage of the added level of detail found in these files. At the very least... they should be... ahem... downright useful.

Progress Update
As I type out this post I have a new module... right now it's called the HDW File Manager.

That module is currently capable of:

1. Finding individual HDW binary race card files on a target folder.

2. Importing JCapper data out of those files and into memory.

3. Repeating the above two steps for every race on the card until there are no more races on that card to be procesed.

4. Writing all of the JCapper data resident in memory to a JCapper race card file.

5. Performing a similar set of routines for parsing data out of HDW charts files.

From there the JCapper (.JCP) race card file created on the fly can be used by the program for both Calc Races and Build Database Routines.

I am able to unzip and process HDW files for both data and results completely from inside of JCapper...

I can use HDW data to Calc Races... I can use HDW data to Build Databases... I can create HDW data specific UDMS and run them through the Data Window.

If I decided to publish the version of JCapper that I am running at my end you could do exactly the same thing.

That's the progress update from my side of things.

Still on the To Do List...

Jeff's List
1. JCapper File Downloader - I still need to modify the JCapper File Downloader so that it can point to the HDW Site, enable the user to authenticate, and (hopefully) download all available files not yet downloaded with a single button click.

2. Verify and adjust as needed the algorithms for JPR, JRating, PRating, QRating, CPace, PMI, CMI, Opt Points, etc. so that JCapper users will see equivalent or better results (compared to Brisnet data) for these factors on large data samples.

I have been unable to do this so far... not because the data isn't good enough (it is) but because I haven't been given access yet to a large enough chunk of data to do meaningful tests.

At this point I should probably point out that I have been given access to a small amt of data... and from what I have seen so far it absolutley IS producing equivalent results. But before I start signing people up I absolutely DO have to perform due diligence... and look at the performance of JCapper factors across large sample tests.

HDW's List
1. Ron Tiller at HDW informs me that he needs to do a reinstall of their authentication software to enable their servers to handle delivery of JCapper data and results files to JCapper customers. I don't know how much work this entails but I do know that it is no small task. This step has to be done first before he can tackle the next step.

2. HDW has to "cut" back data and results files using the JCapper "file spec." I have asked for at least one year's worth of back data. After they "cut" the files, Ron then has to post them on the HDW server so that I can navigate to their site, log in, and download them.

From there, I can do my due diligence...

Right now the hard part of the project... at least from my end of things... is behind me.

I'm in a holding pattern... and probably will be for the next 4-5 days waiting for them.

-jp

.

~Edited by: jeff on: 12/11/2009 at: 6:04:16 PM~

Acorn54
12/11/2009
6:04:21 PM

jeff
it amazes me the amount of work you put into the continual developement of jcapper. it has certainly come a long way since 2004.
i appreciate the work you do to make jcapper state of the art.
guy

jeff
1/10/2010
1:13:34 PM

An update...

That 4-5 day holding pattern from my previous post somehow turned into several weeks.

In that time, I've been extremely busy. As some of you might have gleaned from the "Interesting Reading" thread in the private area of the site, I've been working on ways to combine multiple probability estimates together... with the end result being cleaner, more accurate, and useful for betting -- overall probability estimates.

That said, HDW has once again moved forward on the project and is now producing JCapper Data and Results files on a daily basis. The file format now appears to be final. Since 1/5/2010 I have been using a yet to be released version of the File Downloader to perform daily downloads directly from the HDW site. I am also using a yet to be released version of a new JCapper module called the HDW File Manager to process HDW Data and Results files.

Further, I am now building and reviewing JCapper-HDW databases on a daily basis.

There are still a few "kinks" that I have to work out. As mentioned earlier, the scaling of HDW pace figs is different than Brisnet... which in turn affects the scaling of other factors like RV and PaceIndex. But it's nothing insurmountable.

HDW also has a power rating in their files, but from what I can see so far it isn't on par with Brisnet's Prime Power so I have some work to do there too.

Other than that I'm not seeing anything that would throw a diligent JCapper user for a loop. The data produces numbers that are scaled differently than brisnet numbers.

Aside from the "scaled differently" aspect... based on everything I've seen so far in my testing... I LIKE what I see.

There are also a number of things that HDW appears to do better. For example their F1 and F2 pace figs appear to be very good. Their Pedigree rating appears superior to the Brisnet Pedigree counterpart. But that's not the best part...

I was also able to get HDW to include some custom Stat Categories in the JCapper file spec. Without going into too much detail as to what those stat categories are... these are custom stat categories that I have been using in my own UPR and UserFactors.

This means that you are going to see some VERY interesting new factors on the HDW side of things going forward.

I'll give you a hint... I've been experimenting with feeding said new stat category factors as inputs into the algorithms producing both BettorsToteProb and E~BettorsToteProb... and have had some VERY interesting results since doing so.

HDW still has to come through with the rest of a full year's back data for me so that I can complete due diligence testing. They assure me that they are working on that.

All in all I am very happy with the quality of the data (and results from use of that data) that I've seen to date.

I'll post more as soon as HDW makes further progress... Just wanted to keep you guys in the loop.

-jp

.

jeff
1/19/2010
12:58:59 AM

Ron Tiller at HDW told me he expects to deliver a sizeable chunk of back data for due diligence testing within the next 2-3 days.

If you are interested in being part of a JCapper-HDW data beta testing group shoot me an email.

jeff @ jcapper . com
(remove the blank spaces first)

-jp

.

JCapper Message Board

General Discussion -- Multicap Data Files vs. Date from Handicappers' Data Warehouse

General Discussion
-- Multicap Data Files vs. Date from Handicappers' Data Warehouse