BLUP, What Does It Mean? Is It a Valid Prediction Method?

BLUP is the acronym for Best Linear Unbiased Prediction. The BLUP is a method of statistical analysis and estimation. Numerical scores are given to traits and compiled as predictions for future use.

Is It a Valid Prediction Method?

"BLUP is a mathematical model that has been used for predicting future generations in certain agricultural crops - I seem to remember reading examples where it's been used to predict protein content in soybeans, butterfat in cow's milk, the mature height of certain coniferous trees...that sort of thing. Simple traits that can accurately and objectively be measured, and possibly predicted.

For BLUP to work, the following all have to be true. The characteristic to be monitored/predicted has to be simple - only one trait or one goal can be predicted in a model. It has to be something that can be objectively measured with excellent accuracy - like butterfat in cow's milk. And you can only measure a trait that is known to be heritable - we assume that Elsie's calf will grow up and will produce butterfat somewhat affected by Elsie's own genetic tendencies. And, it has to be applied to huge numbers of the animal/plant studied, in order to be statistically meaningful - by huge numbers, normally maybe hundreds of thousands of cases - but multiple-millions of individuals would be better.

BLUP, when applied to Icelandics, meets none of those criteria. The input is evaluation scores. We all know that evaluation scores are affected by who's trained the horse, who's ridden the horse, where the rider sit on the horse's back, what kind of shoes they use, how willing they are to push the horse, and how hard, what kind of nosebands, bell boots...and none of these things are heritable. Strike one.

And, of 100,000-200,000 Icelandics worldwide, only a fraction are evaluated - nowhere nearly the number of individuals needed to attain statistical accuracy. Strike two.

And - are evaluations scores "simple" and are they objective - like measuring the protein content of soybeans? Do you put the same priorities on what makes a good riding horse as I do? Is measuring a good riding horse as objective as measuring butterfat? No way.

It's not even applied towards predicting one single gait - which alone is probably too complicated for BLUP to work for, since your idea of a "good trot" might not be the same as mine - and that's not even considering the different priorities different people put on temperament, friendliness, intelligence, conformation... The traits we value in horses are completely subjective, and you can't apply a simple, objective model to a complicated, subjective being. Strike three.

I e-mailed a professor at an Agricultural College in Canada who had written an Agricultural genetics textbook, just to see if I was missing anything. He confirmed that I wasn't. I can probably still find the e-mails, if anyone is interested. I also bought his book. I also requested information from the guy in Iceland who is the "father" of BLUP as applied in these horses - he never responded. In this case, I do feel like I can make these statements with confidence as I have. The model just won't work on mammals - they are WAY too complicated. Soybeans maybe - if you just apply it to one single characteristic.

It will not work when applied to characteristics that are subjective (as in "willingness") in characteristics that are not known to be inheritable, or if the samples are not large enough. And it certainly can't be used to predict multiple traits, and there are certainly many desirable and not-so-desirable traits in any mammal...My horses are much more complicated than soybeans."

The theory behind BLUP is comparatively complex, and the BLUP method demands a lot of calculations.

The following example was given:

>>I've been researching genetics with respect to some giant breed dogs and have seen pedigrees where (with respect to hip problems where 1 = no problem, 3 = some problems) a perfect hipped dog mates with another perfect hipped dog and produce a puppy with some hip problems, and then to confuse things more, at least for me . . . a #1 dog is mated with a #3 dog and the puppy is a #1.

I think BLUP is fun from a mathematical standpoint and I enjoy reading my horoscope too but when it comes right down to it, really how accurate is it???<<

Which resulted in this response:

>>In the example you gave, you were looking at one particular trait and the link wasn't quite as predictable as you might think. BLUP looks at many, many characteristics. From what I've read, or more accurately, what I haven't been able to find, I've come to think your comparison to horoscopes is pretty appropriate. Funny, yes, but sad too. I have looked for the sound science behind BLUP as used by Icelandic horses and I can't find it so far. For a mathematical model to work, the input data has to be 100% relevant. If the data is not relevant, no matter how sound the formula and / or model, the results are skewed. The more parameters with less relevance, the more skewed, to the point of becoming useless. The expression I learned in my first year of Computer Science was "GIGO" : garbage in = garbage out. My first engineering mentor taught me to question the relevance of any formulas or studies.

Many of the input parameters to BLUP are very subjective and not scientifically definable, unlike Hip Dysplasia, which has some established guidelines in its diagnosis. Has anyone ever seen a established scientific definition of "character?" Is there a universally accepted equine IQ test? Those are very subjective descriptions, certainly not anywhere near 100% definable. Is there an absolute definition of "pretty head?" How about the influence of a rider on the evaluation of gaits? Riders aren't hereditary. And horses are allowed to have slightly longer feet with shoes when they are evaluated than is considered appropriate by many trained American farriers. Shoes and trimming are not inheritable traits. In fact, the use of shoes in evaluating breeding horses could even push the breed towards less natural gaitedness. Genetics themselves are complicated enough, but when you let factors such as handling, training and management come into the equation, any value that might possibly have been in the BLUP theory is thrown out the window.

I've sent out questions to some in the Icelandic community to show me the independent scientific organizations who have validated that BLUP is appropriate for evaluating future breeding potential. I haven't yet got that request answered. There are Schools of Genetics at many universities, but I can't find published research on BLUP as its being used with Icelandic Horses. If any independent group with valid scientific credentials does indicate that the BLUP model itself is appropriate, then the next step would be to determine if the input parameters and data being used are valid. That hasn't been answered either.

I went into this assuming that BLUP would be relevant for my tiny breeding program. However, as I've searched for substantiating background information, I've instead come to feel like I'm on a snipe hunt. My original questions arose from wondering why the Thoroughbred racing industry doesn't use BLUP. Nowhere is there more motivation to improve breeding predictability. BLUP is not used by other breeds in the USA. Is the Icelandic world ahead of the curve - or following a few without questioning? I'd love for someone to show me how BLUP is more relevant than horoscopes.<<

Along with this followup response:

I have another problem with BLUP. When using BLUP the "goal" is to produce horses that will do well at the Evaluations. That is not the stated goal but determining BLUP based the horses' performance at the evaluations, that is what becomes the mathematical goal. (I hope that makes sense). Just how relevant is this to an individual breeder? For myself - some things are relevant (i.e. good conformation). Some things are not (i.e. lift). Some things are not measured at all (i.e. ability to be a sensible trail horse).

The problem is that BLUP gives you one number (Katina x Gymir = 110%). Does this mean that my foal will have 110% better chance of being a successful trail horse (when compared to the average Icelandic). Absolutely not! It means that my foal has a 110% better chance of being successful at the Evaluations than the average Icelandic. (If the mathmatics are accurate). Is that important to me? No.

They use a similar system in dairy cows - the characteristics that are important and measurable are used (i.e. milk production, solids) and these are weighted depending upon the "hereditablity" of that trait (and the influence of a particular farm is taken into account (the cow is compared to other cows in her herd)) and then the statisticians wave their magic wands and - poof - a highly successful program is launched. BLUP measures things that are not objective and I don't know that the actual "hereditablity" of the traits that are being measured is known. (You need a huge number of animals and very, very accurate rcord keeping). Some trait are determined almost entirely by genetics (i.e. coat clour) and others are not (how does "character" get measured?). The higher the number of traits being measured - the less useful the number is.<<

Followed by:

>>It's a huge jump between mathematical theory and practice and that's what I'm asking about. I understand that BLUP may have been used to predict some simple traits in both livestock and field crops regarding traits that are known without doubt to have a genetic connection. However, Icelandic evaluation scores take into account many, many characteristics, and some are also influenced by training and management and the strength of a genetic link is not always established. Many evaluation criteria have both genetic and management links, but management is not hereditary. What I haven't been able to find is any research that suggests that the Icelandic breeding evaluations are appropriate to use as data into this model. Can anyone point me to a study by a well known independent scientific organization (a university, etc.) that shows where shows a strong cause and effect relationship between the data input and the output related to horse breeding? Despite many questions, and much searching on the Internet I can't find it.

There are a lot of statistical models that are valid when used in the appropriate situation, but if the correct models is not chosen for the appropriate situation and the data is not determined to be purely related to the traits being predicted, the results are very suspect if not totally invalid.

So far, I can't even find out why this particular model was chosen, who decided that it was the appropriate model to use, and if the model is valid when so many traits are being considered, or who validated that these input parameters can be successfully used for predicting futures breedings. In fact, I can't even find what parts of the evaluation numbers are used as field data for determining BLUP. Most research methods and studies have this sort of background documentation available to prove their merit. My engineering training makes me suspicious as to why more background documentation details are not available. If this is valid science, then why aren't other breeds using it?<<

Adding to the discussion:

>>First let me qualify this post by saying: I'm not a horse breeder in any capacity. So, that will explain my ignorance on this subject. : )

However, it is my sincerest hope that the breeding of Icelandic horses will always include the variety of horses we find available now.

I tend not to like "systems". Mother nature does not seem to like them either. And, for the most part, that's who REALLY gave us the Icelandic horse we know and love. Man has only been meddling in the mix for the last 80 years or so.

A system like BLUP and to some extent, the evaluations, is geared towards one "perfect type" of horse which may or may not be ideally suited to all kinds of riding and riders who fancy Icelandic horses. My prayer is that there will be enough "maverick" breeders out there who will not be swayed by the fashion of the show ring in terms of turning out the "perfect" Icelandic and instead, breed what they like. This will ultimately keep the variety of the breed in tact. Yes, it will also likely produce some real toads, but if Iceland is still culling 25% of it's foal herd, I don't think this idea is any worse.

While the current trend might be for elegant, refined, narrow bodied, long-legged, highly animated, miniature Saddlebreds or Fresian look-a-likes, there will still be smaller breeders offering something that resembles a horse that looks like it survived the millenium and will likely make it to the next. Give me a usin' pony over a hothouse flower any day.<< And:

>>The only other breed for which I've seen any BLUP references is the Swedish Warmblood. I couldn't find any details on that breed either, but I know its numbers are pretty small too, especially when you compare to the large numbers of American breeds like AQHA, TB, Apps, etc.<<

In regard to looking at the total number, or the numbers assigned to individual traits:

>>I absolutely agree that each individual trait should be considered, not a total score, with or without a formal evaluation, and I think that was the point.

In an ideal scenario, one would hope that the sire's recessive traits would align with the mare's superior dominant traits, and vice versa. We use the best judgement we can, pick the breed breeding candidates we can, but we still have to keep our fingers crossed for a good roll of the genetic dice, followed by a healthy pregnancy, normal delivery, and then hope we pick the best practices for raising, handling, and training each individual foal. It's a long and complicated process and I can't see how it can be reduced to a single number, or even a series of numbers. There has to be a little art and intuition used with some genuinely sound science.<<


>>Great post. You have re-stated the same concerns that other vets, mathematicians, and people trained in statistical and scientific research have pointed out when I've asked them about this issue. I took the liberty of re-listing six of your points below.

[] 1. Some things are not measured at all (i.e. ability to be a sensible trail horse).

[] 2. If the mathematics are accurate

[] 3. BLUP measures things that are not objective and I don't know that the actual "hereditability" of the traits that are being measured is known.

[] 4. Some traits are determined almost entirely by genetics (i.e. coat color) and others are not (how does "character" get measured?).

[] 5. The higher the number of traits being measured - the less useful the number is.

[] 6. When using BLUP the "goal" is to produce horses that will do well at the Evaluations. That is not the stated goal but determining BLUP based the horses' performance at the evaluations, that is what becomes the mathematical goal.

With so many basic premises of this model in question, it can't be taken seriously. BLUP as used with Icelandic horses seems to be pretty darned flawed from a scientific point of view from the beginning.<<

In regard to a stallion and his BLUP score and individual traits:

>>You may possibly have found a good example, but I'd have to have to ask more questions to say. I think we agree on the unimportance of BLUPs, so this isn't to you personally, just another attempt to make people aware of potential pitfalls of using a questionable method.

There could be very little link between BLUPs and pace here. For example, to draw the conclusion that HE and he alone passed on the pace, you'd have to carefully study the mares he was bred to and their genes - each of them contributed 50% to the offspring too. Is it possible that some (many?) mare owners purposely selected him to sire offspring to their mares who had strong pace? For example, might a breeder be tempted to breed a good mare with a slightly stronger pace than he's comfortable with to a stallion that scored a 9.0 on trot and tolt, but only 8.0 on pace? If that was the case very often, the pace might be showing up DESPITE his genes instead of BECAUSE of them...? What were these breeders' goals - strong pace, four-gaits, or an even distribution of five gaits? Did they all have the same goal? Different horse breeders can have very different goals, so it's not like with soybeans where for instance, this year's genetic manipulation might aim for a given protein range and next year, draught-resistance. But what really concerns me (statistically anyway) is that only 28% of his offspring were evaluated, even thought that is probably a pretty high percentage for any stallion in the real world. What about the other 72%? Any hard-pacers or hard-trotters among those 492? That's a lot more of his offspring that we know nothing about compared to the ones we do know something about.

If we could show that many or most of the mares he bred had no or weak pace, then it would appear more likely that he passed on pace more strongly than he showed himself.

A good study shows how extraneous data, coincidences, etc. are explained. Isn't it likely that the people who had his offspring evaluated purposely chose the "best" of his offspring and possibly selected more offspring of a given type? (I'm not going into the subjectivity of "best", but assume we're talking "best" relating to "normal" judging of traits scored in evaluations that are evenly applied everywhere.) If you only chose to study what might the "cream of the crop" and that "cream" is only 28% of the available data, and many or most of those owners were after a particular type horse, then the data is again very skewed from the get-go. Further, we'd have to know if the owners of the offspring tested were more (or less) likely to use any special (harsher or less harsh) training, bitting, trimming, or shoeing methods, to see to see the role training and environment might have played.

I confess don't understand the whole idea of 100% accuracy relating to BLUPs. If the BLUP is determined by feeding in the evaluations of his offspring, wouldn't the results always be 100% just by definition? I haven't seen any definition of that "percent accuracy" relating to Icelandic BLUPs and again, since that's not readily available, that omission of information concerns me.

I don't know enough about how evaluations are actually conducted to have a strong opinion on them yet. However, I can at least accept and respect what I understand they claim to be: a measurement of how that particular horse looked and/or performed on that particular day, by stated rules (even if I question some of the rules), hopefully by judges that are not overly subjective or biased. But it's really pushing it to try to predict the future by these scores by projecting them into BLUPs.

We'd all like as many tools as possible when making breeding decisions, but I'm afraid BLUP could actually be misleading in more situations than it's helpful. That's not a tool I want to use.<<


>>I spent many, many an hour learning all about the system for breeding dairy cows (BS and MS in Animal Science, large animal track through vet school); finally got some use out of it! The other point to consider is that in dairy cows the goals are clear, universally accepted, easily measured, influenced strongly by genetics and few in number - (more milk/higher solids/can't remember the rest - but it's a small number). There are a lot of cows in the measurements (zillions, more or less) This system has been highly effective because of those factors.

The BLUP system has none of these attributes.

I don't want a system for producing Icelandics that will do well at the evaluations to be effective because I don't think that I want that horse. We all have different horse preferences and I think it's great that the Icelandics come in many different packages. Parts of the evaluation are useful for matching up mares and stallions - however - I question the validity of the final score.<<

>>BLUP - the actual % that is given as a final score: IMO that % score is not useful and may, indeed lead some people down a path they did not intend to go.<<

