Averaging Relationship Between 2 Test Subjects

apple405

New member
Joined
Feb 17, 2020
Messages
1
Hi

I am new to this forum and have a maths problem. So basically, I am trying to quantitatively assess a bull’s effect on their offspring’s milk yield. So I have made a model database of 3 bulls and 3 cows. Each cow has one heifer calf to each bull and we have results for average lactation yield per cow (how much milk they produce every year). We have Cow A, Cow B and Cow C. We have Bull 1, Bull 2 and Bull 3. We have heifer I, heifer II, heifer III, heifer IV, heifer V, heifer VI, heifer VII, heifer VIII and heifer IX. Here are my results:

HeiferBullCowMilk Yield
Heifer IBull 1Cow A
3500​
Heifer IIBull 2Cow A
3000​
Heifer IIIBull 3Cow A
2750​
Heifer IVBull 1Cow B
2750​
Heifer VBull 2Cow B
2800​
Heifer VIBull 3Cow B
2650​
Heifer VIIBull 1Cow C
3750​
Heifer VIIIBull 2Cow C
2000​
Heifer IXBull 3Cow C
2250​

Cow 1Cow 2Cow 3
B1vB2
500​
-50​
1750​
733.3333​
B1vB3
750​
100​
1500​
783.3333​
B2vB3
250​
150​
-250​
50​
B1 is 733l better than B2
B1 is 783l better than B3
B2 is 50l better than B3

I would like to be able to this on a much larger scale and I was wondering if there was any algorithm I could use, or any way of speeding up this process. Or has anyone any suggestions on what I could read up on to help me with this?

Thanks
 
Hi, welcome to FMH !

You seem to have worked out the calculation that you need. And the problem is to perform this exact calculation on more data (if I understand correctly).

A "SQL database" might be a solution. But it very much depends on how tek-savvy you are. You'd perhaps be better off producing a spreadsheet if you find the following too confusing, or if you simply don't have the time to learn SQL. So I'm a little hesitant to post this. But here goes:

You can try it on http://sqlfiddle.com/ Paste this into the left hand pane:

SQL:
/* Schema */
CREATE TABLE yield (
    bull_id INT,
    cow_id INT,
    yield  FLOAT NOT NULL,
    CONSTRAINT PK_yield PRIMARY KEY (bull_id,cow_id)
);

/* your data */
INSERT INTO YIELD VALUES (1,1, 3500);
INSERT INTO YIELD VALUES (2,1, 3000);
INSERT INTO YIELD VALUES (3,1, 2750);

INSERT INTO YIELD VALUES (1,2, 2750);
INSERT INTO YIELD VALUES (2,2, 2800);
INSERT INTO YIELD VALUES (3,2, 2650);

INSERT INTO YIELD VALUES (1,3, 3750);
INSERT INTO YIELD VALUES (2,3, 2000);
INSERT INTO YIELD VALUES (3,3, 2250);

/* Some extra data */
INSERT INTO YIELD VALUES (4,4, 2000);
INSERT INTO YIELD VALUES (5,4, 3000);
/* The following will be ignored because cow 5 has heifer with only one bull so no comparison is possible */
INSERT INTO YIELD VALUES (3,5, 1000);


And the following code produces the results, paste it into the right hand pane:

SQL:
SELECT
  y1.bull_id AS bullA,
  y2.bull_id AS bullB,
  avg(y1.yield)-avg(y2.yield) AS comparison,
  count(y1.cow_id) AS num_cows
FROM
  yield y1, yield y2
WHERE
  y2.bull_id > y1.bull_id
  AND y1.cow_id = y2.cow_id
GROUP BY
  y1.bull_id, y2.bull_id
;


You should get the following output:

bullAbullBcomparisonnum_cows
12733.3333333333333
13783.3333333333333
23503
45-10001

"num_cows" is the number of results that were averaged to form the results. So it should give you a confidence level in the comparison figure.

If you wish to take this idea further then I think you'd be better off asking on a SQL forum, search for "top sql forums"
 
From the mathematics point of view, I recommend caution when using statistics like this. It is very difficult to capture the "whole picture" into one number. When you think you've succeeded to do this you'll probably find, at some future date, that you forgot to factor something in.

For example, does bull X produce offspring that produce a lot of milk for one year - and then dry up? Maybe their heifers would be more likely to get bad teeth/ hooves/ whatever, incurring extra vet bills that more than offset for the extra milk yield down the line.

There's also benefits to diverse genetics in livestock. It's like hedging your bets. If bull X's offspring happen to be very susceptible to a particular external virus - then the whole herd could be wiped out in one go without having that diversity.

I urge that you don't use the above tool in isolation. Please use it with commonsense and factor in advice from other sources. (You'd probably do this anyway!)
 
Top