View Full Version : The victims of the globalisation of dressage
freestyle2music
May. 8, 2009, 09:17 AM
The magic word "Globalisation of Dressage" has some very nasty side-effects.
Often we see at Lower Level Competitions that judges from all kind of exotic countries get a chance to shine their light on the performances of Junior/Young- and Pony riders or other lower level combinations.
This weekend we have the prestigious CDI JR/YR/Ponies at Weert in the Netherlands.
The first results are published and Yes we see (again) that riders are placed #3 , # 26 and #18. The winner was placed #8, #1 and #1.
http://www.ichweert.nl/uploads/d__ich-weert_002-resp_htm.pdf
At the last FEI Task Force meeting someone suggested to setup a videodatabase with every single movement scoring from 3 to 10.
Wouldn't this be a very good tool to teach these exotic judges
and the public. Every single movement on video, marked and
explained by the top judges.
Theo
dotneko
May. 8, 2009, 10:00 AM
I have been lobbying for that at the National Level
Dressage Judges Forums every time I go to one.
We really all need to be on the same page.
Dot
ShotenStar
May. 8, 2009, 10:07 AM
Those of us who were/are involved in the US 'Performance Standards' issue are also in favor of this type of rigor in judge training. Without the 'users' of the 'tools' being trained to a well-defined standard, it doesn't matter whether the 'tool' has half-points, is on a 0-100 scale, or is an 'A-B-C' scale. The 'tool' can only be as good as the 'users'.
*Star*
CatOnLap
May. 8, 2009, 10:23 AM
It is a very good idea to train all our judges to a standard. Videotapes and computers make it quite possible to do this for every judge in the world, if the judges would care to take the time. Wouldn't it be nice to have the judges qualify to a standard that actually had visual "examples" that were the same for everyone? instead of just shadow judging a few shows and taking verbal/written tests?
dilligaff2
May. 8, 2009, 10:31 AM
WAIT!
Finland is exotic????
Who knew?:eek:
canyonoak
May. 8, 2009, 10:38 AM
OK.
So which judges are going to watch the videos and set the marks?
I know which judges I want to see be responsible for this...but not so sure these will be the final choices.
CatOnLap
May. 8, 2009, 10:38 AM
PS- Theo, I am quite sure this is not just a problem with judghes from "exotic" countries. Seems to me, even the less exotic judging comes into question especially with the "popular" top riders who often seem to the rest of us, not to deserve the top placings at times. Sometimes those less exotic judges seem too taken with the cult of celebrity in their home countries.
merrygoround
May. 8, 2009, 10:40 AM
C-O-L Have you ever studied the requirements for licensing for judging by the USEF? The requirements these days are a little more rigorous. There are some quite well qualified judges at their respective levels in your neighborhood.
As far as the YR scoring::D They were relatively unanimous at the end of the list, :lol: seems everone can recognize a bad ride, it's the good rides that give them trouble:sigh:
Ambrey
May. 8, 2009, 10:48 AM
Wouldn't this be a very good tool to teach these exotic judges
and the public. Every single movement on video, marked and
explained by the top judges.
It's one method to study the measurment method.
There are other ways, of course, using statistical analysis when there is multiple judging.
Again I'll say... why any organization would have a measurement on which so much rides and NOT be doing reliability/validity/inter-rater reliability studies is waaaay beyond me.
Mike Matson
May. 8, 2009, 11:16 AM
Stop worrying Theo. Remember, someone is working on a computer program that will replace the judges. :)
freestyle2music
May. 8, 2009, 11:48 AM
Stop worrying Theo. Remember, someone is working on a computer program that will replace the judges. :)
Yes that's why the computer asked me to make these videos.:D:lol:
freestyle2music
May. 8, 2009, 11:49 AM
It's one method to study the measurment method.
There are other ways, of course, using statistical analysis when there is multiple judging.
Again I'll say... why any organization would have a measurement on which so much rides and NOT be doing reliability/validity/inter-rater reliability studies is waaaay beyond me.
You probably missed the fact that this was the end-conclussion of all the thousands of mathematical analysis presented. Ask CanyonOak she sure has been through all the fact and figures which were presented.
mbm
May. 8, 2009, 12:04 PM
great idea. but who is going to interpret what a "10" is? or a 3 ? isnt that the exact issue we are having now? the interpretation of the standard? .
i also think that it has less to do with globalization and more to do with the fact that there are at least two different "camps" in dressage and probably the scoring reflects that more than anything (besides of course the cult of personality that surrounds various topriders)
it will be interesting to see what happens and what affect it will have.
Ambrey
May. 8, 2009, 12:12 PM
You probably missed the fact that this was the end-conclussion of all the thousands of mathematical analysis presented. Ask CanyonOak she sure has been through all the fact and figures which were presented.
So they have determined, from this mathematical analysis, which judges have the poorest reliability with respect to their fellows and are requiring those judges to get additional training? Which tests have the lowest inter-rater reliability in general, and they are re-writing the judging criteria for those tests? What variables are affecting the inter-rater reliability, and how those variables can be controlled?
What you posted is not the result of an analysis, was simply an observation. I keep hearing that analysis is done, but have yet to see the results- I'd love to if someone would post them.
mbm
May. 8, 2009, 12:18 PM
i am no statistician, but how did the quantify the "cult of personality" into the score studies?
are they going to require those judges that are blind to what those riders do extracurricular training?
and the flip side of this is: are they going to address the fact that a rider needs to "ride in front of" a judge many times before they can hope to get a fair score?
freestyle2music
May. 8, 2009, 04:39 PM
i am no statistician, but how did the quantify the "cult of personality" into the score studies?
are they going to require those judges that are blind to what those riders do extracurricular training?
and the flip side of this is: are they going to address the fact that a rider needs to "ride in front of" a judge many times before they can hope to get a fair score?
Just read the 214 pages of this study, and you will find the answers.
Theo
Ambrey
May. 8, 2009, 04:42 PM
Is the study online?
dressageUK
May. 8, 2009, 05:32 PM
isn't this what the FEI judges handbook was designed to address? How cool would it be if this was turned into an interactive e-book, with video clips to represent all the movements being discussed?
Stephen Clarke, who was a major player in the FEI handbook & is an O judge has produced a DVD series in the UK for the movements from training to 4th level ridden by different standards of riders & marked accordingly.
Ambrey
May. 8, 2009, 05:34 PM
isn't this what the FEI judges handbook was designed to address? How cool would it be if this was turned into an interactive e-book, with video clips to represent all the movements being discussed?
That won't help if there is no process by which judges are being held to that standard, or checks and balances to make sure that the judges are implementing them consistently.
slc2
May. 8, 2009, 05:39 PM
The judging is an OPINION.* OPINIONS differ. Plus, most of the 'huge' differences in placing were caused by a difference of scoring of 3 points or so.
Too, when you look at how the class placed, it often turns out that the 'high score' and the 'low score' don't actually affect the placing of the class much at all.
Now to have any understanding of why some horses got larger scoring differences, I really think you need to see the tests and read them over, score by score, and see why.If there are no comments on a lower score, and the judge cannot explain why he scored that way, then yes, I would be concerned.
If a judge was consistently lower OVERALL, I would not be concerned.
If he was consistently higher overall, I would not be concerned. If he consistently scored everyone from his country higher, I would be concerned.
Looking at some specifics. The ninth placer was a Dutch horse. He got his lowest score from the Finnish judge, and his highest score from the Belgian judge.
The French judge's score placed him the closest to his actual placing in the class. His highest score was 3+ points higher than his average score.
His lowest score was 3+ points lower than his average score.
The judge who placed him closest to his average score was 1+ points off his average score.
He finished in the class, exactly where the judge closest to his average score would have put him, and exactly where the 'middle of the road' scoring judge would have placed him.
In other words, it appears the high and low score he received, did not place him substantially differently than he would have placed, had the two scores agreed more closely with the 'middle of the road' judge.
The Finnish judge gave a great many horses a lower score than they got from the other judges.
I think the Finnish judge was the lowest scorer the most times, I think about for half the horses.
The French judge was the lowest scorer for only about 5 horses.
The Belgian judge was the lowest scorer for I think only about 6 horses.
Did the differences in scoring affect the actual placing of the top 3 horses? I would have to get a calculator, but I don't think it really did.
The biggest disagreements seemed to be actually around the middle of the class. There was more unanimity around the bottom of the class.
freestyle2music
May. 9, 2009, 10:02 AM
You can go through every single rider and horse, analyse it and give your opinions, but it's obvious that the Finish judge is a complete disaster in almost all classes.
Answering another question :
http://www.dressage-analysis.com/articles.html
Have fun,
Theo
CatOnLap
May. 9, 2009, 11:05 AM
C-O-L Have you ever studied the requirements for licensing for judging by the USEF? The requirements these days are a little more rigorous. There are some quite well qualified judges at their respective levels in your neighborhood.
I am not criticizing any particular judges. There are very few qualified judges in my neighbourhood. The ONE FEI qualified judge in my neighbourhood, who was a very good friend for decades, died a few years back. We also have 2 USEF dressage judges (S) within an hour's travel. That does not read to me like "quite a few". I know and like both of them, but the scores I get riding under them bear little resemblance to scores I get from imported judges. We also have a handful of people who judge dressage here with local horse council credentials only.
I have not studied to become a USEF judge. I have studied the local requirements for judging in my country and have gone part way through the program, judging schooling shows and shadow judging and such, but family matters distracted me from pursuing that. I am familiar with the judging problems in the US from my work with the CotH NerdHerd, however and recognize that although an individual judge may have internal consistency, there is too great inter-rater relaibility between the qualified judges, and that there is little agreement on what constitutes the "best" ride, which is illustrated by Theo's score posting. Stephan Clarke's work and the video guides in the UK are a good step towards what we need. As the NerdHerd concluded, there needs to be consistent standards first, before we can say that the judging is a disaster. We don't really have a good "ruler" yet to measure against. Therefore to say that the judging is off is not really the fault of the judges, who know what they like and judge to a consistent internal standard. The finnish judge doesn't agree with the others, but who is to say what is "right"? Perhaps the finnish judge is the enlightened one.
PS- Theo thanks for the dressage analysis links-
slc2
May. 9, 2009, 11:18 AM
Why is the Finnish judge 'a disaster'? He was consistent. I am not so sure the ponies are easy to judge. They are looking very exaggerated. I am sure opinions are divergin very strongly and that a more traditional judge might have some very ascerbic scores to deliver.
freestyle2music
May. 9, 2009, 11:29 AM
Why is the Finnish judge 'a disaster'? He was consistent. I am not so sure the ponies are easy to judge. They are looking very exaggerated. I am sure opinions are divergin very strongly and that a more traditional judge might have some very ascerbic scores to deliver.
#1 It's not a HE but a SHE
#2 SHE we were not judging the ponies but the Junior Riders
#3 I was there.
Yes, the Finnish judge was a disaster (also today and probably also tommorrow)
Theo
ridgeback
May. 9, 2009, 11:36 AM
#1 It's not a HE but a SHE
#2 SHE we were not judging the ponies but the Junior Riders
#3 I was there.
Yes, the Finnish judge was a disaster (also today and probably also tommorrow)
Theo
So she's a disaster because she's a she? I thought you might play for the other team:D
mbm
May. 9, 2009, 11:38 AM
interesting web site.
here is a question tho from someone who knows nothing about statistics: while i get that accuracy is important, since dressage is so subjective and since there is a pretty big divide between what some folks think is correct (ie: horses ridden in a more classical frame as opposed to a more "modern" frame) doesn't it seem correct that some judges would score either higher or lower that the average judge?
and, i personally think that if "they" implement a system where the judges must be within x score of everyone else that we will see a break up of dressage.
because in effect you would be taking the voice away of the judges and instead require them to be "politically correct" because in fact a score is the judges only way of saying what he/she thinks.
while i agree that judges should have training - who gets to decide what is "correct"?
take for example the "L" program here in the US, while it is a program of teaching consistency in judging there are some folks who disagree with what they are teaching.
how will the above be addressed?
Ambrey
May. 9, 2009, 11:40 AM
So in your world, "Finnish" is "exotic?"
Because from over here, Finland isn't that far from Holland.... it seems more... local.
I was thinking you were having trouble with Sri Lankan judges or Laotian judges. I hardly think expanding to neighboring nations while still staying within such a concentrated area qualifies as "globalization."
Ambrey
May. 9, 2009, 11:44 AM
interesting web site.
here is a question tho from someone who knows nothing about statistics: while i get that accuracy is important, since dressage is so subjective and since there is a pretty big divide between what some folks think is correct (ie: horses ridden in a more classical frame as opposed to a more "modern" frame) doesn't it seem correct that some judges would score either higher or lower that the average judge?
I haven't had a chance to read it yet, I'll get to that- but to answer that question.
No, a judge that consistently scores differently than other judges is damaging the validity of the test. A test that is based on subjective means can't be valid unless people have a reasonable expectation that trained examiners will come to approximately the same conclusion given the same input.
The question is how MUCH difference will be allowed. There won't be complete agreement, there never is. But if one judge consistently scores either higher or lower than the "norm" it means his measurement criteria is not the same one everyone else is using, which is very detrimental to the measurement system.
freestyle2music
May. 9, 2009, 11:56 AM
So in your world, "Finnish" is "exotic?"
Because from over here, Finland isn't that far from Holland.... it seems more... local.
I was thinking you were having trouble with Sri Lankan judges or Laotian judges. I hardly think expanding to neighboring nations while still staying within such a concentrated area qualifies as "globalization."
Exotic judges is a saying which Tineke Bartels initiated manyyyy years ago.
It doesn't have anything to do with Bounty Islands or any other geographical areas.
Ambrey
May. 9, 2009, 11:59 AM
OK, well it makes you sound like quite the nationalist to call a judge from a nearby European nation "exotic" and bemoan globalization of dressage on a US board because you don't like the judge from a country closer to you than many US states are to the rest of us;) Just thought I'd point that out.
mbm
May. 9, 2009, 12:04 PM
I haven't had a chance to read it yet, I'll get to that- but to answer that question.
No, a judge that consistently scores differently than other judges is damaging the validity of the test. A test that is based on subjective means can't be valid unless people have a reasonable expectation that trained examiners will come to approximately the same conclusion given the same input.
The question is how MUCH difference will be allowed. There won't be complete agreement, there never is. But if one judge consistently scores either higher or lower than the "norm" it means his measurement criteria is not the same one everyone else is using, which is very detrimental to the measurement system.
so how will it be dealt with that there is this divide in judging? and again WHO gets to decide what correct is? the breeders of fancy warmbloods? classical trainers/judges? rollkur trainers? who?
again, i understnad the need for accuracy - but how to do get accuracy when there is such a divide in what is considered correct?
i dont personally think you can. i think this will divide dressage down the middle and we will have eXtrmeStreSSage, and traditional dressage as two seperate entities.
Heck, if i was a judge i would be pissed that my voice was going to be taken away and made vanilla - just doesn't seem very democratic.
perhaps there will need to be 2 "divisions" that a judge can want to test for?
again not knowing anythign about how to make things more statistically accurate maybe this has already been worked out in other sports?
and finally - i hope that the website they create will be open to everyone (for a fee?) it will be very interesting to say the least.
and pps - i thought it interesting that Loriston-Clarke was outside the norm and had a couple places where she was tagged for "review"
Ambrey
May. 9, 2009, 12:12 PM
FEI determines what is correct, as they are developing the tests and the measurement scales. I haven't read them, but from what I've heard judging criteria are quite specific.
The judges are not the ones who get to decide what is and is not correct dressage, they just have to judge to the criteria. Otherwise, there's no way to have fair competition!
mbm
May. 9, 2009, 12:19 PM
but ambrey - the entire problem for some of us, is that it appears that the judges are NOT judging to the criteria set forth very clearly by the FEI.
i suggest you read the rules. it is quite interesting.
ETA: also a lot of what is in teh rules can be " understood" differently depending on how you look at it.
take the rule that the horse needs to be on or approaching the vertical (i am paraphrasing here)
you can understand this to mean approaching as "in front of " OR approaching from behind....
Ambrey
May. 9, 2009, 12:25 PM
I think that's exactly what is being addressed. But the nature of tests would make it probable that the AVERAGE would be the most approaching the standards, and that those who are not judging to the standards are more likely to be outliers.
But you are correct, additional measures are needed to demonstrate that the criteria are being adhered to. In tests they talk about "reliability" and "validity." "Reliability" is about being able to expect the same score for the same performance every time. "Validity" is about whether the score really measures what it's supposed to measure. You can't have validity without reliability, but you can have reliability without validity.
slc2
May. 9, 2009, 12:30 PM
You can have validity without reliability.
I have a hard time conceiving of Holland as 'the victim' of globalization of dressage.
But perhaps it is just the lingering plot of the evil Witthages and the disguised Belgian desire to rule the world.
Ambrey
May. 9, 2009, 12:40 PM
You can have validity without reliability.
No, because reliability is necessary to make the inferences you would draw from the test valid.
Just to keep this in the context of dressage- validity refers to the strength of the conclusion you can draw from the result of the test. So, if you score a 65%, what can you conclude from that score?
If you consider that there is a theoretical "true score" that a theoretical "perfect tester" would give you, reliability would measure the likelihood that the score you got would approach that true score. Without reasonable reliability, you can't make an inference based on the score because you can't be assured that the score you received is within an acceptable range of your "true score" on the test.
slc2
May. 9, 2009, 12:53 PM
No, actually, I do not consider that there is a 'theoretical true score' at all.
Scores are opinions that range based on a framework of FEI guidelines for judging. There is a built-in variation based on how heavily specific issues are valued by each individual judge, and to a certain degree, that is not restrictively or even partially defined by the rules.
Ambrey
May. 9, 2009, 01:00 PM
No, actually, I do not consider that there is a 'theoretical true score' at all.
Scores are opinions that range based on a framework of FEI guidelines for judging. There is a built-in variation based on how heavily specific issues are valued by each individual judge, and to a certain degree, that is not restrictively or even partially defined by the rules.
Do you really want to have a dressage scoring method in which your score of 65% could mean something completely different than my score of 65%? Of course there will be subjectivity- that's the nature of these types of testing- but a level of consistency is really necessary.
If there is no "theoretical true score" in dressage, how can we have competition? If they are scoring Anky by a different criteria than they are scoring Steffan, what is the point? If your 62 means something completely different from my 62, how can we ever compare our rides?
Ambrey
May. 9, 2009, 01:08 PM
p.s. thanks for the link, Theo :) When I discuss reliability and validity, it's what they are saying is "consistency and accuracy." Different terminology.
But indeed this powerpoint presentation puts it all very clearly. It also discusses some of the issues I've brought up in the past regarding discrimination in scoring and the compression of the range of scores. Good work!
Theo, you're working on the FBF video database?
slc2
May. 9, 2009, 03:45 PM
"Do you really want to have a dressage scoring method in which your score of 65% could mean something completely different than my score of 65%?"
It wouldn't mean something completely different; it would merely be the opinion of a different judge. I'm fine with that.
Your judge may be from Resume Speed, Iowa, and never have gone to a judge's seminar and be an L candidate.
My judge may be an international judge with 55 years of riding, training and showing at international levels, and two Olympic gold medals under his belt.
In a sense, those two scores will ALWAYS 'mean something completely different', even when they match.
This is the key. Your judge could have given you a 65% entirely due to his inexperience.
Too, your 65% may have been given to you because of completely different strengths and weaknesses than my 65%. The fact that two scores are the same is just as meaningless as if they are different.
Again, because the judge is rendering an opinion when s/he judges dressage. I don't want or expect judges to give exactly the same scores. I don't even expect them to always be close, and I don't feel them being different means the low one is wrong and the high one is right or vice versa.
I think judging dressage is simply not what you are trying to make it be, and it can't be what you're trying to make it be. I'm not at all sure it's even healthy, productive or good for the sport to have scoring be that 'consistent'.
People already complain that the wrong people win all the time, and that they win for too long, and that their faults are over-looked. If you force more 'consistency', that has far more chance of enforcing that than ending it.
mbm
May. 9, 2009, 04:02 PM
i think what we are talking about is the problem when you take an "art" and make it a "sport" ....
because no matter how they try, dressage can not and will not (hopefully) ever be like jumping or racing where it is very apparent who the winner is.
and if it does get to that point - it will no longer be dressage.....
i think all of this is dancing around the REAL issue. and for me that REAL issue is that many of the judges don't appear to be judging to the FEI standards.
so no matter how you dress it up with fancy statistical verbiage- it will still be not judging to the standards already set forth.
i dont see why we would want all judges to score exactly the same. why even have many judges then? why even have a judge at all?
slc2
May. 9, 2009, 04:19 PM
I agree with you.
I think the accusation that some are not following the standard is the crux of the matter and what is really bugging people, not statistics and finding a way of making the judges all urp up the exact same scores every time.
What they believe is, if that judging were uniform, the people who 'shouldn't win' would stop winning. It has every chance of doing exactly the opposite.
For people that 'shouldn't win' to stop winning, there actually needs to be agreement about what should win as well as what should not win (which is even harder to agree on). And you'd actually have to convince a great number of people that the people who are winning shouldn't be; there are a great many people that don't believe that.
Ambrey
May. 9, 2009, 04:42 PM
Your judge may be from Resume Speed, Iowa, and never have gone to a judge's seminar and be an L candidate.
My judge may be an international judge with 55 years of riding, training and showing at international levels, and two Olympic gold medals under his belt.
This is not at all what the discussion is about. We're talking about 2 international level judges, judging international competition of the highest levels, and whether those scores... scores from the highest trained judges... are valid and reliable... not whether a highly trained judge and a less trained judge will agree with each other.
Alagirl
May. 9, 2009, 04:53 PM
At the last FEI Task Force meeting someone suggested to setup a videodatabase with every single movement scoring from 3 to 10.
Wouldn't this be a very good tool to teach these exotic judges
and the public. Every single movement on video, marked and
explained by the top judges.
Theo
Well, nice and dandy, but since judging is so subjective...
The only way I could see this endavour having any merit if it was animate and impersonal. That way all angles of one move could be explored, better than any real video could. It could also counter act the fad tu jour, just because favorite rider X had a bit success with it.
slc2
May. 9, 2009, 04:55 PM
"This is not what the discussion is about"
Of course it is. International judges have different experiences, training and opinions too.
Ambrey
May. 9, 2009, 05:05 PM
Of course it is. International judges have different experiences, training and opinions too.
So then if two judges will give the same ride completely different scores, they are measuring different things.
How can we have competition if judges are not measuring the same thing?
mbm
May. 9, 2009, 05:20 PM
well, i for one would LOVE LOVE LOVE to see video of riders (top/bottom and in between) with the scores for each movement for all judges displayed real time . it would also be extremely cool if the comments could be displayed too
what a way to learn! i hope this does happen and i hope it is available to all that so desire to see it.
AnotherRound
May. 9, 2009, 08:07 PM
I think this is an issue of being afraid to be judged, for some people. I for one am not afraid to be judged. Being in front of a human judge is inherintly "risky" in the idea that human bias or mistakes or emotion may contribute to the evaluation. There are levels of compentence in judges, just as in bosses and managers and lawmakers and presidents and parents. The performance of dressage skills is an evaluation of a horse's training. Who gets to admire that? Humans, people, and some seek to qualify themselves as expert evaluators. The performance or demonstration of that horse' training is done for people, and in my estimation that evaluation is going to vary from judge to judge and region to region. It doesn't worry me in general, and seeking a single standard of copmetence for judges in the sport is a good thing. Removing the human element, though, would reduce the value of showing. It would be like taking critics out of the theater. Why not evaluate the new play based on the audience's applause level? Throughout the world and history human sucess is subject to the approval of people around us. That's what makes society different from isolated existence. That's why most of us live in town, and only a few live up in a cabin on the mountain side scorning group dynamics. Some sports are built based on that model. Some are modeled on non-subjective criteria, such as stadium jumping (clear rounds and time).
I am getting too far away from being comprehensible, sorry. I just think you can't clean out the human element from judging and shouldn't.
Edited to add, what I am trying to say is that on a given day, this particular qualified judge sees this performance and under the structure of scoring as he understands it gives this particular score. I accept this. Another judge most certainly would have seen different parts of the performance and probably scored them differently, looking for different things from the pair in the ring. Its been the same for me in equitation classes and hunter classes. A particular judge favors a "type" or a way of going, or a certain type of turnout, or misses a big gaff. Its like a police man, if you try to argue that another driver committed a terrible error, in the end, he didn't see it, so the prize goes to you. I don't feel like I'm connecting to my point so I will try to quit again. Cheers.
Ambrey
May. 9, 2009, 08:43 PM
well, i for one would LOVE LOVE LOVE to see video of riders (top/bottom and in between) with the scores for each movement for all judges displayed real time . it would also be extremely cool if the comments could be displayed too
what a way to learn! i hope this does happen and i hope it is available to all that so desire to see it.
And also the things they discussed in that presentation, with the movement videos and comments by top judges. I think that would be very informative for riders as well!
Carolinadreamin'
May. 9, 2009, 09:11 PM
Removing the human element, though, would reduce the value of showing. It would be like taking critics out of the theater. Why not evaluate the new play based on the audience's applause level? Throughout the world and history human sucess is subject to the approval of people around us. That's what makes society different from isolated existence. That's why most of us live in town, and only a few live up in a cabin on the mountain side scorning group dynamics.
Well said.
FancyFree
May. 9, 2009, 09:28 PM
Yes good post AR. Dressage judging is subjective. It will be colored by the judge's experience and opinions. People who say otherwise are just deluding themselves. Really what's so bad about that? I did my first first level test with Hilda Gurney judging. It was a wonderful learning experience and I valued her comments. It wouldn't have been exactly the same had someone else had been judging, but so what? The human element makes it interesting to me. If you what black and white, do Jumpers. It's pretty clear there. :lol:
AnotherRound
May. 9, 2009, 09:53 PM
Or barrell racing!! You win, you win, who cares what people think of your form!
There are folks in every sport who want to change the rules because they don't do well as the rules are or are afraid of competing under the rules as they are. Start a different sport. I agree that standards should be global, but the human element in judging is part of the long, delightful climb up in the sport. At my local level, I will get local judging. I aspire to qualify to ride under an admired top level judge one day. Some more so than others, but the work showing and getting there is what the sport is. Trying ton circumvent the human element in judging is like not showing at all and just working at home. Great, satisfying and educational. Showing takes on a new level. Its sort of like those who say showing isn't fair, because their horse doesn't behave at a show as well as he does at home. Well, then don't show. find a sport where you can make a video tape and submit that for judgement, so you and your horse don't have to compete in stressful environments. See, its just not the same thing. It isn't bad, or good, just isn't the sport as it has grown up to be. Judges and their evaluation of your horse based on a rulebook. That's where you're going, when you stuff your horse in the trailer and bumpity bump on down the road to the fairgrounds.
FancyFree
May. 9, 2009, 10:00 PM
Or barrell racing!! You win, you win, who cares what people think of your form!
Also you can wear whatever you want. :lol:
Miss Dior
May. 9, 2009, 10:32 PM
With all of the performance standards data that has come to light due to actual data mining, I have been thinking that perhaps we are looking at this entirely from the wrong angle. In the spirit of what is good for the goose is also good for the gander, how about some judging performance standards. Hard data based of course. While judging IS subjective there are statistics that come into play. How about a judges report card being issued annually. Let's track the average scores given by each judge for a wide variety of test levels. How many 5/6/7/8/9 etc. were given? How many times did this particular judge's scores deviate from the majority opinion of the full jury? And by what percentage? How many times and by what percentage did they judge their countrymen? It would be fascinating to put together some of these stats for the past 10 years of international competition alone !!!! The results would probably be mind boggling !!!! But then when show managers go about hiring the judges or the FEI selections begin for impt. events they can hire those with the most STATISTICALLY reliable report cards. It would level the playing field and hold them to a fair work ethic. All jobs have performance reviews...why not this one !!! Just a thought.
slc2
May. 9, 2009, 10:55 PM
What specifically would be the attraction of 'statistical reliability'?
slc2
May. 9, 2009, 10:55 PM
What specifically would be the attraction of 'statistical reliability'?
First the judging is all so bad, then we want it to be bad AND uniform?
yaya
May. 9, 2009, 11:15 PM
How many 5/6/7/8/9 etc. were given?
In the past, individual movement scores were not recorded, as the test sheet went home with the rider and only the total scores were submitted to the associations.
With programs like Fox Village in use now, if the "instant scoring" option is used, we would have a record of individual movement scores. However, this data still is not turned in to the associations.
Yet.
mbm
May. 9, 2009, 11:34 PM
With all of the performance standards data that has come to light due to actual data mining, I have been thinking that perhaps we are looking at this entirely from the wrong angle. In the spirit of what is good for the goose is also good for the gander, how about some judging performance standards. Hard data based of course. While judging IS subjective there are statistics that come into play. How about a judges report card being issued annually. Let's track the average scores given by each judge for a wide variety of test levels. How many 5/6/7/8/9 etc. were given? How many times did this particular judge's scores deviate from the majority opinion of the full jury? And by what percentage? How many times and by what percentage did they judge their countrymen? It would be fascinating to put together some of these stats for the past 10 years of international competition alone !!!! The results would probably be mind boggling !!!! But then when show managers go about hiring the judges or the FEI selections begin for impt. events they can hire those with the most STATISTICALLY reliable report cards. It would level the playing field and hold them to a fair work ethic. All jobs have performance reviews...why not this one !!! Just a thought.
if you read the link theo provided it goes into detail about exactly what you propose.
and in fact my comments were based on the info in that link.
Ambrey
May. 10, 2009, 12:44 AM
if you read the link theo provided it goes into detail about exactly what you propose.
and in fact my comments were based on the info in that link.
Exactly. It suggested the collection of detailed data, which would then be analyzed to monitor judging proficiency. All great ideas :)
Ambrey
May. 10, 2009, 12:45 AM
What specifically would be the attraction of 'statistical reliability'?
First the judging is all so bad, then we want it to be bad AND uniform?
Did you even read the link?
Validity and reliability. Accuracy and consistency. Two different ways to say the same thing.
And the attraction of statistical reliability is that it's necessary for a valid measurement ;)
slc2
May. 10, 2009, 07:23 AM
I read the entire link, every page, and went over every chart, actually. And I still don't agree with you, but I also don't see why you think I have to agree with you.:)
Ambrey
May. 10, 2009, 01:46 PM
I read the entire link, every page, and went over every chart, actually. And I still don't agree with you, but I also don't see why you think I have to agree with you.:)
I don't, but your posts are very confusing. Do you really not want consistent and accurate judging? I can't quite figure out what you're disagreeing with, and am trying to clarify it.
If you'd read the presentation, which I agree with completely, you'd note that consistency and accuracy do NOT eliminate subjectivity, and that that was part of the presentation. However, you can't have competition without a standard... a bar to reach, as it were... and if not all judges are judging to the same standard ("validity" in test theory, "accuracy" in the language used in the presentation) I am not sure what the competition is for. If one person is jumping 5 feet and one is jumping 3 feet, how do you decide who won? The goal needs to be equal.
It's not about different training levels for judges (an L judge vs. O judge). It's not about whether judges bring their own flavor to their judging. It's not about whether someone is a sore loser. It's about whether the system that creates the standards and trains the judges is doing it in such a way that competitors are playing on an even field.
If the data shows that there is no problem, then great- but with stakes this high, you need to have the system of checks and balances to make sure it's all working (and apparently according to the presentation, there are issues that need to be resolved).
I don't have any particular complaint with dressage judging... I've not even started competing yet! I simply believe that any measurement system should be checked for reliability and validity, to make sure it's working :)
slc2
May. 10, 2009, 02:05 PM
I don't think it's confusing at all - it's just that I don't agree with your opinion. Nothing real confusing about that. You have one idea, I have another. So many times, people here say, 'wait, I'm confused', when what they REALLY mean is something entirely different.
I don't believe in measuring judging the way you do, and I DO think it removes the human element your way.
I want judges free to disagree with each other. Why have a judge otherwise? If how good, how 'accurate' they are is measured by how similarly they judge to other judges, I have a serious problem with that.
I want to hear what different judges think when they look at the same horse. Rockwell gave a horse a SEVEN when other judges gave him a ZERO on one movement. And he explained that. He caught a lot of crap for it, but he also had reasons why he did that. And I don't think Gary Rockwell is WRONG, OR the other judges are wrong. It is a different way of interpreting the rules and judging guidelines.
No one has even for a MOMENT pondered WHY the Finnish judge scored so many horses 'so low' (again, usually 3 points, and sometimes 5, 6, 7 points). Maybe he is concerned that the horses are being ridden too tensely, too up in a forced posture. Maybe he feels they're not coming through properly. Maybe he feels there is not enough activity behind. Maybe he has a fundamental issue he sees and he's trying very, very hard to improve our sport through his comments and scores.
Maybe there is something that he wants to see that is missing, and maybe him scoring low isn't such a bad thing. Maybe people should listen to him instead of call him names. Giving a Dutch horse a lower score is not a federal crime, and it isn't necessarily the wrong thing to do. The Dutch can get knocked off a perch just as easily as anyone else.
People sit on this bb and beat their chests constantly about how badly every one at shows rides and how awful forceful, strict, demanding and who-cares-about-basics 'those riders' are, maybe that's what this judge feels, that it is time for a little shake up and people to get some strong messages in scores and comments....though I HARDLY think 3 percentage points is a 'harsh' message.
Ambrey
May. 10, 2009, 02:38 PM
I don't think it's confusing at all - it's just that I don't agree with your opinion. Nothing real confusing about that. You have one idea, I have another. So many times, people here say, 'wait, I'm confused', when what they REALLY mean is something entirely different.
Nope, I am really confused. You've brought so many completely unrelated issues into it, things that actually support the idea of data collection (although you haven't thought it through far enough to realize that ;))
1) you've indicated (I think) that it's possible that the lower scoring judge is scoring more accurately than the higher scoring judges. This was addressed by the figure-by-figure video concept. If it's true that the low score judge was scoring accurately while the higher scoring judges were not, that is something that needs to be determined.
2) One of the issues brought up was that the standard deviation of dressage scores is very low. This means that all judges are using a very narrow range, and judge-judge differences of 3 points are significant.
3) Allowance for subjectivity is completely possible by studying the inter-rater reliability and using common sense to develop allowable variations.
mbm
May. 10, 2009, 02:52 PM
ambrey one of the basic assumptions you are resting you argument on is false.
and that is: the criteria set forth in the rules are being followed and interpreted exactly the same by one and all.
if in fact that were the case then all of this would be good and productive etc.
however as i said elsewhere, there is a pretty big divide in how people interpret the rules. and therefore you are going to see a pretty big divide when it comes to judging.
and of course the "popular" interpretation is gong to win the day and anyone who doenst follow the party line will be help up for "review" and told how bad they are etc etc.
in other words: where you stand re: interpretation of the rules is how you see the judging.
The first thing that needs to happen is that the rules need to be clarified so that everyone is on the same page.
except that means that a lot of the artistic portion of this gets lost to mechanization.
Ambrey
May. 10, 2009, 03:06 PM
and that is: the criteria set forth in the rules are being followed and interpreted exactly the same by one and all.
That is not my assumption- it's one of the assumptions being tested by the research.
In measurements, those who write the rules need to then go back, see how interpretation is altering the results, and re-write the rules to be clearer. If the rules are such that they can be interpreted in such widely varying ways, they need to be changed.
I'll give an example from my own field, psychoeducational testing. On a particular IQ test there are questions. Answers get scored 0, 1, or 2. Obviously this scoring can be open to interpretation, as what seems like a 1 point answer to one person could seem like a 2 point answer to another person.
Since it's understood that an IQ test is useless without inter-rater reliability, the handbook for the test and the training given to perform the test are quite specific regarding what a 0,1, or 2 point answer would look like, even listing the most frequent answers and the scores they'd give. The guidelines for interpretation are such that if they are followed, inter-rater reliability is quite high.
This was all developed by writing the test, administering it, seeing where the challenges were, refining it, re-analyzing, and so on, until the real-world usage of the test met guidelines for sufficient reliability.
spaghetti legs
May. 11, 2009, 09:57 AM
I'll give an example from my own field, psychoeducational testing.
Oh to be an expert across so many fields.. this is about the millionth career you've made reference too across your postings... how ever do you keep track of them all? Or do you just forget what your previous field was, and so make up something new each time?
As you have mentioned Ambrey, you have not even ridden a show; how any of this concerns you is completely beyond me. :lol:
Ambrey
May. 11, 2009, 10:23 AM
Oh to be an expert across so many fields.. this is about the millionth career you've made reference too across your postings... how ever do you keep track of them all? Or do you just forget what your previous field was, and so make up something new each time?
It's easy, dear. I only have two college degrees. One in Educational Psychology(M.A.), one in Mechanical Engineering (B.S.). What other degrees/fields have I mentioned? Or perhaps you are just not clear on what falls under the purview of those two areas of study?
And it interests me because it's interesting. Why you'd have a problem with that I can not imagine. You know, my husband likes Basketball and he's never played with the NBA either.
ShotenStar
May. 11, 2009, 11:14 AM
...
As you have mentioned Ambrey, you have not even ridden a show; how any of this concerns you is completely beyond me. :lol:
I'm with Ambrey on this one ... this type of analysis is interesting in and of itself. No specific experience in a field is needed in order to be willing to learn, apply analytical reasoning, and formulate theories for further testing.
Those who think that dressage and dressage judging are some sort of rarefied things, removed from the world of statistics and measurements, need to broaden their own horizons and look into the world of Operations Research, Modeling and Simulation (ORMS). There are some powerful tools lurking in that realm .... tools that can be applied to everything from how to make your morning coffee to GP dressage.
*star* - the former chief of a multi-million dollar government ORMS office
AnotherRound
May. 11, 2009, 11:14 AM
I agree with you, Slc:
I don't believe in measuring judging the way you do, and I DO think it removes the human element your way.
I want judges free to disagree with each other. Why have a judge otherwise? If how good, how 'accurate' they are is measured by how similarly they judge to other judges, I have a serious problem with that.
I want to hear what different judges think when they look at the same horse. Rockwell gave a horse a SEVEN when other judges gave him a ZERO on one movement. And he explained that. He caught a lot of crap for it, but he also had reasons why he did that. And I don't think Gary Rockwell is WRONG, OR the other judges are wrong. It is a different way of interpreting the rules and judging guidelines.
this is the point I was making and Slick took it one step further. I too LIKE the diea that two different judges will see the same performance differently. Its the human element of this subjective sport which I like. Its the difference between clinical analysis and the art of dressage. Throughout the world and history human sucess is subject to the approval of people around us. That's what makes society different from isolated existence. That's why most of us live in town, and only a few live up in a cabin on the mountain side scorning group dynamics. Some sports are built based on that model. Some are modeled on non-subjective criteria, such as stadium jumping (clear rounds and time).
Ambrey
May. 11, 2009, 11:21 AM
Those who think that dressage and dressage judging are some sort of rarefied things, removed from the world of statistics and measurements, need to broaden their own horizons and look into the world of Operations Research, Modeling and Simulation (ORMS). There are some powerful tools lurking in that realm .... tools that can be applied to everything from how to make your morning coffee to GP dressage.
Yep! I am just in awe of the capability of statistics to show us patterns in large bodies of data. I admit to not really seeing why anyone would think understanding the patterns and correlations within any large body of numbers would be a bad thing ;)
One of the problems with statistics is that people do not understand them. Because sometimes they are improperly used, and most people don't have the understanding to know when they are properly used and when they aren't (even many researchers!), people dismiss them entirely as not being useful.
It was very interesting to me to read Star's comments during the performance standards discussion, because I found so many parallels between her field and mine, and ways to apply those concepts to the problem at hand.
Beasmom
May. 11, 2009, 01:15 PM
Something that hasn't been brought up here is how the position the judge occupies can affect what he/she sees and therefore how movements are judged.
A judge at C will see movements differently than the judge sitting at E or B. Add to that the different emphasis that individuals will place on the variables and voila! Difference in scoring.
It isn't rocket science. It does not make one judge better, worse, right or wrong. It's the reason people, when interviewed after an accident, crime or other event, will "see" and describe the incident somewhat differently.
I enjoy reading and comparing the scores from two or more judges. Since they have different perspectives, their scores and comments will reflect what THEY observed from THEIR position.
I would not want to remove the human element from judging. Around here we know who the "stingy" judges are, and who the "generous" judges are. It's taken into consideration when reviewing the test. As long as they are fair, ie., stingy or generous to everyone, it doesn't matter. The scores between the two extremes average out.
It's the difference of opinion that makes horse races. Anyone that can't abide by the subjective nature of dressage judging needs to change to another sport. Show Jumping or speed events, perhaps.
Ambrey
May. 11, 2009, 01:15 PM
Something that hasn't been brought up here is how the position the judge occupies can affect what he/she sees and therefore how movements are judged.
Yes, it was mentioned in the presentation.
Beasmom
May. 11, 2009, 01:27 PM
What presentation? Freestyle's? There was nothing there regarding the seating/placement of the judges vis a vis scoring, other than to note the scores from the judges seated at C, etc. That has always been done.
What I'm TRYING to say, Ambrey, is, the human variable will never be eliminated as long as you have two or more judges sitting at different spots around the arena. That's the way it is. There's a lot of overthinking going on here.
And if I have missed something, it's because I don't have time to spend hour after hour perusing bulletin boards and Googling stuff. I teach, I ride, I show, I study my scores and work on improvement. Even if I don't like what the judge may say, there is always a kernel of truth in the comments.
fiona
May. 11, 2009, 01:34 PM
I teach, I ride, I show, I study my scores and work on improvement. Even if I don't like what the judge may say, there is always a kernel of truth in the comments.
You freak you.
Beasmom
May. 11, 2009, 01:39 PM
Bwahahahaha!
Ambrey
May. 11, 2009, 01:47 PM
What I'm TRYING to say, Ambrey, is, the human variable will never be eliminated as long as you have two or more judges sitting at different spots around the arena. That's the way it is. There's a lot of overthinking going on here.
And if I have missed something, it's because I don't have time to spend hour after hour perusing bulletin boards and Googling stuff. I teach, I ride, I show, I study my scores and work on improvement. Even if I don't like what the judge may say, there is always a kernel of truth in the comments.
It was a critical part of the presentation... page 7, "Not all inconsistency is “bad judging”"
This goes back to my comment about people dismissing statistics because they do not understand them. Assuming that measuring and maintaining validity and reliability would eliminate all variability is just a failure to understand the methods.
Dressage Art
May. 11, 2009, 02:04 PM
The *** judge doesn't agree with the others, but who is to say what is "right"? Perhaps the f*** judge is the enlightened one.Yes, that can be seen when 4 out of 5 judges reward Behind the vertical and short neck and only 1 judge scores down for that... and to blame "more classical" judge for not being on the same page???? Hmmm, I'm not sure if I want to see such "unity" among judges.
Ambrey
May. 11, 2009, 02:05 PM
That's why there were measures for both "consistency" AND "accuracy."
But if the FEI decides to reward behind the vertical, the problem doesn't lie with how they train their judges, but how they develop their standards.
Dressage Art
May. 11, 2009, 02:09 PM
Wouldn't this be a very good tool to teach these *** judges and the public. Every single movement on video, marked and
Explained by the top judges. USDF publishes DVDs "On The Levels" with the same idea. Also there is 2 other privet DVDs published as well, by S. Clarke and by B. Berry: http://www.dressageart.com/l_dressage_judge_videos.htm
Dressage Art
May. 11, 2009, 02:11 PM
How about a judges report card being issued annually. Let's track the average scores given by each judge for a wide variety of test levels. How many 5/6/7/8/9 etc. were given? How many times did this particular judge's scores deviate from the majority opinion of the full jury? And by what percentage?Interesting.
vBulletin® v3.6.8, Copyright ©2000-2012, Jelsoft Enterprises Ltd.