Welcome to the Podiatry Arena forums

You are currently viewing our podiatry forum as a guest which gives you limited access to view all podiatry discussions and access our other features. By joining our free global community of Podiatrists and other interested foot health care professionals you will have access to post podiatry topics (answer and ask questions), communicate privately with other members, upload content, view attachments, receive a weekly email update of new discussions, access other special features. Registered users do not get displayed the advertisements in posted messages. Registration is fast, simple and absolutely free so please, join our global Podiatry community today!

  1. Have you considered the Clinical Biomechanics Boot Camp Online, for taking it to the next level? See here for more.
    Dismiss Notice
Dismiss Notice
Have you considered the Clinical Biomechanics Boot Camp Online, for taking it to the next level? See here for more.
Dismiss Notice
Have you liked us on Facebook to get our updates? Please do. Click here for our Facebook page.
Dismiss Notice
Do you get the weekly newsletter that Podiatry Arena sends out to update everybody? If not, click here to organise this.

A misbehaving control group

Discussion in 'Biomechanics, Sports and Foot orthoses' started by daisyboi, Jun 15, 2011.

Tags:
  1. daisyboi

    daisyboi Active Member


    Members do not see these Ads. Sign Up.
    Over the past two years I have been collecting data to look at the effectiveness of manipulation in the treatment of plantar digital neuritis. I have been using algometric pressure threshold readings as part of my measrements to assess improvement. In my wisdom I chose to use the non-affected limb as a control group. Reading over my data it is clear that the non-affected limb sees an increase in it's algometric pressure reading too. Although the increase is not as great as on the affected foot it is still consistently increased across the entire control group (almost). I read in Howard Dananbergs response to criticisms of his ankle manipulation paper that he had experienced similar phenomena in a previous calf stretching study. Does anyone have any insight as to why this may happen? There was no sham proceedure or contact of any kind with the control foot other than taking the algometric reading itself.

    Many thanks

    Dave
     
  2. daisyboi

    daisyboi Active Member

    Thanks Simon but the Hawthorne effect is only a short-lived phenomena whereas this change has remained consistent throughout the two years. Also, since there is an active interest in the subject from day one, shouldn't this effect be consistent throughout all measurements. Would you suggest that the control group should not be interacted with in any way? I'm not sure how that could be achieved. I was wondering whether the effect I saw may be due to a global lowering of pain thresholds due to the fact that the subject is already experiencing pain in the affected foot. Is this likely do you think?
     
  3. I said psychological effects "such as Hawthorn", there are others.

    I should have thought a control group consisting of individuals that were not part of the treatment group might have helped you with some of these problems; a sham treatment group too.

    One possible reason for the results is a global change. But this is just one possibility. Can you theorise on the mechanism of such a global change?
     
  4. Moreover, can you rule out psychological or other confounding factors given your study design?

    Are your "control group misbehaving"? They are not a control group since they are the same individuals that are your study group.
     
  5. daisyboi

    daisyboi Active Member

    Yes thats true Simon, thanks. The difficulty with the control group is recognised but on the other hand, the asymptomatic limb is as close to exactly the same environment as the symptomatic limb and this could not be anywhere near as well matched using seperate individuals. I did consider using a sham treatment group but a lit review showed this to be stacked with problems for manipulative techniques. How do you perform a sham that definitely has no effect? Since manipulation is all about touch, movement and neurological responses, how to you eliminate these during a sham proceedure but still leave the subject believing that they received the actual treatment? Since the results mentioned in the control group have shown to be long lasting (2 years or more in some cases) I'm not sure that psychological factors are at play here. I have no doubt that my study design has very many limitations which I hope to address when repeating it for publication rather than for my own interest, but the fact that the control group score increased in the way that it did and stayed that way intrigues me. As I said earlier I suspect a neuroligical response in some way related to altered pain thresholds to be responsible but I am not well versed in neurology, or research for that matter, and was hoping you and others here could shed some light on possible pathways for such a mechanism. I was thinking that there is a hypersensitivity to normally painless stimulli when the body is already experiencing pain. In other words, our pain threshold is decreased and normal stimulli can become painful ones. If that is correct then the algometric meter reading of the asymptomatic limb would be "artificially" low due to the overall increase in sensitivity. As the underlying condition improves, so the pain threshold would return to normal. Sorry for the rambling but I am way out of my comfort zone here and just trying to understand what I am seeing. To give you an example of a pretty standard subject I have listed some data here of one subject. The Right foot is the affected limb and the 3rd MTP joint is where the measurement is being taken from on both feet.
    Wk 1 R=1.6kg L=5.3kg. Wk 2 R=1.9 L=5.4. Wk 3 R=2.4 L=5.4. Wk 4 R=3.1 L=5.7. Wk 6 R=4.0 L=5.9. Wk 10 R=5.1 L=6.3. Wk 18 R=5.6 L=6.4. Wk 30 R=5.8 L=6.3. Wk 42 R=6 L=6.4

    So while there has clearly been a much larger increase on the affected side and the two limbs are much closer to each others measurements than at the beginning of treatment, there is still a significant (I do not mean statistically as the stats are not yet done) rise on the control side. I find it interesting that we can have a noticable and prolonged affect on a limb without seemingly any direct intervention or am I missing something really obvious here?
     
  6. Yes, you're missing the potential psychological effects of the intervention and your own biases upon the individual. Your potentially missing the between day error in your measurement and the natural between day variation in the variables. And the clinical significance in 1.1 units is..... Good luck with your future studies though.

    Before you start such a study you need to show your between-day repeatability. What is the between-day variation in a population without intervention as measured by you?
     
  7. daisyboi

    daisyboi Active Member

    Ok. This is all pretty new to me but I would like to design a good quality study. Am I correct in saying that I should start by ensuring that my measuring tools are consistent, day after day. The next step would be a separate control group. As I am exclusively in private practice this could be an issue as assigning patients to a group where they get no intervention is not really possible. Is there a way around this? Would a comparative study that compared two different forms of treatment be valid without a further control group, if the second group received a treatment that has previously been accepted in the literature? I don't know that this would be rigorous enough. Maybe I will approach my local hospital and see if we can create a control group from those who are waiting for surgical intervention, although this would restrict the length of time they can be monitored for.
     
  8. Griff

    Griff Moderator

    Correct. It's called reliability/repeatability. Any measuring device you use should be reliable (there is no validity without reliability). You need to perform inter-rater and intra-rater reliability studies.

    No. Comparing two different forms of treatment does not tell you how many of the individuals in the study would have got better anyway (regression to the mean). Hence the need for a control group.
     
  9. daisyboi

    daisyboi Active Member

    Fair point. I'm no researcher as is rapidly becoming obvious by these posts but I thought this was a pretty standard approach to research where a control was not feasible. For example in the following research protocol;

    "120 patients from one Orthopaedic group’s foot and ankle offices with single foot neuromas and no previous history of neuroma or foot disorder treatment will be selected for the study. These patients will be randomized to three treatments, specifically lidocaine injection, corticosteroid injection, or ethanol injection. Outcomes will be assessed at 3, 6 and 12 month time points using validated questionnaires as well as a non-validated disease specific questionnaire. Primary endpoint will be graded change in the physical function portion of the SF-36 form. Secondary endpoints will be the graded change in the McGill Short Form for Pain and ultimate satisfaction with treatment as assessed by a non-validated questionnaire designed for Morton’s neuroma symptoms."

    As you can see there is no control group, just varying types of intervention. Is this research invalid? I felt it would show that a new/alternative treatment is as good/poorer/more cost effective/less cost effective/faster/slower etc etc than the current standard treatment. Is this not the case?
     
  10. daisyboi

    daisyboi Active Member

     
  11. Not necessarily invalid, but methodologically weaker.
     
  12. In your hands, or by someone else? You need to look at the between-day variation too (in your hands).
     
  13. Griff

    Griff Moderator

    I think you are getting reliability and validity slightly muddled. Although closely linked they are not the same thing. You cannot have intra/inter-examiner validity. I presume you mean reliability? If so what are the ICC's/Limits of Agreement?
     
  14. daisyboi

    daisyboi Active Member

    No. reliability studies have been done by others. I will perform some in my own hands next in that case. Is this just a matter of measuring the same subjects on subsequent days without any other intervention. presumably this should be asymptomatic subjects. Would small numbers of 5-6 be sufficient for this and how could I control against outside factors influencing changes in their response? For example should I also measure them with a visual analogue scale so that changes in my measurements should be mirrored by changes in their VAS score?
     
  15. daisyboi

    daisyboi Active Member

    Not sure I really understand what you are looking for as I'm in pretty new territory here. The following is the abstract from one of many reliability studies;

    Inter- and intra-rater reliability of the pressure threshold meter in measurement of myofascial trigger point sensitivity.
    Delaney GA, McKee AC.
    Source

    Department of Physical Medicine and Rehabilitation, Saint Vincent's Hospital, Ottawa, Ontario, Canada.
    Abstract

    This study was designed to establish the intra-rater and inter-rater reliability of measurements of trigger point sensitivity using a commercially available pressure threshold meter. Fifty healthy adult volunteers (25 men and 25 women, aged 20 to 51 years) underwent repeated pressure threshold readings from two separate trigger point locations in the trapezius muscle, TP2 (left) and TP3 (right) by two independent examiners. Pressure threshold readings, using a 1.0 kg/s application, were done alternately by each experimenter. Measurements from each trigger point were completed 5 minutes apart. Intraclass correlation coefficients (ICC) revealed the inter-rater reliability to be high for both the first (ICC = 0.82) and second trial (ICC = 0.90) of TP2 and for the first (ICC = 0.86) and second trial (ICC = 0.92) of TP3. Intra-rater reliabilities for TP3 (ICC = 0.91) were higher than for TP2 (ICC1 = 0.80; (ICC2 = 0.83). These results show that the pressure threshold meter is highly reliable in measuring trigger point sensitivity, between and within experimenters, and may be useful in the diagnosis and monitoring of treatment of myofascial pain syndrome.

    Is this what you were questioning or am I looking for something else entirely?
     
  16. Griff

    Griff Moderator

    Did they use your measurement device in the above study? If not I am not sure why you have mentioned it. What I was asking (in response to you saying that reliability of your device was good to excellent) was what YOUR ICC's/Limits of agreement were?

    Let's just clear up what reliability and validity actually are.

    Reliability (sometimes called repeatability) of a measuring device is essentially how consistently it will give you the same readings. So if you stand on the bathroom scales 10 times you hopefully get 10 very similar (and some maybe identical) readings. This would be considered to be reliable/repeatable.

    Validity of a device is how sure you are that you are measuring what you say you are measuring. The gold standard is considered to be criterion validity - whereby you compare your readings/measurements to something which is proven/established and already measures the same thing you are.

    An example... Let's say I have designed a new gizmo which measures body mass. It looks like a belt with a digital watch on it. For this to become accepted by the public (or used in academic research) I have to show it is reliable and valid i.e. People have to know that it will consistently give them accurate results of their mass or they won't buy it. How do I do this?

    I get some people. I strap them all in and record their measurements. I then have a colleague do the same measurements with the same group of people. I run the appropriate statitisical tests on my Vs my colleagues results. This is my inter-rater reliability.

    I then get the exact same group of people (or a percentage of them) back in to see me a few days later. I run the appropriate statistical tests on my measuring day 1 results Vs my measuring day 2 results. This is my intra-rater reliability.

    I then put all the subjects on a set of bathroom scales and record their results. I run the appropriate statistical tests on bathroom scales measurements Vs my new device measurements. This is potentially my criterion validity. (The problem here is that often a new measuring device will be measuring something that nothing previously has)

    A measuring device cannot be valid if it isn't reliable/repeatable.
    However it can be reliable/repeatable and not be valid.

    If measurements are not reliable or valid in the research world then conclusions can be tossed out the window to a certain extent (control group or no control group!)
     
  17. daisyboi

    daisyboi Active Member

    Yes this is the same device as I use, sorry I thought that was a given.
     
Loading...

Share This Page