Welcome to the Podiatry Arena forums

You are currently viewing our podiatry forum as a guest which gives you limited access to view all podiatry discussions and access our other features. By joining our free global community of Podiatrists and other interested foot health care professionals you will have access to post podiatry topics (answer and ask questions), communicate privately with other members, upload content, view attachments, receive a weekly email update of new discussions, access other special features. Registered users do not get displayed the advertisements in posted messages. Registration is fast, simple and absolutely free so please, join our global Podiatry community today!

  1. Its been a busy week moving to new domain and new server and new platform. We still have a lot to do. The basic functionality is there, but we working long hours on the rest and awaiting the new design.
    Dismiss Notice

Research help

Discussion in 'General Issues and Discussion Forum' started by Nikki10, Apr 16, 2010.

Tags:
  1. Nikki10

    Nikki10 Well-Known Member


    Members do not see these Ads. Sign Up.
    Dear all,

    I am currently working on my final year disseratation paper on how experience may affect the reliability of the Foot Posture Index-6 and I am using Multi-rater and Weighted Kappa for analysing the data.I have no experience on Kappa analysis and would appreciate any advise on good resources or general suggestions using Kappa analysis.


    Thank you
     
  2. Griff

    Griff Administrator

    Nikki,

    If memory serves me Kappa analysis is generally used for qualitative data. What is your methodology? Are you doing intra-rater reliability for the FPI for both an experienced and unexperienced user and want to compare them to see which is more reliable? If so it may be just as simple to do intraclass correlation coefficients (ICCs), or limits of agreement (LOAs).

    Resources/authors to look for:

    Kappa coefficient (Cohen, J.)
    ICCs (Shrout, P.E.)
    LOAs (Bland, J.M. & Altman, D.G.)

    This link may also help: http://www.sportsci.org/resource/stats/precision.html#loa
     
  3. Nikki10

    Nikki10 Well-Known Member

    Hi,

    I took a convenience sample of 20 subjects with asymptomatic feet and assessed both feet in three trials.Yes,there was 2 group of raters an expert group of 4 and a novice group of 2,both groups assessed all subjects on the same day.
    I am investigating the inter-rater and intra-rater reliability.The inter-rater reliability will be analysed using multi-rater kappa and the intra-rater will be analysed using weighted kappa.
    As the data did not have a normal distribution and is categorical I can't do ICC or LOAs.

    Thank you for the references,they are v helpful.
     
  4. Griff

    Griff Administrator

    Nikki,

    Categorical data (or nominal data) is data which uses labels - when you report FPI data it is usually done qualitatively (i.e -4 or +6 for example) and would therefore be interval/ratio data. The gold standard (as far as what journals would expect to see reported) for intra/inter-rater reliability is the ICC.

    How are you presenting the data?
     
  5. Griff

    Griff Administrator

    Nikki,

    Here is an example of what I mean (and also one of the better papers on FPI reliability) - notice they report ICCs with 95% confidence intervals

    Ian
     

    Attached Files:

  6. Nikki10

    Nikki10 Well-Known Member

    Thanks for the feedback
    The data will be presented in numerical forms,showing the measurements taken by each rater at three trials.
     
  7. Griff

    Griff Administrator

    Then it isn't categorical
     
  8. Nikki10

    Nikki10 Well-Known Member

    Ok I understand what you mean, I will discuss this with my supervisors.

    Thanks again
     
  9. stewartm

    stewartm Member

    Hi,

    FPI data is categorical. Categorical data refers to data which can be sorted into categories (i.e. foot types). Ian, you state below that FPI data is reported qualitatively, what do you mean by that? Due to the boundaries of the FPI scale I wouldn't agree that this is interval or ratio data.

    The gold standard for reporting reliability data is to make the correct assumptions on the data and run the appropriate statistics.

    Cheers.
     
  10. Griff

    Griff Administrator

    Hi Stewart,

    I'm glad someone with far superior research methods knowledge to me has pitched on this as I've been mulling this over for the last few days. By saying it was reported qualitatively I meant numerically (e.g +12 or -8 etc) - naturally it could be sorted into categories (or labels as I referred to them earlier) based on one of the 5 'foot types' but is this the norm with respect to a reliability study? Is it rigorous enough?

    I assumed it was not as by categorising into foot types, using the highly supinated foot type of the FPI-6 as an example, we cover the range of -5 to -12. Therefore I could score someone differently by up to a value of 7 but as they are still in the same category it would appear to have good intra-rater reliability? Appreciate your thoughts on this - in the mean time I will delve back into the FPI literature and re-read.

    Cheers

    Ian

    PS I'm assuming Nikki is a student of yours at UEL so apologies if I have added to her woes or confusion!
     
  11. stewartm

    stewartm Member

    Hi Ian,

    I suspect it is a student of ours however not under a name that I recognise. If it is the case then clearly their meeting with the statistician at University has caused them further confusion.

    I'm sure there are inconsistencies in the methods for reporting the FPI but as a tool for quantifying foot posture I'd argue that the scores on their own don't mean much. I'm not sure that I grasp what you mean when referring to the norms for a reliability study however, the study is looking at the reliability across raters in their classification of foot type and therefore I can't see why the design is a problem (or not rigorous enough).

    I appreciate what you are saying regarding the supinated foot but, in recognition of the limitations of the index, we can't do much about that other than recognise the possibility for reliability values which may not be a true reflection. There are processes where the data can be transformed in order to run parametric tests - possibly worth pursuing - but due to the pending submission deadlines for the BSc project this isn't feasible at this stage.

    Cheers
     
  12. Griff

    Griff Administrator

    I meant the norm for a reliability study of the FPI (not a reliability study in general).

    I am no stats whizz by any stretch, but my intuitive assumption was that you would sort the data as numerical values rather than labels/categories. My understanding is that a inter-rater reliability study is essentially trying to ascertain how repeatable a measurement is between two raters (whatever that measurement may be). So if I was measuring the length of every subjects left leg and so were you we would have a table of both our measurements for each subject (in mm; i.e. numerically) and then we would run the stats to see how 'similar' (not a stats word I know) they were. Would this not be the same for FPI? Are we not just seeing how 'similar' rater A and rater B's measurements (or values) are?

    As I said no stats whizz - happy to 'get my coat' if I'm way off

    Ian
     
  13. Griff

    Griff Administrator

    Stewart,

    I've just had another read of the Cornwall et al paper (attached above) and it sort of answers my question. In part of their study they tried to determine whether reliability was better when using the raw 'score' to the classify foot types.

    Intra-rater reliability
    When analysing raw score Wilcoxon used
    When analysing category Kappa coefficient used

    Inter-rater reliability

    When analysing raw score Independent t test used
    When analysing category Kappa ceofficient used

    The ICCs were calculated for all.

    Conclusion: Classification of feet based on the raw FPI-6 score does not seem to improve the amount of agreement between clinicians.
     
  14. stewartm

    stewartm Member

    Hi Ian, I hope I haven't lost the focus with the answer. You are right that an inter-rater reliability study is comparing between raters and I suppose you would compare the scores but you need to look at the underlying assumptions of the data (and what is being provided by the measurement tool). I think the example of leg length is an interesting one but not directly comparable because we are not looking at a summated score with the LLD (we are comparing the analysis of continuous data with categorical). In essence, the data generated by the FPI is discrete, being limited by the boundaries of the score -12 to +12. This is not the case with leg length measurements. As we are taking a sum of 6 criteria then we need to consider the 6 individual criteria and how much do they contribute to the final score. There is the assumption that each individual item of the index and the divisions within that item have equal weighting, not necessarily the case! (and this is where the weighted kappa comes in, I think) If we want to run the analysis like the leg length data then we need to transform the data to logit transformed scores. This is the process of changing raw FPI-6 scores into a data form suitable for parametric analysis but for this, large data sets are required (Keenan et al , 2007).
    Keenan AM, Redmond AC, Horton M, Conaghan PG, Tennant A. (2007) The Foot Posture Index: Rasch analysis of a Novel, Foot-Specific Outcome Measure. Arch Phys Med Rehabil. 88: 88 - 93
     
  15. Griff

    Griff Administrator

    Stewart,

    Thanks for your time answering this - I'm gonna need some time to digest it (and stop my ears bleeding). Point taken about the LLD measurement - bad example!

    Cheers

    Ian
     
  16. stewartm

    stewartm Member

    This may be of interest (although at the same time, quite probably not)

    Walters SJ, Campbell MJ, Lall R. (2001) Design and analysis of trials with quality of life as an outcome: a practical guide. Journal of Biopharmaceutical Statistics. 11(3):155-76.

    Cheers.
     
Loading...

Share This Page