The Importance of Statistics

Wow, sometimes life just takes control and you realize that you haven’t written a blog in almost a month! At first, I was thinking “no one will care how long it takes for you to write your blog”, but then people were asking about what I will write next, or if I can just at least write an update on everything! Which is super inspiring for me! Thank you everyone who has reached out and asked, and I will try my best to keep everything flowing more regularly. Well before I get into my topic of today, I can give some personal life updates to why I haven’t written anything yet this month.  I got engaged! I was surprised, it was unexpected, but I am over the moon about it. So, wedding planning has commenced. My football season has ended, and I will be starting pre-season conditioning next week, but there was a lot of practice, and traveling back and forth to games the last little bit.

Now that I have given you that update, I want to talk about something that triggered this next blog! I had brunch with an old friend today who reminded me of what kind of student I was in high school. Some typical math and science class stories and I said to her “isn’t it crazy that I spend 90% of my time doing statistics and designing experiments to present science to other scientists?!”  We laughed about it, but realistically I actually struggled with math and science for a very long time. Even when I was in university I dreaded going to my introduction statistics class. And because I decided to do a double major, and honours, I then had to take 2 method classes, and 2 more statistics classes. I then went to graduate school to do a Master of SCIENCE…  which then lead to advanced level statistics classes, and a lot of statistics in the lab, and learning and demonstrating a strong scientific method. Needless to say, I absolutely adore statistics now, and I am constantly learning and upgrading my knowledge about the methods and programs that are involved with data analysis and management.

But let’s get to the point here!!

I want to talk about the statistical practice I learned in class and compare that to the applied statistics that are done in the industry, and how important it is to not cut corners when completing contracts. I hope this gives some insight into what I am doing daily and how much I respect the data that I am working with.

1)      The first thing that I have learned is that not a lot of people that I am working with really understand what needs to be considered when starting with a dataset. It is at this moment that I personally tell my clients that it’s important to me that the information I convey to them is accurate and that it is done so ethically. When you pull a dataset, it is considered “dirty” as in, there are responses that aren’t complete, there are some responses that weren’t taken seriously, sometimes you find glitches, or you have variables that aren’t ready to be analyzed. The scary part is, in undergraduate level classes, most of the time we are given a cleaned dataset, and we jump right into analysis, so the practice of cleaning a dataset is typically learned while doing a first experiment, or sometimes people aren’t exposed to this proper practice until graduate school.  I also don’t think that this is common sense, so typically when I pull a dataset, I tell my client that I want a few days to get familiar with the data. This is not a fault of our professors teaching us. There is only so much time within a year to really grasp all these concepts, but honestly, I know myself I would have taken advantage of an elective course just on learning the cleaning and assumptions of data.

2)      When I am familiarizing myself with data this includes going through the experimental design, the survey, reading up on specific items, and measures, just really grasping what it is that we are going to be analyzing. I make sure I look at exclusion criteria for each participant. I do the study myself to assess time and see if anything is confusing. A lot of the time this data cleaning/familiarizing will be different per client depending on when I started with them. Because sometimes I am the researcher that has created the study, and if that is the case, familiarizing data does not take as much time.

3)      Its also important that the researcher is aware of the hypotheses that are being tested because that will determine the analysis that is being run, which then in turn helps with the assumption checks that need to be completed before the analysis. Because every statistical analysis is different, this process is important. I won’t go into much detail about this, but if the researcher is walking into a study half way (which happens often), it is important to know exactly the methods, the ethics that are included, and what it is that the client is trying to answer. Seems like that makes sense to say, but there are times that clients aren’t ready to answer some of those questions because sometimes they aren’t scientists and they wouldn’t know where to start to answer those questions. So, it is good practice to be aware of how you will help your client be successful with their data.

4)      Ok! We have clean data, we have checked our assumptions, and we have now run this analysis. Now what? Well, I get asked A LOT for statistics help from friends and colleagues who both are in school still, or working within industry, and a common question I am asked is how to interpret specific results, and if there are any violations that may occur depending on the method they run. And I honestly think this is the most important part about the statistics. When someone is handling a dataset and they aren’t quite sure what is going on, to me that is hazardous, and almost scary. Even our most basic statistic needs to be done the right way. We can’t be throwing a bunch of data together to try and find the answers or something new and exciting, because fishing data without really knowing what is happening isn’t good science.  To this day, I have a binder that I have labeled and highlighted to remind me what needs to be done correctly for every kind of statistic that I have learned since 2007. So, the important part here is to make sure that we are reporting the data correctly, but also so that the client understands what the findings are. Not everyone understands scientific jargon, and being a good consultant and researcher is being able to clearly communicate results.

I could probably write about proper use and method for statistics for days, but I might be the only person who cares about what I have to say on the topic. My take home message here is that statistics are hard, and a lot of training goes into properly using them. Take the time and effort to learn, or to hire someone who knows what they are doing with the hefty numbers.

KC