Your colleague, Sheila, an avid hip-hop fan, just sent you a data file with the top #1 hits of the year 2000. She’s certain the year 2000 was one of hip-hop success, and that there were more #1 such hits than any other type.
After a cursory look at her data, you decide you have enough information to make an informed decision about how to proceed. How data is gathered, cleaned, and analyzed depends on myriad factors, including purpose, context, and reason, for its collection. This reference outlines some key considerations that may arise during various steps in your data collection, cleaning, and analysis process.
This 5-step approach can be used to begin having fun with your data at a cursory level.
- Import data and get familiar with it
- What does it look like?
- Are there NaN values?
- How might you be able to use/play with this data?
- Clean and organize your data
- What non-values need to be removed?
- Should column values be changed to more general terms?
- Create a problem statement and approach to your data
- What types of questions might your data answer?
- What other data could be merged with this dataset to make it even more valuable?
- Condense your data using pivot tables
- Make your data easier to navigate through and manipulate
- Visualize your data
- Visualization of data makes it easier to consume. Run your data through any visualization software to help clarify it.
After the qualitative process is complete, you can begin asking more and more of your data through qualitative analysis. Unlike qualitative analysis, qualitative research “involves analysis of data such as words”. Qualitative analysis seeks to gather verbal data rather than measurements, to provide detailed descriptions of the research topic.
The National Science Foundation recommends users continuously explore their qualitative data with types of questions:
- What patterns/common themes emerge around specific items in the data?
- How do these patterns (or lack thereof) help to shed light on the broader study question(s)?
- Are there any deviations from these patterns?
- If, yes, what factors could explain these atypical responses?
- What interesting stories emerge from the data?
- How can these stories help to shed light on the broader study question?
- Do any of the patterns/emergent themes suggest that additional data needs to be collected?
- Do any of they study questions need to be revised?
- Do the patterns that emerge support the findings of other corresponding qualitative analyses that have been conducted?
Main image credit: kenshoo.com