How do research projects proceed?
Every project is a bit different, but there are some general patterns.
Pose a preliminary question
Find the appropriate data
Get the data into usable form
Preliminary analysis
Revisit and revise preliminary questions
Final analysis
Report your results
What do you want to know? Why is it interesting?
Your question may (and probably will) change depending on what you find in the data.
You will likely cycle back and forth between question and data in the very early stages.
Some questions are great, but we do not have the data to answer them.
Sometimes we start looking at the data and realize the preliminary question is not quite the right one. That's okay: Go back and revise it!
This is where you will spend a lot of time. Survey results reported by Forbes show that data cleaning is the most time consuming and most unpleasant task. The better we are at these tasks, the easier our lives will be.
Time to get to work answering our question. Do we see patterns that suggest an answer to our question?
At this point, your plots are for you and your team's consumption, so they do not have to be perfect. You should be thinking, however, about how the final plots will look.
A common mistake at this step is presenting the results in the order that they were "discovered." Do not take this approach! Instead, report your results in the way that best leads the reader to your question's answer.
Hugo Bowne-Anderson via Harvard Business Review
"The vast majority of my guests tell [me] that the key skills for data scientists are...the abilities to learn on the fly and to communicate well in order to answer business questions, explaining complex results to nontechnical stakeholders."
The whole article is interesting...