Years ago I read an article by the famous blogger Mark Manson. Since then he published several books and is pretty well-known now. What he said in that article was that you might think you want something but you have to make sure that you are considering the troubles and the challenges you’ll face that come with it, not just the state of having achieved it. His point being “you need to love the journey, not just the outcome”. Otherwise, you’ll be miserable.
I interview data professionals from different companies, background and levels of seniority on my podcast. During these chats, I ask them why they chose data science and what they like about it. An answer I get often is: “I love being able to tell people what to do based on data”. I'm sure everyone can relate to that being a good feeling. Imagine going up to your boss and saying, “well, that campaign you did last month was a mistake, here is why and here is what you should do instead”. Of course, not every data scientist does that. I never got to do that yet. If this is also your aspiration, we might have a problem there.
My take on data science is a little bit different. I love data science for the journey rather than the results of it. The process of making something out of seemingly nothing delights me every time. In this article, I'd like to share some of the specific things I observed in my job that makes me like it more every day and things that might not appeal to you.
It is not possible to be a data scientist without being exposed to any sort of domain knowledge. I work with different companies for every new project and it is a fun ride. When I’m done with a project, I’m typically way better informed about that industry than I was before. It is partly because I need to know more to be able to perform my job but also because just being around people who work in that industry gives you a newfound appreciation for what they do and the intricacies of the job. In a way, I understand how the world works better with every project.
As part of my job, I have to act like an investigator. No one might know where I can get access to the data but I need to figure it out, talk to people and find that one person who can give me access. The stakeholders might not know what exactly they need or want, it is up to me to study their area and understand what exactly is going wrong and what type of data science solution they need. It is also on me to keep an eye out for any possible pitfall in the data, in the training process. I need to make sure everything checks out, there are no conflicts and no obvious mistakes. This requires great attention to detail and it is very fulfilling.
This job is not a “we need this so please get it done” kind of job. I, as a data scientist, am highly included in the decision process. This is because data scientist is the one with the expertise but also because what I'm doing has to perfectly align with reality. If my model is not perfectly tuned with reality, it is very hard to use it in real life. So it is important, and fun, to make sure you talk to nearly everyone involved in a project and make sure you have a good overall understanding of the problem at hand.
I know this is not a common favorite of data scientists. Many people strongly dislike the data cleaning process. But I somehow get a weird sense of satisfaction out of it. Kind of like vacuuming a conspicuously very dirty surface. It is not very hard but very rewarding since you see the results immediately. It is also not mundane since you need to come up with smart solutions to problems like missing data points or an unbalanced dataset.
It is just fascinating what breakthroughs you might trigger as a data scientist. You don't even have to prepare an advanced model for it. Just the data-driven perspective on things tend to make a big difference. Many times business stakeholders on the higher parts of an organization neglect thinking in terms of data. That's why once they are presented with a simple visualization of their data or some insights that they have never heard before, you see a spark in their eyes. It means, yes my friend, you managed to change or add to how they think about that certain thing.
Sometimes it sounds as if data scientists are robots performing the same tasks over and over again. Clean data, train mode, test model, next... That’s why many people believe (rather incorrectly) that a data scientist’s job can be easily automated. I mean, I’m sure it can be automated, just not so easily. I believe the job of a data scientist is creative at times. There are rules of thumb and best practices when it comes to dealing with problems in the data but many times you have to go wild and let your hair down, try something that you never heard before. Those are the moments when my job gets very fun.
I have to say, I detest the phrase “can-do attitude” but it does a good job defining a data scientist’s attitude. Most of the time the information you need is not going to be included and you’re going to have to find a solution to this on your own terms. You need to be resourceful and creative.
No date information on the dataset? Look at mostly sold items in that period. Found the most sales for Halloween themed candy? Now you can estimate the date.
No day of the week information? Look at day numbers and when they lapse to the new week. Day 1 must be Monday. BUT the data is from the US so likely Day 1 will be Sunday.
One exampleI like is from Twitter where they used the mention of soggy fries to suggest the frying machine company to offer support to restaurants.
One thing I haven’t mentioned here is, of course, the community of people in AI who push the limits of this profession. Definitely makes one proud. Not to mention the perk of understanding what actually is happening when there is news about a certain advancement in AI, rather than being scared of “robots taking over the world”.
All these things and more make it exciting to get in front of my computer and start working for me. I believe it is especially very fulfilling to be a data scientist for someone who loves to learn new things and loves to solve problems. This might not be you and that's okay. Before jumping into the data science wagon make sure you understand the journey that comes before the glamorous "I tell my boss what to do, everyone in the company is waiting on my insights" phase. It might save you valuable time spent unhappily in the office.