Skip to main content

Big Data is all the rage right now. In fact, being a data scientist was recently named one of the sexist jobs of the 21st century by the Harvard Business Review. I’d say a good chunk of the angel investment community here in Miami has invested in a Big Data startup at some point in their investment careers. You can take a look for yourself.

Starting a Big Data Company

Starting a Big Data startup is definitely a lot more feasible than before. There are more advances in machine learning and artificial intelligence, allowing for better analysis of unstructured text data. Thanks to the open source software Hadoop, founders have a tool to store and upload vasts amounts of data free of charge.
However, building a company focused on harvesting and mining Big Data is not as easy as it sounds, despite the accessibility and advances in the technology. Here’s why:

  1. Scarce Talent: There are thousands of vacant data scientist positions in the United States. Top companies struggle with finding data scientists to fill their openings. Startups are in for a long talent search. You can always look to outsourcing, but make sure your data scientist is fluent in the language your data is in written in.
  2. High Salaries: Data scientists are some of the highest paid professionals in the country with an average salary of $113,436 according to Glassdoor. A startup with limited capital will have a hard time acquiring and attracting top data science talent. They are probably already working in great companies and earning a competitive salary and benefits package. How can you compete with that? You must give the data scientist a compelling and emotional story about how your startup is the next big thing.
  3. Data Cleaning: After you have acquired your data and parsed through it, you will realize that your data is messy; filled will errors and misspellings. Manual review and corrections of the data is necessary in order to make the data presentable and interpretable for analysis. That can be a very mundane and time consuming task.  

Ultimately, the best way to mitigate these challenges is to team up with a great partner and surrounded yourself with great mentors. A good data scientist should know efficient and effective strategies to extract and manipulate the unstructured data. Make sure the communication within your startup is open and transparent in order to encourage fluid dialogue between your domain expert and data scientist.