Follow Datanami:
October 10, 2016

What’s Driving Data Science Salaries Now

(Melpomene/Shutterstock)

Interested in getting a higher salary in your job as a data scientist? (Who isn’t?) According to a recent survey, there’s a strong correlation between higher salaries and working with Spark, Python, and Unix, and spending long hours coding and building models, and also sitting in lots of meetings.

Similarly, using Excel, working for an older company, and being a woman had a strong correlation to a lower salary, according to O’Reilly’s recent 2016 Data Science Salary Survey.

Now in its fourth year, O’Reilly’s report is based on data collected from more than 900 respondents, which was subsequently run through uses of models to try to tease out the pertinent bits. For example, the median salary for the entire sample was $87,000. That’s down from the $91,000 average recorded in last year’s survey. But don’t worry: the decline was mostly due to a higher share of young non-U.S. respondents, according to O’Reilly. The average salary for data scientists in the U.S. (which accounted for 60% of the whole) was $106,000.

Geography Matters

oreilly_1

Overall data scientist salary distribution (Source: O’Reilly’s 2016 Data Science Salary Survey)

Obviously, where you work has a big impact on your salary. Data scientists working in California had the highest salaries. In the Golden State, O’Reilly found the interquartile range (IQR) between the 25th percentile and the 75th percentile to be about $105,000 to $150,000, with an average just north of $125,000. The Pacific Northwest, Northeast, and Mid-Atlantic also had average salaries above $100,000. The average data science salary was less than $100,000 in the Midwest, the South, Texas, and the Southwest/Mountain regions, the survey found.

Being a woman in the data science business will cost you about $10,000. That’s the average difference that O’Reilly found between survey respondents of the opposite sex, with all other variables being equal. Clearly, the IT industry has a long way to go before it can claim to treat men and women equally.

Time Counts

Age matters, too. The highest salaries went to data scientists between the ages of 51 and 60, who had average salaries of about $135,000. What’s more, data scientists over the age of 60 had a higher average salary than those aged 41 to 50, the survey found. Data scientists in their 30s had an average salary of around $85,000, while very few of those under the age of 30 earned more than $100,000. (Tesla drivers, as an informal proxy, would seem to lean heavily toward the aged male demographic, too.)

Those data scientists working 51 to 55 hours per week earned the most. The survey found that a median salary of about $125,000 for those working these hours, followed closely by those working between 46 to 50 hours, and those working from 56 to 60 hours. Those gung-ho data scientists working more than 60 hours earn about the same as those working 41 to 46 hours, the survey found. However, if you’re working 40 hours or less, be prepared to earn less.

Titles Matter

Your title also matters. Interestingly, if your job title is “data scientist,” your median salary is about $75,000, with an IQR range from about $55,000 to $110,000, according to the survey. Nearly half of all respondents had this title.

oreilly_4

What industry you work in has a strong influence on your salary (Source: O’Reilly 2016 Data Science Salary Survey)

But if you’re a data scientist who has a title that’s representative of “upper management”–such as CIO, director, or vice president–or you have “architect” or “senior engineer/developer” in your title, you will earn more money, the survey found. “Principals” also had a median salary above $100,000.

What you do also matters. If you’re involved in organizing or running large projects, setting up systems, identifying business problems to solve, communicating with others inside or outside of the organization, or developing analytics software, then you’re going to be at the upper-end of the salary spectrum.

Tasks that correlate with lower salaries include basic exploratory analyses, creating visualizations, using ETL, developing dashboards, or using dashboards created by others, according to the survey.

More Meetings, Please

Interestingly, O’Reilly found a strong correlation between time spent in meetings and higher salaries.

Data scientists who spend more than 20 hours per week in meetings had a median salary close to $130,000, according to the survey. This makes sense when you consider that older people in higher positions (i.e. those with leadership roles) will typically spend more time in meetings than their younger cohorts who have less experience and less responsibility.

oreilly_2

Time spent in meetings correlates with salary Source: O’Reilly 2016 Data Science Salary Survey

However, don’t expect to boost your salary just by scheduling more meetings, O’Reilly warns. This variable likely moves in concert with other variables, including experience, age, and titles.

Salaries and Tools

Tool usage also has an impact on salary, the survey found.

The most commonly used tools were Excel and SQL (used by 69% of the sample) followed by R (57%) and Python (54%). However, not all had an equal impact, as Python carried a positive 4.6% co-efficient rating (basically the percentage increase in salary a user could expect to see) versus a negative 7.4% co-efficient rating for Excel.

Learning Scala could have a big impact on your salary. The language, which is closely tied to Spark in the data science realm, carried an average salary of about $110,000. That was second only to Perl, a popular open source scripting language used for Web development. Scala, Perl, Ruby, and Go were the only languages associated with salaries above $100,000.

What databases you’re experienced with also plays into what you’re paid. In the relational field, Oracle (NYSE: ORCL) Exascale had the highest rating, with a median salary above $150,000. Exascale pros at the 75th IQR had a salary close to $210,000, way more than any other technology. Knowing Amazon (NASDAQ: AMZN) Redshift, EMC‘s Greenplum, and Teradata (NYSE: TDC) correlate with a median salary above $100,000.

Big Data Salaries

In the battel over search engines, Lucene won with a median of about $125,000. Taking second here is Solr, which is built on Lucene and has a median of about $110,000. Coming in last was ElasticSearch, which carried a median salary less than $100,000, according to O’Reilly’s survey.

oreilly_3

Programming language has an effect on salary (Source: O’Reilly 2016 Data Science Salary Survey)

In the Hadoop world, data scientists who work with MapR‘s converged platform and Amazon’s Elastic MapReduce (EMR) earned the most, with an average salary of about $120,000. Rivals Cloudera and Hortonworks (NASDAQ: HDP) were deadlocked on the salary-o-meter, with an average salary just north of $100,000. Those with plain vanilla Apache Hadoop skills did a little better than their Hortonworks or Cloudera cousins, while Oracle and EMC/Greenplum’s distribution (probably Pivotal but it’s not spelled out) also did well. Only those counting IBM (NYSE: IBM) BigInsights skills had median salaries less than $100,000.

Apache Spark continues to be a value-add to salary-seeking big data scientists, with a salary just north of $100,000. But you can do better for yourself by adding other skills to your big data resume, inducing Apache Hive (median salary: $105,000), Redshift ($120,000), Apache Kafka ($120,000); Pig ($105,000), Toad ($105,000), Cassandra ($120,000), Zookeeper ($115,000) Redis ($105,000), Google BigQuery ($105,000), Amazon DynamoDB ($105,000), Apache Storm ($120,000) and Couchbase ($110,000).

Among front-end BI tools, if you can tout Alteryx (average median salary: $115,000) Jaspersoft ($115,000), and Microstrategy ($110,000) as skills, then you’re doing fairly well. In the visualization department, Tableau ($100,000) and D3 ($95,000) stood out from the pack.

The machine learning environments having the biggest impact on salaries include Vowpal Wabbit ($110,000), Google Prediction ($105,000) H20 ($105,000) Turi (formerly Dato and Graphlab: $105,000), KNIME ($100,000), Spark MLLib  ($100,000) Scikit-learn ($95,000) and Mahout ($95,000).

In its analysis, O’Reilly defined four clusters of different types of workers, and analyzed the tools they use in that manner. It would be difficult to assign salary influences to individual tools outside of this cluster analysis, so take the numbers with a grain salt.

Related Items:

Matchmaker Vets Data Scientists as Talent Gap Widens

Skip the Ph.D and Learn Spark, Data Science Salary Survey Says

9 Must-Have Skills to Land Top Big Data Jobs in 2015

 

Datanami