||The Senior Data Science role is primarily responsible for developing, interpreting and implementing statistical models and creating other analytical solutions using the latest statistical and data science techniques for business problems across the organization.
- Develop and maintain predictive models
- Investigate data quality
- Explore feature selection and model tuning for improved performance
- Collect performance data and monitor model assessment
- Work with Product Team to align on data meaning and industry practices and terminology
- Work with Data Platform Team to ensure concept and semantic integrity across data layers
- Research and analyze the effectiveness of products
- Use techniques such as PSM to estimate, understand, and monitor product performance
- Research and implement new machine learning methods to improve risk management products
- Support BI and analytics across departments
- Provide consultative expertise as needed
- Work to provide self-serve reports, dashboards, and information on predictive models and customer performance
- Must have strong experience in or more of the folllowing:
- R package creation using Software Development Best Practices
- Statistical hypothesis tests
- Training / mentorship
NICE TO HAVE'S:
- 5+ years of professional experience working as a Data Scientist in machine learning, .
- Undergraduate degree from an accredited college/university in Computer Science, Statistics, Mathematics, Engineering, Bioinformatics, Physics, or related fields with a strong mathematical background, with ability to understand data mining algorithms and machine learning.
- Strong proficiency in R, python, or SAS and demonstrated experience with programming languages.
- Must either have current or past experience with R
- Knowledge of all the below machine learning concepts: #1 hyperparameter tuning (CV, bootstrap, etc.)#2 feature selection (stepwise, best subset, etc.)#3 model performance (AUC, accuracy, MSE, MAE, etc.)#4 Overfitting and underfitting
- Broad knowledge of supervised machine learning approaches. Boosting, random forests, linear regression, logistic regression, artificial neural networks, convolutional neural networks, naive bayes, etc.
- Demonstrated experience defining and creating the data needed for modeling. Response variables are not given. They are made by the Data Science team.
- Ability to write moderately complex SQL select statements. Joins and subqueries should feel natural.
- Moderate knowledge of traditional statistics. Confidence intervals, hypothesis tests for independent samples vs related samples, p values, type I vs type II error.
- Able to communicate findings and solutions clearly to a variety of audiences, as well as draft clear, comprehensive specifications for engineers or explaining analytics concepts to non-experts.
- Support Data Science Team in at least one area.
- AWS: demonstrated (perhaps via Github) production lifecycle experience using AWS services.
- Docker: demonstrated (perhaps via Github) production lifecycle experience using Docker.
- R Package creation: Have at least one package that is either published on cran or a proprietary package used in professional context. You make software other Data Scientists rely on.
- Statistical hypothesis tests: Advanced degree in Statistics. Ability to translate analysis question into a testable hypothesis and run a well-designed experiment.
- Training / mentoring: Professional experience mentoring new team members and creating training material.
- Graduate degree from an accredited college/university in Computer Science, Statistics, Mathematics, Engineering, Bioinformatics, Physics, or related fields with a strong mathematical background, with ability to understand data mining algorithms and machine learning.
- Healthcare experience is a plus.
- Public GitHub containing example work.
- Knowledge of unsupervised machine learning approaches. Principal component analysis, kernel principal component analysis, clustering etc.
This position can be remote from anywhere in the US.