Data Scientist for Predictive Modeling

  • Parabon NanoLabs, Inc.
  • Reston, VA, USA
  • Jan 03, 2019
Full-time C++ Data Science Deep Learning Java Machine Learning MATLAB PERL Python R SAS

Job Description

A data scientist is sought to build, enhance, and operate machine learning pipelines for creating models to predict medical outcomes and physical traits from biological data. Successful hires will join an interdisciplinary research group pursuing ambitious, long-term projects aimed at achieving major scientific advances in the fields of medicine and forensics.

  • Must work in the Reston, Virginia office.
  • OPT work and Visa sponsorship not available due to the nature of some of Parabon's work.


This position encompasses both research and day-to-day operational duties. In this position, the Data Scientist will conceive, implement, and optimize machine learning pipelines for making predictions from biological data. This will include determining and implementing optimal approaches to feature selection, model selection, parameter optimization, and model accuracy evaluation. This position will also support the company's computational biology and genomics experts with their development of computational tools such as data analysis pipelines.

Operationally, the Data Scientist will be responsible for performing analyses using existing pipelines related to Parabon's Snapshot DNA Analysis Service. (Learn more about Snapshot at:

The Data Scientist will have the opportunity to work at the forefront of multiple scientific disciplines and is expected to be capable of developing new methodologies, tools, and pipelines, as well as driving the identification and implementation of emerging technologies and statistical tools. The Data Scientist will be part of a dynamic team of genetics and computer science experts and will be expected to work both independently and collaboratively.

Required Qualifications

  • Master's Degree or Master's Degree completion expected in the next 6 months in Data Science, Bioinformatics, or related field.
  • Experience with statistical software tools such as R, MATLAB, SAS, or equivalent in an academic or professional environment
  • Experience with scientific programming in a language such as Python, C/C++, Perl, Java, or equivalent in an academic or professional environment.
  • Experience with implementation and optimization of machine learning models
  • 2+ years experience in applying deep learning models
  • Coursework or relevant experience in applied statistics
  • Excellent written and verbal English communication skills
  • Candidates must be able to provide examples of prior work to demonstrate writing and scientific programming skills

Preferred Qualifications

  • Experience working with biological data types, particularly genomic data
  • Experience working with large data sets
  • Ability to exhibit code samples from previous ML projects
  • Kaggle competition experience