We use cookies. Find out more about it here. By continuing to browse this site you are agreeing to our use of cookies.
#alert
Back to search results
New

Data Engineer II (BTI, Bioinformatics)

Children's National Medical Center
remote work
United States, D.C., Washington
Dec 27, 2024
Description

Description:

The Brain Tumor Institute (BTI) Bioinformatics Core at Children's National Hospital is seeking a highly skilled Bioinformatics Engineer to join our team. This position will play a critical role in advancing research of multiple PIs to uncovering oncogenic mechanisms in pediatric brain tumors and identifying novel therapeutic targets.

The role involves close collaboration with the Director of the BTI, researchers, and clinicians at Children's National. The successful candidate will lead implementation of Gabriella Miller Kids First workflows using AWS and/or CAVATICA as well as new workflow development and benchmarking CWL or NextFlow. The candidate will lead data modeling and database generation for current and incoming BTI genomics data files and may be responsible for project-specific engineering needs, such as API/UI development.

Key Responsibilities:



  • Responsible for implementation of Gabriella Miller Kids First workflows using AWS, CAVATICA, and/or other cloud-based tools.





  • Collaborate with bioinformatics scientists and PIs to benchmark and optimize new production-scale analysis pipelines and workflows to generate high quality and high data integrity outputs





  • Support project-specific engineering needs, such as database/API/UI development





  • Collaborate with an AWS architect and/or IT to optimize resource use as needed





  • Be responsible for AWS bucket organization and file storage management





  • Create and maintain clear documentation for data engineering workflows, including codebases, data pipelines, validation, testing, and CI/CD processes.





  • Create and implement a data model and database for BTI genomics data




Application Process:



This position may be remote or hybrid. Candidates should be prepared to share their GitHub handle and present a recent project as part of the interview process.

Qualifications

Required Qualifications:



  • Bachelor's Degree in computational discipline or systems engineering





  • At least three (3) years of experience in a production clinical or research bioinformatics data processing role required.





  • Experience with cloud and high performance compute environments including creation and use of virtual machines, virtual environments, and job submissions





  • Experience profiling performance, benchmarking, and optimizing multiple job types and scenarios in bioinformatics data processing.





  • Experience with current standard parallel computing and data processing workflows essential (eg: Snakemake, NextFlow, CWL, WDL).





  • Experience with reproducible pipeline development including software version control, use and creation of docker and/or singularity images, collaborative code review





  • Experience diagnosing and troubleshooting pipeline errors and unexpected behaviors. This includes taking initiative whether it be debugging, online searches, contacting authors of software for assistance and generally seeking assistance as needed





  • Willingness to contribute to open source software development for the purposes of improving/meeting community standards when appropriate





  • Attention to detail.





  • Experience benchmarking and implementing harmonization workflows for a variety of NGS data types.





  • Ability to independently plan and execute pipelines and workflows of high complexity required.





  • Ability to independently engineer systems relative to larger enterprise frameworks required.






  • Strong UNIX/LINUX expertise required.





  • Expertise in support mechanisms for applications written in common bioinformatics languages such as R, Python, Perl, Java, or similar required





  • Expertise in common bioinformatics applications, data sources, and data formats required.





  • Knowledge of common NGS or other high-throughput data formats is required.





  • Expertise with resources of genomic data sets and analysis tools, such as GATK, UCSC Genome Browser, Bioconductor, ENCODE, and NCBI databases is required.





  • Demonstrated ability to develop and implement best practices for bioinformatics systems integration, testing, and deployment is required.





  • Ability to lead discussions with various information systems and technology owners to achieve desired bioinformatics outcomes is required.



Preferred Education:



  • Master's degree in computational discipline or systems engineering



Organizational Accountabilities
Organizational Accountabilities (Staff)
Organizational Commitment/Identification



  • Anticipate and responds to customer needs; follows up until needs are met


Teamwork/Communication



  • Demonstrate collaborative and respectful behavior


  • Partner with all team members to achieve goals


  • Receptive to others' ideas and opinions


Performance Improvement/Problem-solving



  • Contribute to a positive work environment


  • Demonstrate flexibility and willingness to change


  • Identify opportunities to improve clinical and administrative processes


  • Make appropriate decisions, using sound judgment


Cost Management/Financial Responsibility



  • Use resources efficiently


  • Search for less costly ways of doing things


Safety



  • Speak up when team members appear to exhibit unsafe behavior or performance


  • Continuously validate and verify information needed for decision making or documentation


  • Stop in the face of uncertainty and takes time to resolve the situation


  • Demonstrate accurate, clear and timely verbal and written communication


  • Actively promote safety for patients, families, visitors and co-workers


  • Attend carefully to important details - practicing Stop, Think, Act and Review in order to self-check behavior and performance

Primary Location : District of Columbia-Washington
Work Locations :
Remote Work Location
111 Michigan Avenue NW
Washington 20010
Job : Information Technology
Organization : Ctr Cancer & Immunology Rsrch
Position Status : R (Regular) - FT - Full-Time
Shift : Day
Work Schedule : 9:00 AM - 5:30 PM
Job Posting : Dec 27, 2024, 6:20:27 PM
Full-Time Salary Range : 92684.8 - 154460.8
Applied = 0

(web-86f5d9bb6b-f242k)