We use cookies. Find out more about it here. By continuing to browse this site you are agreeing to our use of cookies.
#alert
Back to search results

Python and PySpark Data Engineer - Remote

NTT DATA, Inc.
United States, Texas, Frisco
Nov 18, 2024

Req ID:294348

NTT DATA strives to hire exceptional, innovative and passionate individuals who want to grow with us. If you want to be part of an inclusive, adaptable, and forward-thinking organization, apply now.

We are currently seeking a Python and PySpark Data Engineer - Remote to join our team in Frisco, Texas (US-TX), United States (US).

Key Responsibilities:

Work in an agile environment with a development team to:

  • Design, develop, and maintain scalable data pipelines using software development patterns.
  • Implement data processing solutions using Python and PySpark on a cloud-native Lakehouse data platform.
  • Write efficient SQL queries to extract, transform, and load data.
  • Collaborate with product management and analysts to understand data requirements and deliver solutions.
  • Optimize and troubleshoot data pipelines for performance and reliability.
  • Ensure data quality and integrity through comprehensive testing and validation processes.
  • Follow DevOps principles and use CI/CD to deploy and operate data pipelines.

Required Skills:

  • Proficiency in Python and PySpark.
  • Strong experience with SQL and database management.
  • Knowledge of software development patterns and best practices in data engineering.
  • Experience with ETL/ELT processes and data pipeline orchestration.
  • Proficiency developing using version control, automated testing, and deployments using git-based tools like GitHub and GitHub Actions.

Preferred Skills:

  • Understanding of testing methodologies for data pipelines, including unit testing, integration testing, and end-to-end testing.
  • Knowledge of data governance and data security best practices.
  • Familiarity with data warehousing concepts and tools.
  • Experience with cloud platforms (e.g., Azure, AWS, GCP) with Azure preferred.
  • Knowledge of big data technologies (e.g., Microsoft Fabric, Azure Synapse, Lakehouse, Databricks).
  • Familiarity with advanced data orchestration tooling and development frameworks like dbt or Airflow.
  • Experience working in a healthcare related industry.

Qualifications:

  • Intense intellectual curiosity and an ability to view old problems with a fresh perspective.
  • Bachelor's degree in Computer Science, Engineering, or a related field.
  • 8+ years of experience in data engineering or a related role.
  • Strong problem-solving skills and attention to detail.
  • Excellent communication and teamwork abilities.

About NTT DATA

NTT DATA is a $30+ billion trusted global innovator of business and technology services. We serve 75% of the Fortune Global 100 and are committed to helping clients innovate, optimize and transform for long-term success. We invest over $3.6 billion each year in R&D to help organizations and society move confidently and sustainably into the digital future. As a Global Top Employer, we have diverse experts in more than 50 countries and a robust partner ecosystem of established and start-up companies. Our services include business and technology consulting, data and artificial intelligence, industry solutions, as well as the development, implementation and management of applications, infrastructure, and connectivity. We are also one of the leading providers of digital and AI infrastructure in the world. NTT DATA is part of NTT Group and headquartered in Tokyo. Visit us atnttdata.com

INDHCLSMC

Applied = 0

(web-69c66cf95d-dssp7)