המקום בו המומחים והחברות הטובות ביותר נפגשים

Limitless High-tech career opportunities - Expoint

IBM Data Engineer - PySpark / Spark
Mexico, Jalisco, Guadalajara
981293806

09.12.2024

שיתוף

In this role, you’ll work in one of our IBM Consulting Client Innovation Centers (Delivery Centers), where we deliver deep technical and industry expertise to a wide range of public and private sector clients around the world. Our delivery centers offer our clients locally based skills and technical expertise to drive innovation and adoption of new technology.
Day-to-day troubleshooting of forecasting systems, mainly working through data anomalies that cause inaccurate forecasts or prevent forecasts’ generation.Create dynamic object-oriented methods, full stack solutions, and integrations to existing code solutions.
Develop individual Python classes, methods, functions that support the data flow of existing and new projects.
Work on code additions to seamlessly support projects for data flows, including logging and support, with little to no supervision.
Experience in modifying packages, testing, and repository instances to support CI/CD.

1. PySpark and Spark: Proficiency in PySpark, including the Spark DataFrame API and RDD (Resilient Distributed Datasets) programming model. Knowledge of Spark internals, data partitioning, and optimization techniques is advantageous.
2. Data Manipulation and Analysis: Ability to manipulate and analyze large datasets using PySpark’s DataFrame transformations and actions. This includes filtering, aggregating, joining, and performing complex data transformations.
3. Distributed Computing: Understanding of distributed computing concepts, such as parallel processing, cluster management, and data partitioning. Experience with Spark cluster deployment, configuration, and optimization is valuable.
4. Data Serialization and Formats: Knowledge of different data serialization formats like JSON, Parquet, Avro, and CSV. Familiarity with handling unstructured data and working with NoSQL databases like Hadoop HBase or Apache Cassandra.
5. Data Pipelines and ETL: Experience in building data pipelines and implementing Extract, Transform, Load (ETL) processes using PySpark. Understanding of data integration, data cleansing, and data quality techniques.

משרות נוספות שיכולות לעניין אותך

JPM

JPMorgan Software Engineer II Spark/PySpark sqlAWS India, Telangana, Hyderabad

JPM

JPMorgan Software Engineer III AWS Python Spark/PySpark India, Telangana, Hyderabad

JPM

JPMorgan Software Engineer II Spark/PySpark AWS Python SQL India, Telangana, Hyderabad

IBM

IBM AWS Data Engineer Mexico, Jalisco, Guadalajara

כלי לבניית קורות חיים מקצועיים מבית אקספוינט

הצטרפו למאות שיצרו קורות חיים ושדרגו את הקריירה שלהם

צרו קו"ח

IBM Data Engineer - PySpark / Spark Mexico, Jalisco, Guadalajara 981293806

JPMorgan Software Engineer II Spark/PySpark sqlAWS India, Telangana, Hyderabad

JPMorgan Software Engineer III AWS Python Spark/PySpark India, Telangana, Hyderabad

JPMorgan Software Engineer II Spark/PySpark AWS Python SQL India, Telangana, Hyderabad

IBM AWS Data Engineer Mexico, Jalisco, Guadalajara

IBM Data Engineer - PySpark / Spark
Mexico, Jalisco, Guadalajara
981293806