Última atualização: 17 de Junho de 2025
Senior Data Platform Engineer
Via Lever
Sobre
Responsibilities:
- Design & Optimization: Build, and fine-tune data clusters to support both batch and streaming workloads, ensuring optimal performance and reliability.
- Platform Development: Build and expand our (Spark, Hadoop, Kubernetes, Trino, Delta Lake, and Druid) ecosystems to meet evolving business needs and add new integrations, data ingestion, and data transforms as needed.
- Innovation: Introduce and scale new data platform solutions, iterating on our OLAP platforms and exploring next-generation data formats.
- Collaboration: Work closely with cross-functional teams, including infrastructure engineers, to align platform capabilities with organizational goals.
Required qualifications:
- Distributed Systems Expertise: Proven experience in scaling and tuning large deployments of Spark-on-Kubernetes and Spark-on-Hadoop.
- Object Storage Solutions: Knowledge of open-source S3 alternatives, including Ceph and MinIO.
- Storage Systems Knowledge: In-depth understanding of Hadoop and the HDFS protocol.
- Performance Tuning: Skilled in designing and optimizing shuffle-heavy systems, utilizing YARN or Kubernetes with remote shuffle services.
- Lakehouse Technologies: Hands-on experience with at least one lakehouse file format, such as Delta Lake, Apache Iceberg, or Apache Hudi.
- OLAP Systems: Familiarity with OLAP technologies, including ClickHouse, Apache Druid, Apache Pinot, or Apache Doris.
- Communication Skills: Strong ability to collaborate with diverse stakeholders and effectively communicate complex technical concepts.
- Problem-Solving: Proven track record of troubleshooting and resolving issues in large-scale, production environments.
Preferred qualifications:
- Advanced Data Formats: Experience with next-generation and multi-modal data formats, such as LanceDB.
- Self-Service Platforms: Background in building self-service stateful platforms.
- Accelerated Runtimes: Familiarity with native or accelerated runtimes for Spark, such as Apache DataFusion Comet, Apache Gluten, or NVIDIA RAPIDS.
Outras Informações
Selecionamos as principais informações da posição. Para conferir o descritivo completo, clique em "acessar"
Hey!
Cadastre-se na Remotar para ter acesso a todos os recursos da plataforma, inclusive inscrever-se em vagas exclusivas e selecionadas!