Última atualização: 29 de Setembro de 2025

Senior Data Engineer

🌍 100% Remoto 💬 Inglês ✈️ Vaga internacional 🧓🏽 Sênior

Via Greenhouse

Sobre

How your day-to-day work will look like

Architect and evolve scalable infrastructure to ingest, process, and serve large volumes of data efficiently.
Lead improvements to existing frameworks and pipelines to ensure performance, reliability, and cost-efficiency. Establish and maintain robust data governance practices that empower cross-functional teams to access and trust data.
Transform raw datasets into clean, usable formats for analytics, modeling, and reporting.
Investigate and resolve complex data issues, ensuring data accuracy and system resilience.
Maintain high standards for code quality, testing, and documentation, with a strong focus on reproducibility and observability.
Stay current with industry trends and emerging technologies to continuously raise the bar on our engineering practices.

This would make you the ideal candidate

Bachelor's Degree in Computer Science, Engineering, or a related field.
5+ Years of experience working with data engineering experience designing, building, and operating scalable data ingestion, processing, and serving layers in production.
5+ Years of experience working with SQL for analytics, transformations, and performance optimization (windowing, partitioning, query tuning).
5+ Years of experience working with Python for data manipulation, pipeline development, packaging, and integration (e.g., PySpark, pandas).
5+ Years of experience working with data modeling for Data Warehouses/Lakehouse (dimensional, wide tables, partitioning, Z-ordering, schema evolution) and building efficient ELT/ETL pipelines.
3+ Years of experience working with distributed data processing (Apache Spark) for batch and/or streaming at scale.
3+ Years of experience working with cloud platforms (AWS and/or GCP) for data engineering workloads (storage, compute, IAM, networking, cost controls).
2+ Years of experience working with implementing data governance at scale (policies, lineage, quality checks, access controls) across multiple domains/teams.
2+ Years of experience working with improving existing data frameworks/pipelines for performance, reliability, and cost efficiency (profiling, caching, partitioning, autoscaling).
1+ Years of experience working with automated testing and CI/CD for data pipelines (unit/integration tests, data quality tests, environment promotion) and observability (logging/metrics/tracing).
Expert-level SQL (query design, performance tuning, troubleshooting execution plans).
Experience with API-based integrations (JDBC/ODBC, REST, SOAP) and data ingestion patterns.
Intermediate shell scripting (Bash and/or PowerShell).
Logging, auditing, and tracing for data platforms; strong observability mindset.
Data security and privacy (PII handling, encryption, tokenization, IAM, least privilege).
Automation of data pipelines and environments (schedulers, CI/CD, packaging, reproducibility). Hands-on with diverse data types and formats (structured, semi-structured, unstructured; JSON, Avro, ORC, Parquet; columnar vs. row).
Proficiency with Git-based workflows (branching strategies, code reviews, PRs).
Understanding of modern data architectures: Data Lake, Data Warehouse, Lakehouse, Data Mesh, Data Fabric, Delta/transactional lakes.
Advanced English Level is required for this role, as you will work with US clients. Effective communication in English is essential to deliver the best solutions to our clients and expand your horizons.

Outras Informações

Selecionamos as principais informações da posição. Para conferir o descritivo completo, clique em "acessar"

Hey!

Cadastre-se na Remotar para ter acesso a todos os recursos da plataforma, inclusive inscrever-se em vagas exclusivas e selecionadas!