The Role: Critical to achieving DueDil's vision is our ability to combine multiple disparate data sources from different providers into a unified view of companies and the people who run them. This requires us to develop web crawlers, automated matching algorithms, machine learning models and complex ETL processes to tie all these components together. As a Senior Data Engineer, you'll be expected to enhance and expand our data processing toolset to support our international expansion effort, while maintaining quality and reliability of our existing data products and services. This will mean dealing with challenges such as order of magnitude increases in data volumes, assessing quality of data from multiple suppliers and building pipelines to match and extract valuable insight from these datasets. You will be working in a team of experienced Data Engineers and Data Scientists building next generation tools and transforming the Fintech industry

We are looking for: 

  • Proven track record leading complex ETL and Data Infrastructure projects
  • Demonstrable ability working with high volume heterogeneous data with distributed systems such as Hadoop or Spark
  • Expert knowledge in one or more of the following languages – Python, Scala, Java
  • Strong understanding of data structures and algorithms
  • Deep knowledge of data modeling, data access, and data storage techniques
  • Familiarity with Unix systems, common command line tools e.g. grep, awk and source control tools e.g. git
  • Familiarity with Machine Learning

Our Tech Stack:

  • Python
  • Scala
  • Spark
  • Hadoop
  • Elasticsearch
  • PostgresQL
  • TinkerPop
  • Redis
  • Kafka
