Senior Data Engineer



We‘re looking for minds who match the phenomenal growth attitude of the company. Built on individual initiative, raw energy and a go-getter attitude, we have a foosball table at the centre of our discussions! They say you can judge the team by its refrigerator, well at KhataBook it’s always stocked.

At KhataBook, we’re building utility solutions for the nation’s 60 million-strong SMBs sector. Our first offering – an Android App enabling businesses to digitally record the credit they extend to customers – went viral. In a matter of 6 months, KhataBook has been downloaded 7M times, more and more businesses are becoming a part of this network. Khatabook recorded transactions totalling $3 Bn in August 2019.

Some context of the market: 90% of India’s ~$1 Trillion Retail Market is controlled by traditional/unorganized sector. Which means ~$900B worth of commerce flows through ~60M small & medium shops/warehouses/kiosks/homes, scattered all over the country – from mighty metros to tiny villages. A network powered by millions of businesses, built-in turn to power these businesses – that’s our goal at KhataBook.

Why our users love us: The app effectively cuts down the receivables of merchants to half within a few weeks of onboarding them. Its users recovered approximately $5 bn stuck in credit in six months, significantly impacting India’s MSME sector.

Where we currently are: KhataBook recently closed a $25 Million Series A Round and is backed by the top investors in the business. Founded by four IIT-Bombay grads, KhataBook is led by Ravish Naresh, Co-founder and COO at, one of India’s most significant property listing websites.

We’re a small team based in Bengaluru, funded by Sequoia, Info Edge & Y Combinator. We’re always looking for great folks to come and join us on this fantastic adventure.

  • Own the technical solution design, act as a technical architect and implement all stages right from data acquisition to integration, both batch and real-time
  • Drive continuous improvements in moving, aggregating, profiling, sampling, testing gigabytes/terabytes of data
  • Build data pipelines data processing tools and technologies in open source and proprietary products like Oracle, Redshift, Hadoop, Pig, Hive, HBase, Spark, MongoDB etc.
  • Be the go to person for product owners and analysts for ETL design, and other related big data and programming technologies
  • Quickly create functioning prototypes to address quickly changing business needs and later revamp prototypes to create production-ready data flows.
  • Proactively identify performance & data quality problems and drive the team to remediate them
  • Harness operational excellence & continuous improvement with a can do attitude.
  • 3+ years of intense experience with large scale data delivery platforms, solutions and designing modern data systems to support exponentially data growth.
  • Hands-on experience with an emphasis on the data lake, data warehouse solutions, business intelligence, big data analytics, enterprise-scale custom data products.
  • Redshift, Hadoop, and Spark platform experience is a must
  • Knowledge of data modeling techniques and high-volume ETL/ELT design.
  • Strong SQL optimization and performance tuning experience in a high volume data environment that utilizes parallel processing.
  • Experience with version control systems (Gitlab), deployment tools (e.g. Airflow, Jenkins) and cloud platforms (AWS, GCP, Azure)
  • Hands-on experience with big data technologies like Hadoop MapReduce, Spark, Hive, Pig, HBase, Elasticsearch, and others.
  • Experience with programming languages like Java, Scala & scripting in Python, Bash.
  • Ability to work effectively in an unstructured and fast-paced environment both independently and in a team setting, with a high degree of self-management with clear communication and commitment to delivery timelines.

Market Rate + ESOP


HSR Layout, Bengaluru




We’re always looking for talented people.

Send your resume to