Data engineering concerns itself with the practical applications of data work. To be able to collect, use and analyse data, you first need technical infrastructure and mechanisms to ‘house’ that data. Building, optimizing and maintaining this house is the work of the data engineer.

while true founders Adam and Paul have decades of experience working on data engineering projects for major national governments, multinational companies and global NGOs. Between them, they have held positions including CTO and Engineering Lead at the international Open Knowledge Foundation (OKF), CEO and Senior Engineer at an enterprise-focused data engineering company, and founder of the Israeli open data nonprofit The Public Knowledge Workshop (‘Hasadna’). Paul is also the co-author of ‘A Frictionless Approach to Statistics and SDG Indicators’, a whitepaper commissioned by the World Bank that looks at how statistical organizations can build more efficient data pipelines.

Increasing efficiency within data systems is a key concern for data engineers. A single piece of data undergoes multiple transformations between being ingested into a system and being published, and there is often a high amount of friction involved in this process. With so much effort spent wrangling data, organizations are limited in the time and resources they can invest in actually using and analysing that data. Moreover, many organizations currently still rely on slow and tedious manual workflows when they could be benefitting from automated processes.

Data engineers can work on numerous projects within any one data management system. Their work starts at the point at which data enters a system (‘data ingestion’). Data being collected from different sources may arrive in diverse formats, such as Excel, CSV and JSON files, and need ‘transforming’ into a single format. This helps to make data more ‘interoperable’, ie. you can more easily compare and combine different pieces of data when they share the same format, which allows for the freer flow of information through a system.

If you want to improve existing data architecture, or are planning to start building a system for managing data, whiletrue is available for both consulting and delivery engagements.

