Posts
Work in progress
CKAN
The Comprehensive Knowledge Archive Network (CKAN) is the world’s leading open-source software for building data management solutions. Whether your organization is big or small, public or private, CKAN enables you to create tailored systems for publishing, sharing, discovering and using data.
while true founders Adam and Paul have played pivotal roles in major CKAN projects since the software’s inception in 2014. They have both held technical leadership positions at two organizations with strong links to CKAN, the Open Knowledge Foundation (OKF) and Datopian, during which time Paul was elected as a CKAN steward and steering group member. Adam continues to promote CKAN as a solution for transforming government data into public knowledge through The Public Knowledge Workshop (‘Hasadna’), an Israeli nonprofit of which he is co-founder.
Generally speaking, the approach taken to CKAN depends on whether an organization needs to manage private, public or restricted data use cases. Though it is not a set rule, governments tend to use CKAN to build open data portals (ie. for a public data use case), while companies and nonprofits mainly use it for internal data management purposes (ie. for private data use cases). Research institutions are a good example of a type of organization that tends to have restricted data use cases, as they often need to process personally identifiable information (PII).
CKAN is used by organizations large and small. It is the engine behind major open data portals like data.gov, data.gov.uk and data.gov.au, while also powering business intelligence workflows at startups and SMEs. It can be used to set up small-scale, ‘out-of-the-box’ solutions as well as highly-customized systems with technically-advanced features like data versioning and nuances access permission layers. In any case, CKAN’s open-source codebase means that organizations are free to extend their CKAN instances as and when they see fit, making it a sustainable, attractive choice for growing organizations.
Whatever your use case, while true will work with you to design, implement and maintain a CKAN solution that meets your needs.
Data Engineering
Data engineering concerns itself with the practical applications of data work. To be able to collect, use and analyse data, you first need technical infrastructure and mechanisms to ‘house’ that data. Building, optimizing and maintaining this house is the work of the data engineer.
while true founders Adam and Paul have decades of experience working on data engineering projects for major national governments, multinational companies and global NGOs. Between them, they have held positions including CTO and Engineering Lead at the international Open Knowledge Foundation (OKF), CEO and Senior Engineer at an enterprise-focused data engineering company, and founder of the Israeli open data nonprofit The Public Knowledge Workshop (‘Hasadna’). Paul is also the co-author of ‘A Frictionless Approach to Statistics and SDG Indicators’, a whitepaper commissioned by the World Bank that looks at how statistical organizations can build more efficient data pipelines.
Increasing efficiency within data systems is a key concern for data engineers. A single piece of data undergoes multiple transformations between being ingested into a system and being published, and there is often a high amount of friction involved in this process. With so much effort spent wrangling data, organizations are limited in the time and resources they can invest in actually using and analysing that data. Moreover, many organizations currently still rely on slow and tedious manual workflows when they could be benefitting from automated processes.
Data engineers can work on numerous projects within any one data management system. Their work starts at the point at which data enters a system (‘data ingestion’). Data being collected from different sources may arrive in diverse formats, such as Excel, CSV and JSON files, and need ‘transforming’ into a single format. This helps to make data more ‘interoperable’, ie. you can more easily compare and combine different pieces of data when they share the same format, which allows for the freer flow of information through a system.
If you want to improve existing data architecture, or are planning to start building a system for managing data, whiletrue is available for both consulting and delivery engagements.
Data Management
Simply put, data management refers to the process of collecting and organising data within an organization. Having a data management strategy and systems for managing data is becoming increasingly important for modern organizations as they respond to the increasing volume and variety of data being produced in the world today.
while true founders Adam and Paul have worked on some of the world’s leading data management projects and software. They’ve collaborated with national governments, multinational companies, and global NGOs to design, implement and maintain systems tailored to individual objectives, requirements and budgets. With former titles including CTO and Engineering Lead at the Open Knowledge Foundation (OKF), CEO of the data management company Datopian, and founder of and the The Public Knowledge Workshop (‘Hasadna’), Adam and Paul have been advising organizations on data management since even before it became an established term.
There are various approaches to data management. This is because different organizations have different use cases for their data: governments often use data to drive policy or to publish it for public consumption; private companies tend to use data to help them make smarter business decisions; and nonprofits usually use data to raise public awareness on certain issues and influence government policy. Individual organizations may also have their own unique requirements, such as a need to share data securely with external partners, integrate different access permission layers, or version their data for auditing purposes.
Other factors also affect approaches to data management, such as the amount and diversity of data being collected, the nature of that data (ie. whether it is public or private), and its quality at the point of collection. When designing a suitable system, organizations also have to consider the amount of human and financial capital they currently have in existing systems, as well as taking into account the different levels of technical expertise among key stakeholders. They may face external limitations, such as procurement requirements, interdepartmental dependencies or financial restraints.
Whatever your data use case, while true will work with you to design a data management strategy and/or system that best suits your organization.
Open Data
Open data is public data (such as data relating to taxes, the climate, education, public spending etc) that has been made ‘open’ (available for all to see, use and distribute without cost or need to seek permission) by means of an open licence.
while true founders Adam and Paul have decades of experience supporting organizations worldwide to embrace the benefits of open data. They spent years at the forefront of open data thought leadership while working in senior management at the world’s largest open data nonprofit, the Open Knowledge Foundation (OKF), and worked together as CEO and Senior Engineer of a private open-source software company with ties to OKF. Both are particularly active in the Israeli open data space, with Paul having played a major role in opening up the country’s municipal data, work which Adam continues through the nonprofit The Public Knowledge Workshop (‘Hasadna’), of which he is co-founder.
Governments all over the world publish open data via what are known as open data portals or platforms like data.gov, data.gov.uk and data.gov.au. It is also becoming more common for other types of organization, particularly private companies, to openly publish certain datasets. An energy provider, for example, might publish energy consumption data to support climate change research, or a fashion retailer might publish data to make their supply chain more transparent. In any case, organizations can have multiple motivations for publishing data openly, among them transparency, innovation and compliance.
There are many practical considerations that need to be taken into account when publishing or working with open data. A popular tool used for publishing open data is The Comprehensive Knowledge Archive Network (CKAN), an open-source software that is trusted by multiple governments and other major organizations. One feature of CKAN that makes it so popular for publishing open data is its rich options for metadata tagging. Having quality metadata (data about data, like when it was produced, who produced it, when it was last updated etc) is crucial for data governance and ensuring that the end consumer has all the information they need to validate and use the data. Ensuring the published data is licensed correctly is also key to maximising open data’s impact.
If your organization needs to publish open data for the purposes of compliance or as part of a transparency initiative, or you are simply interested in exploring the possibility of using or publishing open data, while true can help. We can design, implement and maintain the right technical infrastructure, help you choose the right licence, and advise you on best practices.
Open Source
Open-source software refers to a type of computer software that is free to use, is not ‘owned’ by any one organization and can be freely copied, distributed and modified under an open license.
while true founders Adam and Paul have dedicated their professional careers to advocating for open-source data management solutions. In their respective former roles as Engineering Lead and CTO at the Open Knowledge Foundation (OKF), Adam and Paul were some of open source’s earliest proponents. With Adam’s additional experience as founder of the nonprofit The Public Knowledge Workshop (‘Hasadna’) and Paul’s work as CEO of the open-source consultancy company Datopian, both have helped dozens of organizations worldwide to make the transition to open-source software.
The benefits to all kinds of organisation of adopting open-source software are considerable. The most attractive advantage to many is that open-source software provides a more economical and sustainable alternative to proprietary software. It costs no money to use and eliminates the possibility of what is known as ‘vendor lock-in’. Moreover, because open-source software can be freely modified, you can tweak and extend it as your company grows, new use cases arise and your business priorities shift. This makes open-source an attractive option for organizations looking to future proof their software ecosystem.
Open-source software also appeals greatly to socially-conscious organizations, particularly governments and nonprofits. With its open codebase and emphasis on community and transparency, open-source software is ‘democratic’ and lends itself nicely to open data initiatives. Additionally, because open-source software improves with use, contributing to the development of open-source projects and giving back to the open-source community can compliment philanthropic programs.
If you are interested to learn more about how open-source software could benefit your organization, while true can advise you on best approaches, recommend the right software, and help you get set up.