We use cookies to ensure our website operates correctly and to monitor visits to our site. This helps us to improve the way our website works, ensuring that users easily find what they are looking for. To allow us to keep doing this, click 'Accept All Cookies'. Alternatively, you can personalise your cookie settings.

Accept All Cookies Personalise settings

We use cookies to ensure our website operates correctly and to monitor visits to our site. This helps us to improve the way our website works, ensuring that users easily find what they are looking for. To allow us to keep doing this, click 'Accept All Cookies'. Alternatively, you can personalise your cookie settings.

Accept All Cookies Personalise settings

Data Engineering

In all AI and Analytics applications, there is a layer of Data Engineering that needs to be done so that the Data Science gives meaningful answers, and avoids the “garbage in, garbage out” problem.

Whether we are using a data-lake or data-warehouse model, data feeds need to be configured, and data is collected and stored. Below are some of the areas we work in:

  • Data Verification: this uses statistical, semantic and other techniques to ensure that onboarded data is wrangled and cleansed. Our mathematicians and computer scientists use a variety of tools and approaches depending on the level of structure in the data.
  • Big Data Engineering: this focuses on the scaling of data and the issues that applications encounter with large amounts of data. We have experience of working with large amounts of data including sensor-derived data and large corpuses of text data.
  • Cloud Computing: this incorporates flexible sets of tools for rapidly and effectively experimenting with scaled computing environments. Depending on the use case, our experts will work either entirely in the Cloud or instantiate an on-premises workflow.
  • Infrastructure: we work flexibly with computing infrastructure according to the use case. Cloud Infrastructure is very useful for starting projects where we will use Open Source tools and data. It also flexibly scales so we can scale and test tools rapidly and cheaply. We have a number of experts in house and also work with partners to cover all domains.
  • On Premise Hardware/Cluster computing: this is very useful for running known high-intensity types of base-load computing, or when we need to use sensitive data.We have a long track record of developing and using high-performance clusters, and of working with sensitive data for a range of military, government and commercial customers.

Data warehouse

Image: Optasense (a QinetiQ company) data centre