Data Lakes: Store Everything. Analyse Anything. Build What You Need.

By Servaas Venter

A data lake is an enterprise resource that gives business units, functions and departments unprecedented freedom and flexibility to gather, analyse, and use the data they most need. In essence, business data lakes hold the keys to meeting the fast-growing business appetite for new combinations of data and to putting big data analytics to work across the enterprise. The success of this service is measured by how the service is consumed and converted into other forms of business value. Servaas Venter, Country Manager of EMC Southern Africa, believes data lakes will drive big changes.

“You can look at business data lakes in three ways: as one place to put all the data you may want to use; as a platform for big data analytics; and to help resolve the long-standing tension between the corporate push to get standard data into warehouses and used consistently, and the business unit need for local views and combinations of data that get implemented in all those Excel spreadsheets,” he says. He believes that while organisations and individuals have been generating enormous amounts of data for a long time, it is only recently that we have had technologies and methods for dealing with it with relative ease.

Applications of data lakes include those that need to analyse vast amounts of newly-generated or combined data, for example, analysis or predictive models for when and where power grids will fail.

Venter points out the opportunities around customer insight and experience, explaining that by analysing all the data about your customers can enable you to design and develop a more compelling experience for them.

CIOs will be particularly interested in information systems security applications. Controls like firewalls and authentication are insufficient to protect an enterprise against all of today’s threats. CIOs need to be able to notice and analyse the behaviours of people or programs that have or appear to have valid credentials.

“An organisation that puts all its system logs and network activity into a data lake can get better and faster at spotting anomalies, which leads to a faster, more targeted response.” The loop can then be closed by using the information to build predictive models of when and where problems are most likely to occur.

Organisations opting to implement data lakes need to recognise that data lakes will become at least an order of magnitude bigger than the largest corporate data repositories today. Fortunately, data lakes have a far more favourable cost structure than conventional databases.

Venter explains that the biggest challenge for business leaders may be deciding what to do with data lakes because they present so many opportunities. Due to the fact that data lakes remove the constraints around how much data an organisation can work with, opportunities become endless.

He recommends that CIOs recognise that data lakes are not on the horizon – they’re here today and advises that investing in data lakes as a Platform-as-a-Service (Paas) solution is wise.

Leave a Reply

Your email address will not be published. Required fields are marked *