Extracting value from data

Building a big data platform to meet advertisers needs for accountability

Extracting value from data

Building a big data platform to meet advertisers needs for accountability


Clear Channel Outdoor is one of the world’s leading Out-of-Home media owners. Across their large, diverse Out-of-Home portfolio of half a million sites in 22 countries throughout Europe, Asia, the USA and Latin America, they boost brands by connecting them with the people they want to reach, with media and ideas that enlighten, entertain, charm, challenge and influence. In the UK alone, they operate more than 35,000 sites nationwide, from Inverness in Scotland to Truro in Cornwall and in every major urban area in between.



Accountability is a big challenge in the advertising industry. There have been numerous negative stories about the big digital media owners and whether viewership figures of their adverts are trustworthy, and whether the content is being played in places that are appropriate or could cause reputational risk. Out-of-Home campaign delivery is relatively easy to prove when classic paper-based posters are displayed for a number of weeks, pictures can be taken of the posters and customers can easily see their adverts. But the industry has gone through rapid digitisation of its products and now many markets are nearing 50% revenue from digital. But being accountable and proving digital campaign delivery is a very different challenge to classic posters.

Clear Channel needed to quickly create a way of proving to its advertiser and agencies that their digital campaigns were delivering what they had committed to. Their goal was to create full transparency to their customers so they could see how their campaigns were performing at a granular level. Additionally, they wanted to get this audited by a third party to provide proof of the methodology followed.



The Riverflex team lead the definition, design and rollout of a new Campaign Performance solution. This provides both summary and granular information on how digital advertising campaigns are performing on a near real-time basis to:

  • Internal teams such as campaign planners who will adjust campaigns in-flight to ensure they meet their objectives, and operations staff who address problems with screens
  • External stakeholders such as advertisers and agencies who need proof that their advertising campaigns have delivered

To build this solution a number of sources of data needed to be integrated to provide accurate and trustworthy information.

  • An order management system stores information about the products and services the customer has contractually bought (e.g. 100,000 impacts in London, Bristol and Leeds)
  • A campaign planning system automatically schedules the campaign onto the most appropriate screens
  • A content management system associates the right content (e.g. video, image, dynamic file) to the campaign and sends it to the player
  • Over 10,000 digital screens with various types of network connections that are constantly sending data regarding what they have played, and on their technical (e.g. CPU utilisation, alarms) and physical (e.g. power, heat, door open / closed) state

To create this solution a modern big-data pattern was adopted based on the scale and complexity of the data, and the needs to process and visualise this on a near real-time basis.

Azure was chosen as the cloud provider. Riverflex team members drove the process to build and support the Azure technology foundation, security implementation, continuous integration/continuous delivery (CI/CD) pipeline, and create the application stack to support the data lake and ETL ingestion pipeline.

The key elements of the cloud infrastructure, data and compute platform that we created are described below.


Cloud-based Account / Project Structure

The proper root account / sub-account / project structure was implemented to achieve huge gains in productivity, innovation, and cost reduction as the workload moved to the Azure cloud. There are a variety of services and features that allow for flexible control of cloud computing resources and also of the Azure AD account(s) managing those resources. On the account level, these options are designed to help provide proper cost allocation, agility, and security. A project-based mapping oneto-one to a sub-account structure was implemented. Creating a security relationship between sub-accounts was a key element added to assess the security of cloud-based deployments, centralize security monitoring and management, manage identity and access, and provide audit and compliance monitoring services.


Project-based Implementation with Infrastructure as Code (IaC)

Infrastructure as Code (IaC) was implemented as a method to provision and manage IT infrastructure through the use of source code, rather than through standard operating procedures and manual processes. IaC helps the devops team to automate the infrastructure deployment process in a repeatable, consistent manner, also providing the benefit to easily deploy standard infrastructure environments in other regions where the cloud provider operates so they can be used for backup and disaster recovery.


Serverless Compute and Storage

By employing cloud serverless compute and storage, such as Blob and objectstores, the organisation leverages the ability to build and run applications and services with infinite elasticity without using physical hardware. In addition, all existing costs associated with managing servers and containers (operating system updates, maintenance updates, image snapshots, backups, restarts, etc.) largely disappeared.


ETL and the Data Pipeline

The data pipeline acts as a utility – a standard suite of data tools that enabled the develops team to automate the sourcing, processing, and entitlement of data. Automation of these processes allows data sources to be quickly added and the approach for the cloud data lake then extracted, transformed, combined, validated and loaded (ETL) for further use. The data pipeline is able to simultaneously process multiple data sources at once.


Enterprise Data Lake

The introduction of an enterprise data lake provided a central data repository and access to analytics tools that maximized the value of the data. The enterprise data lake is a centralized repository that allows storage of structured and unstructured data at any scale. Data can be stored as-is, without having to first structure the data, and run different types of analytics – from dashboards and visualizations to big data processing, real-time analytics, and machine learning to guide better decisions.


Data Catalog and Discover-ability

The introduction of a data catalogue provided a single searchable glossary of data that is available to the organization, including the data source, definition and entitlements. The data catalogue built on top of the data lake allows users to find the data they need and then use it in the tools that they prefer along with ensuring information boundaries and data contracts are not violated.


Application Programming Interface

An Application Program Interface (API) is a set of software routines that allow programs to interact. The use of an API at the organisation allowed data from the enterprise data lake to be accessed by upstream applications that rely on it. Additionally, the API enabled end users (both internal & external) access to data for their individual analytics and modelling needs. External parties were able to integrate the APIs with their inhouse sales systems, to better improve programmatic buying and monitoring of campaign delivery.


Analyze and Visualize Data

Through self-discovery of data resident in the enterprise data lake through the data catalog, individuals are able to access data based on their role. For low technology use cases, end users are able to upload datasets into Excel or tools such as PowerBI, or alternatively, the API allows data to be integrated with programs coded in Python, Scala, Java, R, etc. For heavy big-data workloads, data engineers are able to use Azure Databricks clusters and Azure Synapse to achieve limitless analytics.


Sandbox for Experimentation

To support innovation, the data lake includes a sandbox environment that provides the functionality of the enterprise data lake but allows one to easily introduce new datasets and technologies for experimentation.


Flexibility of Architecture

Unlike traditional technical approaches for data warehousing which are inflexible in terms of data schemas, technical capabilities and tools, the cloud-based approach allows flexibility on all of these fronts. Fit for purpose cloud-based data tools and technologies can be incorporated with relative ease as needs get identified.



The creation of an enterprise data lake had substantial benefits. Historically, the organisation had data that existed in individual business teams or systems. Providing access to this data required point-to-point solutions and significant time was spent preparing and reconciling data by each team who uses it. Additionally, teams were not aware of data that existed within the organization. The enterprise data lake enabled the organisation’s Sales and Operations teams to capitalize on the value in data, by bringing together internal and external datasets in a single place, in near real-time, as well as eliminating redundant reconciliations by using the same dataset. This value grows as new data sets are added. This easy access to a broad set of data empowers users to innovate the way the organisation sells and manages its digital campaign delivery and the overall performance.

As a direct result of Riverflex’s work, Clear Channel have now been independently audited by professional audit firm PwC. The audit covers Clear Channel’s delivery and reporting of selected UK digital campaigns, meeting advertisers’ requirement for transparency in digital media. It also covers Clear Channel’s delivery and reporting of UK digital campaigns over three months. The audit is part of Clear Channel’s ongoing commitment to ensuring full transparency and accountability for their digital outdoor campaigns. 

By leveraging the data catalog, the internal users can now easily discover, access and perform analytics on hundreds of datasets. After discovery, analysts can access data directly via an API into their environment, or they can leverage sophisticated scalable cloud-based analytics software, such as Spark based services, to perform intense algorithms against multiple datasets. This greatly speeds analytics and increases its use in digital campaign performance.

The implementation of a Continuous Integration/Continuous Deliver (CI/CD) approach to cloud development and deployment, the organisation can deliver new software features in hours or days instead of months. Smaller code changes are simpler (more atomic) and have fewer unintended consequences. Upgrades introduce smaller units of change and are less disruptive. The products improve rapidly through fast feature introduction and fast turn-around on feature changes. End-user involvement and feedback during continuous development leads to usability improvements. You can add new requirements based on customer’s needs on a daily basis. 

Let’s talk!

Reach out to one of our Foundry experts to see how we can help you deliver – obligation-free.

Contact us Follow us on
Get access to this success story and learn how we solved a tough technical challenge for our client.

You have successfully subscribed to the newsletter

There was an error while trying to send your request. Please try again.

Riverflex will use your contact details to be in touch with you and to provide support and information on our consulting services.