Architecture of a generic online real-time advertising platform

We have recently posted a LinkedIn article entitled “How does online real-time advertising work?” [3] It provided a simplified overview of online real-time advertising. It should be obvious from this overview that the online real-time advertising ecosystem consists of a large number of components or platforms. This overview has provided how these components interact with one another but it has not looked into how these platforms work on the inside. This article is to rectify this situation.

In this article we provide the necessary details to understand the inner workings of online real-time advertising platforms: We first describe a generic (systems) architecture that can support all the key platforms with their secret sauce encapsulated in a component called “the decision engine”; we then clarify how these platforms differ from one another by explaining the inner workings of the decision engine. Again our presentation is simplified here to optimize understanding and provide insights. Though simplified, please be warned that this article is more technical and longer due to the scope of the details involved.

A generic architecture for an online real-time advertising platform (“the platform”) is shown below. For the sake of simplicity, we again focus on the key components of the platform. Each box in this figure may represent one or more services or functions. We also exclude the components such as load balancers and the like; they are needed by any scalable platform and a good platform architect will necessarily see the need for these components and where these components go within the architecture.

Below let us review each of the components from the architecture diagram above. The term “obj” in this figure is short for the “object”, which refers to the objects of interest, say, a user, an ad, a campaign, an agency, a publisher, etc. Below we will also refer to multiple tools from the (Apache) Hadoop ecosystem. Obviously leading tech companies in this space may have their own internal tools that may provide equivalent functions.

(User) Console Interface. The platform has a console interface to its users. The console interface is usually accessible by human users via a web browser or an app, say, running on a mobile device. The interface can get pretty complex with sophisticated data in/out, analytics, and visualization capabilities, e.g. see an example in [2].

The interface will also support APIs for server-to-server access to automate (hopefully) most of the tasks a human user needs to perform. A human user will use the console interface to create his or her objects of interest, say, the advertising campaigns. The object creation will involve entering many values for the parameters of the objects. For example, for a campaign, a user can enter its name, its goals, its constraints, and its budget together with its budget schedule. The creation process is similar for the sub-objects, say, the line item objects under the campaign object.

The console interface is usually very interactive in that the user can change most or all of the object parameters on demand. For example, the user can stop a campaign, change its budget or its goals, etc. Usually, these changes will be in response to the feedback that the user gets about the performance of his or her objects. This means the console interface is very rich in terms of real-time or batch reporting presented back to the user. 

The console interface is also designed to provide a seamless integration of the interface of each service the console supports. The number of key console services are discussed below.

(User) Console. The front-end of the console interface runs on the client side, i.e., on the user’s browser or app. Behind the console interface is the console that runs on the server side. The console, being a web application, is most probably a 3-tier application, with its three tiers: presentation, application, and data. The presentation tier (“the front-end”) supports what the console user sees and provides low-latency interactivity, data in/out, analytic and visualization capabilities.

The application tier supports the presentation tier logic as well as the main business logic for the platform. For example, what to do when an object is created, how to run search for a given object, or what and how to present reports or visualization back to the user. The application tier usually will rely on multiple console services to perform its function. A seamless integration of these services as well as their interfaces as part of a unified console experience is an important task.

The data tier is discussed below after the console services due to the dependence of the latter on the former.

Console Services. The console supports many key services to complete its offering. We will touch upon four key services among many accessible from the console.

The first key service is an engine to check whether or not the entered data is alright, i.e., consistent and as desired. For example, the console user should not be allowed to enter a budget figure that is too unrealistic, say, a billion dollars for a campaign budget. An important part of the checking is to ensure the entered details are approved by the right users with the approval privileges, say, before execution, i.e., before actual money gets spent.

The second key service is an engine to provide insights beyond the regular performance reports. For example, if the performance of a campaign is not good with the user entered values, this engine can prompt the user with suggestions to fix such performance issues. A suggestion can be in the form of relaxing a (targeting) constraint or focusing more on a different segment of the intended audience. A sophisticated monitoring and alerting service may be part of this service too. For example, a user may be alerted based on user generated triggers or auto-generated triggers to direct the user towards increasing potential gain or limiting potential loss.

The third key service is the forecasting engine that provides forecasts on the goals, say, those of a campaign before the campaign is even started. For example, the forecaster can estimate the potential audience size and parameters, the potential number of ad impressions, potential bids or costs, potential bidders, etc.

The forth key service is the billing engine that provides how the money from the platform customer is spent by the platform and how it is accounted for in detail and attributed to each object of interest. If a platform customer is an intermediary for yet another customer, then the billing engine may provide ways of the platform customer to maximize its revenue and also provide the necessary reports so that the correct invoices can be generated and sent downstream.

To generate its outcomes, the console services usually have dependence on real-time analytics / streaming, the big data component, and the console data tier.

Console Data Tier. The data tier covers the data access layer as well as the actual data storage to store and serve data. This tier usually contains three types of stores: An object metadata store (“the obj metadata store”), where, say, the campaign name or a customer name is stored; an object performance store (“the obj analytics store”), where, say, the campaign, bidder, or ad performance reports are stored; lastly, an object search store (“the search index store”), where the search indexes are kept for object search. The object search store is not shown in the architecture diagram above for the sake of simplicity. The first two of these stores is a must as the search can be done on the metadata data store itself, though not as efficiently as a search engine would do.

The (obj) metadata store can be implemented using a regular (row-based) SQL database due to its (primarily) OLTP nature. On this store, most of the queries are in the form of transactions, e.g., for an object, create, insert, delete, overwrite, list, etc. From the console user’s point of view, the metadata store is always a read/write database in that there are reads as well as writes by the user. For the implementation of this store, there are multiple open source and commercial solutions from vendors on premise or on public clouds.

The (obj) performance store can be implemented best using a column-based SQL database due to its (primarily) OLAP nature. From the console user’s point of view, the performance data store is always a read/only database in that the user via the column interface will only be reading the stored performance results; the writes to the store are usually in bulk and in parallel, usually from the “big data platform” component. Since the performance data is usually very large and contains a huge number of reports and columns, it is important to ensure (very) fast read performance for read requests on a particular report or a small set of columns. This is part of the reason for the performance database to be column-based. For the implementation of this store, there are multiple open source and commercial solutions from vendors on premise or on public clouds.

All major public cloud vendors usually support row- and column-based databases from the commercial database vendors as well as those of their own. One key advantage of the cloud versions is that they are expected to be well maintained and geographically distributed for higher reliability. However, the potential users should pay close attention to the cost of running their data tier on the public cloud.

Real-Time Analytics. A majority of the content of the performance data store may come from a batch process on the big data platform. The frequency of such a process depends on the amount of data processed and what software tool gets used. However, high-impact decisions may require far faster frequency with low latency, say, in seconds or lower. For this requirement, the platform usually has a separate streaming flow that is supported via the “real-time analytics” service, which continuously takes information from the budget/spend control service as well as the web tier service (in the form of events or log file snapshots), executes the necessary analytics logic on the data stream in real time, and feeds the results back to the console to present to the console user. This service can also feed signals, say, for alerts, back into the budget / spend control service.

Budget / Spend Control. The platform is also a sophisticated accounting system because it spends large amounts of money on behalf of its “customers” (represented by the console users) and charges money as part of these services performed. The budget/spend control engine makes sure that the given budgets are spent well within limits. This means for a given budget there should not be any overspend or underspend, i.e., spending more or less than the available budget, respectively. These are real or opportunity costs to the platform, e.g., the customer will most definitely not pay for any overspend and will happily take the extra gains delivered (under some conditions such as frequency caps on the audience are not violated drastically); similarly, consistent underspend may signal to the customer that the platform is unable to spend the given budgets, which should then be moved to other platforms.

Beyond the overspend/underspend constraints, this service also needs to deal with four more issues: Budget allocation, day parting, budget pacing, and budget changes.

  • Budget allocation refers to how a given budget is distributed or allocated into multiple choices, say, over channels, campaigns, publishers, ads, etc. Unless determined manually, budget allocation may involve some kind of an automated learning process to ensure, say, more budget is spent on the choices that has the best ROI.
  • Day parting refers to the fact that a budget schedule can include on/off days or time intervals within a day for the budget spend. Obviously, no budget should be spent on the off periods.
  • Budget pacing refers to how the budget spend is paced or distributed over a given time period, usually a day. Multiple strategies are possible here. For example, a pacing can be “even paced”, i.e., distributed evenly throughout the day, or “as soon as possible”, i.e., spend as quickly as possible without any regard for performance, or “optimally paced”, i.e., distributed in such a way that the performance can be optimized.
  • Budget changes refer to how the spend needs to be adjusted or even be stopped upon budget changes executed by the console user via the console interface. Any changes need to be sent to all the decision engine and web tier service as soon as possible, including to those servers that are in a geographically different location. Though this may sound trivial, it is not due to the nature of distributed systems.

The money is usually spent by the decision engine yet the recording of the spent is almost always logged in the web tier service as log files and in the streaming service as events. The reasons for this are that the decision engine has to be so extremely fast that it cannot afford losing time even by writing to log file, that the decision engine has to be idempotent in that the death or a restart of a server should not affect the results as long as there are other servers to take the required task, and that the web tier is already the interface for events anyhow. If a decision engine server is not informed about a budget change or gets disconnected due to a network partition, the server (a “runaway server”) can keep spending, leading to sometimes massive overspend. The trick is to distribute budgets to the decision engine servers in small amounts in a controlled manner algorithmically: The amount should not be too large that a runaway server can do lots of overspend damage; also, the amount should not be too small that a runaway server needs budget updates constantly, leading to too many message exchanges and potentially affecting network bandwidth negatively. Striking a balance between these two extremes is usually part of the platform’s secret sauce.

User / Obj(ect) Profile. Consider multiple event streams relevant for an object. It is highly likely that each stream gets events at different frequencies. Snapshots across these streams at certain time intervals are taken as candidates for appending to the relevant object containers called “profiles”. These profile provide all the relevant data for an object in one place and make it easily accessible to the applications on the platform.

At a high level, each event may arrive as a tuple (object id, event data). The id of an object (“the object id”) is an identification key (usually alphanumeric) to uniquely identify the object. For example, ids for users, advertisers, campaigns, publishers, ads, bidders, etc.; the URL for a web page is also an id.

At time time of a snapshot, once all the event data is joined using the object key as the join key in the snapshot across the event streams, it can be appended as a composite value (“the object value”) of one or more attributes to the object profile. In other words, the object profile can be thought of a typical key/value pair. This key/value structure is the reason for using key/value stores to store such profiles for fast access. For the implementation of such stores, there are many open source and commercial solutions available on premise or on public cloud vendors, running on servers with large main memory and flash drive based fast storage.

Profiles can be grouped into two types: Runtime (real-time) and Analytical (batch). Runtime profiles usually need snapshots collected from event streams at high frequencies while analytical profiles can accommodate snapshots at lower frequencies. A read access to a runtime profile involves returning the whole profile upon receiving an id in the read request; a write access to a runtime profile involves writing a few attributes of the profile (mainly appending but some overwriting).

In the architecture diagram above, the runtime version is supported by the user/obj profile store while the analytical version is usually supported by the big data platform. The runtime version is up-to date and roughly subsumed by the analytical version. The analytical version usually contains all the information known about the user or the object, whose limits are determined either by the storage preferences or the privacy laws or regulations. The analytical version, as the name implies, is usually intended for all analytical needs.

There is always data exchange between the runtime and analytical profiles directly or indirectly via other channels such as the web tier or a streaming pipeline. This is due to at least two reasons: They need to be kept in sync so that both are up to date as much as possible; They need to feed each other with the information that the other may now have. It seems obvious why the analytical profile needs to take the real-time events submitted to the runtime profile but the other direction is also necessary. For example, an attribute like a customer lifetime value may only be computed on the big data platform due to the amount of data the relevant algorithms need; once it is computed, then the runtime profile should be enhanced by it so that the runtime systems can use the data in real time.

Web Tier. The web tier serves three main purposes: It is the single interface between the platform and the outside world of servers; it is usually where a record of every event or action is logged; and finally, it is the message or data bus connecting multiple components, enabling data/event exchange among these components. The components connected include the decision engine, the real-time analytics service, the profile stores, and the big data platform. Even the components the decision engine are usually connected to each other via the web tier.

The web tier’s interface is defined using APIs. These APIs allow the web tier to exchange events or data between the platform and its external servers (belonging to the external entities in the ecosystem) as well as browsers or mobile apps, which the online user interfaces and which is the final destination for the ads. These APIs also allow to exchange data in the form of files between the platform and the external file servers. It is usually the case that the web tier will have separate servers dedicated to each type of interface, one set of servers for the server-to-server communication, another set of servers for the browser/app-to-server communication, and yet another set of servers for the file-based communication.

The events from the external servers are usually requests such ad or bid calls. The response to such events may include an ad or bid response. The events from the browsers or ads are usually user events due to pixel or tag firing on the web page the online user is about to see or seeing. The response to such events may include a segment membership update or redirects or no response. The files from the external file servers are usually user events that have to be delivered in batch; the files can also be various analytics reports.

As mentioned before, the web tier is also the place where a record of every action that the platform takes and is significant for multiple reasons such as accounting, billing, or monitoring is logged. For robustness, the web tier will write to a new log file very often, typically every few minutes, depending on the speed of the writes and the size of the resulting files. The resulting files are intended to be short in length so that they are easy to transfer and the damage from a file being corrupted is bounded.

The web tier logs will be pulled by the big data platform for further processing for accounting or analytics or machine learning reasons. It is also possible to push the events to a streaming service or the real-time analytics service for faster processing. The big data platform will usually keep the log files for a long time in case of any need to go back in time for debugging or auditing. The web tier will also keep the log files but for a few days only or until after a successful pull by the big data platform is acknowledged; this is due to storage limitations on the web tier.

Big Data Platform. Though shown as a single component in the architecture diagram above, this platform is a complex platform on its own right. At a high level, it has both storage and processing components; the processing part has both real-time streaming and batch capabilities. We will leave the details of such a platform to a separate article.

The big data platform for the focus of this article provides the processing pipelines to generate analytical user / object profiles, analytics for all reporting needs, click / action (or other outcomes of interest, say, view) reconciliation or attribution (last touch and multi touch), as well as the data science pipelines (both training, simulating, experimenting, and running). Here reconciliation and attribution refers to figuring out which touchpoint is associated with favorable outcomes and how much each of them has contributed as such; it is complex enough that deserves a separate discussion in our upcoming book.

The decision engine. The decision engine is where the platform-specific decisioning takes place. Below we briefly discuss the details of this engine for each key component of the online real-time advertising ecosystem.

The requests for the decision engine is submitted via the web tier, which in turn gets them via its interface. The decision engine sends its response to the web tier, which routes it to the intended entity, usually an external one.

To make its decisioning, the decision engine requires profile data about users and other objects, metadata about the objects (“the metadata”), the latest available budget, as well as data (“the resource data”) that its (usually machine learning) algorithms need. The profile data is sent in real time via the web tier as part of the request that the decision engine gets. The latest available budget is provided in real time by the budget/spend control service. The platform contains data distribution services (not shown in the architecture diagram above for simplicity) whose job is to distribute data from any source to any number of destination servers, which are usually geographically distributed into multiple data centers; such services may be built on existing real-time and batch data streaming tools. For the implementation of such a data distribution service, open source or commercial publish / subscribe and messaging tools may be deployed on premise or on public clouds.

The source for the metadata is the object metadata store whereas the source for the resource data is usually the big data platform, where the machine learning models will be trained and the data for their execution will be constructed. The metadata distribution may occur every few minutes whereas the resource data distribution occurs every 10s of minutes to hours, depending on how long the model building and data construction take on the big data platform.

Now let us see how the decision engine looks like for each key component of the online real-time advertising ecosystem. The presentation below is necessarily brief and focuses on a few major functions of the decision engine for each platform.

  • Demand-Side Platform (DSP). This platform makes decisions on behalf of advertisers; its goal is to maximize the value for the advertisers. It continuously receives events for users and other objects; it also receives bid requests from the ad exchanges. The former may lead to a re-evaluation of audience segments (which are usually defined by a set of Boolean rules) and relaying of this outcome to the rest of the ecosystem. The latter leads to an execution trace in which for the given user within the given context of the bid request a bid request will be generated. This generation involves an execution of the targeting rules or constraints associated with each ad to find out the list of ads eligible for the given bid request; this is then followed by the computation of a ranking score and bid for each eligible ad; this is finally followed by responding to the ad exchange with the winner of the bid / ad ranking. In short, the decision engine mainly encapsulates the audience segment evaluation and bidding decisions.
  • Ad Exchange. This platform matches supply with demand and figures out the winner of ad opportunity; the goal is to provide a fair market for the demand and supply side interests. For a given user within a given context (i.e., for an ad opportunity coming from a publisher directly or via an SSP), the winner will be determined in an auction over the bids submitted by the bidders (usually DSPs). As such, the decision engine mainly encapsulates the capability for running the actual auction, determining the winner, and figuring out the cost of the match. The decision engine may also encapsulate other capabilities such as the capability for a floor price, i.e, the lowest acceptable bid for the ad opportunity in question, or the capability to rate limit bidders if they cannot deal with the rate of bid requests. The winner of the auction gets submitted back to the supply side platform from where the ad request has originated. The winning bidder also gets informed about the win.
  • Supply-Side Platform (SSP). This platform makes decisions on behalf of publishers; the goal is to maximize the yield for the publishers. An ad opportunity may be satisfied by ads from guaranteed deals between the publisher in question and the partner advertisers; it may also be satisfied via the market, i.e., the route through ad exchanges and bidders (also called non-guaranteed). As such, the decision engine encapsulates the capability to select which partners to reach out to satisfy an ad opportunity, the capability to trade off between guaranteed and non-guaranteed sides, the capability to act like an ad exchange for the non-guaranteed side but with a bias towards the publisher in question.
  • Ad Server. This platform can serve both sides of the market for the storage and serving of ad creatives; the goal is to be the single source for that function, which also imply being the single source of truth for who sees what. An ad server is a good place to enforce constraints on outcomes of interest such as frequency capping (which is used to limit how many ad impressions a given user should be seeing under which conditions). As such, the decision engine encapsulates the capability to determine who sees what, the capability to enforce and evaluate constraints, the capability to perform experiments and A/B tests (to find out the best serving creatives), the capability to act according to the result of a satisfied constraint or the best performing A/B test candidate, and the like.

DSPs, ad exchanges, and SSPs also provide experimentation and A/B testing capabilities to their customers; the logic for these capabilities may also reside in their decision engines.

Scale. What is the scale of the platform? The platform (a leading one) may run on 1000s of high-end servers distributed over multiple data centers in three or more geographic regions / continents, and support millions of requests / responses per second within one to 10s of milliseconds [1]. The big data platform itself may use 1000s of servers in multiple data centers, dealing with 10s to 100s of petabytes of data. Given this kind of a scale, the platform usually operates in a fully automated manner with many decisions generated and performed by machine-learned or control-theory-based algorithms. The platform also has strong disaster recovery capabilities. The spend per year through the platform may hit 100s of millions to billions of dollars. It may take 10s to 100s of person-year effort to build from scratch.

Summary. We hope you have found this overview of a typical high-end platform in the online real-time advertising ecosystem useful. Please provide a comment if you have a suggestion on how to improve it.

If you are a computer scientist or software engineer who wants to quickly learn many areas of computer science very well and who do not mind working in a challenging environment, you may consider working for such a platform. Such platforms are operated as part of many leading high-tech companies whose revenue source is mainly online advertising.

References and notes:

  1. How fast is a millisecond? 1 ms = 1 millisecond = one thousandth of a second. Light may travel around the Earth in about 130 milliseconds. Over fiber networks, it may slow down to about 170ms. Also note that a blink of an eye may take time between 100ms to 200ms.
  2. Component interfaces can get very complex with lots of sophisticated analytics and visualization capabilities. For a look at an example interface from a leading DSP circa Nov 2012, see http://goo.gl/IxJZT or https://nyti.ms/2KN4OEX .
  3. J. Koran and A. Dasdan, How does online real-time advertising work?, May 2018, see https://bit.ly/2sGkcMC .

This article comes from a preliminary version of a book on computational advertising, authored by Joshua Koran and Ali Dasdan, and due out in 2018.

Disclaimer: This article presents the opinions of the author. It does not necessarily reflect the views of the author’s employer(s).