Docs

Docs

  • docs
  • support

›Getting Started

About

  • Overview

Getting Started

  • Introduction
  • Providing User Data
  • Providing Ad Performance Data
  • Providing Ad Network API access
  • Receiving Data from AlgoLift
  • Receiving Probabilistic Attribution pROAS from AlgoLift
  • Providing SKAdNetwork Postbacks

Intelligent Automation

  • User Guide
  • Intelligent Automation FAQ

Intelligent Budget

  • User Guide
  • Intelligent Budget FAQ

FAQ

  • LTV FAQ
  • Organic Lift FAQ

Providing User Data

Supporting Products

  • User-level LTV projections
  • Intelligent Automation
  • Probabilistic Attribution
  • Organic Lift
  • Intelligent Budget

Introduction

We require data on Attribution, Revenue, and User Engagement in a standardized format that is described below. This data is aggregated and delivered daily, with each daily delivery landing at a unique location for that day's data. Instructions on delivering this data are likewise detailed below.

Note: Please exclude test or QA user data from the files in order to ensure accuracy and optimal results. You can also provide a list of users to blacklist as well.

Data History

AlgoLift provides the most accurate LTV model when we have 2 years+ of data. This enables us to observe temporal changes in revenue and user behavior and to fully understand how cohorts mature.

We can train our LTV model with less historical data but this restricts the projection windows based on the amount of data available to train our models. Below is a schedule for how AlgoLift releases new projection windows to clients based on the historical data available for modeling:

Days of Historical DataProjection Window
45D30
70D60
95D90 / D180
105D365

Data Shape

Attribution Data

Attributions, which are closely tied to installs (which represent the beginning of a user's lifetime within the app or product), are the events when a given user can be attributed to a specific ad in a campaign if they were acquired through advertising, or otherwise can be labeled 'organic.' When preparing Attribution data for export to AlgoLift, please format it according to the below schema.

All columns present in the schema should be present in the corresponding file. Any columns for which data is unavailable should be left blank, but not omitted.

FieldData TypeRequiredDescription
user_id*varchar(255)YesA common user identifier used across all data tables
impression_dtdatetimeNoThe timestamp of the ad impression event the user is attributed to
(ISO-8601 compliant datetime string)
install_dtdatetimeYesThe timestamp of the user's install event
(ISO-8601 compliant datetime string)
countryvarchar(2)YesThe country in which the user resides
(ISO-3166 alpha-2 country code)
platformvarchar(255)YesThe platform of the user's device
sourcevarchar(255)YesThe ad network to which the user has been attributed. Should be Unattributed Facebook for Facebook VTA users who are not attributed to any specific campaign.
subpub_idvarchar(255)No**A unique anonymous identifier for a sub-publisher provided by your MML, i.e. ad network’s ID for the publisher app
subpub_bundle_idvarchar(255)No**The app store bundle id (com.app.pub, or 1234534534), i.e. google or apples’d for the app
subpub_namevarchar(255)No**The proper name for a sub-publisher (i.e., Bubbles Game)
campaign_idvarchar(255)No**The id of the campaign to which the user has been attributed
campaign_namevarchar(255)No**The name of the campaign to which the user has been attributed
adgroup_idvarchar(255)No**The id of the ad group to which the user has been attributed
adgroup_namevarchar(255)No**The name of the ad group to which the user has been attributed
ad_idvarchar(255)No**The id of the ad to which the user has been attributed
ad_namevarchar(255)No**The name of the ad to which the user has been attributed
ad_typevarchar(255)NoThe type of creative user has been attributed

valid types:
text - an ad unit containing only text, e.g. a search result
banner - a basic format that appears at the top or bottom of the device screen
interstitial - a full-page ad that appears during breaks in the current experience
video - a standard video, i.e. non-rewarded
rewarded_video - an ad unit offering in-app rewards in exchange for watching a video
playable - an ad unit containing an interactive preview of the app experience
sponsored_content - a link included in a piece of sponsored content, like an advertorial article
audio - an audio ad
keyword_idvarchar(255)No**The id of the keyword to which the user has been attributed
keyword_namevarchar(255)No**The name of the keyword to which the user has been attributed
device_brandvarchar(255)NoThe brand of the user's device
device_modelvarchar(255)NoThe model of the user's device
os_versionvarchar(255)NoThe version of the operating system of the user's device
custom1varchar(255)NoAn arbitrary, client-defined, field. Will be carried through the pipeline as a dimension and may be configured as a filter in the AlgoLift app.
custom2varchar(255)NoAn arbitrary, client-defined, field. Will be carried through the pipeline as a dimension and may be configured as a filter in the AlgoLift app.
custom3varchar(255)NoAn arbitrary, client-defined, field. Will be carried through the pipeline as a dimension and may be configured as a filter in the AlgoLift app.
conversion_valueintNoThe SKAdNetwork conversion value set on iOS in versions 14 and above. This should be the last conversion value that was set.

** data required when using Intelligent Automation

*user_id is a non-PII (ie not IDFA/GAID) identifier that uniquely identifies a given user of the app or product. This can be an identifier generated by the client, or a third party identifier provided by a third party such as an MMP or Analytics provider. Any identifier that follows the following rules can be used:

  • The identifier is available at the time of attribution.
  • All subsequent revenue and engagement events for a given user are provided to AlgoLift using the same identifier as that initially provided in that user’s attribution event. Stated another way, all revenue and engagement events should contain a user_id that has previously been included in an attribution event.
  • The identifier is the one AlgoLift will return in any output data.

Not all ad networks use the same nomenclature. The below table shows the names under which you can find certain fields for a given ad network.

AlgoLiftFacebookApple Search AdsGoogle Ads
CampaignCampaignCampaignCampaign
Ad GroupAd SetAd GroupAd Group
AdAdCreative SetAd
KeywordN/AKeywordN/A

Revenue Data

Revenue represents revenue generated from the user, either from in-app purchases, ongoing-subscriptions, or payments derived from ads presented to the user. When formatting Revenue data for export to AlgoLift, please format it according to the following schema:

FieldData TypeRequiredDescription
user_idvarchar(255)YesA common user identifier used across all data tables
revenue_typevarchar(255)YesThe type of the revenue source

valid types:
[adrev, iap, subscription]
revenuedoubleYesAggregate revenue amount in USD
transaction_countintYesNumber of transactions
parcel_namevarchar(255)Yes*Unique identifier of the transaction. For iap this should be the item(s) sold, for adrev this should be the ad_type of the ad viewed (banner, rewarded_video, etc).

* Only for revenue_types iap or subscription

User Engagement Data

User Engagements represent user actions within the app that meaningfully distinguish a user's relationship with the app from other users. Many of these actions are common across many or all apps, such as starting a session or utilizing a social sharing feature. Others may be highly specific to your app alone. You are encouraged to include whatever actions you feel are most important; if it is unclear to you which actions may be most useful, please contact us and we can help consult on the question. When formatting User Engagement data for export to AlgoLift, please format it according to the following schema:

FieldData TypeRequiredDescription
user_idvarchar(255)YesA common user identifier used across all data tables
engagement_typevarchar(255)YesThe type of the engagement event
engagement_countintYesThe number of occurrences of the engagement event

Sample Data

Please click here for a copy of sample data (user attribution, revenue, and engagement) for your reference.

Data Delivery

The volume of the above described data can be significant. Having a reliable way to regularly deliver data at scale is imperative for the functioning of AlgoLift's prediction and optimization products. We rely on reading and writing to Amazon S3 in order to transfer data to and from customers. The instructions for how to do so are detailed below, along with technical formatting information.

Data Formatting

  • All data should be formatted in one or more CSV files. These files can be named as you like with data type and date (for example, 'engagement_20201111' or something similar), but should have the .csv extension and be placed in the appropriate S3 bucket and prefix for the app, data type, and date they represent.
  • The first row of each CSV file should be a header row that contains the column names for the ensuing rows.
  • All field entries should be double quoted according to the CSV spec.
  • Text fields without data should be sent as an empty field, not as the string NULL
  • Required numeric fields should never be NULL. If there is no data available, pass 0 as the value for the field. Other numeric fields can be passed as NULL (empty fields, as before).
  • Data should be deduplicated prior to transfer.
  • Aggregating the data by day prior to upload will reduce data size and therefore transfer times and resulting storage space.

Configuring S3 Access

When you begin your integration with AlgoLift, we will create an Amazon S3 bucket specifically for housing your data. The convention we use for bucket naming is algolift-<client>. Sending us data is as simple as writing data to the appropriate location in the S3 bucket.

Since many companies already use AWS for at least some portion of their data infrastructure, the following process assumes you have an existing account and provides you with the best control and security, since it does not involve sharing any credentials:

  1. Create a Role in your AWS account that you will be using to write to the bucket. Attach a policy to grant that role permission to write to the new bucket. (see steps 2 & 3 in this AWS support document for more detail.)
  2. Send a email to support@algolift.atlassian.net with the AWS ARN for the role created in step 1. An ARN looks like the following: arn:aws:iam::account-id:role/role-name.
  3. We will grant your role permissions to the bucket and notify you that the access has been granted.
  4. You may begin writing data to your bucket.

In the case you do not have an AWS account, we will create a role for you and provide an access key and secret which your data processes may use to gain read/write access to your bucket. These alternative directions will apply.

  1. Send a message to support@algolift.atlassian.net letting us know you need us to provide access and you do not have an AWS account.
  2. We will create the user and grant the appropriate bucket permissions, then share an access key and secret with you via a secure channel.
  3. You may begin writing data to your bucket. (see this AWS support document for a simple Python example).

Object Location

Within an organization's s3 bucket, data should be written to the following locations for each app:

  • Attribution: s3://algolift-<client>/<app>/ingestion/daily_data/attribution/date=<date>/
  • Revenue: s3://algolift-<client>/<app>/ingestion/daily_data/revenue/date=<date>/
  • Engagement: s3://algolift-<client>/<app>/ingestion/daily_data/user_engagement/date=<date>/

Data should be delivered daily as soon as possible after 00:00:00 UTC. All dates should be in the format of YYYY-MM-DD. For more information about folders in S3, see this AWS support document.

Object Nesting

Please note that naming convention in the directories should be all lowercase and not have any punctuation or spaces (we ask for strict adherence to this convention since the incoming data is pulled directly into our datawarehouse which has a rigid schema). If you only operate a single app, the string app is sufficient for the app parameter in the paths above.

Object Permissions

Important note: When writing, the make sure to set the bucket-owner-full-control permission on each object that is written (usually this is just a setting in the write command for data processing code used to deploy data). Here's an example of correct object permissions with the AWS CLI: aws s3 cp s3://source_awsexamplebucket/myobject s3://algolift-clientname/path --acl bucket-owner-full-control

Restating Data

There are cases in which data for a past attribution date may need to be restated. Delayed information from an ad network may update your understanding of how a given user is attributed to an ad campaign, the conversion value for a given user may be updated based on actions they take in the first few days of activity, or another reason may apply. AlgoLIFT's ingestion process allows for an "ingestion window" where the last X days of data will be reprocessed each day, so any restated data for past days within the window will be taken into account. To enable this, simply overwrite the outdated data in S3 with the updated data for a given attribution date. The exact width of the ingestion window is variable and can be set by emailing integrations@algolift.com with your desired window size.

← IntroductionProviding Ad Performance Data →
  • Introduction
  • Data History
  • Data Shape
    • Attribution Data
    • Revenue Data
    • User Engagement Data
    • Sample Data
  • Data Delivery
    • Data Formatting
    • Configuring S3 Access
    • Object Location
    • Object Nesting
    • Object Permissions
    • Restating Data
Docs
Links
Corporate WebsiteAlgoLift App
Social
MediumTwitterLinkedIn
Copyright © 2021 AlgoLift