Project Description

The project goal was to build an advanced tracker and TDS to distribute traffic and optimize marketing campaigns and costs.
There are few domains that get traffic from multiple sources and with different url parameters.

For instance: http://example1.com/sdof-120234-ofjsf?cid=XXXXX&tid=XXXXX&zoneid=XXXXX&id=XXXXX
http://example1.com/sdkfjsd-o2ieo-sokdfsodf?cid=XXXXX&tid=XXXXX&zoneid=XXXXX&id=XXXXX
Traffic is purchased through different platforms. When visitor comes to our domain, we collect all possible information, like screen resolution, device, mobile carrier, ISP, timezone, country etc. When we have these details at place, we are able to define where to redirect him, to what landing page or offer and track if this visit has produced a conversion. There was also a requirement to build redirection rules that would apply depending upon user details. We store information about each visitor in the database and build detailed reports for further analysis. As far as performance is concerned, requirement was to handle from 10 mln to 100 mln visitors a day and to be scalable to higher capacity.

Traffic sources
Why did the project require functionality to handle traffic from multiple domains?
1) Tracker admins can buy cheap or expensive traffic, with a different quality. But there is no guarantee by default that expensive traffic will have the highest quality, it could bring bots. Low-quality traffic can affect conversion as well as potentially affect domain reputation. For instance, we have one domain only and we purchase traffic for it through different systems. A big percentage of visitors will lead to marking domain as spam, and search engines will take actions accordingly; site rank can be affected and domain owner will lose more than he could potentially earn.
2) another reason to use multiple domains is to provide tracker administrator with an option to create campaigns and landing pages on various topics. A campaign for mobile application and for laundry cannot be located on the same domain; domain should belong to the same topic.

To simplify handling domains as much as possible, we have created a script to purchase domains automatically from two registrars, dynadot and namecheap. Tracker admin sets up a template for purchase domains. Connecting to Amazon Route53 and server IP addresses is performed automatically as well. Finally, with the least effort, we get functional domains like test001.taipei, test002.taipei …

Visitor redirection rules
Initially, our goal was to create a simple set of rules to define whether a visitor is a real person or a bot. Tracker admin has set up templates in backend that were applied to each visitor, basing off the following parameters: User agent, Color density, Height, Width, Platform, App version, Plugins, Time zone, Country, Orientation. For instance, if user has difference in time more than 3 hours in parameters ‘time zone’ and ‘country’, we treat him as bot. There were multiple sets of rules applied dynamically, i.e. rules included ‘if…, then…’ case. Rules were based upon the following parameters: device accelerometer, browser version, mobile carrier, connection type, country, day of the week, device type and model, screen resolution, visit time, IP address, browser language, device OS, user location, device timezone, touchscreen, traffic source. The more parameters participate in the rule, the more flexible the system is.

Real time stats
To watch campaigns more effectively and timely take appropriate actions we have implemented stats that update real time. Reports are flexible to see stats basing of various sets of parameters, from basic reports to in-depth analysis.
We have been using MySQL database for data storage. With small amounts of data (up to 5 mln records) MySQL was enough to build real time records quickly. As data amount grew, reports have started to take more and more time to build. For instance, with 10 mln records in the database the report took a minute to build. Since planned load was much higher, we have implemented more advanced solution. We integrated SphinxSearch for data indexing. It required one more powerful server but with is calculations cook minimal time even in cases with grouping and filtering. With 16 core CPU and 128 GB RAM a report handling 300 mln records took a minute to be ready.

Monitoring of revenue and costs.
Watching financial information was an important part of the project as well. We have integrated calculations of conversions for CPC, CPA, CPM, RevShare. The tracker allows to create a template for a specific partner network or even a specific campaign.