On Nov 11, 2014 10:07 PM, "Andrew Groh" <agroh@xxxxxxxxxxxxxxxx> wrote: > > My company is in the process of designing and implementing a bidder for real time advertising. > > Let me describe the problem briefly. > > A lot of online advertising (mobile, display, video) is sold through real time exchanges. The exchange calls bidders every time an ad is needed on a web page, and the bidder responds with the price that it is willing to pay, along with the ad if the bidder is the winner. Then exchange than choses the highest bid and shows that ad. > > Typically, the bidder is passed information like user id, ip address, geographic area, user agent, browser name (just parsed from the user agent), browser version, etc. > > The bidder than looks at that request, compares it to all the ad campaigns it has, sees which campaigns are eligible to bid, computes a bid, and then returns this bid. In our case (as is pretty typical) the bid prices is computed from a matrix of coefficients that would weight how much the ad is worth in this context. This coefficient matrix would all be in memory and the calculation would be fairly simple. > > So you might be running two ad campaigns, one which requires Chrome and can only run in the US, and another campaign which can only run NY/NJ but on any browser. In our case, this targeting would be specified in a web app by a user. > > The kicker here is that this all has to be really performant. In the US, we would probably need to handle 1million bid requests per second and need to respond to the bid request in less than 100 milliseconds. Now, given that our company does not have infinite resources, I want to build a bidder that is really fast, so that we do not have to buy too many machines to operate this. Ideally each machine would be able to handle 10,000 queries per second, so I would then only need 100 machines to handle the traffic. (figure these are bare metal linux boxes, not virtual machines). > > (Note that I have built a bidder for a previous company that could handle 2000 QPS per machine and did pretty similar things). > > Sorry for the long setup, but I wanted to layout some of the problem. > > So my idea for building this is a combination of nginx, C (and/or C++), and luaJIT. > > Nginx would handle the http requests, parameter parsing, C would be glue code/bookkeeping/utilities to be really fast. > > We would then GENERATE lua code from our web app that would sent to the bidder where it would be JIT’d and really fast (I hope). > Generated code for the above example might look like > > // conceptually the generated code would look something like this: > // state, country, and browser would be a variable that the C code definite and passed in to the lua code > // hopefully the syntax is correct in my example > runCampaign1 = country == “US” and browser == “chrome" > runCampaign2 = state == “NY” or state == “NJ” > > Later on I would have code that would calculate prices for any eligible campaigns (based on the booleans just created) and pick the highest priced campaign and return it > > What I am wondering is if this sounds like a good idea. I feel like generating code in some language which is then compiled could be really fast, and luaJIT seems like a good fit. Doing this in an interpreted language will be too slow for my needs. > I know people who are doing similar ad serving things with LuaJIT and nginx. Not sure they generate code. Also CloudFlare run all their traffic through LuaJIT and Nginx. So you are in known territory. Justin