Today, I got the privilege to present my team work at USENIX LISA 15. TubeMogul grew from few servers to over two thousands servers and handling over one trillion http requests a month, processed in less than 50ms each. To keep up with the fast growth, the SRE team had to implement an efficient Continuous Delivery infrastructure that allowed to do over 10,000 puppet deployment and 8,500 application deployment in 2014. In this presentation, we will cover the nuts and bolts of the TubeMogul operations engineering team and how they overcome challenges.
As you probably know, on Tuesday June 30th a leap second will be added and the day will last 86,401 seconds. Considering the Leap Second Bug which happened in 2012, this event could be quite impacting for the internet and any computer systems. What does it mean and what should you know to be better prepared for this event?
In the end, there’s only one solution: Hitting the Earth with asteroids.
During Puppet Camp Paris, I got the privilege to present the Continuous Delivery Workflow of TubeMogul’s Operations Engineering Team. In few years, we went from few servers to over two thousands nodes fully managed by Puppet. In our presentation, we went over the challenges we faced as well as the implementation of our workflow to improve our day to day operation while still moving fast.
The International Earth Rotation And Reference Systems Service (IERS) announced that a positive leap second will be introduced on the last day of June 2015 (Official Bulletin C 49) making the day with 86,401 seconds.
In 2012, a similar event created major outages on most of the internet with only few avoiding problems. See this Forbes post from July 2012: +1: Google Aces ‘Leap Second’ While Reddit, LinkedIn And More Went Down Saturday.