By Nathan Milford
As a two-sided marketplace that hinges on both customers and contributors around the world, Shutterstock must design, implement, and control a behind-the-scenes workflow system that promises both speed and reliability. When it works best, the customer doesn’t even know that something is going on.
To ensure that our customers find what they’re searching for as quickly as possible, Shutterstock relies on content delivery networks (CDNs) as an integral part of our architecture. Although many companies set up similar processes to keep global customers happy, we have another clientele to consider: our contributors.
Shutterstock is home to millions of creative assets that were created by amateur and semi-professional photographers, filmmakers and musicians around the world. Many of them work on nights and weekends, and they expect our databases to support them wherever they are and whenever they’re submitting.
With so much data flowing in and out at once, this can pose storage difficulties. Images, video clips and audio clips are large in size, and they’re constantly being uploaded and stored in our data centers.
CDNs cache lots of small assets around the world all day long, but you have to be ready for what’s coming next. Here’s a glimpse into how we’re assessing our CDN setup going forward.
For starters, the growth of our CDN use is tied not only to organic growth, but also to engineering decisions we make in terms of the size and quantity of assets we store and provide, as well as to technology improvements. For example, higher-quality displays beget higher-quality images that must be pushed out to the edge.
What that means, practically, is that we must adjust on a regular basis to broadening the scope and thinking even bigger.
· In 2013, Shutterstock.com did 3.6 petabytes of CDN traffic.
· In 2014, it rose to 13.8 petabytes (a 283 percent increase).
· At our current pace, we expect we’ll do around 28 petabytes in 2015 (a further 102 percent increase).
As the organization grows into new areas and deeper into higher-quality video, storage becomes a bigger concern. Up until now, we have relied primarily on our relationship with Akamai as a CDN to serve our images out to the world.
However, we made that selection when we were mostly known as an image provider. And, at that time, it was hard to imagine what a 28-petabyte company would look like—much less what kind of storage operation it might require.
This underscores the importance of our getting a handle now on the cost and sprawl of CDNs, when we can still anticipate and prepare for immense growth. Without suitable and reliable back-end support, our engineering and product teams would surely run into undesired resistance down the line.
To ward off any problems, our infrastructure team meets regularly with directors across the technology and product sides to make sure we’re all aligned.
In addition to fostering good communication and collaboration, this process helps keep costs—and worries—down. We won’t have to fear having to repair unexpected glitches that could have broken under the pressures of an evolving 24/7 e-commerce site. We plan to invest ahead of time for the long-term good.
Leveraging Decision Engine
With that in mind, early this year, the infrastructure platform team implemented a service called Cedexis OpenMix, which will help us rein in the situation and give us improved capabilities to boot. OpenMix is essentially a CDN load-balancer that leverages Cedexis’ decision-making engine. The system enables us to optimize with multiple CDN providers based on their strengths.
We can, among other things:
· Move off legacy CDN setups and have some uniformity across business units, which will give us more bang for our buck;
· Switch, automatically, to a different CDN if one has an outage or other problems;
· Switch, automatically, to a different CDN if we hit an overage in order to prevent overage fees;
· Optimize for CDN’s strengths, including file size and geographies; and
· Put traffic that’s in slower markets on slower, less expensive CDNs.
In sum, we are letting the market complete leveraging our CDN traffic to provide us with more cost control and flexibility. Once these concerns are out of the way, we can reinvest that money elsewhere in order to improve our product for the customer.
Nathan Milford is the director of infrastructure at Shutterstock, where he is building the foundation for a globally distributed, multi-backend storage system to house and serve hundreds of petabytes of images, music and video. He has built large-scale distributed systems, large-volume data projects, scalable architectures and open source.