How Facebook Is Handling All That Big Data what techstacks used behind it

#facebook #facebookdata

Deepak Mishra Jan 10 2022 · 2 min read
Share this

Facebook is much like the Starship Enterprise in that it likes to go to places no company has gone before.s

source google images 

This is probably because not too many IT companies, especially young ones, have had to serve upwards of 950 million registered users — including a high percentage on a real-time basis — daily. Not many have to sell advertising to about 1 million customers or have dozens of new products in the works, all at the same exact time.

Facebook Does Involves Big Data

Big data simply is about having insight and using it to make impact on your business, It’s really very simplistic. If you aren’t taking advantage of the data you are collecting and being kept in your business, then you just have a pile of a lot of data, “We are getting more and more interested in doing things with the data we are collecting.”

Facebook doesn’t always know what it wants to do with the user lists, Web statistics, geographic information, photos, stories, messages, Web links, videos and everything else that the company collects. But we want to collect everything, we want to instrument everything: cameras, when that door opens and closes, the temperature in this room, who walks in and out the lobby.

Facebook launched the OCP on April 7, 2011. This is an unprecedented attempt to open-source the specifications it employs for its hardware and data center to efficiently power a social network comprising 950 million-plus people.

As part of the project, Facebook has published specs and mechanical designs used to construct the motherboards, power supplies, server chassis, and server and battery cabinets for its data center. That’s unprecedented enough for a company of Facebook’s growing scale, but the social network is also open-sourcing specs for its data center’s electrical and mechanical construction.

The move is surprising because Facebook closely secures the information inside its network walled garden. It has had to endure its share of flack from users about how it handles personal information, which the company relies upon to earn income.

Key Storage Rule: Facebook Does Not Partition Data

Above and beyond all the well-documented security headaches Facebook has faced is the continuing battle it has with handling the enormous amount of data coming into Prineville and the other data centers it rents.

One thing we established early on is that our data infrastructure is shared across the entire company, with some constraints on user access. “The challenge here is that it is not easily partitionable. We’ve hit these scaling limitations of this system, mainly because of our growth and because we try to keep it all together.

“A lot of times companies take the easy way out and say, ‘OK, it’s time to partition because we can’t do this. We’ll just separate this team from that team; we’ll take the bigger thing and divide it into smaller and smaller pieces over time, and that’s how we’ll manage scale.”

But breaking up a centralized IT system into smaller parts simply adds more complexity, cost and staff time.

“That has been unacceptable for us here at Facebook, “That’s not how our product works, that’s not how our team works, and that is a unique thing about how we work and how we face these challenges.”

“High-volume financial trading systems, for example, are now down to micro- or nano-second response times. That is the kind of competitive advantage that hedge funds are getting now in being able to process large volumes of data in extremely near-real-time sense.

This goes back to the united-system approach. “You shouldn’t have any friction that prevents somebody from accessing another organization’s data that’s going to help you drive more sales or better efficiency”

Read next