Data-Infrastructure
Meta
Tue Nov 19 2024
Sequence learning: A paradigm shift for personalized ads recommendations
AI plays a fundamental role in creating valuable connections between people and advertisers within Meta’s family of apps.
Security
Tue Nov 12 2024
How Meta built large-scale cryptographic monitoring
Cryptographic monitoring at scale has been instrumental in helping our engineers understand how cryptography is used at Meta.
Culture
Fri Oct 25 2024
Diff Authoring Time: Measuring developer productivity at Meta
At Meta, we’re always looking for ways to enhance the productivity of our engineers and developers.
Tue Oct 22 2024
IPLS: Privacy-preserving storage for your WhatsApp contacts
Your contact list is fundamental to the experiences you love and enjoy on WhatsApp.
Data-Center-Engineering
Tue Oct 15 2024
OCP Summit 2024: The open future of networking hardware for AI
At Open Compute Project Summit (OCP) 2024, we’re sharing details about our next-generation network fabric for our AI training clusters.
Meta’s open AI hardware vision
At the Open Compute Project (OCP) Global Summit 2024, we’re showcasing our latest open AI hardware designs with the OCP community.
AI-Research
Thu Oct 03 2024
How open source AI can improve population estimates, sustainable energy, and the delivery of climate change interventions
Data for Good at Meta is open-sourcing the data used to train our AI-powered population maps.
Android
Wed Oct 02 2024
React at Meta Connect 2024
At Meta, React and React Native are more than just tools; they are integral to our product development and innovation.
Tue Sep 17 2024
Inside Bento: Jupyter Notebooks at Meta
This episode of the Meta Tech Podcast is all about Bento, Meta’s internal distribution of Jupyter Notebooks, an open-source web-based comput...
Tue Sep 10 2024
Simulator-based reinforcement learning for data center cooling optimization
We’re sharing more about the role that reinforcement learning plays in helping us optimize our data centers’ environmental controls.
Wed Sep 04 2024
Read Meta’s 2024 Sustainability Report
[.
Wed Aug 28 2024
Meta is getting ready for post-quantum cryptography
The Quantum Apocalypse is coming.
Tue Aug 27 2024
How Meta enforces purpose limitation via Privacy Aware Infrastructure at scale
At Meta, we’ve been diligently working to incorporate privacy into different systems of our software stack over the past few years.
Mon Aug 26 2024
RETINAS: Real-Time Infrastructure Accounting for Sustainability
We are introducing a new metric— real-time server fleet utilization effectiveness —as part of the RETINAS initiative to help reduce emission...
Fri Aug 23 2024
How PyTorch powers AI training and inference
Learn about new PyTorch advancements for LLMs and how PyTorch is enhancing every aspect of the LLM lifecycle.
Thu Aug 22 2024
Inside the hardware and co-design of MTIA
In this talk from AI Infra @ Scale 2024, Joel Colburn, a software engineer at Meta, technical lead Junqiang Lan, and software engineer Jack ...
Wed Aug 21 2024
Bringing Llama 3 to life
Llama 3 is Meta’s most capable openly-available LLM to date and the recently-released Llama 3.
Tue Aug 20 2024
Aparna Ramani discusses the future of AI infrastructure
Delivering new AI technologies at scale also means rethinking every layer of our infrastructure – from silicon and software systems and even...
Wed Aug 14 2024
How Meta animates AI-generated images at scale
We launched Meta AI with the goal of giving people new ways to be more productive and unlock their creativity with generative AI (GenAI).
Mon Aug 05 2024
RoCE networks for distributed AI training at scale
AI networks play an important role in interconnecting tens of thousands of GPUs together, forming the foundational infrastructure for traini...
DCPerf: An open source benchmark suite for hyperscale compute applications
We are open-sourcing DCPerf, a collection of benchmarks that represents the diverse categories of workloads that run in data center cloud de...
ML-Applications
Thu Jul 18 2024
Meet Caddy – Meta’s next-gen mixed reality CAD software
What happens when a team of mechanical engineers get tired of looking at flat images of 3D models over Zoom? Meet the team behind Caddy, a n...
DevInfra
Tue Jul 16 2024
AI Lab: The secrets to keeping machine learning engineers moving fast
The key to developer velocity across AI lies in minimizing time to first batch (TTFB) for machine learning (ML) engineers.
Wed Jul 10 2024
Taming the tail utilization of ads inference at Meta scale
Tail utilization is a significant system issue and a major factor in overload-related failures and low compute utilization.
Meta’s approach to machine learning prediction robustness
Meta’s advertising business leverages large-scale machine learning (ML) recommendation models that power millions of ads recommendations per...
Tue Jun 25 2024
The key to a happy Rust/C++ relationship
The history of Rust at Meta goes all the way back to 2016, when we first started using it for source control.
Mon Jun 24 2024
Leveraging AI for efficient incident response
We’re sharing how we streamline system reliability investigations using a new AI-assisted root cause analysis system.
Wed Jun 19 2024
PVF: A novel metric for understanding AI systems’ vulnerability against SDCs in model parameters
We’re introducing parameter vulnerability factor (PVF), a novel metric for understanding and measuring AI systems’ vulnerability against sil...
Web
Thu Jun 13 2024
MLow: Meta’s low bitrate audio codec
At Meta, we support real-time communication (RTC) for billions of people through our apps, including WhatsApp, Instagram, and Messenger.
Wed Jun 12 2024
How Meta trains large language models at scale
As we continue to focus our AI research and development on solving increasingly complex problems, one of the most significant and challengin...
Maintaining large-scale AI capacity at Meta
Meta is currently operating many data centers with GPU training clusters across the world.
Core-Infra
Tue Jun 11 2024
Unlocking the power of mixed reality devices with MobileConfig
MobileConfig enables developers to centrally manage a mobile app’s configuration parameters in our data centers.
Mon Jun 10 2024
Serverless Jupyter Notebooks at Meta
At Meta, Bento, our internal Jupyter notebooks platform, is a popular tool that allows our engineers to mix code, text, and multimedia in a ...
Wed May 22 2024
Composable data management at Meta
In recent years, Meta’s data management systems have evolved into a composable architecture that creates interoperability, promotes reusabil...
Post-quantum readiness for TLS at Meta
Today, the internet (like most digital infrastructure in general) relies heavily on the security offered by public-key cryptosystems such as...
Tue May 14 2024
Behind the scenes of Threads for web
When Threads first launched one of the top feature requests was for a web client.
Thu Apr 11 2024
Building new custom silicon for Meta’s AI workloads
Building an infrastructure for AI’s future
Wed Apr 10 2024
Introducing the next-gen Meta Training and Inference Accelerator
Tue Mar 26 2024
Bringing HDR photo support to Instagram and Threads
Meta’s family of apps serves trillions of image download requests every day.
Networking-and-Traffic
Thu Mar 21 2024
Threads has entered the fediverse
Threads has entered the fediverse! As part of our beta experience, now available in a few countries, Threads users aged 18+ with public prof...
Wed Mar 20 2024
Optimizing RTC bandwidth estimation with machine learning
Bandwidth estimation (BWE) and congestion control play an important role in delivering high-quality real-time communication (RTC) across Met...
Video-Engineering
Better video for mobile RTC with AV1 and HD
At Meta, we support real-time communication (RTC) for billions of people through our apps, including Messenger, Instagram, and WhatsApp.
Mon Mar 18 2024
Logarithm: A logging engine for AI training workflows and services
Systems and application logs play a key role in operations, observability, and debugging workflows at Meta.
Tue Mar 12 2024
Building Meta’s GenAI Infrastructure
Marking a major investment in Meta’s AI future, we are announcing two 24k GPU clusters.
Wed Mar 06 2024
Making messaging interoperability with third parties safe for users in Europe
To comply with a new EU law, the Digital Markets Act (DMA), which comes into force on March 7th, we’ve made major changes to WhatsApp and Me...
Mon Feb 26 2024
How DotSlash makes executable deployment simpler
Andres Suarez and Michael Bolin, two software engineers at Meta, join Pascal Hartig (@passy) on the Meta Tech Podcast to discuss the ins and...
Tue Feb 20 2024
Aligning Velox and Apache Arrow: Towards composable data management
We’ve partnered with Voltron Data and the Arrow community to align and converge Apache Arrow with Velox, Meta’s open source execution engine...
Mon Feb 12 2024
Meta loves Python
By now you’re already aware that Python 3.
Connectivity
Wed Feb 07 2024
Simple Precision Time Protocol at Meta
While deploying Precision Time Protocol (PTP) at Meta, we’ve developed a simplified version of the protocol (Simple Precision Time Protocol ...
Tue Feb 06 2024
DotSlash: Simplified executable deployment
We’ve open sourced DotSlash, a tool that makes large executables available in source control with a negligible impact on repository size, th...
Mon Jan 29 2024
Improving machine learning iteration speed with faster application build and packaging
Slow build times and inefficiencies in packaging and distributing execution files were costing our ML/AI engineers a significant amount of t...
Thu Jan 18 2024
Lazy is the new fast: How Lazy Imports and Cinder accelerate machine learning at Meta
At Meta, the quest for faster model training has yielded an exciting milestone: the adoption of Lazy Imports and the Python Cinder runtime.
Thu Jan 11 2024
How Meta is advancing GenAI
What’s going on with generative AI (GenAI) at Meta? And what does the future have in store? In this episode of the Meta Tech Podcast, Meta e...
Tue Dec 19 2023
How Meta built the infrastructure for Threads
On July 5, 2023, Meta launched Threads, the newest product in our family of apps, to an unprecedented success that saw it garner over 100 mi...
AI debugging at Meta with HawkEye
HawkEye is the powerful toolkit used internally at Meta for monitoring, observability, and debuggability of the end-to-end machine learning ...
Thu Dec 07 2023
Building end-to-end security for Messenger
We are beginning to upgrade people’s personal conversations on Messenger to use end-to-end encryption (E2EE) by default.
Production-Engineering
Tue Nov 21 2023
Writing and linting Python at scale
Python plays a big part at Meta.
Wed Nov 15 2023
Watch: Meta’s engineers on building network infrastructure for AI
Meta is building for the future of AI at every level – from hardware like MTIA v1, Meta’s first-generation AI inference accelerator to publi...
Wed Nov 08 2023
Enhancing the security of WhatsApp calls
New optional features in WhatsApp have helped make calling on WhatsApp more secure.
Mon Nov 06 2023
How Meta built Threads in 5 months
In about five short months, a small team of engineers at Meta took Threads, the new text-based conversations app, from from an idea to the m...
Tue Oct 31 2023
Automating data removal
Meta’s Systematic Code and Asset Removal Framework (SCARF) has a subsystem for identifying and removing unused data types.
Tue Oct 24 2023
Automating dead code cleanup
Meta’s Systematic Code and Asset Removal Framework (SCARF) has a subsystem for identifying and removing dead code.
Mon Oct 23 2023
5 Things you didn’t know about Buck2
Meta has a very large monorepo, with many different programming languages.
Wed Oct 18 2023
How Meta is creating custom silicon for AI
Olivia Wu, Meta’s Technical Lead for Infra Silicon, discusses the design and development of Meta’s first-generation AI inference accelerator...
Tue Oct 17 2023
Automating product deprecation
Systematic Code and Asset Removal Framework (SCARF) is Meta’s unused code and data deletion framework.
Thu Oct 05 2023
Meta contributes new features to Python 3.12
Python 3.
Tue Sep 12 2023
Meta Quest 2: Defense through offense
Meta’s Native Assurance team regularly performs manual code reviews as part of our ongoing commitment to improve the security posture of Met...
Thu Sep 07 2023
Using Chakra execution traces for benchmarking and network performance optimization
Meta presents Chakra execution traces, an open graph-based representation of AI/ML workload execution, laying the foundation for benchmarkin...
Arcadia: An end-to-end AI system performance simulator
We’re introducing Arcadia, Meta’s unified system that simulates the compute, memory, and network performance of AI training clusters.
Threads: The inside story of Meta’s newest social app
Earlier this year, a small team of engineers at Meta started working on an idea for a new app.
Tue Sep 05 2023
What’s it like to write code at Meta?
Ever wonder what it’s like to write code at Meta’s scale? On the latest episode of the Meta Tech Podcast, Meta engineer Pascal Hartig (@pass...
Tue Aug 29 2023
Scheduling Jupyter Notebooks at Meta
At Meta, Bento is our internal Jupyter notebooks platform that is leveraged by many internal users.
Thu Aug 24 2023
Code Llama: Meta’s state-of-the-art LLM for coding
Tue Aug 15 2023
Introducing Immortal Objects for Python
Instagram has introduced Immortal Objects – PEP-683 – to Python.
Mon Aug 14 2023
Meta Connect 2023: September 27 – 28
Wed Aug 09 2023
Scaling the Instagram Explore recommendations system
Explore is one of the largest recommendation systems on Instagram.
Tue Aug 08 2023
How Meta is improving password security and preserving privacy
Meta is developing new privacy-enhancing technologies (PETs) to innovate and solve problems with less data.
Mon Aug 07 2023
Fixit 2: Meta’s next-generation auto-fixing linter
Fixit is dead! Long live Fixit 2 – the latest version of our open-source auto-fixing linter.
Using short-lived certificates to protect TLS secrets
Short-lived certificates (SLCs) are part of our latest efforts to further secure our Transport Layer Security (TLS) private keys on our edge...
Mon Jul 17 2023
Bringing HDR video to Reels
Meta has made it possible for people to upload high dynamic range (HDR) videos from their phone’s camera roll to Reels on Facebook and Insta...
Thu Jun 29 2023
Meta’s Evenstar is transitioning to OCP to accelerate open RAN adoption
Meta is transferring its IP for Evenstar, a program to accelerate the adoption of open RAN technologies, to the Open Compute Project (OCP).
Tue Jun 27 2023
Meta developer tools: Working at scale
Every day, thousands of developers at Meta are working in repositories with millions of files.
Mon May 22 2023
Bombyx is being licensed for product development
When we first conceived of our aerial fiber deployment solution, Bombyx (the Latin name for a silk moth), we imagined a robot weaving strand...
Thu May 18 2023
MSVP is Meta’s first video processing ASIC
Meta introduces its first-generation AI inference accelerator
Tue May 16 2023
Building and deploying MySQL Raft at Meta
We’re rolling out MySQL Raft with the aim to eventually replace our current MySQL semisynchronous databases.
Wed May 03 2023
The malware threat landscape: NodeStealer, DuckTail, and more
We’re sharing our latest threat research and technical analysis into persistent malware campaigns targeting businesses across the internet, ...
Mon Apr 17 2023
A fine-grained network traffic analysis with Millisampler
What the research is: Millisampler is one of Meta’s latest characterization tools and allows us to observe, characterize, and debug network...
Thu Apr 13 2023
Deploying key transparency at WhatsApp
WhatsApp has launched a new cryptographic security feature to automatically verify a secured connection based on key transparency.
How Device Verification protects your WhatsApp account
WhatsApp has launched a new security feature that further helps prevent attackers from using vectors like on-device malware.
Tue Apr 11 2023
Why xHE-AAC is being embraced at Meta
We’re sharing how Meta delivers high-quality audio at scale with the xHE-AAC audio codec.