Elon Musk on Friday fulfilled a promise to release some of the code underlying how Twitter filters and presents the information that users see.
Twitter released computer code it called “the algorithm” on Github revealing some of the inner workings of the black box that determines which tweets a user sees — which is far more complicated than simply showing all tweets from people you follow.
“Most of the recommendation algorithm” was released and “the rest will follow,” Musk said on the platform he owns. “No doubt, many embarrassing issues will be discovered, but we will fix them fast!” he added.
The algorithm is important because rather than focus on simply presenting a chronological feed from those a user follows — which would also take far less staff — Musk has said that he will de-amplify content freely. That means that while users might not be censored outright the way they would under the old regime, they could be reaching significantly fewer people, without even realizing that anything was happening, similar to a form of soft “shadow-banning.”
Engineers wrote that the “For You” feed consists half of people you follow, and half of people you don’t. “This requires a recommendation algorithm to distill the roughly 500 million Tweets posted daily down to a handful of top Tweets that ultimately show up,” they explained.
The code also deals with “Trust and Safety,” which has in the past been used by Twitter to hide information such as the Hunter Biden laptop scandal just before the 2020 election.
However, the code only contains logic and machine learning models, not the data that powers it, so it is of limited use. For example, you can see code building the “Model to detect toxic tweets. Toxicity includes marginal content like insults and certain types of harassment. Toxic content does not violate Twitter terms of service.”
In part, it looks for keywords dealing with “politics,” “insults,” and “race,” but you can’t see what those keywords are. The end result of many factors is a “toxicity score” being assigned to each tweet.
The code shows that Twitter keeps track of Democrats, Republicans, important users (called VITs), and Musk himself, saying, “These author ID lists are used purely for metrics collection. We track how often we are serving Tweets from these authors and how often their tweets are being impressed by users. This helps us validate in our A/B experimentation platform that we do not ship changes that negatively impacts one group over others.”
It also references blacklisted topics and searches.
Under “SpaceSafetyLabels,” it includes DoNotAmplify, CoordinatedHarmfulActivity, CivicIntegrityMisinfo, MedicalMisinfo, GenericMisinfo, HighToxicity, and UkraineCrisis.
It lists as “Deprecated” safety labels named MisinfoCovid19, MsinfoBrazilianElection, MsnfoCovid19Vaccine, MsnfoFrenchElection, MsnfoPhilippineElection, and MsnfoUsElection.”
It sets a “tweet safety level” based on factors like TweetContainsHatefulConductSlurLowSeverity.
“The [[VisibilityReason]] feature represents a VisibilityFiltering [[SPAM.FilteredReason]], which contains safety filtering verdict information including action (e.g. Drop, Avoid) and reason (e.g. Misinformation, Abuse),” a code comment says.
It also orchestrates a “misinfo nudge.”
And it considers policy violations, which include ElectionInterference, MisinformationVoting, HackedMaterials, MisinformationCivic AbusePolicyUkraiseCrisisMisinformation, MisinformationMedical, and MisinformationGeneral.
Last week following the Nashville shooting, many conservatives were banned for calling attention to a “Trans Day of Vengeance” set to take place on Saturday, even though the original far-Left accounts were not banned when they originally promoted the event.