Protein Language Models

Into to Local-First Protein Language Models

There has been a proliferation of really powerful AI models to assist in scientific discovery.
We have seen a shrinking of the commercial LLMs where LLAMA 3.3 can now be run on a machine with 548GB of RAM.
Some of the powerful ML models like ProteinMPNN and Ligand MPNN are only ~20MB!!!
At the same time we've gotten really powerful desktop harward that increasingly can handle ML models locally via GPUs or CPU acceleration. Apples Metal framework seems to be expesially good.
Can we develop local machine learning tools that can acccelerate scientists in their effort to use these tools.
That was the idea behind ferritin-bio - port common ML models to compile and run locally.

This is highlighting the release year and basic specs of available Desktop hardware. At the moment focusing on the Mac M-series machines. This is related to a broader interest in making machine learning models that can be used locally.

Hardware Table

Data for these charts can be found at this gSheet. Source code for the graphs are here

Into to Local-First Protein Language Models

Model Size by Release Date

Protein Language Model Table

Desktop Hardware

Hardware Table