
Welcome to Blackfish! Blackfish is an open source "ML-as-a-Service" (MLaaS) platform that helps researchers use state-of-the-art, open source artificial intelligence and machine learning models. With Blackfish, researchers can spin up their own version of popular public cloud services (e.g., ChatGPT, Amazon Transcribe, etc.) using high-performance computing (HPC) resources already available on campus.
The primary goal of Blackfish is to facilitate transparent and reproducible research based on open source machine learning and artificial intelligence. We do this by providing mechanisms to run user-specified models with user-defined configurations. For academic research, open source models present several advantages over closed source models. First, whereas large-scale projects using public cloud services might cost $10K to $100K for similar quality results, open source models running on HPC resources are free to researchers. Second, with open source models you know exactly what model you are using and you can easily provide a copy of that model to other researchers. Closed source models can and do change without notice. Third, using open-source models allows complete transparency into how your data is being used.
Why should you use Blackfish?
1. It's easy! 🌈
Researchers should focus on research, not tooling. We try to meet researchers where they're at by providing multiple ways to work with Blackfish, including a Python API, a command-line tool (CLI), and a browser-based user interface (UI).
Don't want to install a Python package? Ask your HPC admins to install Blackfish OnDemand!
2. It's transparent 🧐
You decide what model to run (down to the Git commit) and how you want it configured. There are no unexpected (or undetected) changes in performance because the model is always the same. All services are private, so you know exactly how your data is being handled.
3. It's free! 💸
You have an HPC cluster. We have software to run on it.
Requirements
Python
Blackfish requires Python to run locally. Alternatively, Blackfish can be added to your university's Open OnDemand portal, which allows users to run applications on HPC resources through a web browser. For more information, see our companion repo blackfish-ondemand.
Docker & Apptainer
Blackfish uses Docker or Apptainer to run service containers locally. HPC-based services require Apptainer to be installed on your university cluster.
Quickstart
Step 1 - Install blackfish
Step 2 - Create a profile
blackfish init
# Example responses
# > name: default
# > type: slurm
# > host: cluster.organization.edu
# > user: shamu
# > home: /home/shamu/.blackfish
# > cache: /shared/.blackfish
Step 3 - Start the API
Step 4 - Obtain a model
Step 5 - Run a service
Step 6 - Submit a request
# First, check the service status...
blackfish ls
# Once the service is healthy...
curl -X POST 'http://localhost:8080/transcribe' -H 'Content-Type: application/json' -d '{"audio_path": "/data/audio/NY045.mp3", "response_format": "json"}'
Next Steps
Ready to give Blackfish a try? Check out our setup guide.
Acknowledgements
Blackfish is maintained by research software engineers at Princeton University's Data Driven Social Science Initiative.