APEX: Large-Scale Multi-Task Aesthetic-Informed Popularity Prediction for AI-Generated Music

APEX is the first large-scale multi-task learning framework for jointly predicting popularity and aesthetic quality of AI-generated music from audio alone. It is trained on over 211k AI-generated songs (~10k hours of audio) from Suno and Udio, leveraging MERT-v1-95M audio embeddings.


What does APEX predict?

Given any audio file, APEX predicts 7 scores:

Popularity:

Score Range Description
score_streams 0–100 Predicted streaming engagement score
score_likes 0–100 Predicted likes engagement score

Aesthetic Quality (from SongEval):

Score Range Description
coherence 1–5 Structural and harmonic coherence
musicality 1–5 Overall musical quality
memorability 1–5 How memorable the song is
clarity 1–5 Clarity of production and mix
naturalness 1–5 Naturalness of the generated audio

Architecture

APEX Architecture


Usage

Installation

pip install torch transformers soundfile torchaudio

Inference

from transformers import AutoModel

model   = AutoModel.from_pretrained("amaai-lab/apex", trust_remote_code=True)
results = model.predict("my_song.mp3", save_json="results.json")

print(results["score_streams"])  # popularity score 0-100
print(results["score_likes"])    # popularity score 0-100
print(results["coherence"])      # aesthetic score 1-5
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support