Obrovský únik dat ze Spotify, co vše víme?
Publikováno: 24.12.2025


Spotify je dnes de facto centrální archiv světové hudby – většina poslechu, objevování i hudebních trendů je navázaná na jednu komerční platformu. Právě tato centralizace je ale zároveň slabinou. Pokud by služba změnila podmínky, odstranila část katalogu nebo v budoucnu zanikla, obrovská část digitální hudební historie by mohla zmizet.
Na tento problém upozorňuje blog Anna’s Archive, který zveřejnil text Backing up Spotify. Autoři v něm tvrdí, že se jim podařilo systematicky zazálohovat téměř celý Spotify katalog, a to jak z pohledu metadat, tak samotných audio souborů. Cílem podle nich není konkurovat streamovacím službám, ale dlouhodobá archivace a ochrana kulturního dědictví, podobně jako u knih nebo akademických publikací.
Podle blogu dnes Spotify obsahuje desítky milionů skladeb, přičemž drtivá většina poslechů se soustředí na relativně malou část katalogu. Právě tuto strukturu se archiv snaží kopírovat – data jsou rozdělená podle popularity, kompletnosti a významu. Výsledkem má být archiv v řádu stovek terabajtů, který mapuje prakticky celý hudební ekosystém Spotify.
Jaká jsou ty nejzajímavější leaknutá data?
Přehled kolekce spotify

Písně podle skóre oblíbenosti
Top 10 000 skladeb má popularitu v rozmezí 70-100.

Analýza počtu streamů podle popularity

Můžeme také odhadnout, že top 3 skladby mají vyšší celkový počet streamů než spodních 20-100 milionů skladeb dohromady:
| Umělec | Jméno | Popularita | Počet streamů |
| Lady Gaga, Bruno Mars | Die With A Smile | 100 | 3.075 miliard |
| Billie Eilish | BIRDS OF A FEATHER | 98 | 3.137 miliard |
| Bad Bunny | DtMF | 98 | 1.124 miliard |
Skladby podle délky trvání


Písně podle hodnocení obsahu

Top 30 Hudebních žánrů podle počtu interpretů

Protože každý žánr je velmi specifický, můžeme žánry také seskupit a spočítat výsledky:

Můžeme také seskupit interprety podle popularity. Výsledný graf vypadá velmi podobně jako graf popularity skladeb:

Vydání alb podle roku

Alba podle počtu skladeb

Z tohoto grafu je poznat, že nejvíce skladeb na Spotify jsou singly.
Metadata Files
Stručný přehled tří hlavních APIs Spotify: interpreti, alba, skladby. Každá skladba existuje přesně v jednom albu, ale každá skladba a každé album může mít více interpretů.
CREATETABLE`artists` (
`rowid`integer PRIMARY KEYNOTNULL,
/* The original Spotify base62 ID. */`id`textNOTNULL,
/* When the item was fetched (unixepoch ms). */`fetched_at`integerNOTNULL,
/* "The name of the artist."*/`name`textNOTNULL,
/* followers.total - "The total number of followers." */`followers_total`integerNOTNULL,
/* "The popularity of the artist. The value will be between 0 and 100, with 100 being the most popular. The artist's popularity is calculated from the popularity of all the artist's tracks." */`popularity`integerNOTNULL
);
/* "A list of the genres the artist is associated with. If not yet classified, the array is empty." */CREATETABLE`artist_genres` (
`artist_rowid`integerNOTNULL,
`genre`textNOTNULL,
FOREIGNKEY (`artist_rowid`) REFERENCES`artists`(`rowid`)
);
/* Images of the artist in various sizes, widest first. */CREATETABLE`artist_images` (
`artist_rowid`integerNOTNULL,
`width`integerNOTNULL,
`height`integerNOTNULL,
`url`textNOTNULL,
FOREIGNKEY (`artist_rowid`) REFERENCES`artists`(`rowid`)
);
/*
* Information about artist-albums relationships from /artists/{id}/albums, album.artists[] and album.tracks[].artists[].
* The relationships "album", "single", "compilation" are left out because they can be reconstructed from `album.type`.
*/CREATETABLE"artist_albums" (
`artist_rowid`integerNOTNULL,
`album_rowid`integerNOTNULL,
/* True if this link was retrieved from /artists/{id}/albums with an "album_group" response of "appears_on". */`is_appears_on`integerNOTNULL,
/* True if this link is based on the actual artists of each track in the album. Only exists if the link is not explicit (above). */`is_implicit_appears_on`integerNOTNULL,
/* If neither is_appears_on or is_implicit_appears_on, the index of album.data.artists[] this was retrieved from. */`index_in_album`integer,
FOREIGNKEY (`artist_rowid`) REFERENCES`artists`(`rowid`),
FOREIGNKEY (`album_rowid`) REFERENCES`albums`(`rowid`)
);
/* combinations of available markets in a separate table to save space */CREATETABLE`available_markets` (
`rowid`integer PRIMARY KEYNOTNULL,
/* comma separated ISO 3166-1 alpha-2 country codes. */`available_markets`textNOTNULL
);
/* /albums/{id} - "Get Spotify catalog information for a single album." */CREATETABLE`albums` (
`rowid`integer PRIMARY KEYNOTNULL,
/* The original Spotify base62 ID. */`id`textNOTNULL,
/* When the item was fetched (unixepoch ms). */`fetched_at`integerNOTNULL,
/* "The name of the album. In case of an album takedown, the value may be an empty string." */`name`textNOTNULL,
/* 'The type of the album. Allowed values: "album", "single", "compilation"' */`album_type`textNOTNULL,
/* available markets as an index into the available_markets table to save space. - "The markets in which the album is available: ISO 3166-1 alpha-2 country codes. NOTE: an album is considered available in a market when at least 1 of its tracks is available in that market." */`available_markets_rowid`integerNOTNULL,
/* external_id.upc - Universal Product Code */`external_id_upc`text,
/* external_id.amgid - undocumented - AMG MUSIC GROUP Internal ID */"external_id_amgid"text,
/* "The copyright" */`copyright_c`text,
/* "The sound recording (performance) copyright." */`copyright_p`text,
/* "The label associated with the album." */`label`textNOTNULL,
/* "The popularity of the album. The value will be between 0 and 100, with 100 being the most popular." */`popularity`integerNOTNULL,
/* "The date the album was first released." */`release_date`textNOTNULL,
/* 'The precision with which release_date value is known. Allowed values: "year", "month", "day"' */`release_date_precision`textNOTNULL,
/* tracks.total */`total_tracks`integerNOTNULL,
FOREIGNKEY (`available_markets_rowid`) REFERENCES`available_markets`(`rowid`)
);
/* album.images[] - "The cover art for the album in various sizes, widest first." */CREATETABLE"album_images" (
`album_rowid`integerNOTNULL,
`width`integerNOTNULL,
`height`integerNOTNULL,
`url`textNOTNULL,
FOREIGNKEY (`album_rowid`) REFERENCES`albums`(`rowid`)
);
/* /tracks/{id} */CREATETABLE`tracks` (
`rowid`integer PRIMARY KEYNOTNULL,
/* The original Spotify base62 ID. */`id`textNOTNULL,
/* When the item was fetched (unixepoch ms). */`fetched_at`integerNOTNULL,
/* "The name of the track." */`name`textNOTNULL,
/* "A link to a 30 second preview (MP3 format) of the track. Can be null" */`preview_url`text,
`album_rowid`integerNOTNULL,
/* "The number of the track. If an album has several discs, the track number is the number on the specified disc." */`track_number`integerNOTNULL,
/* http://en.wikipedia.org/wiki/International_Standard_Recording_Code */`external_id_isrc`text,
`external_id_ean`text,
`external_id_upc`text,
/* "The popularity of the track. The value will be between 0 and 100, with 100 being the most popular.
The popularity of a track is a value between 0 and 100, with 100 being the most popular. The popularity is calculated by algorithm and is based, in the most part, on the total number of plays the track has had and how recent those plays are.
Generally speaking, songs that are being played a lot now will have a higher popularity than songs that were played a lot in the past. Duplicate tracks (e.g. the same track from a single and an album) are rated independently. Artist and album popularity is derived mathematically from track popularity. Note: the popularity value may lag actual popularity by a few days: the value is not updated in real time." */`popularity`integerNOTNULL,
/* A reference into the available_markets table. - "A list of the countries in which the track can be played, identified by their ISO 3166-1 alpha-2 code." */`available_markets_rowid`integerNOTNULL,
/* "The disc number (usually 1 unless the album consists of more than one disc)." */`disc_number`integerNOTNULL,
/* "The track length in milliseconds." */`duration_ms`integerNOTNULL,
/* "Whether or not the track has explicit lyrics ( true = yes it does; false = no it does not OR unknown)." */`explicit`integerNOTNULL,
FOREIGNKEY (`available_markets_rowid`) REFERENCES`available_markets`(`rowid`)
);
/* "The artists who performed the track." */CREATETABLE`track_artists` (
`track_rowid`integerNOTNULL,
`artist_rowid`integerNOTNULL,
FOREIGNKEY (`track_rowid`) REFERENCES`tracks`(`rowid`),
FOREIGNKEY (`artist_rowid`) REFERENCES`artists`(`rowid`)
);
CREATEINDEX`artist_genres_artist_id`ON`artist_genres` (`artist_rowid`);
CREATEINDEX`artist_genres_genre`ON`artist_genres` (`genre`);
CREATEINDEX`artist_images_artist_id`ON`artist_images` (`artist_rowid`);
CREATEUNIQUEINDEX`artists_id_unique`ON`artists` (`id`);
CREATEINDEX`artists_name`ON`artists` (`name`);
CREATEINDEX`artists_popularity`ON`artists` (`popularity`);
CREATEINDEX`artists_followers`ON`artists` (`followers_total`);
CREATEINDEX`artist_album_artist_id`ON"artist_albums" (`artist_rowid`);
CREATEINDEX`artist_album_album_id`ON"artist_albums" (`album_rowid`);
CREATEUNIQUEINDEX`albums_id_unique`ON`albums` (`id`);
CREATEINDEX`album_name`ON`albums` (`name`);
CREATEINDEX`album_popularity`ON`albums` (`popularity`);
CREATEUNIQUEINDEX`available_markets_available_markets_unique`ON`available_markets` (`available_markets`);
CREATEINDEX`track_artists_artist_id`ON`track_artists` (`artist_rowid`);
CREATEINDEX`track_artists_track_id`ON`track_artists` (`track_rowid`);
CREATEUNIQUEINDEX`tracks_id_unique`ON`tracks` (`id`);
CREATEINDEX`tracks_popularity`ON`tracks` (`popularity`);
CREATEINDEX`tracks_album`ON`tracks` (`album_rowid`);
CREATEINDEX`album_images_album_id`ON`album_images` (`album_rowid`);
CREATEINDEX tracks_isrc on tracks(external_id_isrc);Scrape objektů Playlist ze Spotify API. Vyžaduje databázi spotify_clean.sqlite3 pro mapování track_rowid na track_id.
/* /get-playlist/{id} - "Get a playlist owned by a Spotify user." */CREATETABLE"playlists" (
`rowid`integer PRIMARY KEYNOTNULL,
/* the original spotify base62 ID */`id`textNOTNULL,
/* the spotify snapshot ID that was fetched (we only store one copy of each playlist) - "The version identifier for the current playlist. Can be supplied in other requests to target a specific playlist version" */`snapshot_id`textNOTNULL,
/* When the playlist was fetched (unixepoch ms). */`fetched_at`integerNOTNULL,
/* "The name of the playlist." */`name`textNOTNULL,
/* "The playlist description. Only returned for modified, verified playlists, otherwise null. */`description`text,
/* "true if the owner allows other users to modify the playlist." */`collaborative`integerNOTNULL,
-- "The playlist's public/private status (if it is added to the user's profile): true the playlist is public, false the playlist is private, null the playlist status is not relevant. For more about public/private status, see Working with Playlists"`public`integerNOTNULL,
`primary_color`text,
/* owner.id - "The unique string identifying the Spotify user that you can find at the end of the Spotify URI for the user. The ID of the current user can be obtained via the Get Current User's Profile endpoint." */`owner_id`text,
/* owner.display_name - "The name displayed on the user's profile. null if not available." */`owner_display_name`text,
/* followers.total */`followers_total`integer,
`tracks_total`integerNOTNULL
);
CREATEUNIQUEINDEX`playlists_id_unique`ON`playlists` (`id`);
CREATEINDEX`playlists_name`ON`playlists` (`name`);
CREATEINDEX`playlists_owner_id`ON`playlists` (`owner_id`);
CREATEINDEX`playlists_snapshot_id`ON`playlists` (`snapshot_id`);
CREATEINDEX`playlists_followers`ON`playlists` (`followers_total`);
/* "Images for the playlist. The array may be empty or contain up to three images. The images are returned by size in descending order. See Working with Playlists. Note: If returned, the source URL for the image (url) is temporary and will expire in less than a day." */CREATETABLE`playlist_images` (
`playlist_rowid`integerNOTNULL,
`width`integer,
`height`integer,
`url`textNOTNULL,
FOREIGNKEY (`playlist_rowid`) REFERENCES`playlists`(`rowid`)
);
CREATEINDEX`playlist_images_playlist_id`ON`playlist_images` (`playlist_rowid`);
/* The tracks of the playlist. Merged from all pages of getPlaylistItems. */CREATETABLE"playlist_tracks" (
`playlist_rowid`integerNOTNULL,
/* 0-based integer position within the playlists response (playlist.tracks.items[i]) */`position`integerNOTNULL,
/* true if track.type == "episode". false if track.type == "track" */`is_episode`integerNOTNULL,
/* The rowid of this track in the `tracks` table. */`track_rowid`integer,
/* If the rowid is null, this is the spotify base62 ID instead. */`id_if_not_in_tracks_table`text,
/* (unixepoch seconds) - "The date and time the track or episode was added. Note: some very old playlists may return null in this field." */`added_at`integerNOTNULL,
/* added_by.id - "The Spotify user who added the track or episode. Note: some very old playlists may return null in this field." */`added_by_id`text,
/* undocumented */`primary_color`text,
/* video_thumbnail.url - undocumented */`video_thumbnail_url`text,
/* "Whether this track or episode is a local file or not." https://developer.spotify.com/documentation/web-api/concepts/playlists#local-files */`is_local`integerNOTNULL,
/* track.name if is_local is true. For non-local tracks, name can be retrieved from the `tracks` table. */`name_if_is_local`text,
/* track.uri if is_local is true */`uri_if_is_local`text,
/* track.album if is_local is true. Spotify constructs a fake album object for local tracks. For non-local tracks, album can be retrieved from the `albums` table. */`album_name_if_is_local`text,
/* track.artists[0].name if is_local is true. Spotify constructs a single fake artists object for local tracks. For non-local tracks, artist name can be retrieved from the `artists` table. */`artists_name_if_is_local`text,
/* track.duration_ms if is_local is true. For non-local tracks, duration_ms can be retrieved from the `tracks` table. */`duration_ms_if_is_local`integer,
PRIMARY KEY(`playlist_rowid`, `position`),
FOREIGNKEY (`playlist_rowid`) REFERENCES`playlists`(`rowid`)
) WITHOUTROWID;