We backed up Spotify (metadata and music files). It’s distributed in bulk torrents (~300TB). It’s the world’s first “preservation archive” for music which is fully open (meaning it can easily be mirrored by anyone with enough disk space), with 86 million music files, representing around 99.6% of listens.
Normally you’re mostly right, but in this case I have to agree that lossy existence is better than lossless absence. 300TB puts it at the upper limit of pro-sumer capacity, but it’s still doable from a personal archive perspective. If you went FLAC lossless, though, you’re looking at 3-6PB. That quantity is almost completely unattainable by hobbyists, and presents challenges even for enterprise entities. This archive is the “photo of the original document” for the collection. It’s not optimal, and there’s a lot of room for improvement, but the alternative is to just not do it at all
I’d argue that no one is gonna be archiving 300tb either though and will likely be picking and choosing which files to download from the torrents.
What I don’t know is if this is how Spotify is storing this music on their end or if they have some other lossless source they pull from. I know Deezer has flacs available for most stuff and mp3 320 (I think) for what isn’t lossless.
Not many hobbyists are going to dedicate that much space to bad quality audio, or even have that much space to begin with.
Eh - maybe - there are definitely hoarders with the ability to absorb 300TB. They’re not common, but they do exist. There are probably close to zero hoarders that could spare 3PB, especially for a collection that they won’t listen to a majority of. It’s like saying that it isn’t worth digitizing wax tube recordings because the source is so low quality. If preservation is the goal, anything is better than nothing.
160kbps ogg is not exactly low quality. Most people can’t tell the difference between 160kbps ogg and lossless, nor do they have the equipment when listen to. And with huge amount of data like this, it might be impossible or too expensive or too time consuming for them to archive in lossless quality.
I agree, archiving audio files should be lossless when possible, but that is not a requirement. 160kbps ogg is “good enough”.
I consider anything under 256kbps to be not worth getting unless it’s the only ever rip of something that doesn’t exist anymore. If its lossy it should be 320kbps mp3 ideally.
You just say it should not, but why? As said 160kbp ogg is for most people not distinguishable from uncompressed. I think it is worth archiving this, especially if it is in mass like this. Why do you stay away from VBR?
Archival should be as close to source quality as possible. VBR just adds more noise to the audio whether you can hear it or not. That means copying it to different mediums will eventually start to notice the quality reduction over time.
Archives should be lossless unless there’s literally no other source available.
Archiving low quality sources like this just degrades the overall integrity of the whole
Normally you’re mostly right, but in this case I have to agree that lossy existence is better than lossless absence. 300TB puts it at the upper limit of pro-sumer capacity, but it’s still doable from a personal archive perspective. If you went FLAC lossless, though, you’re looking at 3-6PB. That quantity is almost completely unattainable by hobbyists, and presents challenges even for enterprise entities. This archive is the “photo of the original document” for the collection. It’s not optimal, and there’s a lot of room for improvement, but the alternative is to just not do it at all
I’d argue that no one is gonna be archiving 300tb either though and will likely be picking and choosing which files to download from the torrents.
What I don’t know is if this is how Spotify is storing this music on their end or if they have some other lossless source they pull from. I know Deezer has flacs available for most stuff and mp3 320 (I think) for what isn’t lossless.
Not many hobbyists are going to dedicate that much space to bad quality audio, or even have that much space to begin with.
Eh - maybe - there are definitely hoarders with the ability to absorb 300TB. They’re not common, but they do exist. There are probably close to zero hoarders that could spare 3PB, especially for a collection that they won’t listen to a majority of. It’s like saying that it isn’t worth digitizing wax tube recordings because the source is so low quality. If preservation is the goal, anything is better than nothing.
Spotify has a lossless quality option in their apps.
The link says “The quality is the original OGG Vorbis at 160kbit/s”, so I guess that’s what Spotify uses for the “high” desktop/mobile setting described at https://support.spotify.com/us/article/audio-quality/
160kbps ogg is not exactly low quality. Most people can’t tell the difference between 160kbps ogg and lossless, nor do they have the equipment when listen to. And with huge amount of data like this, it might be impossible or too expensive or too time consuming for them to archive in lossless quality.
I agree, archiving audio files should be lossless when possible, but that is not a requirement. 160kbps ogg is “good enough”.
I consider anything under 256kbps to be not worth getting unless it’s the only ever rip of something that doesn’t exist anymore. If its lossy it should be 320kbps mp3 ideally.
I also try to stay away from VBR rips
You just say it should not, but why? As said 160kbp ogg is for most people not distinguishable from uncompressed. I think it is worth archiving this, especially if it is in mass like this. Why do you stay away from VBR?
Archival should be as close to source quality as possible. VBR just adds more noise to the audio whether you can hear it or not. That means copying it to different mediums will eventually start to notice the quality reduction over time.