So, at WWDC21, Apple released ShazamKit, a tool that lets you enrich your app experience with audio recognition, enabling your users to find out a song’s name, who sang it, the genre, and more.
The original Shazam app has this capability, but here is apple giving you the power to bring this capability into your own personal apps.
Here is a step by step process involved in building a simple Shazam clone (as seen below) using the newly introduced ShazamKit.
NB: UI is built using SwiftUI, but easy to follow through if you are new to SwiftUI.
Requirements
The requirements are simple. ShazamKit is currently available on iOS 15. Building for iOS 15 requires Xcode 13 and Mac OS 11.3 or higher is required to use Xcode 13. Also, you need a physical device to test with.
The Process
Clone the repo from Github so that you can follow through easily.
I have a SwiftUI view (ShazamView
) containing a number of views, most importantly a record button you press to get the app to listen to music around you. Tapping on the button calls a startListening()
function in the ShazamViewModel
to begin the listening process.
The startListening()
function goes through a couple of steps:
1. Requesting Record (Microphone) Permission
Here, the app checks for record permission, if allowed, another function is further called to begin the recording process. If not allowed, the app requests that the user allow microphone access. This is not the major focus of this blog post, so I’m just going to skip the details of that here.
class ShazamViewModel: NSObject, ObservableObject { // ... func startListening() { let audioSession = AVAudioSession.sharedInstance() switch audioSession.recordPermission { case .undetermined: requestRecordPermission(audioSession: audioSession) case .denied: viewState = .recordPermissionSettingsAlert case .granted: DispatchQueue.global(qos: .background).async { self.proceedWithRecording() } @unknown default: requestRecordPermission(audioSession: audioSession) } } }
2. Setting up Audio Engine for recording
In the proceedToRecord()
button, the app is switched into a ‘loading’ state using the Ripple animation described in a separate blog post. Here also, AudioEngine
needed to enable the app to continually record sounds is set up.
class ShazamViewModel: NSObject, ObservableObject { // ... private func proceedWithRecording() { DispatchQueue.main.async { self.viewState = .recordingInProgress } if audioEngine.isRunning { stopRecording() return } let inputNode = audioEngine.inputNode let recordingFormat = inputNode.outputFormat(forBus: .zero) inputNode.removeTap(onBus: .zero) inputNode.installTap(onBus: .zero, bufferSize: 1024, format: recordingFormat) { [weak self] buffer, time in print("Current Recording at: \(time)") } audioEngine.prepare() do { try audioEngine.start() } catch { print(error.localizedDescription) } } }
3. Asking ShazamKit to match recordings with songs in the catalog
ShazamKit requires an instance of SHSession()
, from which we can call a function required to match the recording with songs in the catalog. It also has two delegate functions within the SHSessionDelegate
, one is triggered when a match is found and the other is triggered when there is an error or no match is found. It’s as simple as that.
class ShazamViewModel: NSObject, ObservableObject { // ... private func proceedWithRecording() { // ... inputNode.installTap(onBus: .zero, bufferSize: 1024, format: recordingFormat) { [weak self] buffer, time in print("Current Recording at: \(time)") self?.session.matchStreamingBuffer(buffer, at: time) // <---- Here } // ... } } extension ShazamViewModel: SHSessionDelegate { func session(_ session: SHSession, didFind match: SHMatch) { guard let firstMatch = match.mediaItems.first else { return } stopRecording() let song = Song( title: firstMatch.title ?? "", artist: firstMatch.artist ?? "", genres: firstMatch.genres, artworkUrl: firstMatch.artworkURL, appleMusicUrl: firstMatch.appleMusicURL ) DispatchQueue.main.async { self.viewState = .result(song: song) } } func session(_ session: SHSession, didNotFindMatchFor signature: SHSignature, error: Error?) { print(error?.localizedDescription ?? "") stopRecording() DispatchQueue.main.async { self.viewState = .noResult } } }
The rest of the codebase consists of the Song
struct that the Shazam match is mapped into, a SongDetailView
where the song details are displayed to the user, a couple of animations to improve the app experience and a couple of alerts displayed to the user as needed.
Custom Catalog
You may ask, “So is building a Shazam clone the only thing I can do with ShazamKit?” The answer is no.
Even though ShazamKit matches songs from the default music catalog, you can pass in your own custom-built catalog for ShazamKit to use during lookup. With this, you can build various applications like games, learning resources, etc.
Android?
ShazamKit is not limited to iOS only, it is also available for android and lets you add audio recognition to your Android apps.
With this simple Shazam clone, I’m sure you can agree with me that using ShazamKit is pretty easy. Why don’t you check it out and get your hands dirty building something with it.
Cheers!