Ecosyste.ms: Advisories

An open API service providing security vulnerability metadata for many open source software ecosystems.

Advisories: GSA_kwCzR0hTQS1xNzY0LWc2Zm0tNTU1ds4AAxIk

Path traversal in spotipy

Summary

If a malicious URI is passed to the library, the library can be tricked into performing an operation on a different API endpoint than intended.

Details

The code Spotipy uses to parse URIs and URLs accepts user data too liberally which allows a malicious user to insert arbitrary characters into the path that is used for API requests. Because it is possible to include .., an attacker can redirect for example a track lookup via spotifyApi.track() to an arbitrary API endpoint like playlists, but this is possible for other endpoints as well.

Before the security advisory feature was enabled on GitHub, I was already in contact with Stéphane Bruckert via e-mail, and he asked me to look into a potential fix.

My recommendation is to perform stricter parsing of URLs and URIs, which I implemented in the patch included at the end of the report. If you prefer, I can also invite you to a private fork of the repository.

PoC

The POC expects SPOTIFY_CLIENT_ID and SPOTIFY_CLIENT_SECRET environment variables to be set to authenticate against the API.

import spotipy
from spotipy.oauth2 import SpotifyClientCredentials


def main():
    spotifyApi = spotipy.Spotify(client_credentials_manager=SpotifyClientCredentials())

    # This URL contains the example playlist ID from the spotify docs, a malicious
    # playlist could instead contain a XSS payload in their title. A playlist with 
    # such a title was also included in the initial report via mail to maintainer.
    malicious_spotify_url = 'spotify:track:../playlists/3cEYpjA9oz9GiPac4AsH4n'
    

    # Usage of the track function, expecting to get a non-user-controllable track name
    # e.g. for displaying in a website.
    # Our modified track uri however makes the library return the name of a playlist which
    # may be created by anyone containing anything.
    track = spotifyApi.track(malicious_spotify_url)

    # Prints:
    # 'Name of the track: Spotify Web API Testing playlist'
    # A malicious playlist could also have an XSS payload as title, which would result in:
    # 'Name of the track: <img src=x onerror=prompt(1)>'
    print(f"Name of the track: {track['name']}")

if __name__ == '__main__':
    main()

Impact

The impact of this vulnerability depends heavily on what operations a client application performs when it handles a URI from a user and how it uses the responses it receives from the API.

Possible Patch

Caviats of this patch

From 30cf29b16e893dcac974dbd7481fb58a073b853c Mon Sep 17 00:00:00 2001
From: Shaderbug <[email protected]>
Date: Tue, 10 Jan 2023 19:26:18 +0100
Subject: [PATCH] Improve URL and URI handling

---
 spotipy/client.py                            | 61 +++++++++++++++-----
 tests/integration/non_user_endpoints/test.py |  6 +-
 2 files changed, 49 insertions(+), 18 deletions(-)

diff --git a/spotipy/client.py b/spotipy/client.py
index d7025a9..b094947 100644
--- a/spotipy/client.py
+++ b/spotipy/client.py
@@ -6,6 +6,7 @@ __all__ = ["Spotify", "SpotifyException"]
 
 import json
 import logging
+import re
 import warnings
 
 import requests
@@ -96,6 +97,29 @@ class Spotify(object):
         "US",
         "UY"]
 
+    # Spotify URI scheme defined in [1], and the ID format as base-62 in [2].
+    #
+    # Unfortunately the IANA specification is out of date and doesn't include the new types
+    # show and episode. Additionally, for the user URI, it does not specify which characters
+    # are valid for usernames, so the assumption is alphanumeric which coincidentially are also
+    # the same ones base-62 uses.
+    # In limited manual exploration this seems to hold true, as newly accounts are assigned an
+    # identifier that looks like the base-62 of all other IDs, but some older accounts only have
+    # numbers and even older ones seemed to have been allowed to freely pick this name.
+    #
+    # [1] https://www.iana.org/assignments/uri-schemes/prov/spotify
+    # [2] https://developer.spotify.com/documentation/web-api/#spotify-uris-and-ids
+    _regex_spotify_uri = r'^spotify:(?P<type>track|artist|album|playlist|show|episode|user):(?P<id>[0-9A-Za-z]+)$'
+
+    # Spotify URLs are defined at [1]. The assumption is made that they are all
+    # pointing to open.spotify.com, so a regex is used to parse them as well,
+    # instead of a more complex URL parsing function.
+    #
+    # [1] https://developer.spotify.com/documentation/web-api/#spotify-uris-and-ids
+    _regex_spotify_url = r'^(http[s]?:\/\/)?open.spotify.com\/(?P<type>track|artist|album|playlist|show|episode|user)\/(?P<id>[0-9A-Za-z]+)(\?.*)?$'
+
+    _regex_base62 = r'^[0-9A-Za-z]+$'
+
     def __init__(
         self,
         auth=None,
@@ -1940,20 +1964,27 @@ class Spotify(object):
         return path
 
     def _get_id(self, type, id):
-        fields = id.split(":")
-        if len(fields) >= 3:
-            if type != fields[-2]:
-                logger.warning('Expected id of type %s but found type %s %s',
-                               type, fields[-2], id)
-            return fields[-1].split("?")[0]
-        fields = id.split("/")
-        if len(fields) >= 3:
-            itype = fields[-2]
-            if type != itype:
-                logger.warning('Expected id of type %s but found type %s %s',
-                               type, itype, id)
-            return fields[-1].split("?")[0]
-        return id
+        uri_match = re.search(Spotify._regex_spotify_uri, id)
+        if uri_match is not None:
+            uri_match_groups = uri_match.groupdict()
+            if uri_match_groups['type'] != type:
+                raise ValueError("Unexpected Spotify URI type.")
+            else:
+                return uri_match_groups['id']
+
+        url_match = re.search(Spotify._regex_spotify_url, id)
+        if url_match is not None:
+            url_match_groups = url_match.groupdict()
+            if url_match_groups['type'] != type:
+                raise ValueError("Unexpected Spotify URL type.")
+            else:
+                return url_match_groups['id']
+
+        # Raw identifiers might be passed, ensure they are also base-62
+        if re.search(Spotify._regex_base62, id) is not None:
+            return id
+
+        raise ValueError("Unsupported URL / URI")
 
     def _get_uri(self, type, id):
         if self._is_uri(id):
@@ -1962,7 +1993,7 @@ class Spotify(object):
             return "spotify:" + type + ":" + self._get_id(type, id)
 
     def _is_uri(self, uri):
-        return uri.startswith("spotify:") and len(uri.split(':')) == 3
+        return re.search(Spotify._regex_spotify_uri, uri) is not None
 
     def _search_multiple_markets(self, q, limit, offset, type, markets, total):
         if total and limit > total:
diff --git a/tests/integration/non_user_endpoints/test.py b/tests/integration/non_user_endpoints/test.py
index 96ee4da..116e1d9 100644
--- a/tests/integration/non_user_endpoints/test.py
+++ b/tests/integration/non_user_endpoints/test.py
@@ -280,7 +280,7 @@ class AuthTestSpotipy(unittest.TestCase):
         try:
             self.spotify.track(self.bad_id)
             self.assertTrue(False)
-        except SpotifyException:
+        except ValueError:
             self.assertTrue(True)
 
     def test_show_urn(self):
@@ -296,7 +296,7 @@ class AuthTestSpotipy(unittest.TestCase):
         self.assertTrue(show['name'] == 'Heavyweight')
 
     def test_show_bad_urn(self):
-        with self.assertRaises(SpotifyException):
+        with self.assertRaises(ValueError):
             self.spotify.show("bogus_urn", market="US")
 
     def test_shows(self):
@@ -333,7 +333,7 @@ class AuthTestSpotipy(unittest.TestCase):
         self.assertTrue(episode['name'] == '#1 Buzz')
 
     def test_episode_bad_urn(self):
-        with self.assertRaises(SpotifyException):
+        with self.assertRaises(ValueError):
             self.spotify.episode("bogus_urn", market="US")
 
     def test_episodes(self):
-- 
2.34.1

Permalink: https://github.com/advisories/GHSA-q764-g6fm-555v

Source: GitHub Advisory Database
Origin: Unspecified
Severity: Moderate
Classification: General
Published: 12 days ago
Updated: 9 days ago

CVSS Score: 5.4
CVSS vector: CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:N/I:L/A:L

Identifiers: GHSA-q764-g6fm-555v, CVE-2023-23608
References:

Affected Packages

pypi:spotipy
Versions: < 2.22.1
Fixed in: 2.22.1