Updated Depot Archiver and fixed loading of cached manifests, as well as added colors for error messages. Updated Depot Extractor and made it able to read binary depot keys, which is currently what No-Intro Dats. Added Depot Validator to read the whole contents of a depot folder, and validate every file without the need of a manifest. This is useful when you need to verify that there are no corrupted chunks, and saves time since manifests may share lots of files between them. Updated .gitignore to also ignore the keys folder, where the binary depot keys are stored. Andrew Vineyard (1): Fixed Cached Manifest checking, adds Error Message Colors, adds Full Depot Validation checking. .gitignore | 1 + depot_archiver.py | 14 ++--- depot_extractor.py | 12 +++- depot_validator.py | 152 +++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 171 insertions(+), 8 deletions(-) create mode 100644 depot_validator.py -- 2.43.4
Committed, thanks so much for the changes! I have a few ideas: 1. We should probably update everything that uses depot keys (i.e. get_depot_keys etc.) to use the No-Intro binary format 2. Error messages, in addition to being in red, should probably be sent to stderr. Ideally we can make a function to reuse for this purpose 3. How does one go about getting a No-Intro login to view the datfiles? (by the way, I joined the discord, my username is @benlowry) Thanks, -Benjamin
I hope this works this time. My email client keeps removing the reply-to part, breaking compatibility, and I can't figure out what is wrong. Agreed with everything on 1 and 2. The edits I made for reading the saved Depot Keys were to keep compatibility. Also, I realized that I didn't properly close the error code coloring for one of the lines, meaning that if we get an error 404 for the response status, the terminal gets stuck in red. For the 3rd question, you'd have to reach out to the No-Intro team after you have an account on their website/forum, either on their forum or within Discord, before you would gain access to read the DATs for it. Currently, we don't have a record for Steam CDN, but I've been regularly using and testing your project. So far, I've gotten things ready for submission to them. You can take a look at other records on No-Intro's DOM to get an idea of how they want things set up.
Copy & paste the following snippet into your terminal to import this patchset into git:
curl -s https://lists.sr.ht/~blowry/steamarchiver/patches/53782/mbox | git am -3Learn more about email & git
From: Andrew Vineyard <TechnoMage6@gmail.com> --- .gitignore | 1 + depot_archiver.py | 14 ++--- depot_extractor.py | 12 +++- depot_validator.py | 152 +++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 171 insertions(+), 8 deletions(-) create mode 100644 depot_validator.py diff --git a/.gitignore b/.gitignore index 2e78875..b4f6805 100644 --- a/.gitignore +++ b/.gitignore @@ -7,6 +7,7 @@ virtualenv/ extract/ clientmanifests/ clientpackages/ +keys/ *.swp depot_keys.txt last_change.txt diff --git a/depot_archiver.py b/depot_archiver.py index 994cdf0..c8bdb4e 100644 --- a/depot_archiver.py +++ b/depot_archiver.py @@ -104,7 +104,7 @@ def archive_manifest(manifest, c, name="unknown", dry_run=False, server_override content = await response.content.read() break elif 400 <= response.status < 500: - print(f"error: received status code {response.status} (on chunk {chunk_str}, server {host})") + print(f"\033[31merror: received status code {response.status} (on chunk {chunk_str}, server {host})\003[0m") return False except Exception as e: print("rotating to next server:", e) @@ -160,8 +160,8 @@ def try_load_manifest(appid, depotid, manifestid): makedirs("./depots/%s" % depotid, exist_ok=True) if path.exists(dest): with open(dest, "rb") as f: - manifest = CDNDepotManifest(c, appid, f.read()) print("Loaded cached manifest %s from disk" % manifestid) + return CDNDepotManifest(c, appid, f.read()) else: while True: license_requested = False @@ -218,18 +218,18 @@ if __name__ == "__main__": if args.workshop_id: response = steam_client.send_um_and_wait("PublishedFile.GetDetails#1", {'publishedfileids':[args.workshop_id]}) if response.header.eresult != EResult.OK: - print("error: couldn't get workshop item info:", response.header.error_message) + print("\033[31merror: couldn't get workshop item info:\033[0m", response.header.error_message) exit(1) file = response.body.publishedfiledetails[0] if file.result != EResult.OK: - print("error: steam returned error", EResult(file.result)) + print("\033[31merror: steam returned error\033[0m", EResult(file.result)) exit(1) print("Retrieved data for workshop item", file.title, "for app", file.consumer_appid, "(%s)" % file.app_name) if not file.hcontent_file: - print("error: workshop item is not on SteamPipe") + print("\033[31merror: workshop item is not on SteamPipe\033[0m") exit(1) if file.file_url: - print("error: workshop item is not on SteamPipe: its download URL is", file.file_url) + print("\033[31merror: workshop item is not on SteamPipe: its download URL is\033[0m", file.file_url) exit(1) archive_manifest(try_load_manifest(file.consumer_appid, file.consumer_appid, file.hcontent_file), c, file.title, args.dry_run, args.server, args.backup) exit(0) @@ -251,7 +251,7 @@ if __name__ == "__main__": if changenumber > highest_changenumber: highest_changenumber = changenumber if highest_changenumber == 0: - print("error: -l flag specified, but no local appinfo exists for app", appid) + print("\033[31merror: -l flag specified, but no local appinfo exists for app\033[0m", appid) exit(1) appinfo_path = "./appinfo/%s_%s.vdf" % (appid, highest_changenumber) else: diff --git a/depot_extractor.py b/depot_extractor.py index faecbcc..3ab857f 100644 --- a/depot_extractor.py +++ b/depot_extractor.py @@ -31,6 +31,7 @@ from chunkstore import Chunkstore if __name__ == "__main__": path = "./depots/%s/" % args.depotid + keyfile = "./keys/%s.depotkey" % args.depotid manifest = None with open(path + "%s.zip" % args.manifestid, "rb") as f: manifest = DepotManifest(f.read()) @@ -39,7 +40,16 @@ if __name__ == "__main__": if manifest.filenames_encrypted: manifest.decrypt_filenames(args.depotkey) elif manifest.filenames_encrypted: - if exists("./depot_keys.txt"): + ## Using No-Intro's DepotKey format, which is + ## a 32-byte/256-bit binary file. + ## Examples require login to No-Intro to view. + if exists(keyfile): + with open(keyfile, "rb") as f: + args.depotkey = f.read() + manifest.decrypt_filenames(args.depotkey) + ## If depotkey is not found, locate depot_keys.txt + ## and check if key is located in there. + elif exists("./depot_keys.txt"): with open("./depot_keys.txt", "r", encoding="utf-8") as f: for line in f.read().split("\n"): line = line.split("\t") diff --git a/depot_validator.py b/depot_validator.py new file mode 100644 index 0000000..aba45bb --- /dev/null +++ b/depot_validator.py @@ -0,0 +1,152 @@ +#!/usr/bin/env python3 +from argparse import ArgumentParser +from binascii import hexlify, unhexlify +from datetime import datetime +from fnmatch import fnmatch +from glob import glob +from hashlib import sha1 +from io import BytesIO +from os import scandir, makedirs, remove +from os.path import dirname, exists +from pathlib import Path +from struct import unpack +from sys import argv +from zipfile import ZipFile +import lzma + +if __name__ == "__main__": # exit before we import our shit if the args are wrong + parser = ArgumentParser(description='Extract downloaded depots.') + parser.add_argument('depotid', type=int) + parser.add_argument('depotkey', type=str, nargs='?') + parser.add_argument('-b', dest="backup", help="Path to a .csd backup file to extract (the manifest must also be present in the depots folder)", nargs='?') + args = parser.parse_args() + +from steam.core.manifest import DepotManifest +from steam.core.crypto import symmetric_decrypt +from chunkstore import Chunkstore + +if __name__ == "__main__": + path = "./depots/%s/" % args.depotid + keyfile = "./keys/%s.depotkey" % args.depotid + if args.depotkey: + args.depotkey = bytes.fromhex(args.depotkey) + elif exists(keyfile): + with open(keyfile, "rb") as f: + args.depotkey = f.read() + elif exists("./depot_keys.txt"): + with open("./depot_keys.txt", "r", encoding="utf-8") as f: + for line in f.read().split("\n"): + line = line.split("\t") + try: + if int(line[0]) == args.depotid: + args.depotkey = bytes.fromhex(line[2]) + break + except ValueError: + pass + if not args.depotkey: + print("\033[31mERROR: files are encrypted, but no depot key was specified and no key for this depot exists in depot_keys.txt\033[0m") + exit(1) + else: + print("\033[31mERROR: files are encrypted, but no depot key was specified and no depot_keys.txt or depotkey file exists\033[0m") + exit(1) + + chunks = {} + if args.backup: + chunkstores = {} + chunks_by_store = {} + for csm in glob(args.backup.replace("_1.csm","").replace("_1.csd","") + "_*.csm"): + chunkstore = Chunkstore(csm) + chunkstore.unpack() + for chunk, _ in chunkstore.chunks.items(): + chunks[chunk] = _ + chunks_by_store[chunk] = csm + chunkstores[csm] = chunkstore + else: + chunkFiles = [data.name for data in scandir(path) if data.is_file() + and not data.name.endswith(".zip")] + for name in chunkFiles: chunks[name] = 0 + + # print(f"{len(chunks)}") + + def is_hex(s): + try: + unhexlify(s) + return True + except: + return False + + badfiles = [] + + for file, value in chunks.items(): + try: + if args.backup: + chunkhex = hexlify(file).decode() + chunk_data = None + is_encrypted = False + try: + chunkstore = chunkstores[chunks_by_store[file]] + chunk_data = chunkstore.get_chunk(file) + is_encrypted = chunkstore.is_encrypted + except Exception as e: + print(f"\033[31mError retrieving chunk\033[0m {chunkhex}: {e}") + ##breakpoint() + continue + if is_encrypted: + if args.depotkey: + decrypted = symmetric_decrypt(chunk_data, args.depotkey) + else: + print("\033[31mERROR: chunk %s is encrypted, but no depot key was specified\033[0m" % chunkhex) + exit(1) + else: + decrypted = chunk_data + chunk_data = None + + else: + chunkhex = hexlify(unhexlify(file.replace("_decrypted", ""))).decode() + if exists(path + chunkhex): + with open(path + chunkhex, "rb") as chunkfile: + if args.depotkey: + try: + decrypted = symmetric_decrypt(chunkfile.read(), args.depotkey) + except ValueError as e: + print(f"{e}") + print(f"\033[31mError, unable to decrypt file:\033[0m {chunkhex}") + badfiles.append(chunkhex) + continue + else: + print("\033[31mERROR: chunk %s is encrypted, but no depot key was specified\033[0m" % chunkhex) + exit(1) + elif exists(path + chunkhex + "_decrypted"): + with open(path + chunkhex + "_decrypted", "rb") as chunkfile: + decrypted = chunkfile.read() + else: + print("missing chunk " + chunkhex) + continue + decompressed = None + if decrypted[:2] == b'VZ': # LZMA + decompressedSize = unpack('<i', decrypted[-6:-2])[0] + print("Testing (LZMA) from chunk", chunkhex, "Size:", decompressedSize) + try: + decompressed = lzma.LZMADecompressor(lzma.FORMAT_RAW, filters=[lzma._decode_filter_properties(lzma.FILTER_LZMA1, decrypted[7:12])]).decompress(decrypted[12:-10])[:decompressedSize] + except lzma.LZMAError as e: + print(f"\033[31mFailed to decompress:\033[0m {chunkhex}") + print(f"\033[31mError:\033[0m {e}") + badfiles.append(chunkhex) + continue + elif decrypted[:2] == b'PK': # Zip + print("Testing (Zip) from chunk", chunkhex) + zipfile = ZipFile(BytesIO(decrypted)) + decompressed = zipfile.read(zipfile.filelist[0]) + else: + print("\033[31mERROR: unknown archive type\033[0m", decrypted[:2].decode()) + badfiles.append(chunkhex) + continue + #exit(1) + sha = sha1(decompressed) + if sha.digest() != unhexlify(chunkhex): + print("\033[31mERROR: sha1 checksum mismatch\033[0m (expected %s, got %s)" % (chunkhex, sha.hexdigest())) + badfiles.append(chunkhex) + except IsADirectoryError: + pass + for bad in badfiles: + print(f"{bad}") \ No newline at end of file -- 2.43.4
Committed, thanks so much for the changes! I have a few ideas: 1. We should probably update everything that uses depot keys (i.e. get_depot_keys etc.) to use the No-Intro binary format 2. Error messages, in addition to being in red, should probably be sent to stderr. Ideally we can make a function to reuse for this purpose 3. How does one go about getting a No-Intro login to view the datfiles? (by the way, I joined the discord, my username is @benlowry) Thanks, -Benjamin
Andrew V <technomage6@gmail.com>I hope this works this time. My email client keeps removing the reply-to part, breaking compatibility, and I can't figure out what is wrong. Agreed with everything on 1 and 2. The edits I made for reading the saved Depot Keys were to keep compatibility. Also, I realized that I didn't properly close the error code coloring for one of the lines, meaning that if we get an error 404 for the response status, the terminal gets stuck in red. For the 3rd question, you'd have to reach out to the No-Intro team after you have an account on their website/forum, either on their forum or within Discord, before you would gain access to read the DATs for it. Currently, we don't have a record for Steam CDN, but I've been regularly using and testing your project. So far, I've gotten things ready for submission to them. You can take a look at other records on No-Intro's DOM to get an idea of how they want things set up.