An Analysis of the Source 2 Engine (Part 1): The Schema System

Introduction

With the release of the Dota 2 Reborn Beta, users are able to try out the new Source 2 Engine. Just recently, Valve released the Mac and Linux clients for this beta.

What does this mean?

Well, as someone who is familiar with the internals of the Source Engine, getting my hands on the Mac files for one of its games is a dream come true. This is because all the debug symbols are present in these binaries, meaning all function names, class names, and variable names are visible. The tradition has not changed with Valve’s newest engine.

In this series, I am going to be analyzing the new parts of the Source 2 Engine. As a hacker, I will also be talking about the process of transitioning from Valve’s old engine to the new one, as Dota will inevitably completely transition to Source 2 (other games may as well).

At the time of writing this, the Source 2 SDK has not yet been released, so it is mostly reverse engineering.

 

A Brief History on Source Networking and Classes

In Source 1, classes on the client that needed to be in sync with the server utilized networked variables (and client prediction, but that’s not important right now). This means that when a networked class instance was modified on the server’s end, it would dispatch the modified variables to the proper clients.

An example would be an entity’s health, or its uninterpolated position. Once those are modified on the server, it dispatches the information to clients it deems valid (like ones that have vision of the entity), and that data would be replicated accordingly on the client. For more information, click here.

One of the most important parts about networked variables (for a hacker or modder) is that for a variable to be networked, the game must expose the offset to this variable, relative to its object base. That way the client will know what to modify once it receives an update from the server.

Retrieving these offsets involves calling CHLClient::GetAllClasses which returns the head of a ClientClass linked list.  One would then traverse each ClientClass::m_pRecvTable by iterating over each of its RecvProps and retrieving the prop’s offset.

While networked variables are certainly advantageous for getting an idea of how a class works and mapping them out, there are still many variables that are not networked within these classes. Such variables do not have their offsets exposed like networked variables. This means there is extra work needed on the reverse engineer’s end to find non-networked member variables residing within a class.

Source 2 however, changes the ballgame completely.

 

Source 2’s Schema System

With the release of Source 2, there is now a Schema system used by its many modules. This Schema system is used to create very detailed descriptions of Classes, Enumerators, and Types. Many classes have a Schema binding with detailed descriptions of their members and inheritance, even if they don’t contain networked variables. There are even full listings of many Enumerators that were in Source 1, but were not exposed.

CHLClient::GetAllClasses is still in Source 2 (now called CSource2Client::GetAllClasses). While it still contains listings and the Schema layout of many of client.dll’s classes, it does not contain all of them.

Digging deeper, ClientClass is a smaller part of a much more ambitious solution. There is a whole module called schemasystem.dll which is the core of this new Schema system responsible for mapping out and describing many of Source 2’s Classes, Enumerators, and Types.

The schemasystem.dll’s main interface, ISchemaSystem, can be retrieved the same way an interface is retrieved in Source 1, by calling the schemasystem.dll export: CreateInterface.

CSchemaSystem‘s virtual method table is laid out in the Mac x64 libschemasystem.dylib like so:

The important functions to focus on are:

  • CSchemaSystem::GlobalTypeScope
  • CSchemaSystem::FindTypeScopeForModule

Both of these functions return a CSchemaSystemTypeScope* variable for doing operations under their scope.

  • GlobalTypeScope returns a CSchemaSystemTypeScope* used for doing operations on Schema that were created under this global scope, but can still be assigned a module.
  • FindTypeScopeForModule returns a CSchemaSystemTypeScope* used for doing operations on Schema that were created under the scope of the module name that was passed (like “client.dll”)

That being said, these two functions will allow iteration over the many Schema definitions that were defined within their scope. While there are no direct functions to iterate over the Schema definitions, they are still stored within the CSchemaSystemTypeScope at offset 0x450 for classes, and 0x1C90 for enumerators (32-bit). The Schema they operate on differ completely from eachother, so it is imperative to use the proper scope for the Class/Enumerator/Type to operate on.

CSchemaSystemTypeScope‘s virtual method table is laid out like so:

FindDeclaredClass can be used to find a class declared within the CSchemaSystemTypeScope that it’s called from. It will return a CSchemaClassInfo* which is basically an empty class which inherits from SchemaClassInfoData_t.

SchemaClassInfoData_t is the class that describes — in astonishing detail, many classes in Source 2.

That is exactly how Source 2 has SchemaClassInfoData_t set up, from the variable names to the variable types. That is because not only does this new Schema system describe classes, it actually also describes the very classes that it uses to describe other classes.

For proof, there are 3 hidden console commands that Valve has in Dota 2 Reborn.

  1. schema_list_bindings
  2. schema_dump_binding
  3. schema_detailed_class_layout

These console commands use the CSchemaSystem::GlobalTypeScope shown earlier, so it will not show every single class in the game. To do that, there needs to be iteration over every registered module’s CSchemaSystemTypeScope, which these console commands don’t do.

The “Unaccounted” portions of the dump were named by looking over the class info area in the Mac binaries, to determine what those areas were for.

The second “Unaccounted” piece of data is a SchemaArray_t <SchemaStaticFieldData_t> used for describing static members of a class. This is something that Source 1 did not do. Here’s an example dump.

24 Jun 2015 06:30:41 – C_DOTAGameManagerProxy STATIC MEMBERS
24 Jun 2015 06:30:41 – C_DOTAGameManagerProxy::s_pGameManagerProxy: 0x44AFD364

 Source2Gen

Using the previous information, an SDK generator can be created. The goal is to create header files that can be used in a cheat or mod to streamline the process of creating interoperability with the game engine.

Source2Gen does exactly this. It creates headers for many of the exposed classes and enumerators residing within the target Source 2 game.

An example class that it will generate:

The repository can be found here.

praydog / June 24, 2015 / hacking, reverse engineering

Comments

  1. badstreff - June 7, 2022 @ 8:11 pm

    Is there something special you needed to do to enable the 3 hidden console commands, I’ve done a bit of poking around and it looks like they are created/registered when the CSchemaSystem class is initialized but am not 100% sure here. Just taking a guess

    Reply

Leave a Reply

Your email address will not be published / Required fields are marked *