Instance and Device

In the previous chapter we set up the development environment. We installed the Vulkan headers and library, familiarized ourselves with rust-bindgen and installed its dependencies. Then we generated bindings to the Vulkan API and linked our application to it.

The next step is to start writing our vk_tutorial crate. Cargo only generated a main file with a hello world main function in it. We are going to start calling into Vulkan, familiarize ourselves with a few fundamental Vulkan objects along the way and build up our mental model of how they work.

Afterwards we will try out one of the most useful tools for debugging Vulkan applications, the Validation layers.

This tutorial is in open beta. There may be bugs in the code and misinformation and inaccuracies in the text. If you find any, feel free to open a ticket on the repo of the code samples.

Importing Vulkan bindings

The Rust bindings generated from the C headers are made available through the vk_bindings crate. You can import it and use its functions and structs.


use vk_bindings::*;

fn main()
{
    // ...
}

Before we get into actually writing our Vulkan application we need to mention one thing: rust-bindgen isn't perfect. Vulkan in its C headers uses black magic called C macros for certain constants and simple helpers. One of these is the VK_MAKE_API_VERSION which looks like this in C.


#define VK_MAKE_API_VERSION(variant, major, minor, patch) \
    ((((uint32_t)(variant)) << 29) | (((uint32_t)(major)) << 22) | (((uint32_t)(minor)) << 12) | ((uint32_t)(patch)))

Bindgen does not generate a Rust function out of this, and occasionally struggles with other macros as well. Since we need to use this VK_MAKE_API_VERSION right at the beginning, here is a Rust function that does the exact same thing.


fn make_version(
    variant: u32,
    major: u32,
    minor: u32,
    patch: u32
) -> u32
{
    ((variant) << 29) |
    ((major) << 22) |
    ((minor) << 12) |
    (patch)
}

Once we copied this function into our application, it's time to start coding inside our main function.

Instance

The first thing you need to do is create a Vulkan instance.

The Vulkan instance is the entry point of Vulkan. It initializes the Vulkan library and serves as the root object that lets you query or create fundamental objects your graphics application will use to read hardware info, allocate GPU resources, issue commands to the GPU and present the rendered image to a window.


fn main()
{
    //
    // Instance creation
    //

    let app_name_bytes = b"vk rust\0";
    let app_name = unsafe
    {
        core::ffi::CStr::from_bytes_with_nul_unchecked(
            app_name_bytes
        )
    };
    let engine_name_bytes = b"Tutorial engine\0";
    let engine_name = unsafe
    {
        core::ffi::CStr::from_bytes_with_nul_unchecked(
            engine_name_bytes
        )
    };

    let application_info = VkApplicationInfo {
        sType: VK_STRUCTURE_TYPE_APPLICATION_INFO,
        pNext: core::ptr::null(),
        pApplicationName: app_name.as_ptr(),
        applicationVersion: make_version(0, 0, 1, 0),
        pEngineName: engine_name.as_ptr(),
        engineVersion: make_version(0, 0, 1, 0),
        apiVersion: make_version(0, 1, 0, 0)
    };

    let create_info = VkInstanceCreateInfo {
        sType: VK_STRUCTURE_TYPE_INSTANCE_CREATE_INFO,
        pNext: core::ptr::null(),
        flags: 0x0,
        pApplicationInfo: &application_info,
        enabledExtensionCount: 0,
        ppEnabledExtensionNames: core::ptr::null(),
        enabledLayerCount: 0,
        ppEnabledLayerNames: core::ptr::null()
    };

    println!("Creating instance.");
    let mut instance = core::ptr::null_mut();
    let result = unsafe
    {
        vkCreateInstance(
            &create_info,
            core::ptr::null(),
            &mut instance
        )
    };

    if result != VK_SUCCESS
    {
        panic!("Failed to create instance. Error: {:?}.", result);
    }

    // ...
}

Vulkan is a C API, and it expects null terminated C strings as the application and engine name, so we create them into a byte array and use Rust's ffi library to create CStr instances from them.

Then we fill a VkApplicationInfo structure with the application and engine name, version, and the required Vulkan API version.

Afterwards we create a VkInstanceCreateInfo struct, which will refer to our previously created application info, and contains the instance extensions and layers our application will use. Since we currently do not use instance extensions or layers, we set them to null pointers.

Then we call vkCreateInstance which consumes our create info structure and instantiates the Vulkan instance. The last parameter is a pointer which points to the VkInstance variable the new vulkan instance will be written to. It returns a VkResult which contains whether the operation succeeded or generic errors if it failed. In case of an error we terminate the program.

Since we are interfacing with a C API without Rust destructors cleaning up after us, we need to manually free resources that we allocate and Vulkan objects are no exception. At the end of our main function we need to destroy our newly created Vulkan instance using vkDestroyInstance.


fn main()
{
    // ...

    //
    // Cleanup
    //

    println!("Deleting instance.");
    unsafe
    {
        vkDestroyInstance(
            instance,
            core::ptr::null()
        );
    }
}

Now we have a well defined skeleton main. It will be self explanatory where to put the rest of the code.

Physical device selection

Now that we have a Vulkan instance, it's time to query our first fundamental object, a Vulkan physical device.

A Vulkan physical device represents a Vulkan capable device in your computer. Using a Vulkan physical device you can query device capabilities and features, and later create logical devices to do actual work.

Physical devices do not need to be destroyed, because their lifetime is tied to the Vulkan instance: when the instance is destroyed, the physical devices are also freed.


    //
    // Enumerating physical devices
    //

    let mut phys_device_count: u32 = 0;
    let result = unsafe
    {
        vkEnumeratePhysicalDevices(
            instance,
            &mut phys_device_count,
            core::ptr::null_mut()
        )
    };

    if result != VK_SUCCESS
    {
        panic!("Failed to enumerate physical devices. Error: {:?}.", result);
    }

    if phys_device_count == 0
    {
        panic!("No vulkan capable device found.");
    }

    let mut phys_devices = vec![core::ptr::null_mut(); phys_device_count as usize];
    let result = unsafe
    {
        vkEnumeratePhysicalDevices(
            instance,
            &mut phys_device_count,
            phys_devices.as_mut_ptr()
        )
    };

    if result != VK_SUCCESS
    {
        panic!("Failed to enumerate physical devices. Error: {:?}.", result);
    }

    let phys_device_index = 0; // You can choose any of the returned devices. I choose the first one.
    let chosen_phys_device = phys_devices[phys_device_index];

There are two calls to vkEnumeratePhysicalDevices, because it expects two mutable pointers, pPhysicalDeviceCount and pPhysicalDevices, and behaves differently depending on which ones are null pointers.

If pPhysicalDevices is null, then the function writes the number of available physical devices where pPhysicalDeviceCount points to. Basically this way you query the number of available devices.

If pPhysicalDevices is not null, then it must point to the beginning of a VkPhysicalDevice array, where vkEnumeratePhysicalDevices will write the available physical devices, and pPhysicalDeviceCount must point to a variable which must be set by the application to the capacity of the pPhysicalDevices array before the call. After the call vkEnumeratePhysicalDevices overwrites the pPhysicalDeviceCount variable to the actual number of VkPhysicalDevice handles written. Using the function this way actually gets the physical devices.

So the first call queries the amount of memory we need to store every physical device, then we create the phys_devices vector where the physical device handles will be stored, and the second call will actually read every physical device into it. If we do not find any devices, or get some error, then we terminate the application.

If everything went well, then in this tutorial I will just grab the first device and run the application with it. Once your graphics app grows serious you may want to do something else, such as making the physical device selectable in a game launcher or in the options menu, but that's up to you and not the point of this tutorial.

Now that we have chosen a Vulkan physical device let's discover its capabilities! Let's start with printing out our device's name for the sake of familiarizing ourselves with the API! (This might actually come in handy if you have multiple GPUs, such as one dedicated and one integrated, and you are testing the application on both. Both cards can be enumerated and they may have different names.)


    //
    // Checking physical device capabilities
    //

    // Getting physical device properties
    let mut phys_device_properties = VkPhysicalDeviceProperties::default();
    unsafe
    {
        vkGetPhysicalDeviceProperties(
            chosen_phys_device,
            &mut phys_device_properties
        );
    }

    let device_name = unsafe
    {
        core::ffi::CStr::from_ptr(
            phys_device_properties.deviceName.as_ptr()
        )
    };
    println!("Chosen device name: {:?}", device_name);

The function vkGetPhysicalDeviceProperties gets several parameters of a physical device into a VkPhysicalDeviceProperties struct, such as the device name, supported API version, device limits and so on. Then we wrap the device name into a CStr and print it to the standard output. Later we will use this struct to determine whether the selected device meets certain system requirements.

Now it's time to query some data that brings us closer to making a functional app! Let's query the kinds of queues the device supports for command submission!

Vulkan queues allow you to submit batches of commands for asynchronous execution on the device.

Queues are organized into queue families. A queue family is a set of queues that share the same supported commands and a few other properties.


    // Checking queues
    let mut queue_family_count: u32 = 0;
    unsafe
    {
        vkGetPhysicalDeviceQueueFamilyProperties(
            chosen_phys_device,
            &mut queue_family_count,
            core::ptr::null_mut()
        );
    }

    let mut queue_families = vec![VkQueueFamilyProperties::default(); queue_family_count as usize];
    unsafe
    {
        vkGetPhysicalDeviceQueueFamilyProperties(
            chosen_phys_device,
            &mut queue_family_count,
            queue_families.as_mut_ptr()
        );
    }

    let mut chosen_graphics_queue_family: i32 = -1;
    let mut chosen_graphics_queue_index: u32 = 0;

    for i in 0..queue_families.len()
    {
        let queue_family_index = i as i32;
        let queue_family = &queue_families[i];

        if queue_family.queueFlags & VK_QUEUE_GRAPHICS_BIT as VkQueueFlags != 0
        {
            chosen_graphics_queue_family = queue_family_index;
            chosen_graphics_queue_index = 0;
        }
    }

    if chosen_graphics_queue_family == -1
    {
        panic!("Chosen physical device is not suitable.");
    }

    let chosen_graphics_queue_family = chosen_graphics_queue_family as u32;

The scheme is similar to what we did during physical device enumeration: first we query the number of queue families, then we allocate enough memory, and read the queue family data into this memory.

The parameters of vkGetPhysicalDeviceQueueFamilyProperties are pQueueFamilyPropertyCount and pQueueFamilyProperties. If pQueueFamilyProperties is null, we get the number of queue families into pQueueFamilyPropertyCount. We can use this to allocate memory for queue family data into queue_families.

If pQueueFamilyProperties is not null, it must point to a VkQueueFamilyProperties array and pQueueFamilyPropertyCount must contain the array size of this array. The function writes the queue family data into pQueueFamilyProperties and pQueueFamilyPropertyCount is overwritten with the number of queues actually written. This way we get the data into the queue_families array.

Then we need to select a suitable queue to submit work to in the future. At first we will search for a graphics queue and nothing else. If we find one, we will deem our device suitable. If we don't, we terminate the application.

The VkQueueFamilyProperties struct contains a number of fields, such as the kinds of commands the queue family can consume and the number of queues in the queue family. Since every family must support at least one queue, we only care about the former. If the queue family supports graphics operations, we deem it suitable. To check this we are looking at whether the VK_QUEUE_GRAPHICS_BIT bit set in the VkQueueFamilyProperties.queueFlags field. If it is set, then graphics operations are supported and we are happy.

Pay attention! In the if condition you see an explitic casting of VK_QUEUE_GRAPHICS_BIT to VkQueueFlags. This is intentional and you must put these explicit casts when you are using Vulkan flags, because rust-bindgen has a flaw. It has platform specific behavior: on Linux both VkQueueFlags and VK_QUEUE_GRAPHICS_BIT are unsigned integers, whereas on Windows VkQueueFlags is an unsigned integer, and VK_QUEUE_GRAPHICS_BIT is a signed integer.

The reason is that VK_QUEUE_GRAPHICS_BIT along with pretty much every Vulkan flag is a C enum, and these enum constants generated in the Rust bindings are unsigned on Linux, and signed on Windows. The VkQueueFlags type on the other hand is just a typedef for VkFlags, which is a 32 bit unsigned integer, and this will be an u32 on both platforms.

If you omit this casting, you will receive platform specific compile errors. You will see such casts around Vulkan flags throughout this tutorial.

The queue family can be later identified by the array index into queue_families, so if the queue family is suitable, then we store this array index. The queue family index will be stored in chosen_graphics_queue_family. The individual queues inside a queue family are identified by an integer id from zero to VkQueueFamilyProperties.queueCount - 1. The index of the chosen graphics queue will be stored in chosen_graphics_queue_index.

Device

Once we determined the specific device that we want to use, and the queue(s) that we want to use, it's time for device creation.

A Vulkan device is a logical device that lets us create GPU resources, and record and submit GPU work.

One physical device can have multiple logical devices on top and every logical device is backed by a physical device. From now on, I will refer to logical devices as simply devices.


    //
    // Device creation
    //

    let queue_priority: f32 = 1.0;

    let queue_create_info = VkDeviceQueueCreateInfo {
        sType: VK_STRUCTURE_TYPE_DEVICE_QUEUE_CREATE_INFO,
        pNext: core::ptr::null(),
        flags: 0x0,
        queueFamilyIndex: chosen_graphics_queue_family,
        queueCount: 1,
        pQueuePriorities: &queue_priority
    };

    let phys_device_features = VkPhysicalDeviceFeatures::default();

    let device_create_info = VkDeviceCreateInfo {
        sType: VK_STRUCTURE_TYPE_DEVICE_CREATE_INFO,
        pNext: core::ptr::null(),
        flags: 0x0,
        queueCreateInfoCount: 1,
        pQueueCreateInfos: &queue_create_info,
        enabledLayerCount: 0,
        ppEnabledLayerNames: core::ptr::null(),
        enabledExtensionCount: 0,
        ppEnabledExtensionNames: core::ptr::null(),
        pEnabledFeatures: &phys_device_features
    };

    println!("Creating device.");
    let mut device = core::ptr::null_mut();
    let result = unsafe
    {
        vkCreateDevice(
            chosen_phys_device,
            &device_create_info,
            core::ptr::null_mut(),
            &mut device
        )
    };

    if result != VK_SUCCESS
    {
        panic!("Failed to create vulkan device. Error: {:?}.", result);
    }

First we create a float queue_priority, because the spec requires it. You can read about it in the spec, but it's not important right now.

Then we create a VkDeviceQueueCreateInfo to describe what queues our device will use. The Vulkan spec mandates that at least one queue must be specified during device creation, so we specify the graphics queue we will use in the future. The queue family is identified by the previously mentioned array index chosen_graphics_queue_family, and the number of queues required by the application is specified in this struct as well.

We then create a VkPhysicalDeviceFeatures struct which is not very useful right now, but we will enable hardware features in later tutorials by setting the fields of this struct.

Then we reference everything we have previously created in a VkDeviceCreateInfo struct. We do not need any device layers and do not enable any device extensions, so they are null. This struct can finally be consumed by a call to the vkCreateDevice function, whose last function is a pointer, and on success, it will write the newly created VkDevice into this pointer. This is our newly created device, and we are happy! If anything goes wrong, we terminate the application.

We also need to clean it up. Make sure you destroy it before you destroy the Vulkan instance!


    //
    // Cleanup
    //

    println!("Deleting device.");
    unsafe
    {
        vkDestroyDevice(
            device,
            core::ptr::null()
        );
    }

    // Destroying the instance should come after...

During device creation we specified a single queue because the specification mandates it. We can query this queue from the Vulkan device. Currently it's pretty useless, but later we will use it to submit graphics work to the GPU. The VkQueue does not need to be destroyed, because it is freed when the VkDevice gets destroyed.


    let mut graphics_queue = core::ptr::null_mut();
    unsafe
    {
        vkGetDeviceQueue(
            device,
            chosen_graphics_queue_family,
            chosen_graphics_queue_index,
            &mut graphics_queue
        );
    }

So finally we have a program that can enumerate physical devices, choose one of them and create a device on top! We can query one of its queues... and not do anything with it. This may be a very minimalistic sample application that doesn't do much, but it's a gentle introduction and we can already familiarize ourselves with some important things.

Related: Resources on GPU hardware overview, different GPU architectures and queues.

Some of you may be advanced learners coming from other graphics APIs, some of you may read this tutorial for a second time and some of you may be just curious beginners. Maybe some of these resources are of interest to you.

For those who are interested in GPU hardware, how it lives right next to the CPU, queues and asynchronous execution, additional information can be found here. It talks about many things we will get into, such as synchronization primitives and GPU pararellization. Maybe you'll like it! :)

Some readers might already be familiar with specific hardware, where a Vulkan queue directly maps to a hardware queue, such as some desktop and console GPUs. For them it might be interesting that on many mobile GPUs a single Vulkan queue aggregates two different hardware queues. For instance this Mali blog post about using compute post processing for Mali starts with explaining how in Mali GPUs a single Vulkan queue actually represents two hardware queues, a Vertex/Tiling/Compute queue and a Fragment queue. As the blog post progresses, this detail will have far reaching consequences when it comes to synchronization schemes and their performance.

This PowerVR blog post also mentions briefly, that It's not quite a 1:1 mapping of hardware front-ends to API queues for us, as the Vulkan graphics queue is represented by two hardware front-ends (tiling and rasterization). So Vulkan queues are still quite abstract and can be backed by many kinds of arrangements under the hood. As you get into more advanced cases for synchronization, understanding GPU hardware will give you better intuition on what synchronization will do and what kind of rendering techniques work well on what GPU architecture.

Validation layers

Now let's introduce a very important tool for debuggind Vulkan applications, the Validation layers. For the sake of experiment let's comment out the part when we destroy our device!


    println!("Deleting device.");
    /*
    unsafe
    {
        vkDestroyDevice(
            device,
            core::ptr::null()
        );
    }
    */

When I last ran it, it worked. Everything is fine, is it?

No. I can't even tell if it actually ran for you when you tried to run it. (It probably didn't crash your OS, but who knows?) So what's wrong?

Vulkan has strict rules about API usage. One class of such rules is when objects need to be destroyed. Certain objects need to outlive other objects. For instance, a VkInstance needs to outilve every VkDevice created from it. Quoting the spec: All child objects created using instance must have been destroyed prior to destroying instance. By commenting out the vkDestroyDevice call, we violated this rule. Now our code may be buggy.

There are many other rules and classes of rules. Some constrain that certain objects free other objects when destroyed. (Such as when you destroy a VkInstance, all of the enumerated physical devices are deallocated.) Some constrain that certain function parameters need to take certain values or a range of values. (Such as how vkEnumeratePhysicalDevices takes two pointers, whether they can be null and how the function behaves when they are.) Some are related to multithreaded uses of Vulkan objects, synchronization and so on.

You will see many other kinds of rules as you progress and your application becomes more serious.

Violating these rules is incorrect API usage that can cause your application to crash, produce artifacts or make your computer unresponsive. (I had to restart mine a few times when I messed up. :P) It is possible that on an untested/future video card your application will crash, produce artifacts or make your or someone else's OS unresponsive.

Vulkan drivers are not mandated to handle such errors, therefore drivers from GPU vendors skip many error checking for the sake of performance. But how do you find and fix these errors when they may or may not cause problem for you but may have severe consequences for someone else?

Validation layers come to the rescue!

Vulkan is built in such a way that libraries called layers can intercept Vulkan calls. The Validation layers are such layers than intercept Vulkan calls and check whether you use the API correctly. If you violate any rules, they write an error message to the console. Let's see what it prints out for our missing vkDestroyDevice call!


UNASSIGNED-ObjectTracker-ObjectLeak(ERROR / SPEC): msgNum: 699204130 - Validation Error: [ UNASSIGNED-ObjectTracker-ObjectLeak ] Object 0: handle = 0x55cd33ad2580, type = VK_OBJECT_TYPE_DEVICE; | MessageID = 0x29ad0222 | OBJ ERROR : VK_DEBUG_REPORT_OBJECT_TYPE_DEVICE_EXT object VkDevice 0x55cd33ad2580[] has not been destroyed.
    Objects: 1
        [0] 0x55cd33ad2580, type: 3, name: NULL

There is our error message. Such errors should be fixed in your application. The Vulkan Validation layers are pretty much a must during development but they incur a massive performance hit which is especially noticable in multithreaded code, so you should not enable them in the released application. (or at least not by default)

Now that we see the uses of Validation layers, let's see how we can use them in our application!

Installing validation layers

If you have the Vulkan SDK installed, then chances are Validation layers are already installed on your system.

If you have some different arrangement, such as you are using some linux distro and installed the vulkan libraries and headers from the repos, you may need to consult the manual on how to install them. For instance, on Arch Linux, the wiki clearly states which package contains the Validation layers.

Once you install them by whatever means, there are two ways to use them: using environment variables and requiring them in code.

Using Validation layers from environment variables

Vulkan can be instructed to use layers using an environment variable. The advantage is that you don't need to write extra code just to enable validation (or any other) layers.

The environment variable we need to set is VK_INSTANCE_LAYERS, which expects a colon separated list of layer names. The layer we are looking for is the VK_LAYER_KHRONOS_validation and we can enable it like this:


VK_INSTANCE_LAYERS=VK_LAYER_KHRONOS_validation cargo run --bin vk_tutorial

After running the application like this, the validation layers should give the previously mentioned error about not deleting our logical device.

Using Validation layers form code

If for whatever reason you don't want to use environment variables, you can enable validation layers from code as well.

The layer we want to enable is the VK_LAYER_KHRONOS_validation. Let's see how we can do this from code.


fn main()
{
    //
    // Layers
    //

    let std_validation_layer = b"VK_LAYER_KHRONOS_validation\0";
    let layers = [std_validation_layer.as_ptr() as *const i8];

    let mut available_layer_count = 0;
    let mut available_layers = Vec::new();
    unsafe
    {
        vkEnumerateInstanceLayerProperties(
            &mut available_layer_count,
            core::ptr::null_mut()
        );
    }

    available_layers.resize(available_layer_count as usize, VkLayerProperties::default());
    unsafe
    {
        vkEnumerateInstanceLayerProperties(
            &mut available_layer_count,
            available_layers.as_mut_ptr()
        );
    }

    for layer in layers.iter()
    {
        let layer = unsafe { core::ffi::CStr::from_ptr(*layer) };
        let mut found = false;
        for available_layer in available_layers.iter()
        {
            let available_layer = unsafe
            {
                core::ffi::CStr::from_ptr(
                    available_layer.layerName.as_ptr()
                )
            };

            if layer == available_layer
            {
                found = true;
            }
        }

        if !found
        {
            println!("Layer {:?} is not supported.", layer);
        }
    }

    // Instance creation comes after...

The list of supported layers need to be queried and checked whether the layers we want to enabled are available.

The parameters of the function vkEnumerateInstanceLayerProperties are conceptually similar to vkEnumeratePhysicalDevices and vkGetPhysicalDeviceQueueFamilyProperties. The function either queries the number of layers or queries the actual layers based on its parameters. Since I already described this scheme during device enumeration and queue selection, I will not go into details. You can find it in the spec instead.

First we get the number of layers, then we allocate memory and available layers. Finally we check whether our layers are among the supported layers. If one of our enabled layers is not supported, we write an error message to stdout.

Afterwards we need to add this array of layer names to our instance create info.


    let create_info = VkInstanceCreateInfo {
        sType: VK_STRUCTURE_TYPE_INSTANCE_CREATE_INFO,
        pNext: core::ptr::null(),
        flags: 0x0,
        pApplicationInfo: &application_info,
        enabledExtensionCount: 0,
        ppEnabledExtensionNames: core::ptr::null(),
        enabledLayerCount: layers.len() as u32, // We added this...
        ppEnabledLayerNames: layers.as_ptr() // ...and this
    };

If we compile our application, the aforementioned error is displayed and you know there is an application bug you need to fix.

The sample applications take the code approach to enable validation layers and will let you toggle the validation layers and select vulkan devices using a command line flag. We do not cover this in the tutorial, because it's just a convenience, but be prepared for this little extra in the samples!

Wrapping up

That's it for now. If we take a look at the resulting source code, you might see that there is a lot of setup code that would be better off extracted into a reusable function. As explained on the home page, the tutorial and the sample code will not extract these most of the time for two reasons.

It would lead to a "tutorial framework" emerging, and understanding such a framework adds extra burden for a new learner. Learning Vulkan will be hard enough without that. Instead every sample application is self contained (minus 1-2 functions), and can be read from beginning to end.
Even if I extracted some utility functions and structs, they would not be optimal for your problem.

For these reasons any restructuring and any utility function extracting will be your homework instead.

In the next chapter we will create a window, interface Vulkan with it and we are going to touch almost every part of our setup code.

The sample code for this tutorial can be found here.

The tutorial continues here.