An approach for evolution within the industry with a research process tailored for the current era
―Please tell us what made you decide to give your presentation on this particular theme?
Masuno：The theme stemmed from the time I spent on self-study initially. We work on a fixed work schedule here at Luminous Productions, so I used my free time after the office hours to further study the field that interests me. During my studies, I was fascinated by a theme similar to what I ended up presenting, namely that of “improved quality achieved by integrating rendering and machine learning”, so I moved forward with testing and implementing something in that vein. I then applied for CEDEC as a platform to present the result of my research. I feel that I was able to bring my research to fruition specifically because of the environment that our studio offers. After I was selected for CEDEC, the studio allowed me to prepare for the presentation as one of my tasks at work as well.
―The environment in the studio encourages you to learn and apply that new knowledge to your work, I see. So, you chose the theme of your presentation based on the studies you were engaging in with the anticipation that there could be a potential implementation of the subject for your future projects?
Masuno： I indeed have our future projects in mind, but another reason I initiated this was that machine learning is the current trend of the IT industry and, lately, some GPUs are being equipped with cores that are designated to effectively process machine learning. The computing processor itself is gradually shifting from the approach to expand its generic calculation capability to a more specialized form that suits specific processing, and APIs to utilize such specific cores have been released as well. So, with the current technical trends and the future evolution of the processors and APIs in mind, I chose machine learning using DirectX 12 as the theme of my presentation.
―There are different types of graphics APIs available nowadays – was there a particular reason behind your decision to use DirectX 12?
Masuno：There is indeed a variety of graphics APIs; Vulkan, which is supported on the Stadia version of FINALFANTASY XV, and Metal are particularly popular these days, but I don’t think the difference between each API is that significant. Platform support aside, I personally like Metal as it is easier to write script for. (laugh)
The reason I used DirectX 12 this time was because I wanted to implement a combination of path tracing and machine learning for my presentation. While you can execute ray tracing with Vulkan, it still isn’t usable as a standard method as it requires you to partly rely on feature expansions. I went with DirectX 12 as it was easier to test on our current environment.
―So it’s a combination of what the trends are and applying the right technology at the right time. Apart from the graphics APIs, what is the current trend when it comes to research in improving image quality?
Masuno：Needless to say, ray tracing and the use of machine learning, which is also the theme of my presentation, is drawing my attention personally as well. While it’s restrictive, the use of real-time ray tracing will become common in the next-generation hardware and looking even further into the future, games that fully support path tracing will eventually emerge. I predict that by then the rendering pipeline will evolve to include processes for improving the quality of the image, such as denoising and super-resolution, with using machine learning in subsequent passes.
―Is the result of your research ready to be implemented to the Luminous Engine?
Masuno： I’m hoping to provide feedback for the engine sometime in the future, but in order to use the results in our game, some more research will be necessary before practical implementation. I also have several challenges to overcome….after actually implementing my results this time, I now have better understanding of what works and what still needs further refinement going forward; now I’ll need to be looking into an approach for tackling that “what to focus on from now on” part.
―Could you elaborate on that “what to focus on from now on” part you mentioned?
Masuno：This time, I implemented denoising and super-resolution on the framebuffer of the rendering result by applying convolution and deconvolution multiple times. However, the target framerate couldn’t be achieved as there were too many convulsion filters, which put a significant amount of pressure on memory, and the calculation cost of the convolutional neural network (CNN) that I implemented in the compute shader of my own accord ended up being too high.
―Have you already found a right approach to tackle said challenge?
Masuno：There are several methods, but I’m thinking that the biggest challenge lies in a more light-weight implementation of Depthwise Separable Convolutions and a model with a lesser amount of filters. I’ll definitely need to investigate this further down the line.
―Which means, you should continue with your study and research?
Masuno：Some progressive companies overseas have already adapted something similar to my presentation theme into actual use. The application of machine learning to real-time rendering is only starting to emerge, so I try to keep an eye out and gather information not only from SIGGRAPH and GDC, but also from a wide range of sources such as GTC, ICCV, and NeurIPS in order to not limit myself to what’s happening in the field of games alone.
―In the previous installment of our Interview series, Iwasaki mentioned anticipating that game development would be requiring technologies from various fields in the future. It seems like the graphic programmers share the same expectation as well?
Masuno：Yes, my impression is that the skill set that is required of developers in terms of technology is becoming more extensive. Even when you take a look at the graphics programmers alone, the job requires a wide-ranging skill set depending on what initiatives are pursued and what the work contents are.
―With new technologies emerging at a rapid pace, it must be becoming harder for the developers to catch up to and surpass what is already available?
Masuno：The more we can do, the harder it gets. Even though it’s hard and requires a lot of work, if the expression capabilities of the game itself are improved we can potentially deliver something new that has never been seen before, which is both fascinating and rewarding. I want to maintain a positive attitude toward the evolution of technology and proactively keep up with what’s new so that, one day, I’ll be in the position to lead hopefully.
The “from now on into the future” that Masuno aspires to see
―Sounds like your pursuit of knowledge is nowhere near finished; are there any other technologies you’re interested in studying?
Masuno：What I’m hoping to get my hands on down the line is the automated generation of 3D geometry using a generative model like GAN. Taking background LOD production, for example, its major battle now is to figure out how to achieve the increased density and amount of information for mid/long-distance as well as the super-high density geometry representation for short-distance. There were many restrictions previously, for instance a place that looked like a forest in short-distance would either be replaced with simplified geometry or, in some cases, we’d add processing to delete the forest altogether in long-distance.So, as an initiative going forward, I’m looking into the possibility of methods such as retaining only two types of information –high density from close distance and silhouette from long-distance – and auto-generating a LOD model on run-time from the silhouette based on the distance and the camera range.
―In a game that is striving for realism, the gaming experience can definitely get compromised instantly if the background occasionally looks cheap depending on its behavior during gameplay – is it difficult to prevent a problem like that?
Masuno：It was quite difficult as, previously, we had to take into account the quality and processing load on top of balancing the production flow and such. Of course some aspects could be handled through automation, but things like the optimal range of MassiveLOD had to be adjusted manually.
For that reason, the background artists were often required to push their creative work aside and take care of things like adjusting the packaging of detailed model data and fixing the LOD distance until right before the game went gold. So, it’d be ideal to eliminate such tasks and auto-generate the LOD model of the appropriate range that matches the progression of the game on runtime with the help of AI.
―Right, the implementation consists of a combination of various content. Is that kind of automation possible theoretically?
Masuno：Automated generation of 3D models is a pretty hot topic and therefore currently being researched at different companies with some solutions already available for practical use. However, it’s a completely different question whether or not it is applicable on the actual product level – I think we are far from making that happen. While we might be able to create real enough model data, it could be determined unusable in the end if it feels uncanny when it’s placed in the game world.
―Regardless of how realistic the model is, it could be completely spoiled if it feels even a little uncanny – is this a never-ending challenge for realistic games?
Masuno：That’s possibly the most important challenge to overcome in our work. Just as you said, no matter how beautiful our asset might be, if it show even the tiniest hint of being unnatural, then the human eye get drawn to that automatically. It’s often the case that even when each individual element is well-made, we still end up finding a sense of awkwardness when looking at the whole picture.This is the part that is directly linked to the final quality of the image and to what the gaming experience will be like for our players. We’d need to carefully eliminate the uncanniness one detail at a time.
―But eliminating such uncanniness one by one seems to require tremendous effort?
Masuno：You’re right…I talked about LOD generation as an example earlier, but when you try to create an open world game where a vast land expands across several square kilometers or even tens of square kilometers– it simply isn’t realistic to try to create all of that by hand to begin with. So, the integration of procedural technologies is a must and furthermore, we’ll have to create a system that ensures efficiency and quality by introducing some sort of AI-related approach.
That being said, no matter how much we can automate, it would never get close to the sensitivity and sense that an actual human has and that’s where we’ll still need human hands to get involved.
In a never-ending search for new surprises and gaming beyond imagination
―I hear it’s only been a few years since you joined the studio – what do you think the unique identity or characteristic of Luminous Productions is?
Masuo：This is quite a personal topic...but my child was born in February. We have a lot of experienced dads in the office and it helps me tremendously that I have people who are willing to listen to my child-rearing struggles. This is a company that is taking on leading edge challenges and yet it offers a solid fixed work schedule and we also have a number of employees on maternity/paternity leave. I think this work-life balance and the environment that provides parenting support for both male and female employees is something unique about this studio in Japan. Sorry, it’s a personal topic. (laugh)
―The studio makes you feel at home? (laugh). I think that kind of comfortable atmosphere plays quite an important role in allowing employees to make the high-quality input and output necessary to create a fun game.
―Lastly, please tell us about your goals, passion, or any challenges you want to take on at Luminous Productions in the future?
Masuno：It’s a cliché, but at the end of the day what I want is to create a high-quality game. Video games these days are continuing to evolve and to adopt a variety of technologies. However, no matter how impressive the expression it incorporates is, the human eye will eventually get familiar with the quality and the fascination and excitement that they feel at first will soon fade away. For this exact reason, I think it’s vital that we continuously challenge the latest technology when it’s current and deliver new surprises and experiences to our players that surpass their imagination at all times.Also, I personally enjoy taking on technological challenges, so I want to try my hand at lots of different things. I also want a chance to someday give presentations at GDC and SIGGRAPH, and get my research paper published on ICCV, CVPR, and NeurIP. For that, I plan on putting my effort into learning something new every day and continue to apply what I learn to my work. I’m privileged to be in an environment where the company supports such endeavors, and the ideal cycle for me would be to have my personal challenges reflected in the game engine and have it eventually lead to the cutting edge game production we are pursuing.